[aprssig] Screen Scraping of findU
Steve Dimse
steve at dimse.com
Fri Jan 6 12:28:49 EST 2006
A real problem with screen scraping has developed at findU over the
last couple days. The program Weather Display added a feature that
allows one to display the weather in real time of any station on
APRS. Unfortunately, the author implemented this by hitting findU's
rawwx.cgi every second, for each user of this feature. He has agreed
to remove it from future versions and parse the data directly, but
the code is in the wild.
Yesterday I wrote an automated routine to identify screen scrapers
and block their IP automatically, but that ran afoul of an authorized
user with dynamic IP. I will periodically be running detecting code
manually and locking out IP addresses which are screen scraping
without any warning. The prohibition on scraping is not new, but
because my time has been extremely limited due to hurricane recovery
and configuring the new server, I have not been policing as
carefully as I should have. The situation reached a critical point in
the last few days when the added load of end-of-year maintenance
pushed the server over the brink, resulting in loos of data being fed
to the National Weather Service. With the screen scrapers shut down I
will have enough capacity. If you use a script manually to grab lat/
lon or some other data it will not trigger the code searching the
log, you will be noticed though if you hit the server on an automated
basis to grab data rather than parsing it yourself.
If you are screen-scraping, and really believe this the only way you
can get the data, explain your situation to me.
Steve K4HG
More information about the aprssig
mailing list