[aprssig] Screen Scraping of findU

Steve Dimse steve at dimse.com
Fri Jan 6 12:28:49 EST 2006


A real problem with screen scraping has developed at findU over the  
last couple days. The program Weather Display added a feature that  
allows one to display the weather in real time of any station on  
APRS. Unfortunately, the author implemented this by hitting findU's  
rawwx.cgi every second, for each user of this feature. He has agreed  
to remove it from future versions and parse the data directly, but  
the code is in the wild.

Yesterday I wrote an automated routine to identify screen scrapers  
and block their IP automatically, but that ran afoul of an authorized  
user with dynamic IP. I will periodically be running detecting code  
manually and locking out IP addresses which are screen scraping  
without any warning. The prohibition on scraping is not new, but  
because my time has been extremely limited due to hurricane recovery  
and configuring the new server, I have not  been policing as  
carefully as I should have. The situation reached a critical point in  
the last few days when the added load of end-of-year maintenance  
pushed the server over the brink, resulting in loos of data being fed  
to the National Weather Service. With the screen scrapers shut down I  
will have enough capacity. If you use a script manually to grab lat/ 
lon or some other data it will not trigger the code searching the  
log, you will be noticed though if you hit the server on an automated  
basis to grab data rather than parsing it yourself.

If you are screen-scraping, and really believe this the only way you  
can get the data, explain your situation to me.

Steve K4HG




More information about the aprssig mailing list