[aprssig] So long.... APRS-IS officially has lost 1/2it's servers.

AE5PL Lists HamLists at ametx.com
Sun Mar 2 07:25:51 EST 2008


I need to clarify as this makes it sound like a Java problem when it
really is an issue that stems from the core basis of APRS-IS.

On APRS-IS, every non-duplicate packet is available to be seen by every
system.  This is the basis of APRS-IS.  Now, look at a "5's" peak
loading where we are seeing 100+ APRS packets per second hitting a core
server (and remote servers taking a full feed, for that matter).  That
doesn't sound like much, 100 packets per second and it isn't.  The input
queues and the basic packet parsing have no problem keeping up.  The
problem is when you look at the output side of the server.  Each one of
these packets must be processed individually for each output queue.
This is so server-side filtering can occur and so no faulty connection
can slow down the server's processing.  That means that 100 pps has now
mushroomed to 100+(200*100)=20,100 pps that the server is processing
assuming 200 connections on the server.  That is the killer.

Dave mentioned a draconian measure that could be put in place that will
put this issue to rest by eliminating almost 1/3 of the APRS-IS packet
load.  That measure would be to disallow any packets from a non-verified
(logged in without a valid password) connection from passing.  When I
originally wrote javAPRSSrvr, we (server authors including Steve Dimse)
had extended discussions about this.  I implemented an algorithm on
javAPRSSrvr that reduced from what was available before.  The current
javAPRSSrvr algorithm only allows position packets and positionless
weather packets to be accepted by a server from a non-verified
connection.  This was done specifically to allow CWOP to use APRS-IS.
They have outgrown the ham network and have begun the migration to their
own network.  Their reports will still be available to hams if the hams
connect to FireNet (a specialized subnet of APRS-IS).  Ham reports will
still be available to CWOP if the ham uses a verified connection.
Bottom line is that there will be a point in time when the core servers
(and probably most others) will turn off the rest of non-verified
packets from passing on APRS-IS.  This is a network run by hams for
hams.  Hams can easily obtain validation codes for their software.
Non-hams (non-verified connections) will still be able to -receive- any
packets they want, they just won't be able to send any.

This change will be heavily advertised and timed to try to have as
little effect on ham operations as possible while giving the CWOP
players some window of opportunity to point their software to the CWOP
network.  In the mean time, there is now a third.aprs.net on APRS-IS
which should lessen the "Port Full" issues.

73,

Pete Loveall AE5PL
pete at ae5pl dot net

> -----Original Message-----
> From: Dave Anderson KG4YZY
> Posted At: Sunday, March 02, 2008 2:06 AM
> Subject: RE: [aprssig] So long.... APRS-IS officially has lost 1/2it's
> servers.
> 
> The current architecture simply cannot grow much more.    In my
opinion
> we've found a glass ceiling of what Java can do for server code when
it
> comes to packet per second processing.  The band-aids put in place
will
> only
> hold so much longer.
> 
> My 8 core box on fourth with 300 connects ran 45% CPU load!!!!    The
4
> core
> box at third ran 90-95% load.     These are pure simple facts, anyone
> wanting to dispute them I can gladly post the MRTG graphs showing just
> how
> much bandwidth and CPU power these boxes ran on for the past year.
> 
> Every packet that flows thru a javaprsrvr has 6 hash tables to flow
> thru,
> and with filtering, that takes processor power.    At the core level,
> it's
> insanely higher as srvr-to-server links mean each server is sent what
> each
> other server is heard, so that quadruples the packet count right there
> per
> second with 4 servers.
> 
> The server code also only has one input queue thread that -everything-
> has
> to flow thru on the core servers before it can be processed, and is
why
> the
> input queues were backing up on the servers and causing the crashes
> with the
> sudden inrush of CWOP stations - remember these CWOP  stations are
> almost
> all using NTP synchronized time to all poll at the -exact- same second
> on
> the 5's.




More information about the aprssig mailing list