[aprssig] APRS IS Issue ?
Gerry Creager
gerry.creager at tamu.edu
Mon Mar 3 08:16:35 EST 2008
Howdy, Stan,
Stan N0YXV wrote:
> Ok you can all shoot me now but I thought I'd add a little fuel to the
> fire. Honestly I'm not trying to make people upset but a couple of
> thoughts came to mind that I thought might be interesting for future
> growth and learning. Some times we learn the most from our mistakes.
>
> 1.) We knew for a month that 3rd and 4th would be gone. Why wasn't a new
> THIRD put into place before the deadline? Why did we have to wait until
> the two remaining servers were full before we figured out that we needed
> to increase the threshold on FIRST? Wonder what would have happended if
> we had run out of capacity during a big disaster like a Katrina? The
> object of servers shouldn't be to meet the demand but to be able to have
> enough overhead for the unexpected.
My fault. I've been traveling too much, and when home, I've been doing
little things like, literally, putting out the RAID shelf fire that cost
me 4TB of disk and data. And doing sysadmin work on a few systems 'til
I can get a new-hire in place. So, simply:
I dropped the ball.
I'll try to do better in the future. I'd coordinated to get the new
server up but didn't get everything done.
Sorry! Really! I'll try to take care in the future.
> 2.) When it was decided that the core take on the CWOP program it would
> appear that there wasn't a lot of preplaining. I mean we knew long
> before the core took the project on that weather stations reported in a
> 5 minute (or less) fashion. Couldn't a person calculate the amount of
> traffic and server resources from what was already an exsisting service?
> Not to anticipate a growth in CWOP data and plan ahead would seam to be
> a hugh mistake. (Sounds like that was already admitted to but just
> wanted to bring it up to get it back into the discussion.)
Looking simply at the traffic, we did calculate that there'd be adequate
hardware and software resources to handle the CWOP data. We missed a
bit of processing information dealing with filter software and hashing.
When the Christmas spike occurred, we identified that a problem had
been unmasked and started the evaluation process, did additional
troubleshooting and determined what the real problems were.
A recommendation to split CWOP off to its own servers was made a year
ago but for a variety of reasons, not the least of which was, at the
time we thought load wouldn't be an issue, we (APRS Core sysops and CWOP
management) tabled it. This year, it wasn't really an optional issue
anymore.
I had started doing some planning along these lines and was able to do
some quick work to get volunteer servers up 'til CWOP servers can be
funded and managed in a different manner. Again for some good reasons,
continuing to fund and operate CWOP servers as pure volunteer efforts
isn't a great choice. We're working on a more permanent arrangement for
CWOP servers and management. We're also faced with the imminent
retirement of a key individual in CWOP and his duties will have to be
assumed somehow.
> I'll get my flame suit ready but my intent isn't to put more salt into
> old wounds but to see if there isn't something we can learn from our
> past mistakes so that we don't repeat them.
I'm trying to make sure we don't make these mistakes again while I'm
involved.
gerry
--
Gerry Creager -- gerry.creager at tamu.edu
Texas Mesonet -- AATLT, Texas A&M University
Cell: 979.229.5301 Office: 979.862.3982 FAX: 979.862.3983
Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843
More information about the aprssig
mailing list