[nos-bbs] First crash for jnos2.0e-hack

Jay Nugent jjn at nuge.com
Tue Sep 19 23:33:55 EDT 2006


On Tue, 19 Sep 2006, (Skip) K8RRA wrote:

> Regretfully I had neither the debugger nor a good record of events
> leading up to the crash Maiko.
> As an early warning only, the one obvious error was on the sysop console
> top line: CONV=1 LNKS=0
> Links should have been =1.
> The "convers link" was "...connecting...retries=8" and the reason for
> loosing the link was not known.
> When the user (me) entered "/b" on the user session (not sysop), jnos
> never responded but crashed.
> I'll create some events logging and I have already instantiated gdb, so
> if it reoccurs a few times I'll make a formal report.
> My suspicion is that I violated a few rules in creating the operating
> conditions at the time of crash.
> I don't suspect the hack is the cause at this moment...

   The CONV feature has been a crash problem from way back.  And it 
persists into the V2.0 versions.  So your 'hack' was not the cause, Skip.

   The CONV crash is seen when LINKS are established, flap, or disconnect.  
We also see the CONV crash when users disconnect (/b) as well.  My 
suspicion is that at these times when connectivity or user lists contained 
in tables in RAM are reordered or sorted, the crash occurs.  This is just 
a theory of mine, as I've not looked at the code or performed any traces.

   I have collected a number of log error messages, showing crash
pointers, of several crashes over the last year (all for JNOS v2.0a) if 
those could be of any use to Maiko.  

   As we are pretty heavy users of the CONVerse bridge here in Michigan 
(usually 18 to 20 nodes linked 7x24) we would like to see this eventually 
get fixed.  Some nodes that ran under DOS that had little RAM available 
were the most vulnerable to the crashes.  More RAM certainly helped alot.  
But the crashes are sometimes seen on the Linux-based nodes as well.

      --- Jay Nugent  WB8TKL
          o Chair, ARRL Michigan Section "Digital Radio Group" (DRG)
          o Michigan AMPRnet IP Address Coordinator
"Getting rid of terrorism is like getting rid of dandruff.  It cannot
 be done completely no matter how hard you try." -- Gore Vidal
| Jay Nugent   jjn at nuge.com    (734)484-5105    (734)544-4326/Fax        |
| Nugent Telecommunications  [www.nuge.com]     (734)649-0850/Cell       |
|   Internet Consulting/Linux SysAdmin/Engineering & Design/ISP Reseller |
| ISP Monitoring [www.ispmonitor.net] ISP & Modem Performance Monitoring |
| Web-Pegasus    [www.webpegasus.com] Web Hosting/DNS Hosting/Shell Accts|
| LinuxNIC, Inc. [www.linuxnic.net]   Registrar of the .linux TLD        |
 11:01pm  up 249 days, 9 min,  5 users,  load average: 0.10, 0.05, 0.01
-------------- next part --------------

nos-bbs mailing list

nos-bbs at lists.tapr.org


More information about the nos-bbs mailing list