[nos-bbs] Re: GW-List - very tech question - NOS stkutil() spikes !

Barry Siegfried k2mf at mfnos.net
Tue Apr 4 16:20:32 EDT 2006


[maiko at pcs.mb.ca wrote]:

> Rob, Barry, Thomas, others ? This one is for you ...
>
> In an attempt to debug a stack violation somewhere (I believe
> it to be a stack violation since I'm getting alot of the dreaded
> _int_malloc() SIGSEGV occurances on mbx_incom () for telnet.
>
> I've solved the SIGSEGV, by bumping up the stack allotment in
> the newproc call, BUT that's a bandaid solution ...
>
> I've run into a very reproduceable problem where if someone
> telnets into the BBS from outside, and does a Nodes, JNOS will
> crash.  It happens more than I would like.  It's the only *bug*
> that is quite frankly BUGGING the crap out of me.  I'm relatively
> happy with the rest of the stuff.
>
> During the course of running the netrom dump command at the BBS
> prompt, the stkutil() of the curproc is about 0x6e8, and that is
> a normal value for it.  HOWEVER, on one occurance, and this really
> shocked me since I was not expecting it, the stkutil() showed
> 0x1700 !!!  Same place in the code that I normally expect to see
> the smaller value.
>
> Call it a spike if you want.
>
> Anyway one else ever run into this ?

*Some* routines can do this, particularly on *slower* CPUs.

> This might be more helpful ...
>
>   sktutil() return 0x6e8
>
>   tprintf (string);
>
>   sktutil () returns 0x1700
>
> That's a big jump for a *simple* tprintf !
>
> It does settle down back to 0x6e8 after a few seconds. This
> is in the loop where the Node info is being printed to the
> telnet session after the user types in the Nodes command.
>
> Perhaps I need to look at the tprintf code ...

There is nothing wrong with the tprintf() code (or I would expect
there not to be anything wrong with it in JNOS).  It isn't tprintf()
itself that is creating problems for you but rather it is the data
which tprintf() is trying to print.

Look for the parts of the code which use qsort() instead.  You
will find it to be utilized in ARP, IP route and Net/Rom nodes
printing routines where the printed output is normally sorted.

The idea behind using qsort() was to allocate a large enough block
of memory that would hold the printed output for the entire list
that needed to be sorted and then after all the data was dumped
into this allocated area of memory, qsort() would sort it before
it is actually printed to the user.

You will also notice in these subfunctions that if there is not
enough memory available for calloc() to succeed allocating enough
memory to hold the printed output for the entire list then the
pointer would return NULL which would then signal the subfunction
NOT to sort the data but rather to print it out to the user one
line at a time in the order that it is encountered in the list.

Obviously, the more elements in the list to be printed, the larger
the amount of memory which must be allocated, the harder qsort()
has to work to sort it all and the more stack will be utilized
as qsort() is working.

This is most likely what is grabbing your stack during Net/Rom nodes
printing.

73, de Barry, K2MF >>
           o
          <|>      Barry Siegfried
+---------/-\---------------------------+
| Internet | bgs at mfnos.net              |
| HomePage | http://www.mfnos.net/~bgs  |
+----------+----------------------------+
| Amprnet  | k2mf at nnj.k2mf.ampr.org     |
| PBBS     | k2mf at k2ge.#cnj.nj.usa.noam |
+----------+----------------------------+




More information about the nos-bbs mailing list