[nos-bbs] nos-bbs Digest, Vol 211, Issue 5

jj ve1jot at eastlink.ca
Sat Nov 19 13:44:42 EST 2022


Could it be the extraneous data that linux ax25/netrom adds to netrom 
supervisory frames?

On 2022-11-19 13:00, nos-bbs-request at lists.tapr.org wrote:
> Send nos-bbs mailing list submissions to
> 	nos-bbs at lists.tapr.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> 	http://lists.tapr.org/mailman/listinfo/nos-bbs_lists.tapr.org
> or, via email, send a message with subject or body 'help' to
> 	nos-bbs-request at lists.tapr.org
>
> You can reach the person managing the list at
> 	nos-bbs-owner at lists.tapr.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of nos-bbs digest..."
>
>
> Today's Topics:
>
>     1. Re: Analysis of recent frequent netrom related crashes
>        (maiko at pcsinternet.ca)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 18 Nov 2022 13:42:59 -0600
> From: maiko at pcsinternet.ca
> To: nos-bbs at lists.tapr.org
> Subject: Re: [nos-bbs] Analysis of recent frequent netrom related
> 	crashes
> Message-ID: <433c4361ee8af0f74913858e9a798c67 at pcsinternet.ca>
> Content-Type: text/plain; charset=UTF-8; format=flowed
>
> Interesting enough the crashing 'seems' to have stopped.
>
> All of this started a while ago after I added a new wormhole
> to another system (FBB over BPQ netrom), but that system seems
> to have suddenly disappeared from the ether. I think they are
> having amprnet connectivity issues, so this will be much more
> difficult to track down now as I don't have a source to figure
> this out with. I am trying to track down the version of BPQ,
> hoping it will help me figure out what to do on my end.
>
> There is something about the netrom traffic or states that is
> causing JNOS to crash in the NR4 level code, but I have yet to
> figure it out ... it's very confusing what is going on ...
>
> Maiko / VE4KLM
>
> On 2022-11-13 11:12, Maiko (Personal) wrote:
>> Okay, last one for now, and learning as I go ...
>>
>> Perhaps I need to set the NR4TDISC a lot lower (default) ?
>>
>>     jnos> netrom tdisc
>>     NR4 redundancy timer (sec): 120
>>
>> Experiences anyone ? But still, even with a smaller timeout value,
>> there is a 'risk' of a crash, making me think the current way of
>> doing a circuit table lookup and reusing entries, seems not be
>> the brightest way of doing it ? thinking a 'rewrite', ugh, no.
>>
>> Maiko / VE4KLM
>>
>> On 2022-11-13 10:47 a.m., Maiko (Personal) wrote:
>>> I am guessing (hopefully this shows up in my debugs) ...
>>>
>>> IF the local side requests a netrom layer 4 disconnect, then JNOS
>>> should probably free the callback there and then, instead of waiting
>>> for the final disconnect (which may not get to us). I figure it would
>>> not hurt to remove it at that point, since effectively it is done.
>>>
>>> I could put in a timer based garbage collection, but I think it's
>>> best to get rid of the callback data ASAP or else it will crash.
>>>
>>> Anyways ...
>>>
>>> Maiko / VE4KLM
>>>
>>> On 2022-11-13 10:37 a.m., Maiko (Personal) wrote:
>>>> Good morning,
>>>>
>>>> Slightly technical post ...
>>>>
>>>> This has been driving me nuts the past few months, it just seems
>>>> to have started, perhaps because I took on a new netrom neighbour
>>>> or two, I just don't know, but I think I know the reasons for all
>>>> the crashes. After a few days of inserting some very heavy debugs
>>>> into the code, this is where I am at this morning :
>>>>
>>>> JNOS keeps a table of netrom callbacks, the default is 20. When a
>>>> new connection happens, it gets put into the table, and when it's
>>>> done with, it is supposed to be removed from the table. However,
>>>> this removal is ONLY DONE when the state of the connection becomes
>>>> disconnected. What is happening, is that it appears the entry in
>>>> the table for a specific connection looks valid, but in fact it
>>>> has disappeared, but JNOS did not remove it, so crash !!!
>>>>
>>>> What this suggests to me is that I did not get the final NETROM
>>>> disconnected, so JNOS still thinks the callback data is valid, but
>>>> in fact it is not, the memory has disappeared, so what happens is
>>>> you get every few days a crash in the nr4subr.c functions, like :
>>>>
>>>> ?? Program received signal SIGSEGV, Segmentation fault.
>>>> ?? 0x000000000047fdd9 in match_n4circ (index=23, id=71,
>>>> user=0x2081457
>>>> ?? "\236\226d\240\212\234b\236\226d\240\212\234b", node=0x208145e
>>>> ?? "\236\226d\240\212\234b") at nr4subr.c:138
>>>> ??? 138? if ((int)(cb->yournum) == index && (int)(cb->yourid) == id
>>>>
>>>> AND
>>>>
>>>> ?? Program received signal SIGSEGV, Segmentation fault.
>>>> ?? 0x00007f96411f9780 in __memcmp_avx2_movbe () from /lib64/libc.so.6
>>>> ?? (gdb) where
>>>> ?? #0? 0x00007f96411f9780 in __memcmp_avx2_movbe () from
>>>> /lib64/libc.so.6
>>>> ?? #1? 0x0000000000482727 in nrresetlinks (rp=0x22c5550) at
>>>> nr3.c:1441
>>>> ?? #2? 0x000000000047ca22 in doobsotick () at nrcmd.c:1316
>>>>
>>>> It is very consistent, so I am running into cases where I am not
>>>> getting
>>>> the final netrom layer 4 disconnect, so the callback remains, but
>>>> JNOS
>>>> needs to loop through the whole circuit table to find valid ones to
>>>> match up with, and this invalid one just happens to still be in the
>>>> table and kablewee :]
>>>>
>>>> Anyways, I hope to have a fix of sorts for this 'soon', very
>>>> frustrating. But again, why has this suddenly started happening
>>>> at the frequency it has for the past 3 months, possibly more ?
>>>>
>>>> Jack, this is probably what you are experiencing as well.
>>>>
>>>> Maiko / VE4KLM
>>>>
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> nos-bbs mailing list
> nos-bbs at lists.tapr.org
> http://lists.tapr.org/mailman/listinfo/nos-bbs_lists.tapr.org
>
>
> ------------------------------
>
> End of nos-bbs Digest, Vol 211, Issue 5
> ***************************************



More information about the nos-bbs mailing list