[nos-bbs] Corrected: JNOS 2.0k memory issue?
Michael Fox - N6MEF
n6mef at mefox.org
Sat Aug 13 15:51:13 EDT 2016
Thanks. I’ll give it a try.
From: nos-bbs [mailto:nos-bbs-bounces at tapr.org] On Behalf Of Boudewijn (Bob) Tenty
Sent: Saturday, August 13, 2016 12:36 AM
To: TAPR xNOS Mailing List <nos-bbs at tapr.org>
Subject: Re: [nos-bbs] Corrected: JNOS 2.0k memory issue?
You can test for memory leaks with valgrind, see http://www.cprogramming.com/debugging/valgrind.html how to use it.
Strange as all modern O/S for PC's free RAM memory when a program is terminated except shared memory is not usually reclaimed.
I head about some issues with memory for sockets sometimes.
On 2016-08-13 12:38 AM, Michael Fox - N6MEF wrote:
Corrected. 100M instead of 100K.
From: Michael Fox - N6MEF [mailto:n6mef at mefox.org]
Sent: Friday, August 12, 2016 9:36 PM
To: TAPR xNOS Mailing List (nos-bbs at tapr.org <mailto:nos-bbs at tapr.org> ) <mailto:nos-bbs at tapr.org> <nos-bbs at tapr.org>
Subject: JNOS 2.0k memory issue?
I’ve been trying to nail down what condition might be causing the hangs with 2.0k. I discovered what may be a memory problem with JNOS. This may be why I haven’t noticed the problem on my test machine (12GB) but I did notice on my production machine (4GB). Here’s what I found:
I have a machine with 4GB of RAM. I boot it up (without starting JNOS), and free memory (reported by “free”) will stay at about 2.3G for as long as you want to watch it. I used “watch free” with the default 2 second refresh rate.
I start up JNOS and free memory drops to about 1.8G or lower.
As forwarding starts to occur, the amount of free memory drops lower and lower, and will bottom out at about 100M bytes. So JNOS is taking about 2.2G of RAM at that point (2.3G – 100M).
When memory goes this low, the JNOS console window becomes unresponsive. And the status line isn’t updated. For example, as I watch “tail -f nos.log”, I can see new forwarding connections happen, but the BBS call signs don’t appear on the status display.
About the time that certain events happen in nos.log, such as “reintegrating data [F] from eol issue”, free memory will jump back up – not as high as before, but up to about 1.4G. This goes on in cycles of lower and higher memory utilization as different forwarding sessions happen.
If I wait until all forwarding sessions have finished and JNOS is quiescent, and then use exit 0, my linux free memory is at 1.4G and it will stay there for as long as you watch it. (Remember, it was at 2.3G before starting JNOS!)
If I start up JNOS again, free memory will drop to about 1.0G after JNOS completes its startup. If I exit 0, free memory stays at 1.0G for as long as you want to watch it.
After reboot, free memory is back to 2.3G and will stay there (as long as JNOS is not started!). This cycle is repeatable.
This seems to suggest that:
* JNOS is not returning all memory to the OS. Each invocation of JNOS leaves less free memory in the system.
* While JNOS is running, certain conditions during forwarding cause JNOS to consume a large amount of memory and hold it for an extended period until some other condition happens. Even then, not all is returned.
To help with troubleshooting, I prepare a little script that will log the output from “free” to a file with timestamps, so it can be compared alongside the nos.log. I captured the two scenarios above -- initial run (ending with 1.4G free) and a second run (ending on 1.0G Free) – using 2 second time intervals in the free memory log. Maybe you have development tools that provide better info. But if you’d like the data, let me know.
Also, this may not be limited to 2.0k. I found similar behavior in 2.0j. However, the condition seems to be exacerbated by compression. And because of some of the B2F forwarding errors that were fixed in 2.0k, I was not using compression in the 4G machines. Therefore, perhaps the exact same conditions aren’t present in 2.0j. So maybe that’s why 2.0j hasn’t completely hung in the lower-memory units like 2.0k did.
nos-bbs mailing list
nos-bbs at tapr.org <mailto:nos-bbs at tapr.org>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the nos-bbs