[nos-bbs] smtp queue run overlap/hang

Michael Fox - N6MEF n6mef at mefox.org
Wed Jul 13 12:25:00 EDT 2016


I'm having an occasional problem with SMTP queue hangs.  I don't have it
nailed down yet, but it seems to only occur when there is a large amount of
smtp mail in the queue and it takes longer than "smtp timer" to deliver.  (I
have smtp timer set to 600, i.e. 10 min.)   My supposition is that when smtp
timer fires, JNOS gets confused if it is still processing the queue from the
previous scan.

 

The symptoms are:

*  This only happens when there is a large amount of smtp mail in the queue
(many messages, each of which is a few KB in length) such that it takes
longer than "smtp timer" to deliver (over 1200 baud radio).

*  "tcp status" shows the machine is in an smtp session with another
machine, but there is no activity on the radio port.

*  Lock files exist for the stuck messages.  I presume this prevents the
next queue run from touching them.

 

I get alerted to the problem because I have a shell script run by cron which
looks at the mail queue for messages that are older than 2 x "smtp timer".
Before the shell script, messages could be stuck for a very long time and I
would never know it.

 

To overcome the situation, what I do is:

*  Check "tcp status" to find the control block number for the hung smtp
session

*  Use "tcp reset <control-block-#>" to kill the session

*  Use "smtp kick" to start things back up again

*  Alternatively, restarting JNOS also works , but is obviously more
disruptive.

 

I plan to increase "smtp timer" from 600 to 900 to see if the problem is
avoided (more often).  But if my assumption is correct, this would only
avoid more instances of the problem, not eliminate the problem, and it would
do so at the expense of slower mail delivery.

 

Can someone with C language skills look at the code to see if it handles the
case where "smtp timer" fires while the previous queue is still being
processed?  If that case is handled, how is it handled?

 

BTW, this is NOT new to 2.0k.rc.

 

Thanks,

Michael

N6MEF

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.tapr.org/pipermail/nos-bbs_lists.tapr.org/attachments/20160713/fcbffdc4/attachment.html>


More information about the nos-bbs mailing list