[nos-bbs] Start script with auto restart loop
Michael Fox - N6MEF
n6mef at mefox.org
Thu Mar 5 11:46:15 EST 2020
Jimmy,
We start a screen session in detached mode to run a shell script.
screen -d -m -S jnos-console $shellscript
The $shellscript runs JNOS in a while loop. The gist of it is:
exit_code=-1
while [ $exit_code -ne 0 ]
clean up from previous runs
run jnos
exit_code=$?
clean up and log stuff
depending on $exit_code: send email to sysop, possibly pause
briefly, other stuff
do
We chose a sysop restart exit code (we use 99 because we didn't see it used
in the JNOS code). We use that to create three different possible
situations for exiting or continuing the while loop.
jnos> exit 0
-- causes while loop to exit
-- notifies the sysops by email that jnos was stopped normally
jnos> exit 99
-- stays within while loop
-- notifies the sysops by email that jnos was restarted by a sysop
and includes tail of nos.log and mail.log
-- Note that we try to remember to write something to the log when
we do. That way, all sysops see the log comment in the email. Example:
jnos> remark N6MEF doing something ...
jnos> exit 99
any other exit (jnos failure/crash/abort)
-- stays within while loop
-- notifies the sysops by email that jnos crashed with $exit_code,
includes tail of nos.log and mail.log
-- pauses for 30 seconds before restarting to avoid runaway problems
that could suck up enough processor to make it hard to log in and stop it.
By running the loop in the shell script called by screen, we can log into
linux from anywhere using SSH, reattach to the screen session to get the
jnos console (screen -r), and then restart or do whatever, and we stay
attached within the screen session until we decide to detach (CTRL-A D).
We've used this on our 6 machines for 11 years. We find JNOS to be very
stable. But maybe once every few months it will apparently choke on a
bulletin (last item in the log before crashing - no other indication). No
problem -- it restarts and lets us know what happened via email.
To protect against a hang (as opposed to a crash/abnormal exit), we also
have a separate "watchdog" type of script that pings JNOS regularly and, if
JNOS doesn't respond, will send it a kill signal. The kill causes JNOS to
exit with other than 0 or 99, so the above loop lets us know what happened
and then continues after a pause.
Michael, N6MEF
-----Original Message-----
From: nos-bbs <nos-bbs-bounces at lists.tapr.org> On Behalf Of K3CHB
Sent: Thursday, March 5, 2020 6:22 AM
To: TAPR xNOS Mailing List <nos-bbs at lists.tapr.org>
Subject: [nos-bbs] Start script with auto restart loop
Hello.
I am using JNOS in a screen session over here and would like to add a
restart provision to my startnos script.
Has anyone come up with a good solution for screen instances of jnos?
I am using a raspbian (systems) on a rasp pi 4.
Thanks
Jimmy
K3CHB
--
Sent from Open Mail on Android.
_______________________________________________
nos-bbs mailing list
nos-bbs at lists.tapr.org
http://lists.tapr.org/mailman/listinfo/nos-bbs_lists.tapr.org
More information about the nos-bbs
mailing list