Page 1 of 1
Bash script exit codes
Posted: Thu Feb 23, 2017 6:58 am
by dLans
Hello!
We've been working on implementing SAP job monitoring. The last difficulty we face is returning the correct status codes to Nagios. I've tried simply "exit 0" or "exit 2" but it always returns the "OK" status. We run the below script to check if the output contains "ind". If it does not it should report CRITICAL. This job always breaks so there is no "ind" in the output. With the "echo "fout"" we made sure the IF statement was working correctly, which it is.
What are we doing wrong?
Code: Select all
STATE_OK=0
STATE_CRITICAL=2
output=$(/usr/local/nagios/libexec/check_sap job_de30mrprunextra pr1) 2>&1
if [[ $(/usr/local/nagios/libexec/check_sap job_de30mrprunextra pr1) =~ "ind" ]]
then
echo "goed"
# echo $output | tr --delete '|'
exitstatus=$STATE_OK
exit $exitstatus
else
echo "fout"
# echo $output | tr --delete '|'
exitstatus=$STATE_CRITICAL
exit $exitstatus
fi
The actual output we get:
Code: Select all
DE30 MRP RUN (15:00 EXTRA) History = DE30 MRP RUN (15:00 EXTRA) | 15000002 | Afgebroken, started at 2017-02-22,15:00:30 terminated at 2017-02-22,15:01:15
Kind regards,
Dennis Lans
Re: Bash script exit codes
Posted: Thu Feb 23, 2017 1:07 pm
by gormank
The text output and the return value are not the same thing. Run the check and then echo $? to see the return value. You can print critical, but return zero and the service will be green. I've done it...
I'd move the exits to outside the conditional...
You have echos of goed, or fout that don't appear in your output which suggests that there's a logic problem in the if line and never setting the return value.
Maybe try this (take out the echo $exitstatus when it works):
Code: Select all
STATE_OK=0
STATE_WARNING=1
STATE_CRITICAL=2
exitstatus=$STATE_WARNING
output=$(/usr/local/nagios/libexec/check_sap job_de30mrprunextra pr1) 2>&1
if [[ $output =~ "ind" ]]
then
echo "goed"
# echo $output | tr --delete '|'
else
echo "fout"
# echo $output | tr --delete '|'
fi
echo $output
echo $exitstatus
exit $exitstatus
Re: Bash script exit codes
Posted: Thu Feb 23, 2017 4:28 pm
by avandemore
gormank wrote:The text output and the return value are not the same thing. Run the check and then echo $? to see the return value. You can print critical, but return zero and the service will be green.
This is a somewhat common stumbling block. Here is some details on exactly how plugins are expected to be returning data:
https://nagios-plugins.org/doc/guidelines.html
https://mathias-kettner.de/checkmk_localchecks.html
http://www.yourownlinux.com/2014/06/how ... cript.html
Re: Bash script exit codes
Posted: Thu Feb 23, 2017 5:07 pm
by gormank
Sorry, I moved things around and messed it up. It seems to work right if the original script is changed so the if is like this:
if [[ $output =~ "ind" ]]
Code: Select all
# ./test.sh ind
some nonsense string containing | ind
goed
some nonsense string containing ind
status: 0
# ./test.sh
some nonsense string containing |
fout
some nonsense string containing
status: 2
# echo $?
2
Code: Select all
#!/usr/bin/sh
STATE_OK=0
STATE_WARNING=1
STATE_CRITICAL=2
exitstatus=$STATE_WARNING
#output=$(/usr/local/nagios/libexec/check_sap job_de30mrprunextra pr1) 2>&1
output="some nonsense string containing | $1"
echo $output
if [[ $output =~ "ind" ]]
then
echo "goed"
echo $output | tr --delete '|'
exitstatus=$STATE_OK
else
echo "fout"
echo $output | tr --delete '|'
exitstatus=$STATE_CRITICAL
fi
echo status: $exitstatus
exit $exitstatus
Re: Bash script exit codes
Posted: Thu Feb 23, 2017 5:36 pm
by dwhitfield
@dLans, was
@gormank's answer useful for you?
Thanks
@gormank!
Re: Bash script exit codes
Posted: Tue Feb 28, 2017 4:10 am
by dLans
Hi,
My apologies for the delay in response, it was carnaval and many of us were not in the office

First of all thank you for putting some time into this, it is much appreciated! I tried the script and I get the below output:
Code: Select all
DE30 MRP RUN (15:00 EXTRA) History = DE30 MRP RUN (15:00 EXTRA) 15003102 Be?ind. , started at 2017-02-27,15:00:30 finished at 2017-02-27,15:02:11
fout
DE30 MRP RUN (15:00 EXTRA) History = DE30 MRP RUN (15:00 EXTRA) 15003102 Be?ind. , started at 2017-02-27,15:00:30 finished at 2017-02-27,15:02:11
status: 2
The status reported within Nagios is OK. I changed the string it is looking for to something that is there and the output changes, but the status remains on OK. All of our checks are working correctly, its only this custom check that has issues :/
Re: Bash script exit codes
Posted: Tue Feb 28, 2017 4:38 am
by dLans
Even when I completely remove all text and then simply type:
Code: Select all
STATE_CRITICAL=2
exit $STATE_CRITICAL
I still get the OK status within Nagios (but the output is obviously gone). GRR!
Re: Bash script exit codes
Posted: Tue Feb 28, 2017 5:25 am
by dLans
Okay, so it turned out to be in the command. I removed the $ARG1$ $ARG2$ values and now it instantly works as expected.
So from:
$USER1$/check_sap_DE30MRPRUN15:00EXTRA -H $HOSTADDRESS$ $ARG1$ $ARG2$
To:
$USER1$/check_sap_DE30MRPRUN15:00EXTRA -H $HOSTADDRESS$
I have no idea why, but I'm glad it is finaly working =) Thank you both!
Re: Bash script exit codes
Posted: Tue Feb 28, 2017 10:04 am
by dwhitfield
If you were running it through NRPE and not escaping the arguments correctly, that could be a reason. (no indication of that in the thread, but that's the first thing that comes to mind)
That said, it sounds like this issue has been resolved. Is it okay if we lock this thread? Thanks for choosing the Nagios forums!