Bash script exit codes

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
dLans
Posts: 40
Joined: Tue May 27, 2014 1:54 am

Bash script exit codes

Post by dLans »

Hello!

We've been working on implementing SAP job monitoring. The last difficulty we face is returning the correct status codes to Nagios. I've tried simply "exit 0" or "exit 2" but it always returns the "OK" status. We run the below script to check if the output contains "ind". If it does not it should report CRITICAL. This job always breaks so there is no "ind" in the output. With the "echo "fout"" we made sure the IF statement was working correctly, which it is.

What are we doing wrong? :(

Code: Select all

STATE_OK=0
STATE_CRITICAL=2

output=$(/usr/local/nagios/libexec/check_sap job_de30mrprunextra pr1) 2>&1

if [[ $(/usr/local/nagios/libexec/check_sap job_de30mrprunextra pr1) =~ "ind" ]]
        then
                echo "goed"
#                echo $output | tr --delete '|'
                exitstatus=$STATE_OK
                exit $exitstatus
        else
                echo "fout"
#                echo $output | tr --delete '|'
                exitstatus=$STATE_CRITICAL
                exit $exitstatus
fi
The actual output we get:

Code: Select all

DE30 MRP RUN (15:00 EXTRA) History = DE30 MRP RUN (15:00 EXTRA) | 15000002 | Afgebroken, started at 2017-02-22,15:00:30 terminated at 2017-02-22,15:01:15
Kind regards,
Dennis Lans
gormank
Posts: 1114
Joined: Tue Dec 02, 2014 12:00 pm

Re: Bash script exit codes

Post by gormank »

The text output and the return value are not the same thing. Run the check and then echo $? to see the return value. You can print critical, but return zero and the service will be green. I've done it...
I'd move the exits to outside the conditional...
You have echos of goed, or fout that don't appear in your output which suggests that there's a logic problem in the if line and never setting the return value.
Maybe try this (take out the echo $exitstatus when it works):

Code: Select all

STATE_OK=0
STATE_WARNING=1
STATE_CRITICAL=2
exitstatus=$STATE_WARNING

output=$(/usr/local/nagios/libexec/check_sap job_de30mrprunextra pr1) 2>&1

if [[ $output =~ "ind" ]]
        then
                echo "goed"
#                echo $output | tr --delete '|'
        else
                echo "fout"
#                echo $output | tr --delete '|'
fi
echo $output
echo $exitstatus
exit $exitstatus
Last edited by gormank on Thu Feb 23, 2017 4:37 pm, edited 1 time in total.
avandemore
Posts: 1597
Joined: Tue Sep 27, 2016 4:57 pm

Re: Bash script exit codes

Post by avandemore »

gormank wrote:The text output and the return value are not the same thing. Run the check and then echo $? to see the return value. You can print critical, but return zero and the service will be green.
This is a somewhat common stumbling block. Here is some details on exactly how plugins are expected to be returning data:

https://nagios-plugins.org/doc/guidelines.html
https://mathias-kettner.de/checkmk_localchecks.html
http://www.yourownlinux.com/2014/06/how ... cript.html
Previous Nagios employee
gormank
Posts: 1114
Joined: Tue Dec 02, 2014 12:00 pm

Re: Bash script exit codes

Post by gormank »

Sorry, I moved things around and messed it up. It seems to work right if the original script is changed so the if is like this:

if [[ $output =~ "ind" ]]

Code: Select all

# ./test.sh ind
some nonsense string containing | ind
goed
some nonsense string containing  ind
status: 0

# ./test.sh
some nonsense string containing |
fout
some nonsense string containing
status: 2

# echo $?
2

Code: Select all

#!/usr/bin/sh

STATE_OK=0
STATE_WARNING=1
STATE_CRITICAL=2
exitstatus=$STATE_WARNING

#output=$(/usr/local/nagios/libexec/check_sap job_de30mrprunextra pr1) 2>&1
output="some nonsense string containing | $1"

echo $output

if [[ $output =~ "ind" ]]
        then
                echo "goed"
                echo $output | tr --delete '|'
                exitstatus=$STATE_OK
        else
                echo "fout"
                echo $output | tr --delete '|'
                exitstatus=$STATE_CRITICAL
fi
echo status: $exitstatus
exit $exitstatus
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Bash script exit codes

Post by dwhitfield »

@dLans, was @gormank's answer useful for you?

Thanks @gormank!
dLans
Posts: 40
Joined: Tue May 27, 2014 1:54 am

Re: Bash script exit codes

Post by dLans »

Hi,

My apologies for the delay in response, it was carnaval and many of us were not in the office :) First of all thank you for putting some time into this, it is much appreciated! I tried the script and I get the below output:

Code: Select all

DE30 MRP RUN (15:00 EXTRA) History = DE30 MRP RUN (15:00 EXTRA) 15003102 Be?ind. , started at 2017-02-27,15:00:30 finished at 2017-02-27,15:02:11
fout
DE30 MRP RUN (15:00 EXTRA) History = DE30 MRP RUN (15:00 EXTRA) 15003102 Be?ind. , started at 2017-02-27,15:00:30 finished at 2017-02-27,15:02:11
status: 2
The status reported within Nagios is OK. I changed the string it is looking for to something that is there and the output changes, but the status remains on OK. All of our checks are working correctly, its only this custom check that has issues :/
dLans
Posts: 40
Joined: Tue May 27, 2014 1:54 am

Re: Bash script exit codes

Post by dLans »

Even when I completely remove all text and then simply type:

Code: Select all

STATE_CRITICAL=2
exit $STATE_CRITICAL
I still get the OK status within Nagios (but the output is obviously gone). GRR!
dLans
Posts: 40
Joined: Tue May 27, 2014 1:54 am

Re: Bash script exit codes

Post by dLans »

Okay, so it turned out to be in the command. I removed the $ARG1$ $ARG2$ values and now it instantly works as expected.

So from:
$USER1$/check_sap_DE30MRPRUN15:00EXTRA -H $HOSTADDRESS$ $ARG1$ $ARG2$
To:
$USER1$/check_sap_DE30MRPRUN15:00EXTRA -H $HOSTADDRESS$

I have no idea why, but I'm glad it is finaly working =) Thank you both!
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Bash script exit codes

Post by dwhitfield »

If you were running it through NRPE and not escaping the arguments correctly, that could be a reason. (no indication of that in the thread, but that's the first thing that comes to mind)

That said, it sounds like this issue has been resolved. Is it okay if we lock this thread? Thanks for choosing the Nagios forums!
Locked