Page 1 of 1

Exit Code 2 but still OK status in Nagios XI

Posted: Tue Sep 17, 2013 3:34 am
by fogier
We have a problem with multiple services (different scripts). They exit with exit code 2 but Nagios still presents them with a green icon (OK status).
We can force an error by locking a specific account on a oracle database.

check_oracle_health.ng

Code: Select all

[root@SERVER libexec]# ./check_nrpe -H server -c check_oracle_db1 -a "--mode=connection-time --method=sqlplus --units=% --warning=1 --critical=2"
Oracle check dbsid: CRITICAL
cannot connect to nagios/***@dbsid. ORA-28000: the account is locked  .
Knipsel.PNG
nrpe.cfg

Code: Select all

command[check_oracle_db1]=/usr/bin/sudo -u oracle /beheer/scripts/nagios/check_oracle_health.ng 1 $ARG1$
Script oracle_health.ng

Code: Select all

SIDNR=$1
ORATAB=/etc/oratab
VLGNR=0
# extract maken van 2e parm die mode= bevat; nodig voor uniek maken van chkfile
MODE=`echo "$2" | awk -F= '{print $2}' -`
CHKFILE="/tmp/check_oracle_health_$SIDNR$MODE.dat"
rm $CHKFILE

cat $ORATAB | while read LINE
do
  case $LINE in
        *\#*)                ;;        #comment-line in oratab
        *)
        if [ "`echo $LINE | awk -F: '{print $3}' -`" != "A" ] ; then
        if [ "`echo $LINE | awk -F: '{print $3}' -`" = "Y" ] ; then
            VLGNR=$((VLGNR + 1))

            if [ "$VLGNR" = "$SIDNR" ] ; then
               DB=`echo $LINE | awk -F: '{print $1}' -`
               OH=`echo $LINE | awk -F: '{print $2}' -`
               ID="nagios/nagios@"$DB

               PARM=$2" "$3" "$4" "$5" "$6" --connect="$ID" --environment=ORACLE_HOME="$OH

#              echo "Db "$DB
#              echo "parm:" $PARM
#              echo "volgnr sid:" $VLGNR" en parm is "$1" en sid is "$DB

               touch $CHKFILE

               UITVOER=`/beheer/scripts/nagios/check_oracle_health $PARM`
               echo "${UITVOER}"
               if [ "`echo ${UITVOER} | awk '{print $4}' -`" = "CRITICAL" ] ; then
                   exit 2
               fi
               if [ "`echo ${UITVOER} | awk '{print $4}' -`" = "OK" ] ; then
                   exit 0
               fi

#              RET=$?
#              exit $RET
               echo "Hier komen is niet mogelijk"
               exit 1

            fi
        fi
        fi
        ;;
   esac
done

# Parameters in de while-loop zijn erbuiten niet op te vragen
# Daarom werken we met het wel/niet bestaan van een bestandje
if [ ! -f "$CHKFILE" ] ; then

    echo "Oracle check sid-volgnr $SIDNR: CRITICAL"
    echo "Geen ORATAB-entry $SIDNR gevonden op: " `date`
    exit 2

fi
I've tested this on Nagios XI 2012 v1.6 en v2.3. Same result. We use the CentOS 64bit vsphere appliance from your website.

Re: Exit Code 2 but still OK status in Nagios XI

Posted: Tue Sep 17, 2013 8:32 am
by BanditBBS
Looks like to me that the "exit 2" is being returned as text instead of an exit code. It should not appear in the text returned as you can see in the one image you attached. Can't see why that's happening in the script, but haven't looked hard yet.

Re: Exit Code 2 but still OK status in Nagios XI

Posted: Tue Sep 17, 2013 8:55 am
by fogier
thanks for replying.

I've editted the script in the meanwhile. We've added echo "exit 2" to the script to see it's really ending there.
After that, we've removed it again as you can see in the script. Sorry for the misunderstanding.

Re: Exit Code 2 but still OK status in Nagios XI

Posted: Tue Sep 17, 2013 1:09 pm
by sreinhardt
So at this point is it working, or are you still having difficulties?

Re: Exit Code 2 but still OK status in Nagios XI

Posted: Wed Sep 18, 2013 12:45 am
by fogier
No, we will have the problem. I just removed the echo line between screenshots.

Re: Exit Code 2 but still OK status in Nagios XI

Posted: Wed Sep 18, 2013 9:52 am
by slansing
It's possibly exiting with a 0 before it reaches the end and exits with a 2, per these lines:

Code: Select all

if [ "`echo ${UITVOER} | awk '{print $4}' -`" = "OK" ] ; then
                   exit 0
               fi
Have you tried commenting these out and seeing if it makes it to the end, then actually exits without echoing 2.

Re: Exit Code 2 but still OK status in Nagios XI

Posted: Thu May 08, 2014 3:19 am
by fogier
There was indeed a problem in the script itself.