[Nagios-devel] Nagios sometimes shows wrong status

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

[Nagios-devel] Nagios sometimes shows wrong status

Post by Guest »

Hi!

I've seen a strange behavior of nagios with a very simple check script.

the relevant part of the script:
#########################################################################
MAINTCNT="`/usr/sbin/metastat |grep -i maint |wc -l`"
RESYNCNT="`/usr/sbin/metastat |grep -i resync |wc -l`"

NOTOK=0
status=$STATE_UNKNOWN

if [ $RESYNCNT -gt 0 ]; then
NOTOK=1
TEXT="WARNING - One or more disks are in resync state. "
status=$STATE_WARNING
fi

if [ $MAINTCNT -gt 0 ]; then
NOTOK=1
TEXT="CRITICAL - One or more disks are in maintenance state."
status=$STATE_CRITICAL
fi


if [ $NOTOK -eq 1 ]; then
echo $TEXT
datum=`date`
echo $datum $status >> /tmp/svm.debug
exit $status
fi

echo "OK - There is no maintenance necessary!"
exit $STATE_OK

#########################################################################

when executing the script from command line, the return code always is 2
and the output always is "CRITICAL - One or more disks are in maintenance
state." (because there is one dead disk) => thats ok

when nagios executes the script, the output always is "CRITICAL - One or
more disks are in maintenance state." but the return code sometimes is 0
and sometimes is 2 => thats not good

snippet from nagios.log:
[1243410051] SERVICE ALERT: acgweb1;BASIC_SVM;CRITICAL;SOFT;1;CRITICAL -
One or more disks are in maintenance state.
[1243410063] EXTERNAL COMMAND:
SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243410061
[1243410071] SERVICE ALERT: acgweb1;BASIC_SVM;OK;SOFT;2;CRITICAL - One or
more disks are in maintenance state.
[1243410083] EXTERNAL COMMAND:
SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243410081
[1243410091] SERVICE ALERT: acgweb1;BASIC_SVM;CRITICAL;SOFT;1;CRITICAL -
One or more disks are in maintenance state.
[1243410124] EXTERNAL COMMAND:
SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243410122
[1243410131] SERVICE ALERT: acgweb1;BASIC_SVM;OK;SOFT;2;CRITICAL - One or
more disks are in maintenance state.
[1243411031] SERVICE ALERT: acgweb1;BASIC_SVM;CRITICAL;SOFT;1;CRITICAL -
One or more disks are in maintenance state.
[1243411316] SERVICE ALERT: acgweb1;BASIC_SVM;OK;SOFT;2;CRITICAL - One or
more disks are in maintenance state.
[1243411323] EXTERNAL COMMAND:
SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243411320
[1243411326] SERVICE ALERT: acgweb1;BASIC_SVM;CRITICAL;SOFT;1;CRITICAL -
One or more disks are in maintenance state.
[1243411363] EXTERNAL COMMAND:
SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243411361
[1243411366] SERVICE ALERT: acgweb1;BASIC_SVM;OK;SOFT;2;CRITICAL - One or
more disks are in maintenance state.
[1243411370] EXTERNAL COMMAND:
SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243411368
[1243411376] SERVICE ALERT: acgweb1;BASIC_SVM;CRITICAL;SOFT;1;CRITICAL -
One or more disks are in maintenance state.
[1243411391] EXTERNAL COMMAND:
SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243411389
[1243411396] SERVICE ALERT: acgweb1;BASIC_SVM;CRITICAL;SOFT;2;CRITICAL -
One or more disks are in maintenance state.
[1243411398] EXTERNAL COMMAND:
SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243411396
[1243411406] SERVICE ALERT: acgweb1;BASIC_SVM;CRITICAL;SOFT;3;CRITICAL -
One or more disks are in maintenance state.
[1243411407] EXTERNAL COMMAND:
SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243411405



/tmp/svm.debug confirmes the command line result:
> cat /tmp/svm.debug
Wed May 27 08:21:33 GMT 2009 2
Wed May 27 08:22:28 GMT 2009 2
Wed May 27 08:22:39 GMT 2009 2
Wed May 27 08:22:46 GMT 2009 2
Wed May 27 08:23:00 GMT 2009 2
Wed May 27 08:23:11 GMT 2009 2
Wed May 27 08:23:46 GMT 2009 2
Wed May 27 08:24:01 GMT 2009 2
Wed May 27 08:27:09 GMT 2009 2
Wed May 27 08:27:19 GMT 2009 2
Wed May 27 08:27:35 GMT 2009 2
Wed May 27 08:27:50 GMT 2009 2
Wed May 27 08:27:56 GMT 2009 2
Wed May 27 08:29:01 GMT 2009 2
Wed May 27 08:32:55 GMT 2009 2
Wed May 27 08:34:01 GMT 2009 2
Wed May 27 08:37:55 GMT 2009 2
Wed May 27 08:39:01 GMT 2009 2
Wed May 27 08:39:55 GMT 2009 2
Wed May 27 08:44:01 GMT 2009 2
Wed May 27 08:44:55 GMT 2009 2

and so on.....

any ideas whats going here wrong?


best regards,
michael








This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Locked