Re: [Nagios-devel] Nagios sometimes shows wrong status

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

Re: [Nagios-devel] Nagios sometimes shows wrong status

Post by Guest »

Hi,

sorry, here is the right snippet from nagios.log:

[1243412547] EXTERNAL COMMAND:
SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243412545
[1243412553] SERVICE ALERT: acgweb1;BASIC_SVM;OK;SOFT;3;CRITICAL - One or
more disks are in maintenance state.
[1243412558] EXTERNAL COMMAND:
SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243412556
[1243412565] EXTERNAL COMMAND:
SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243412563
[1243412579] EXTERNAL COMMAND:
SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243412577
[1243412583] SERVICE ALERT: acgweb1;BASIC_SVM;CRITICAL;SOFT;1;CRITICAL -
One or more disks are in maintenance state.
[1243412590] EXTERNAL COMMAND:
SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243412588
[1243412593] SERVICE ALERT: acgweb1;BASIC_SVM;OK;SOFT;2;CRITICAL - One or
more disks are in maintenance state.
[1243412625] EXTERNAL COMMAND:
SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243412623
[1243412828] EXTERNAL COMMAND:
SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243412826
[1243412838] EXTERNAL COMMAND:
SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243412836
[1243412854] EXTERNAL COMMAND:
SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243412851
[1243412869] EXTERNAL COMMAND:
SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243412866
[1243412875] EXTERNAL COMMAND:
SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243412866
...
[1243413483] SERVICE ALERT: acgweb1;BASIC_SVM;CRITICAL;SOFT;1;CRITICAL -
One or more disks are in maintenance state.
...
[1243413603] SERVICE ALERT: acgweb1;BASIC_SVM;OK;SOFT;2;CRITICAL - One or
more disks are in maintenance state.
[1243413903] SERVICE ALERT: acgweb1;BASIC_SVM;CRITICAL;SOFT;1;CRITICAL -
One or more disks are in maintenance state.


and the time of the two systems are sync :-)

# date && ssh acgweb1 date
Wednesday, May 27, 2009 2:02:20 PM GMT
Wed May 27 14:02:20 GMT 2009


regards,
michael

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 27/05/09 04:52 AM, Michael Prochaska wrote:
>> Hi!
>>
>> I've seen a strange behavior of nagios with a very simple check script.
>>
>> the relevant part of the script:
>> #########################################################################
>> MAINTCNT="`/usr/sbin/metastat |grep -i maint |wc -l`"
>> RESYNCNT="`/usr/sbin/metastat |grep -i resync |wc -l`"
>>
>> NOTOK=0
>> status=$STATE_UNKNOWN
>>
>> if [ $RESYNCNT -gt 0 ]; then
>> NOTOK=1
>> TEXT="WARNING - One or more disks are in resync state. "
>> status=$STATE_WARNING
>> fi
>>
>> if [ $MAINTCNT -gt 0 ]; then
>> NOTOK=1
>> TEXT="CRITICAL - One or more disks are in maintenance state."
>> status=$STATE_CRITICAL
>> fi
>>
>>
>> if [ $NOTOK -eq 1 ]; then
>> echo $TEXT
>> datum=`date`
>> echo $datum $status >> /tmp/svm.debug
>> exit $status
>> fi
>>
>> echo "OK - There is no maintenance necessary!"
>> exit $STATE_OK
>>
>> #########################################################################
>>
>> when executing the script from command line, the return code always is 2
>> and the output always is "CRITICAL - One or more disks are in
>> maintenance
>> state." (because there is one dead disk) => thats ok
>>
>> when nagios executes the script, the output always is "CRITICAL - One or
>> more disks are in maintenance state." but the return code sometimes is 0
>> and sometimes is 2 => thats not good
>>
>> snippet from nagios.log:
>> [1243410051] SERVICE ALERT: acgweb1;BASIC_SVM;CRITICAL;SOFT;1;CRITICAL -
>> One or more disks are in maintenance state.
>> [1243410063] EXTERNAL COMMAND:
>> SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243410061
>> [1243410071] SERVICE ALERT: acgweb1;BASIC_SVM;OK;SOFT;2;CRITICAL - One
>> or
>> more disks are in maintenance state.
>> [1243410083] EXTERNAL COMMAND:
>> SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243410081
>> [1243410091] SERVICE ALERT: acgweb1;BASIC_SVM;CRITICAL;SOFT;1;CRITICAL -
>> One or more disks are in maintenance state.
>> [1243410124] EXTERNAL COMMAND:
>> SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243410122
>> [1243410131] SERVICE ALERT: acgweb1;BASIC_SVM;OK;SOFT;2;CRITICAL - One
>> or
>> more disks are in maintenance state.
>> [1243411031] SERVICE ALERT: acgweb1;BASIC_

...[email truncated]...


This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Locked