Page 1 of 1

Nagios mess up with passive service checks

Posted: Wed Aug 08, 2012 7:41 am
by nagsteff
Hi together

I'm new in this forum, but I work since a few years with nagios.
For most problems I found a solution in the Internet, but not for this:

I use Nagios Core 3.3.1
I receive SNMP-Traps and submit them to the nagios.cmd file as passive check. This works fine, I use only the CRITICAL and OK statuses.
-> Nagios shows me the Service statuses, all ok.

Now the problem: Sometimes I get the two Traps (CRITICAL and OK status) very fast, one after another. Then Nagios doesn't have the right status.

For your understanding: SignalLost = CRITICAL, SignalFound = OK
This is in the log:

11:38:20 nagios-srv nagios: EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;172.16.1.26;Video_248;2;SignalLost Video_248
11:38:21 nagios-srv nagios: EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;172.16.1.26;Video_248;0;SignalFound Video_248
11:38:22 nagios-srv nagios: PASSIVE SERVICE CHECK: FAST-SRV-DVR17;Video_248;0;SignalFound Video_248
11:38:22 nagios-srv nagios: SERVICE ALERT: FAST-SRV-DVR17;Video_248;OK;HARD;1;SignalFound Video_248
11:38:22 nagios-srv nagios: PASSIVE SERVICE CHECK: FAST-SRV-DVR17;Video_248;2;SignalLost Video_248
11:38:22 nagios-srv nagios: SERVICE ALERT: FAST-SRV-DVR17;Video_248;CRITICAL;HARD;1;SignalLost Video_248

=> The service should now be OK, but it is in CRITICAL!
If you see the sequence of the External Commands, the sequence is right: 1st comes "SignalLost" (CRITICAL) then "SignalFound" (OK).
Then it begins to do the passive service check with "SignalFound" (OK) and then with "SignalLost" (CRITICAL). -> Now I have the wrong Status!

Can somebody help me? I already read the settings in the nagios config file, but didn't find anything which has something to do with this behaviour.
This happens every few days and I have about 300 services like that.

Thank you very much
Steff

Re: Nagios mess up with passive service checks

Posted: Thu Aug 09, 2012 8:24 am
by nagsteff
I put a sleep for one second in my script, this solved the problem.