Page 2 of 2
Re: Service checks go stale 30s after passive check received
Posted: Thu Dec 07, 2017 4:17 am
by invade
tgriep wrote:Do you ever see that service go in to an OK state after receiving the Passive check?
Isn't that what the log shows? eg.
[Wed Nov 22 22:39:21 2017] PASSIVE SERVICE CHECK: host;System-Partitions;0;DISK OK
Passive check received OK.
[Wed Nov 22 22:39:21 2017] SERVICE ALERT: host;System-Partitions;OK;HARD;1;DISK OK
Service set to an OK state.
[Wed Nov 22 22:40:01 2017] Warning: The results of service 'System-Partitions' on host 'host' are stale by 0d 1h 15m 30s (threshold=0d 0h 14m 0s). I'm forcing an immediate check of the service.
Warning about service checks being stale.
[Wed Nov 22 22:40:11 2017] SERVICE ALERT: host;System-Partitions;CRITICAL;HARD;1;CRITICAL: No Recent Passive Service Checks.
Service set to a CRITICAL state.
Apologies if I've misunderstood.
Re: Service checks go stale 30s after passive check received
Posted: Thu Dec 07, 2017 1:58 pm
by tgriep
Sorry, I wasn't clear. I meant in the GUI interface.
Re: Service checks go stale 30s after passive check received
Posted: Fri Dec 08, 2017 5:02 am
by invade
Problem is resolved. The issue was that the clock was wrong on all the hosts where the service checks kept going stale.
The screenshot attached is following the successful receipt of a check from the host:
[Fri Dec 8 09:31:18 2017] PASSIVE SERVICE CHECK: ccc-gr-aio;Service-Asterisk;0;OK - Asterisk Service is running / Dec-08 08:50:01 EET
So the check was received on the Nagios server at 09:31:18 GMT (which should be 11:31:18 EET) but the time on the host was actually 08:50:01 EET. This seems to have set the "Last Check" time to 06:50:01 which is older than the freshness threshold and therefore set the check to stale.
So, although the Nagios server was receiving regular checks, there must be some sort of timestamp (that was incorrect) in the check that was causing the Nagios server to determine that the check received was actually hours old and therefore stale.
We have now corrected the time on the hosts and the checks are no longer going stale.
Many thanks for the assistance with this problem, we now have a much quieter alerting system.
Re: Service checks go stale 30s after passive check received
Posted: Fri Dec 08, 2017 2:08 pm
by kyang
Glad you resolved your issue!
Did you have any more questions or are we okay to lock this up?
Re: Service checks go stale 30s after passive check received
Posted: Sun Dec 10, 2017 7:28 am
by invade
All sorted now. This thread can be marked as resolved. Thanks.
Re: Service checks go stale 30s after passive check received
Posted: Mon Dec 11, 2017 10:03 am
by kyang
Sounds good! I'll be closing this thread!
If you have any more questions, feel free to create another thread.
Thanks for using the Nagios Support Forum!