Page 1 of 1

NRDS Host Down and Passive Checks

Posted: Fri Sep 02, 2016 2:30 pm
by blariv
hi,

we had a redhat vm server go down. however during the unavailable time passive disk/swap/ntp checks never showed any interruption while there was a noticeable down in ping. I have attached a screenshot of what i'm looking at. not sure why we would still collect stats when the server is down.

also here is a snippet of the event log showing the server was down but flapping.

Host Notification2016-09-02 01:45:41HOST NOTIFICATION: wilsonh;bihana04.na.hasbro.com;FLAPPINGSTART (UP);xi_host_notification_handler;1
Runtime Warning2016-09-02 01:45:41HOST FLAPPING ALERT: bihana04.na.hasbro.com;STARTED; Host appears to have started flapping (23.9% change > 20.0% threshold)
Host Recovery2016-09-02 01:45:41HOST ALERT: bihana04.na.hasbro.com;UP;HARD;2;1
Host Down2016-09-02 01:45:35HOST ALERT: bihana04.na.hasbro.com;DOWN;SOFT;1;CRITICAL - bihana04.na.hasbro.com: rta nan, lost 100%
Host Recovery2016-09-02 01:45:23HOST ALERT: bihana04.na.hasbro.com;UP;HARD;4;1
Host Down2016-09-02 01:44:50HOST ALERT: bihana04.na.hasbro.com;DOWN;SOFT;3;CRITICAL - bihana04.na.hasbro.com: rta nan, lost 100%
Host Down2016-09-02 01:43:38HOST ALERT: bihana04.na.hasbro.com;DOWN;SOFT;2;CRITICAL - bihana04.na.hasbro.com: rta nan, lost 100%
Host Down2016-09-02 01:42:26HOST ALERT: bihana04.na.hasbro.com;DOWN;SOFT;1;CRITICAL - bihana04.na.hasbro.com: rta nan, lost 100%

thanks!

Re: NRDS Host Down and Passive Checks

Posted: Sun Sep 04, 2016 9:45 pm
by eloyd
Nagios server doesn't know that the remote host is down for passive checks. Unless you have freshness checking enabled and specify a "I haven't heard anything in 30 minutes, time to trigger an alarm" service check, then Nagios will assume the service continues in whatever state it was in prior to the server downtime.

Re: NRDS Host Down and Passive Checks

Posted: Tue Sep 06, 2016 9:28 am
by rkennedy
eloyd wrote:Nagios server doesn't know that the remote host is down for passive checks. Unless you have freshness checking enabled and specify a "I haven't heard anything in 30 minutes, time to trigger an alarm" service check, then Nagios will assume the service continues in whatever state it was in prior to the server downtime.
Thanks @eloyd!

This is correct, it might be worth assigning check_dummy 0 to the service, as during an active check if you do not hear anything from the passive checks, if you're concerned, as this will always come back in an OK state.

Re: NRDS Host Down and Passive Checks

Posted: Tue Sep 06, 2016 9:43 am
by blariv
thanks!!

Re: NRDS Host Down and Passive Checks

Posted: Tue Sep 06, 2016 10:12 am
by lmiltchev
@blariv is it all right if we close this topic?

Re: NRDS Host Down and Passive Checks

Posted: Tue Sep 06, 2016 10:16 am
by blariv
absolutely and thanks again for the help.