Host checks not changing even when system down.

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
EchoKev
Posts: 40
Joined: Tue Jul 02, 2013 11:35 am

Host checks not changing even when system down.

Post by EchoKev »

I took over this nagios instance, and I am trying to figure out why there are so many notifications when our IPSec connection to a remote site goes down randomly. I was under the assumption that when a host becomes unreachable/down, all the notifications for the service checks under it would not be sent as well. Ie. Service notifications depend on host up.

What we are experiencing is that when the IPSec connection goes down, none of the service checks are able to connect and everyone of them send an alert. When I check nagios for the host check it says that the hosts have been ok for 160 days +, while all the services under the hosts are down (or after the IPSec is back all of them are up for the same amount of minutes).

Verified by making a new host, setting it's IP to a valid IP and forcing the check so that it goes green. Changed the IP address in the config to a non-valid IP and restarted nagios, and the host stays green/OK, even though the host IP is no longer reachable.

Is there any way to determine why nagios is saying the hosts are still OK when they are unreachable, even though it still was last updated 10 seconds ago.

System is CentOS 6.4
Nagios version is 4.0.1

Any help to stop the 500 emails we are getting when the IPSec dies at 2am would be greatly appreciated.
User avatar
eloyd
Cool Title Here
Posts: 2190
Joined: Thu Sep 27, 2012 9:14 am
Location: Rochester, NY
Contact:

Re: Host checks not changing even when system down.

Post by eloyd »

WIthout knowing why IPsec comes into play for your host check, it's hard to say. All other things being equal, you could always use a dependency to make sure that service X on host X requires service Y on host Y to be in a particular state before service check X will be triggered.
Image
Eric Loyd • http://everwatch.global • 844.240.EVER • @EricLoyd
I'm a Nagios Fanatic! • Join our public Nagios Discord Server!
EchoKev
Posts: 40
Joined: Tue Jul 02, 2013 11:35 am

Re: Host checks not changing even when system down.

Post by EchoKev »

The only thing that the IPSec has to do with the nagios checks, is that it connects the network that nagios is running in to the remote network. When the ipsec tunnel goes down, nagios cannot reach the remote network.

The issue is that the host checks are never failing so we get hundreds of alerts for each of the services that are then unreachable.

Nagios is not actually checking the hosts to see if they are up. Seems like it is not doing any checks of the hosts, but is just saying that they are in the same state they were in when the check was last forced.

An example is this, I set up a host that worked, then I changed the host config file to point to an IP that does not belong to any host, so that it should state the host is down. Restarted nagios, and it still says it is fine.

Code: Select all

Host State Information
Host Status:	
  UP  
 (for 0d 1h 23m 35s+)
Status Information:	PING OK - Packet loss = 0%, RTA = 10.11 ms
Performance Data:	rta=10.111000ms;3000.000000;5000.000000;0.000000 pl=0%;80;100;0
Current Attempt:	1/10  (HARD state)
Last Check Time:	05-29-2014 15:09:03
Check Type:	ACTIVE
Check Latency / Duration:	0.000 / 0.029 seconds
Next Scheduled Active Check:  	05-30-2014 15:19:03
Last State Change:	N/A
Last Notification:	N/A (notification 0)
Is This Host Flapping?	N/A
In Scheduled Downtime?	
  NO  
Last Update:	05-30-2014 15:17:54  ( 0d 0h 0m 5s ago)
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: Host checks not changing even when system down.

Post by abrist »

I would make a host which uses the ipsec check as its host check. Make it the parent of everything on the other side of the tunnel - that way, if the tunnel is down, all the children are marked as unreachable.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
User avatar
eloyd
Cool Title Here
Posts: 2190
Joined: Thu Sep 27, 2012 9:14 am
Location: Rochester, NY
Contact:

Re: Host checks not changing even when system down.

Post by eloyd »

abrist wrote:I would make a host which uses the ipsec check as its host check. Make it the parent of everything on the other side of the tunnel - that way, if the tunnel is down, all the children are marked as unreachable.
Bingo. I was just trying to figure out how to write that when Andy posted this. :-)
Image
Eric Loyd • http://everwatch.global • 844.240.EVER • @EricLoyd
I'm a Nagios Fanatic! • Join our public Nagios Discord Server!
EchoKev
Posts: 40
Joined: Tue Jul 02, 2013 11:35 am

Re: Host checks not changing even when system down.

Post by EchoKev »

I will try that for dealing with the notifications, but that doesn't explain why nagios isn't changing the status of the host and not actually doing the host check.

Thanks again.
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Host checks not changing even when system down.

Post by tmcdonald »

What happens when you:

1.) Manually run a ping against that non-existent host from the Nagios server?

2.) Run the check_ping plugin against the non-existent host from the Nagios server?
Former Nagios employee
EchoKev
Posts: 40
Joined: Tue Jul 02, 2013 11:35 am

Re: Host checks not changing even when system down.

Post by EchoKev »

tmcdonald wrote:What happens when you:

1.) Manually run a ping against that non-existent host from the Nagios server?

2.) Run the check_ping plugin against the non-existent host from the Nagios server?

1) --- xxx.xxx.xxx.xxx ping statistics ---
5 packets transmitted, 0 received, 100% packet loss, time 4224ms

2)It gives the expected error of CRITICAL - Host Unreachable
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Host checks not changing even when system down.

Post by tmcdonald »

Go ahead and PM me your /usr/local/nagios/var/objects.cache file and I'll take a look.

Tech Note: PM received, stored in appropriate location on network drive
Former Nagios employee
EchoKev
Posts: 40
Joined: Tue Jul 02, 2013 11:35 am

Re: Host checks not changing even when system down.

Post by EchoKev »

PM sent.
Locked