Services of unreachable host

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
diverseft
Posts: 40
Joined: Wed May 23, 2012 6:13 am

Services of unreachable host

Post by diverseft »

Hi Guys

My first post here. Hopefully I can explain this clearly :)

I have recently installed Nagios on a Ubuntu box and everything is working well. I have snmp checks for server hardware as well as generic checks for windows services, mem, cpu etc. All works fine. On top of this i have my "Nagios network" organised with parent/child relationships to tell weather hosts are unreachable to aid in trouble shooting and notification management.

Example:
Firewall - Managed Switch - Server/printer/other device (If firewall is down, managed switch and server/printer are unreachable).

My question is this. I have one site that goes down regularly (ADSL line=poop). The firewall goes down and all hosts behind it fall into a unreachable state so that is good. As all the child hosts are in an unreachable state, what status should the services of the child hosts be?

The reason I ask is that when a site goes down, I only want to be notified about the device that is down. I don't want to be notified of all child hosts services.

What is currently happening is check_SNMP services fall into an unknown state on the printer behind the firewall. services using checks such as 'check_local_procs' or 'check_nt' fall into critical state on the server behind the firewall. In my templates.cfg, I have a template that defines that I am notified of warning,critical and recovery for windows hosts (not unknown), so unknown state is fine as I wont get notified. Critical is where the problem lies. The critical error i get is for example: "CRITICAL - Socket timeout after 10 seconds". I am just wondering weather Nagios should not realise that a parent host is down and therefore put a child hosts services into an unreachable state along with the host it belongs to.

The only thing I can think of is that it is related to that particular plugin that is designed to fall into a critical state if the result of the check is out of range. But I then go back to "should Nagios not check the child host/services until the parent is backup?"

Any advice would be massively appreciated

Thanks

T
agriffin
Posts: 876
Joined: Mon May 09, 2011 9:36 am

Re: Services of unreachable host

Post by agriffin »

Are you getting notifications about services with down or unreachable hosts? My understanding was that that shouldn't happen. If you are only concerned about the service states, rather than notifications, I can only say that that's expected behavior.
diverseft
Posts: 40
Joined: Wed May 23, 2012 6:13 am

Re: Services of unreachable host

Post by diverseft »

I am getting service notifications about unreachable hosts.
agriffin
Posts: 876
Joined: Mon May 09, 2011 9:36 am

Re: Services of unreachable host

Post by agriffin »

Are these active or passive checks? Is the volatile or obsess_over_service options set in their service definitions or templates?
diverseft
Posts: 40
Joined: Wed May 23, 2012 6:13 am

Re: Services of unreachable host

Post by diverseft »

Hi agriffin

It appears that as the link between the sites drops on and off, the devices are coming on-line again and then unreachable again. I'm guessing its not been detected as flapping as the up down statuses are not changing frequently or quickly enough.....
agriffin
Posts: 876
Joined: Mon May 09, 2011 9:36 am

Re: Services of unreachable host

Post by agriffin »

I think the volatile option will still have this effect though, even if the services aren't flapping. So is that option set for these services?
Locked