Hiding cluster service checks

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
m_verafin
Posts: 3
Joined: Thu Nov 08, 2012 3:31 pm

Hiding cluster service checks

Post by m_verafin »

We have a sqlserver database clustered by 3 machines. Meaning if sqlserver on A fails, then sqlserver on B is started and users are directed to B. I used the documentation to set up 3 services that check if sqlserver is running on each machine. Then created another service that uses the check_cluster plugin to see if at least one of the 3 services are OK. If so the cluster service returns OK otherwise it returns Critical if all 3 are down. The services all work wonderfully. The 3 sqlserver checks have notification disabled, so we only get alerts from the cluster service. However we like to display all our hostgroups on a large monitor for people to quickly see the state of the entire network. Since we only ever have one sqlserver instance running, that means only one machine shows as all green, and the other 2 machines show that there is a Critical error. The Critical error is their sqlserver service check.

I'm not interested in the individual sqlserver checks, only the cluster service. Is there anyway to not display the sqlserver service checks in the core website so that the hostgroups page shows all green? I tried setting their register to 0, but then the cluster service can't use them properly. But I see no options in the service object definition for whether or not the service is displayed.
User avatar
jsmurphy
Posts: 989
Joined: Wed Aug 18, 2010 9:46 pm

Re: Hiding cluster service checks

Post by jsmurphy »

Unfortunately Nagios doesn't deal with clusters particularly gracefully and there isn't really much you can do about this. If you were really desperate you could create service-dependencies that stops check execution on the "down" servers when one of the servers is up and create a wrapper script that will inject an "OK" status into the down hosts when the up host runs the check.

It's an ugly hack... but it would work if you were really desperate.
Locked