host_down_disable_service_checks - Set services to critical

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
ghugon
Posts: 23
Joined: Tue May 07, 2019 7:55 am

host_down_disable_service_checks - Set services to critical

Post by ghugon »

Hi,

A couple month ago we decided to enable this option : host_down_disable_service_checks=1 for performance reasons and also because it makes sense to not check and notify services that would be in a critical state anyway.
But it is an issue for some people that like to use maps as they see services in an ok state even though they are not ok (because nagios stopped checking them when the host went down and kept the last know state).
Is there any way those services could be set to something other than ok ? unknow, critical, anything really ...
Maybe Nagios could make one last check when the host is down so the services is in the right state ?

Thanks in advance for your answers.
User avatar
mbellerue
Posts: 1403
Joined: Fri Jul 12, 2019 11:10 am

Re: host_down_disable_service_checks - Set services to criti

Post by mbellerue »

ghugon wrote:... it makes sense to not check and notify services that would be in a critical state anyway.
...
Maybe Nagios could make one last check when the host is down so the services is in the right state ?
This sounds like a job for Host and Service dependencies. If you set up, let's say CPU, disk, and memory checks for a specific host, those checks can be dependent upon that specific host being available. But this does mean that host checks would need to be enabled.

Here is some more information on dependencies.
https://assets.nagios.com/downloads/nag ... ncies.html
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
ghugon
Posts: 23
Joined: Tue May 07, 2019 7:55 am

Re: host_down_disable_service_checks - Set services to criti

Post by ghugon »

I'm not sure if it is what I need because it looks like it only does service-service and host-host dependencies.
Wouldn't I need something like a host-service dependency? Maybe a parent-child dependency?

Also, is there a way to set this up in XI or do I need to do it manually in some config files?
User avatar
mbellerue
Posts: 1403
Joined: Fri Jul 12, 2019 11:10 am

Re: host_down_disable_service_checks - Set services to criti

Post by mbellerue »

You're absolutely right, it is host-host, service-service. Let's say you have a router with 5 machines behind it.

You could set up a ping check to the router. That would count as a service. Then all of the service checks for those 5 machines could be dependent upon the ping check to the router.

Similarly, you can have a ping check to a host, and all of the service checks on the host are dependent upon the ping check.

To set this up in XI, go to Configure -> Core Configuration Manager -> Service Dependencies -> Add New

For our ping check to a host example, we would select Manage Hosts, select the host in question. Then Manage Services, and select the Ping check. Then select Manage Dependent Hosts, again select the host in question. Then Manage Dependent Services, and select all of the other services on that host.

For the router example, in Manage Hosts we'd select the router, and Manage Services, we'd select the Ping check. But in Manage Dependent Hosts, we would select the 5 machines that are behind the router, and for Dependent Services, we would select the services associated with those hosts. But, this assumes that all 5 machines have the same services. If they have different services, you would have to make a new Service Dependency entry for each machine that had unique services.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
ghugon
Posts: 23
Joined: Tue May 07, 2019 7:55 am

Re: host_down_disable_service_checks - Set services to criti

Post by ghugon »

Alright, thanks for the very detailed instructions.
Am I going to have a problem with the parameter host_down_disable_service_checks set to 1 though?
Because let's say a host goes down and the ping service is not checked before the host check (which is also a ping btw), it will stay in it's last know state, meaning OK. And i don't want that, I want all my services to be critical when a host goes down and to still not be checked until the host comes back up.
User avatar
mbellerue
Posts: 1403
Joined: Fri Jul 12, 2019 11:10 am

Re: host_down_disable_service_checks - Set services to criti

Post by mbellerue »

Ah, you're absolutely right. Those will leave the services set to their last known state. The only other thing I can think of would be to use Event Handler to submit a passive check result to each service, and mark them as critical. That solution doesn't scale well at all though, as that would have to be built to handle each service, for each host. Depending on the size of the system, the process that would need to happen to set all of the services to critical manually may very well just negate whatever performance benefits you received from host_down_disable_service_checks.

The only scalable way to do this would be to set that host_down_disable_service_checks back to 0. Maybe you can recover that performance another way.

Here are a couple of documents that may help, if you haven't already read through them.
https://assets.nagios.com/downloads/nag ... ctices.pdf
https://assets.nagios.com/downloads/nag ... zation.pdf
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
ghugon
Posts: 23
Joined: Tue May 07, 2019 7:55 am

Re: host_down_disable_service_checks - Set services to criti

Post by ghugon »

Alright, we pretty much already have implemented everything in the best practices guide.
We have juste shy of 2,5k hosts and around 20k services, but we are still able to compile in less than 5 minutes which I'm pretty happy about.

Looks like I'll have to disable host_down_disable_service_checks then.
What do you recommend doing when a host goes down then? If I recall correctly, before enabling host_down_disable_service_checks we were just suppressing the service's notfications, is that enough or can we do something better?
User avatar
mbellerue
Posts: 1403
Joined: Fri Jul 12, 2019 11:10 am

Re: host_down_disable_service_checks - Set services to criti

Post by mbellerue »

You shouldn't get notifications for services down if the host is marked as down. There may be a condition where a host goes down between host checks, and in that time between host checks, a service check kicks off, you may get a notification that way. But once a host is marked as down, service checks will run, see that the service is unreachable, then check the host's status, sees that the host is down, and because of that it does not send a notification.

If you have for certain received notifications in the past that didn't meet that condition mentioned above, then we may have something else we need to address. But we'll need to have another instance of that pop up before we can troubleshoot it.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
ghugon
Posts: 23
Joined: Tue May 07, 2019 7:55 am

Re: host_down_disable_service_checks - Set services to criti

Post by ghugon »

Understood, I'll make a new forum post If I have this kind of issue.
Thanks for your help.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: host_down_disable_service_checks - Set services to criti

Post by scottwilkerson »

ghugon wrote:Understood, I'll make a new forum post If I have this kind of issue.
Thanks for your help.
Great!

Locking thread
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Locked