Page 1 of 1
feature request
Posted: Wed May 04, 2016 8:11 am
by benhank
I read the knowledge-base article:
Code: Select all
https://support.nagios.com/kb/article.php?id=504&show_category=164
My question is ,Can the Nagios AI be modified to check to see if the host is in a Hard down state, avoiding the need for the info in the knowledgebase?
We have 2200 hosts and 8300+ services already in Nagios and it would be a beast to redo the service check intervals.
My suggestion would be that if a service goes down, it would internally check the status of the host as reported in nagios or wait till the host check is run before reporting as down. I really hope that made sense...
Re: feature request
Posted: Wed May 04, 2016 9:34 am
by lmiltchev
You can easily change the check_interval on services in bulk via the "Bulk Modifications" tool. It's part of
Nagios XI Enterprise Edition. If you don't want to purchase Enterprise, you could always start 60-day free (fully functional) trial, and get the job done.

Re: feature request
Posted: Wed May 04, 2016 9:39 am
by benhank
True, but when you use the bulk modifications tool, it's changes are applied locally which then makes a heap of dependencies, as well as a ton of files which have to be manipulated manually one by one. we use templates and such to manage our cfg's, this would break that.
Re: feature request
Posted: Wed May 04, 2016 1:54 pm
by ssax
tmcdonald did add
host_down_disable_service_checks into Core 4.1.1 (If you're running XI 5+), this will disable service checks from even occurring if the host is down.
You would just set this in your nagios.cfg:
Code: Select all
host_down_disable_service_checks=1
Now, if you still had your service check interval lower than your host check interval you would still get the alert because nagios doesn't know that the host is down yet which is why you should always have your host check interval smaller than your service check interval.
Are you saying that if a service is detected as down/unreachable it should automatically perform a host check (reach out and check) to see if it's down before alerting?
Re: feature request
Posted: Wed May 04, 2016 2:55 pm
by benhank
No, that would blitz the downed host with a lot of network traffic, which would not be ideal.
perhaps if a service is down one check could fire off to the host. the service any any others would wait for that result before anything else happens.
the scenario
5 services are associated with host1
host 1 has a check scheduled for 1pm
1 of the 5 services does a check at 1255 and is down. the nagios logic then says "Hold up fellas (the other services as well as the downed one) gimme a sec before i report this, im gonna check the host. seeing that the host wont be checked for another 5 mins it shoots off one. the check comes back as down. the host gets reported as down and everything proceeds as normal. If the check comes back with the host up, the normal process continues as well
any other scenario I can think of would result in the services waiting for the host to be checked before reporting a problem with the service which causes nagios to be late in reporting an outage.
however my suggestion should be optional in both scenarios. in other words "If you want this option its there".
sry for the bad punctuation I am rushing this post.
perhaps the iption could be called "wait for host status" and /or "check host before notifying" or something
Re: feature request
Posted: Wed May 04, 2016 4:32 pm
by tmcdonald
benhank wrote:No, that would blitz the downed host with a lot of network traffic, which would not be ideal.
Nope. It takes the host status from memory instead of re-checking the host each time. This was done intentionally to save CPU cycles, but also has the nice side-effect of reducing network usage as well. The situation you are describing seems similar to parents and dependencies/on-demand checks:
https://assets.nagios.com/downloads/nag ... ility.html
https://assets.nagios.com/downloads/nag ... hecks.html
Re: feature request
Posted: Mon Aug 29, 2016 3:30 pm
by benhank
ok thanks you can lock this