feature request

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
User avatar
benhank
Posts: 1264
Joined: Tue Apr 12, 2011 12:29 pm

feature request

Post by benhank »

I read the knowledge-base article:

Code: Select all

https://support.nagios.com/kb/article.php?id=504&show_category=164

My question is ,Can the Nagios AI be modified to check to see if the host is in a Hard down state, avoiding the need for the info in the knowledgebase?
We have 2200 hosts and 8300+ services already in Nagios and it would be a beast to redo the service check intervals.
My suggestion would be that if a service goes down, it would internally check the status of the host as reported in nagios or wait till the host check is run before reporting as down. I really hope that made sense...
Proudly running:
NagiosXI 5.4.12 2 node Prod Env 2500 hosts, 13,000 services
Nagiosxi 5.5.7(test env) 2500 hosts, 13,000 services
Nagios Logserver 2 node Prod Env 500 objects sending
Nagios Network Analyser
Nagios Fusion
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: feature request

Post by lmiltchev »

You can easily change the check_interval on services in bulk via the "Bulk Modifications" tool. It's part of Nagios XI Enterprise Edition. If you don't want to purchase Enterprise, you could always start 60-day free (fully functional) trial, and get the job done. :)
Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
benhank
Posts: 1264
Joined: Tue Apr 12, 2011 12:29 pm

Re: feature request

Post by benhank »

True, but when you use the bulk modifications tool, it's changes are applied locally which then makes a heap of dependencies, as well as a ton of files which have to be manipulated manually one by one. we use templates and such to manage our cfg's, this would break that.
Proudly running:
NagiosXI 5.4.12 2 node Prod Env 2500 hosts, 13,000 services
Nagiosxi 5.5.7(test env) 2500 hosts, 13,000 services
Nagios Logserver 2 node Prod Env 500 objects sending
Nagios Network Analyser
Nagios Fusion
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: feature request

Post by ssax »

tmcdonald did add host_down_disable_service_checks into Core 4.1.1 (If you're running XI 5+), this will disable service checks from even occurring if the host is down.

You would just set this in your nagios.cfg:

Code: Select all

host_down_disable_service_checks=1
Now, if you still had your service check interval lower than your host check interval you would still get the alert because nagios doesn't know that the host is down yet which is why you should always have your host check interval smaller than your service check interval.

Are you saying that if a service is detected as down/unreachable it should automatically perform a host check (reach out and check) to see if it's down before alerting?
User avatar
benhank
Posts: 1264
Joined: Tue Apr 12, 2011 12:29 pm

Re: feature request

Post by benhank »

No, that would blitz the downed host with a lot of network traffic, which would not be ideal.
perhaps if a service is down one check could fire off to the host. the service any any others would wait for that result before anything else happens.
the scenario
5 services are associated with host1
host 1 has a check scheduled for 1pm
1 of the 5 services does a check at 1255 and is down. the nagios logic then says "Hold up fellas (the other services as well as the downed one) gimme a sec before i report this, im gonna check the host. seeing that the host wont be checked for another 5 mins it shoots off one. the check comes back as down. the host gets reported as down and everything proceeds as normal. If the check comes back with the host up, the normal process continues as well

any other scenario I can think of would result in the services waiting for the host to be checked before reporting a problem with the service which causes nagios to be late in reporting an outage.

however my suggestion should be optional in both scenarios. in other words "If you want this option its there".
sry for the bad punctuation I am rushing this post.
perhaps the iption could be called "wait for host status" and /or "check host before notifying" or something
Proudly running:
NagiosXI 5.4.12 2 node Prod Env 2500 hosts, 13,000 services
Nagiosxi 5.5.7(test env) 2500 hosts, 13,000 services
Nagios Logserver 2 node Prod Env 500 objects sending
Nagios Network Analyser
Nagios Fusion
tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: feature request

Post by tmcdonald »

benhank wrote:No, that would blitz the downed host with a lot of network traffic, which would not be ideal.
Nope. It takes the host status from memory instead of re-checking the host each time. This was done intentionally to save CPU cycles, but also has the nice side-effect of reducing network usage as well. The situation you are describing seems similar to parents and dependencies/on-demand checks:

https://assets.nagios.com/downloads/nag ... ility.html
https://assets.nagios.com/downloads/nag ... hecks.html
Former Nagios employee
User avatar
benhank
Posts: 1264
Joined: Tue Apr 12, 2011 12:29 pm

Re: feature request

Post by benhank »

ok thanks you can lock this
Proudly running:
NagiosXI 5.4.12 2 node Prod Env 2500 hosts, 13,000 services
Nagiosxi 5.5.7(test env) 2500 hosts, 13,000 services
Nagios Logserver 2 node Prod Env 500 objects sending
Nagios Network Analyser
Nagios Fusion
Locked