Service critical on unreachable hosts

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Post Reply
tonnag
Posts: 21
Joined: Thu Sep 06, 2018 2:27 am

Service critical on unreachable hosts

Post by tonnag »

Hi!
I have nagios core running, and defined parent relations. The relations work okay, when a parent goes down, the child hosts become unreachable. The services on those however become critical. That sounds a bit wrong. I'd guess they should either change to undetermined, retain the last know state, or not be checked at all. How do I change that behaviour?
Thnx, Anton
kg2857
Posts: 288
Joined: Wed Apr 12, 2023 5:48 pm

Re: Service critical on unreachable hosts

Post by kg2857 »

There's a setting in nagios.cfg that should handle this.
tonnag
Posts: 21
Joined: Thu Sep 06, 2018 2:27 am

Re: Service critical on unreachable hosts

Post by tonnag »

Hi, Thnx, I already had "host_down_disable_service_checks=1" but the child isn't down so that has no impact.
I also tried "service_check_timeout_state=u" but no change there either.
Those where the options i've seen and considered to have an impact (but did not). Is there a specific setting you refer to?
bbahn
Posts: 189
Joined: Thu Jan 12, 2023 5:42 pm

Re: Service critical on unreachable hosts

Post by bbahn »

Hello @tonnag,

It looks like there is a known issue with how Nagios Core handles this situation that an issue has already been filed for Nagios Core: Nagios Core Github Issue: Service notifications despite parent host being down.

I will add a note there and update the weight of the issue.
Actively advancing awesome answers with ardent alliteration, aptly addressing all ambiguities. Amplify your acumen and avail our amicable assistance. Eagerly awaiting your astute assessments of our advice.
tonnag
Posts: 21
Joined: Thu Sep 06, 2018 2:27 am

Re: Service critical on unreachable hosts

Post by tonnag »

Thank you, much appreciated
tonnag
Posts: 21
Joined: Thu Sep 06, 2018 2:27 am

Re: Service critical on unreachable hosts

Post by tonnag »

Well ..... It's working now as expected. The childs show up as unreachable en their services are no longer critical.
I will try to reproduce it later, but I THINK the sequence was:

* Added all hosts (two from two subnets; production and lab)
* Turned off the lab network
* Defined Parent/child relations & restarted nagios
-> lab shows unreachable with critical services
* Serveral nagios restarts/server reboots ; no change
* Turned on lab network
-> Everything shows okay
* Turned off the lab network
-> lab devices shows unreachable, their services are no longer reported under "services problems" If you go to the host, the services are shown Ok (or just the last state, not sure)

So it does work as expected (after all :D )
tonnag
Posts: 21
Joined: Thu Sep 06, 2018 2:27 am

Re: Service critical on unreachable hosts

Post by tonnag »

Well, I re-used an test server I had, and found it's still a bit off.
* Stop nagios process
* Delete all status files in var folder
* Copied cfg files with parent/child config
* Started LAB devices
* 5 min. later started nagios
* 10 min. later all shows OK/green
* 15 min later shutdown lab devices

Five hosts show as unreachable which is correct. But ...
Four of those five unreachables are connected identical, but only two of them show up with critical services. (one ping service, the other an application port)
The fifth device shows 3 of it's 9 services as critical. After start/stop of the lab, which and the number of critical services vary, but till now, it's always the same hosts that have a service as critical.
Post Reply