Thanks all - let me know if there is any more info I can feed you. We got a couple of hundred notifcations yesterday for the Child hosts after the Parent went down so in my case the notifications aren't getting blocked nor are the Child hosts seen as unreachable. The email notifications showed them as down. This is similar to the Notifications screenshot I sent previously where there was one child behind the parent.
Fred
Parent/Child Blocking issues
-
Fred Kroeger
- Posts: 588
- Joined: Wed Oct 19, 2011 11:36 pm
- Location: Perth, Western Australia
- Contact:
-
bheden
- Product Development Manager
- Posts: 179
- Joined: Thu Feb 13, 2014 9:50 am
- Location: Nagios Enterprises
Re: Parent/Child Blocking issues
I guess I do have a few questions:
Has this ever happened before?
Have you ever had an outage happen previously where you noticed that the states of child hosts were actually being set to UNREACHABLE?
Has this ever happened before?
Have you ever had an outage happen previously where you noticed that the states of child hosts were actually being set to UNREACHABLE?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Nagios Enterprises
Senior Developer
Nagios Enterprises
Senior Developer
-
bheden
- Product Development Manager
- Posts: 179
- Joined: Thu Feb 13, 2014 9:50 am
- Location: Nagios Enterprises
Re: Parent/Child Blocking issues
Also, looking through the source in Core - if you were to enable debugging and set verbosity to 2 - we'd probably have some useful debugging output if you were able to simulate another outage perhaps. Is this a possibility?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Nagios Enterprises
Senior Developer
Nagios Enterprises
Senior Developer
-
Fred Kroeger
- Posts: 588
- Joined: Wed Oct 19, 2011 11:36 pm
- Location: Perth, Western Australia
- Contact:
Re: Parent/Child Blocking issues
I'm pretty sure that I tested this in an older version of Nagios some time ago - which is why I went down this path for this particular installation. Since all the checks are run by a Mod Gearman worker at the clients site, it made sense to make all the hosts a child of the Worker, so if we lost connection to the worker then we wouldn't get hit with an alert for every host and service.
I could schedule this test after Wednesday next week. Let me know what needs to be set and what files you want me to send.
Fred
I could schedule this test after Wednesday next week. Let me know what needs to be set and what files you want me to send.
Fred
-
bheden
- Product Development Manager
- Posts: 179
- Joined: Thu Feb 13, 2014 9:50 am
- Location: Nagios Enterprises
Re: Parent/Child Blocking issues
This still makes sense.Since all the checks are run by a Mod Gearman worker at the clients site, it made sense to make all the hosts a child of the Worker, so if we lost connection to the worker then we wouldn't get hit with an alert for every host and service.
In regards to your initial post here, the picture with
Code: Select all
/---- FIREWALL-1A ------\
/ \
REMOTE-SITE ---< >--- 2x Devices at Remote Site
\ /
\---- FIREWALL-1B ------/
Can you point out the host names of some of the ModGearman worker parent/child relationships? I only see one obvious one with it set as the parent for only 2 hosts.
Provide me this information so that I can review your object definitions, and then I can give you a detailed instruction list. Which of the parent/child relationships are you going to simulate failure for?
Also, if you're not comfortable listing those host names publicly I can accept them in a PM.
Thanks.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Nagios Enterprises
Senior Developer
Nagios Enterprises
Senior Developer
-
Fred Kroeger
- Posts: 588
- Joined: Wed Oct 19, 2011 11:36 pm
- Location: Perth, Western Australia
- Contact:
Re: Parent/Child Blocking issues
Yes - the original diagram was basically everything after the Worker.
I have PM'd you the full topology together with the hostnames/IPs so that you can follow the paths.
I have PM'd you the full topology together with the hostnames/IPs so that you can follow the paths.
-
bheden
- Product Development Manager
- Posts: 179
- Joined: Thu Feb 13, 2014 9:50 am
- Location: Nagios Enterprises
Re: Parent/Child Blocking issues
Just to get this off of the support team's dashboard, I'm replying. Fred, I'll respond here or reply to your PM directly when I have some meaningful information.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Nagios Enterprises
Senior Developer
Nagios Enterprises
Senior Developer