Service Dependency

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
Maxwellb99
Posts: 97
Joined: Tue Jan 26, 2016 5:29 pm

Service Dependency

Post by Maxwellb99 »

Hi,

I'm trying to set up Service Dependencies. I lay out the use case, variables, then present my solutions. Please respond to the questions in red. Further, please let me know if my understanding is incorrect and let me know the best way to implement this, if it is contrary to one of the solutions given below.

I make use of your documentation:
https://assets.nagios.com/downloads/nag ... tions.html
https://assets.nagios.com/downloads/nag ... ricks.html

Motivation:
Case1: OCC wants to be alerted for Mem if Mem & Processor Queue Length are both critical.
Case2: OCC wants to be alerted for Mem_db if Mem_db, Processor Queue Length, & “sqlservr” process CPU are all critical.

Service A := Mem
Service B := Mem_db
Service C := Processor Queue Length (PQL)
Service D := “sqlservr” process CPU (PCPU)

Each of the associated Services has a constituent HostGroup named similarly (eg Service A: Mem::Mem-Hg)
Universe All Windows Hosts.

Elementwise Hostgroup C is equivalent to the universe -- All hosts get the PQL Service.
Elementwise Hostgroup B is equivalent to Hostgroup D -- All SQL hosts
Elementwise Hostgroup A is the complement to Hostgroup B/D --Non-SQL hosts; equivalently universe - All SQL hosts


Case1 Soln:
1.A Can I make use of the "All Hosts In Multiple Hostgroups"
define servicedependency{
hostgroup_name HOSTGROUP C
service_description Service C
dependent_hostgroup_name HOSTGROUP A
dependent_service_description SERVICE A
notification_failure_criteria o,w,u,p
}

Does this say: for each host in Mem-HG. Mem is dependent on PQL. Only alert Service A if Service C is also in a Hard-state Critical?

This is where my confusion comes in. "Nagios gets the current status* of the service that is being depended upon."

Given generic hosts a0,a1,a2 in Service A-HG & c0,c1,c2 as members of Service C-HG and the services listed above. Nagios kicks off the check for Service A which arguably is three services: service A::Host_a0; service A::Host_a1; service A::Host_a2. Similarly for Service C.

Basically, Can you confirm that this does an element-wise check?

Said another way; imagine that I have hosts x,y,z as members of Mem-Hg. Further, host y is in a state that would trigger a notification (namely Service A is critical & Service C is also critical; both sharing host y as a member of those services), despite the fact that the HG contains other elements that are in say OK states, only host y would result in the expected behavior of sending out an alert. Correct?

Case0 host x is a member of Service A-HG & host x is a member of Service B-HG
Is it checking the status of service A with respect to host x against status of service C with respect to host x

Case1 x is a member of A; x is NOT a member of C
Which service is Service A depending upon? There is no service C with respect to host x

Case2 x is NOT a member of A; x is a member of C
I assume this is OK?

I addded both case1 & case2 I was able to apply the configuration with both. I was not however able to have both Service A dependent upon Service C & Service C is dependent upon Service A. Having both dependencies active at the same time resulted in an "Error: Circular notification dependency detected for services..."

1.B Can I make use of the "Same Host Dependency"?
define servicedependency{
host_name HOST1,HOST2, ... HostN
service_description SERVICE C
dependent_service_description Service A
notification_failure_criteria o,w,u,p
}

Does this say: for each host Mem is dependent on PQL. Only alert Service A if Service C is also in a Hard-state Critical?

Please confirm that all the Hosts in "host_name" would be equivalent to all the hosts in the mem-HG?
as above, I assume this is on a by host basis?


Case2 Soln:
2.A Can I make use of the "All Hosts In Multiple Hostgroups"

define servicedependency{
hostgroup_name HOSTGROUP C
service_description Service C
dependent_hostgroup_name HOSTGROUP B
dependent_service_description SERVICE B
notification_failure_criteria o,w,u,p
}

define servicedependency{
hostgroup_name HOSTGROUP D
service_description Service D
dependent_hostgroup_name HOSTGROUP B
dependent_service_description SERVICE B
notification_failure_criteria o,w,u,p
}

Do I need "Inherits parents" or will Nagios understand that Service B is dependent upon Service C & D both being critical?
If I need "Inherits parents" which should receive it?

Thanks,
Maxwell Ramirez
User avatar
mbellerue
Posts: 1403
Joined: Fri Jul 12, 2019 11:10 am

Re: Service Dependency

Post by mbellerue »

This is a lot to take in, I'm going to focus on this piece right here.
I addded both case1 & case2 I was able to apply the configuration with both. I was not however able to have both Service A dependent upon Service C & Service C is dependent upon Service A. Having both dependencies active at the same time resulted in an "Error: Circular notification dependency detected for services..."
Just to clarify, you're trying to use service dependencies so that in order to get an alert and/or notification, both Service A and Service C need to be down. Is that what you're trying to accomplish?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
Maxwellb99
Posts: 97
Joined: Tue Jan 26, 2016 5:29 pm

Re: Service Dependency

Post by Maxwellb99 »

Motivation:
Case1: OCC wants to be alerted for Mem if Mem & Processor Queue Length are both critical.
Case2: OCC wants to be alerted for Mem_db if Mem_db, Processor Queue Length, & “sqlservr” process CPU are all critical.

Just to clarify, you're trying to use service dependencies so that in order to get an alert and/or notification, both Service A and Service C need to be down. Is that what you're trying to accomplish?
For Case1 yes.

Case2 appears to be a generalization of case1. Given N arbitrary services, only alert if all N dependent services pass the notification_failure_criteria.
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: Service Dependency

Post by benjaminsmith »

Hello Maxwell,

Michael has step out for the day. As I was looking over the requirements, it seems like using the Business Process Intelligence component in Nagios XI may work better in this situation.
Motivation:
Case1: OCC wants to be alerted for Mem if Mem & Processor Queue Length are both critical.
Case2: OCC wants to be alerted for Mem_db if Mem_db, Processor Queue Length, & “sqlservr” process CPU are all critical.

Service A := Mem
Service B := Mem_db
Service C := Processor Queue Length (PQL)
Service D := “sqlservr” process CPU (PCPU)
Yo could create two sepearte BPI groups. The first group would contain Service A & C, and then set the thresholds to alert based both are crtical (non-ok). Likewise a similar BPI group can be made for Case 2.

Once you've created these groups, the BPU configuariton wizard is used to create notifications to the right contacts when the BPI group changes state.

Take a look t the documentation and let us know if this may work as BPI is quite powerful and can also provide a visualization of group health as well.

Using BPI In Nagios XI
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
Maxwellb99
Posts: 97
Joined: Tue Jan 26, 2016 5:29 pm

Re: Service Dependency

Post by Maxwellb99 »

Thanks, This looks promising.

Is my confusion given your documentation warranted?
- What is a service check; as it relates to a host?
- Is my thinking in terms of service0 as it relates to host x depends upon service1 as it relates to host x correct?
- What's the mechanism that links service0 to service1?
- Would any of the solutions that I laid out work as expected?

Thanks,
Maxwell Ramirez
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: Service Dependency

Post by benjaminsmith »

Hi Maxwell,

Glad to hear the BPI information maybe helpful. Following up on your questions below.
Is my confusion given your documentation warranted?
Service dependencies are an advanced feature of Nagios, and you can build some complex chains of dependencies between services. In my opinion, I would recommend keeping things simple enough for other employees to understand the configurations.
What is a service check; as it relates to a host?
A service check is always associated with a host (any device with an IP address) in Nagios. Host and services have slightly different states from plugin ouput. A host is either up, down or unreachable. while services are either ok,warning, critical or unknown.
Is my thinking in terms of service0 as it relates to host x depends upon service1 as it relates to host x correct?
I'm having a little trouble following the notion in this example with respect to your previous example. You a master hosts/services and dependent hosts/services. These can be groups or individual members. Remember that all services in Nagios are associated with a host (device with an IP address). However, service dependencies allow you to create dependent relationships with one or more other services.

Let's say I have a hostgroup called mailservers that provide IMAP mail service. If I want to make this service dependent upon Active Directory service, I would create the following service depenedency.

Code: Select all

define servicedependency {
host_name    ad-server
service_description activedirectory
dependent host_group   mailservers
dependent service_description   imap
execution_failure_criteria   c,u
notification_failure_criteria  c,u
}
In this case, all imap services in the mailserver hostgroup are dependent upon the active directory service. Nagios will not check the services nor send notifications when the active directory services is critical or unknown.
Would any of the solutions that I laid out work as expected?
Going back to the original requirements, please correct me if I am wrong but I don't see the need for suppressing checks or notifications based on the states of master services. In this respect, I believe the BPI option might be better suited to this application.
Motivation:
Case1: OCC wants to be alerted for Mem if Mem & Processor Queue Length are both critical.
Case2: OCC wants to be alerted for Mem_db if Mem_db, Processor Queue Length, & “sqlservr” process CPU are all critical.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
Maxwellb99
Posts: 97
Joined: Tue Jan 26, 2016 5:29 pm

Re: Service Dependency

Post by Maxwellb99 »

"In this case, all imap services in the mailserver hostgroup are dependent upon the active directory service. Nagios will not check the services nor send notifications when the active directory services is critical or unknown."

"A service check is always associated with a host (any device with an IP address) in Nagios. Host and services have slightly different states from plugin ouput. A host is either up, down or unreachable. while services are either ok,warning, critical or unknown."

These go to the crux of my question and for BPI. These need to be on a per-host level.

Please correct me if I'm wrong, BPI treats them as a group? So if 9/100 servers are melting but the critical threshold is 10% this won't send out any notifications?
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: Service Dependency

Post by benjaminsmith »

Hello,
Please correct me if I'm wrong, BPI treats them as a group? So if 9/100 servers are melting but the critical threshold is 10% this won't send out any notifications?
That's correct. BPI uses a group health index each member is given equal weighting. So if 9 servers are down out of 100, the group health is 91%. The thresholds are set according to group health. If you wanted to receive notification you would set the threshold above 91%.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked