Hello,
I need some clarification when setting up service dependencies on the same host. My current config is as follows:
define servicedependency {
dependent_hostgroup_name HOSTGROUP1
dependent_service_description SWAP
hostgroup_name HOSTGROUP1
service_description SSH
inherits_parent 1
execution_failure_criteria c,
notification_failure_criteria c,
dependency_period xi_timeperiod_24x7
}
With this configuration, SWAP is still sending notifications when SSH service is down on any of the hosts that belongs to HOSTGROUP1. We should only be getting SSH alerts.
In the nagios documentation, it says that if the service dependent is on the same host as the service being checked, no host/hostgroup needs to be provided, however, NagiosXi Config Manager is requiring input and wil not save without selecting hostgroups.
Thanks and if anyone can clarify the proper config that would be great.
--
Liz
Service Dependency
Re: Service Dependency
Are you trying to say
"If ANY of the SSH services on ANY of the hosts in HOSTGROUP1 go down, stop alerting for SWAP on ALL of those hosts"
or
"If the SSH service on ANY of the hosts in HOSTGROUP1 goes down, stop alerting for SWAP on THAT host"?
"If ANY of the SSH services on ANY of the hosts in HOSTGROUP1 go down, stop alerting for SWAP on ALL of those hosts"
or
"If the SSH service on ANY of the hosts in HOSTGROUP1 goes down, stop alerting for SWAP on THAT host"?
Former Nagios employee
Re: Service Dependency
If the SSH service on ANY of the hosts in HOSTGROUP1 goes down, stop alerting for SWAP on THAT host"?
I need for SWAP service to be dependent on the SSH service on a particular host.
Thanks,
ffolse
I need for SWAP service to be dependent on the SSH service on a particular host.
Thanks,
ffolse
Re: Service Dependency
Can you show us the HOSTGROUP1's definition and the actual email notification that you received for the "swap service"?
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Service Dependency
HOSTGROUP1:
define hostgroup {
hostgroup_name HOSTGROUP1
alias DB Dev Servers
members host1,host2,host3
}
Ive edited the IPs on this notification email:
From: wxnagios01@txxxxx
Sent: Tuesday, March 24, 2015 10:15 AM
To: ffolse
Subject: PROBLEM (CRITICAL) host1 SWAP
Importance: High
Nagios has detected a problem with this service.
Connection refused by host
(sanmateo) IP: xxx.xxx.xx.x
2015-03-24 10:14:39
define hostgroup {
hostgroup_name HOSTGROUP1
alias DB Dev Servers
members host1,host2,host3
}
Ive edited the IPs on this notification email:
From: wxnagios01@txxxxx
Sent: Tuesday, March 24, 2015 10:15 AM
To: ffolse
Subject: PROBLEM (CRITICAL) host1 SWAP
Importance: High
Nagios has detected a problem with this service.
Connection refused by host
(sanmateo) IP: xxx.xxx.xx.x
2015-03-24 10:14:39
- Nagios Support
- Posts: 36
- Joined: Thu Sep 04, 2014 12:16 pm
Re: Service Dependency
We will try to recreate the issue in-house and will get back to you within the next 24 hours.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: Service Dependency
I would like to see the service definitions for both SSH and SWAP. Specifically I want to see the check_interval, retry_interval and max_check_attempts.ffolse wrote:With this configuration, SWAP is still sending notifications when SSH service is down on any of the hosts that belongs to HOSTGROUP1. We should only be getting SSH alerts.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Service Dependency
define service {
service_description SWAP
hostgroup_name HOSTGROUP1
check_command check_nrpe!check_swap!!!!!!!
max_check_attempts 3
check_interval 3
retry_interval 1
check_period xi_timeperiod_24x7
notification_interval 30
first_notification_delay 0
notification_period work_hours
notification_options c,r,s,
notifications_enabled 0
contact_groups linuxteam
_xiwizard nrpe
register 1
}
define service {
service_description SSH
hostgroup_name HOSTGROUP1
check_command check_nrpe!check_ssh!!!!!!!
max_check_attempts 3
check_interval 3
retry_interval 1
check_period xi_timeperiod_24x7
notification_interval 30
first_notification_delay 0
notification_period xi_timeperiod_24x7
notification_options c,r,s,
notifications_enabled 1
contact_groups linuxteam
_xiwizard nrpe
register 1
}
service_description SWAP
hostgroup_name HOSTGROUP1
check_command check_nrpe!check_swap!!!!!!!
max_check_attempts 3
check_interval 3
retry_interval 1
check_period xi_timeperiod_24x7
notification_interval 30
first_notification_delay 0
notification_period work_hours
notification_options c,r,s,
notifications_enabled 0
contact_groups linuxteam
_xiwizard nrpe
register 1
}
define service {
service_description SSH
hostgroup_name HOSTGROUP1
check_command check_nrpe!check_ssh!!!!!!!
max_check_attempts 3
check_interval 3
retry_interval 1
check_period xi_timeperiod_24x7
notification_interval 30
first_notification_delay 0
notification_period xi_timeperiod_24x7
notification_options c,r,s,
notifications_enabled 1
contact_groups linuxteam
_xiwizard nrpe
register 1
}
Re: Service Dependency
I was not able to recreate this issue. Maybe we will need to move this to the email ticketing system.
BTW, here's what I tried:
Both services failed, but I got notified only about the SSH.
BTW, here's what I tried:
Code: Select all
define servicedependency {
dependent_hostgroup_name HOSTGROUP1
dependent_service_description Swap Usage HOSTGROUP1
hostgroup_name HOSTGROUP1
service_description SSH Server HOSTGROUP1
inherits_parent 1
execution_failure_criteria c,
notification_failure_criteria c,
dependency_period 24x7
}
define hostgroup {
hostgroup_name HOSTGROUP1
alias HOSTGROUP1
members CentOS6-SNMP
}
define service {
service_description SSH Server HOSTGROUP1
use xiwizard_nrpe_service
hostgroup_name HOSTGROUP1
check_command check_nrpe!check_init_service!-a 'sshd'!!!!!!
max_check_attempts 5
check_interval 5
retry_interval 1
notification_interval 60
contacts nagiosadmin
_xiwizard linux-server
register 1
}
define service {
service_description Swap Usage HOSTGROUP1
use xiwizard_nrpe_service
hostgroup_name HOSTGROUP1
check_command check_nrpe!check_swap!-a '-w 50 -c 20'!!!!!!
max_check_attempts 5
check_interval 5
retry_interval 1
notification_interval 60
contacts nagiosadmin
_xiwizard linux-server
register 1
}
You do not have the required permissions to view the files attached to this post.
Be sure to check out our Knowledgebase for helpful articles and solutions!
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: Service Dependency
I think your problem relates to the SSH service having the same intervals and retires as the services that depend on it.
Here's a scenario:
SSH
Check Interval: 3m
Retry Interval: 1m
Max Check Attempts: 3
SWAP
Check Interval: 3m
Retry Interval: 1m
Max Check Attempts: 3
1.00 Nagios checks SSH service, service is OK, next check is 1.03, attempt 1/3
1.01 SSH service breaks somehow, Nagios does not know about it yet
1.02 Nagios checks SWAP service, fails because of SSH service broken, service is w/c/u, SOFT state, next check is 1.03, attempt 1/3
1.03 Nagios checks SSH service, fails, service is w/c/u, SOFT state, next check is 1.04, attempt 1/3
1.03 Nagios checks SWAP service, fails because of SSH service broken, service is w/c/u, SOFT state, next check is 1.04, attempt 2/3
1.04 Nagios checks SSH service, fails, service is w/c/u, SOFT state, next check is 1.05, attempt 2/3
1.04 Nagios checks SWAP service, fails because of SSH service broken, service is w/c/u, HARD state, notifications sent, next check is 1.05, attempt 3/3
1.05 Nagios checks SSH service, fails, service is w/c/u, HARD state, notifications sent, service dependencies now apply, next check is 1.06, attempt 3/3
1.05 Nagios pushes back check of SWAP service because dependencies now apply, however remains in a critical state. service is w/c/u, HARD state, next check is 1.06, attempt 3/3
So what is happening in this scenario is that the SWAP service goes critical BEFORE the SSH service does and sends out notifications. To stop this from happening, set your SSH service to have the check_interval of 1m AND max_check_attempts to 2. This means that in the scenario above, the SSH service would have entered the HARD state first and then the service dependencies would have taken affect.
Does this make sense?
Here's a scenario:
SSH
Check Interval: 3m
Retry Interval: 1m
Max Check Attempts: 3
SWAP
Check Interval: 3m
Retry Interval: 1m
Max Check Attempts: 3
1.00 Nagios checks SSH service, service is OK, next check is 1.03, attempt 1/3
1.01 SSH service breaks somehow, Nagios does not know about it yet
1.02 Nagios checks SWAP service, fails because of SSH service broken, service is w/c/u, SOFT state, next check is 1.03, attempt 1/3
1.03 Nagios checks SSH service, fails, service is w/c/u, SOFT state, next check is 1.04, attempt 1/3
1.03 Nagios checks SWAP service, fails because of SSH service broken, service is w/c/u, SOFT state, next check is 1.04, attempt 2/3
1.04 Nagios checks SSH service, fails, service is w/c/u, SOFT state, next check is 1.05, attempt 2/3
1.04 Nagios checks SWAP service, fails because of SSH service broken, service is w/c/u, HARD state, notifications sent, next check is 1.05, attempt 3/3
1.05 Nagios checks SSH service, fails, service is w/c/u, HARD state, notifications sent, service dependencies now apply, next check is 1.06, attempt 3/3
1.05 Nagios pushes back check of SWAP service because dependencies now apply, however remains in a critical state. service is w/c/u, HARD state, next check is 1.06, attempt 3/3
So what is happening in this scenario is that the SWAP service goes critical BEFORE the SSH service does and sends out notifications. To stop this from happening, set your SSH service to have the check_interval of 1m AND max_check_attempts to 2. This means that in the scenario above, the SSH service would have entered the HARD state first and then the service dependencies would have taken affect.
Does this make sense?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.