remove host from hostgroup but it still gets service checks
-
tredlightly
- Posts: 8
- Joined: Thu Feb 19, 2015 3:34 pm
remove host from hostgroup but it still gets service checks
Nagios version is Core: 4.0.8-2
I have defined a set of shared service checks using a hostgroup. I want to use the hostgroup so that I can remove a host from it and have the service checks continue for the remaining hosts. However, when I remove the host from the hostgroup, nagios continues to monitor it and triggers my event handlers for specific services that are only defined for the hostgroup. My removal script removes the host from the list of members in the hostgroup and restarts nagios. I've also tried appending the bang '!' to that hostname while leaving it within the hostgroup definition. Same issue, it still monitors and triggers event handlers for that host.
I have tried commenting out and unsetting a variety of settings in the nagios.cfg file in an attempt to make sure that nagios is not caching old information about this host and its hostgroup membership. Those changes are shown here:
custom]$ diff /etc/nagios/nagios.cfg /etc/nagios/nagios.cfg~
67c67
< ##object_cache_file=/u02/nagios/var/nagios/objects.cache
---
> object_cache_file=/u02/nagios/var/nagios/objects.cache
83c83
< ##precached_object_file=/u02/nagios/var/nagios/objects.precache
---
> precached_object_file=/u02/nagios/var/nagios/objects.precache
106c106
< ##status_file=/u02/nagios/var/nagios/status.dat
---
> status_file=/u02/nagios/var/nagios/status.dat
115c115
< ##status_update_interval=10
---
> status_update_interval=10
479c479
< ##cached_host_check_horizon=15
---
> cached_host_check_horizon=15
491c491
< ##cached_service_check_horizon=15
---
> cached_service_check_horizon=15
608c608
< retain_state_information=0
---
> retain_state_information=1
643c643
< use_retained_program_state=0
---
> use_retained_program_state=1
654c654
< use_retained_scheduling_info=0
---
> use_retained_scheduling_info=1
However, regardless of those changes the issue persists. Can I not remove a host from a hostgroup and subsequently have all of the service checks for that hostgroup no longer performed on the removed host?
Any direction here would be greatly appreciated. I have searched for others encountering the same issues and have come up empty.
I have defined a set of shared service checks using a hostgroup. I want to use the hostgroup so that I can remove a host from it and have the service checks continue for the remaining hosts. However, when I remove the host from the hostgroup, nagios continues to monitor it and triggers my event handlers for specific services that are only defined for the hostgroup. My removal script removes the host from the list of members in the hostgroup and restarts nagios. I've also tried appending the bang '!' to that hostname while leaving it within the hostgroup definition. Same issue, it still monitors and triggers event handlers for that host.
I have tried commenting out and unsetting a variety of settings in the nagios.cfg file in an attempt to make sure that nagios is not caching old information about this host and its hostgroup membership. Those changes are shown here:
custom]$ diff /etc/nagios/nagios.cfg /etc/nagios/nagios.cfg~
67c67
< ##object_cache_file=/u02/nagios/var/nagios/objects.cache
---
> object_cache_file=/u02/nagios/var/nagios/objects.cache
83c83
< ##precached_object_file=/u02/nagios/var/nagios/objects.precache
---
> precached_object_file=/u02/nagios/var/nagios/objects.precache
106c106
< ##status_file=/u02/nagios/var/nagios/status.dat
---
> status_file=/u02/nagios/var/nagios/status.dat
115c115
< ##status_update_interval=10
---
> status_update_interval=10
479c479
< ##cached_host_check_horizon=15
---
> cached_host_check_horizon=15
491c491
< ##cached_service_check_horizon=15
---
> cached_service_check_horizon=15
608c608
< retain_state_information=0
---
> retain_state_information=1
643c643
< use_retained_program_state=0
---
> use_retained_program_state=1
654c654
< use_retained_scheduling_info=0
---
> use_retained_scheduling_info=1
However, regardless of those changes the issue persists. Can I not remove a host from a hostgroup and subsequently have all of the service checks for that hostgroup no longer performed on the removed host?
Any direction here would be greatly appreciated. I have searched for others encountering the same issues and have come up empty.
Re: remove host from hostgroup but it still gets service che
Please post your host and hostgroup definitions for the host that has been removed.
-
tredlightly
- Posts: 8
- Joined: Thu Feb 19, 2015 3:34 pm
Re: remove host from hostgroup but it still gets service che
Host definition, which I do not remove:
define host{
use linux-server ; Name of host template to use
; This host definition will inherit all variables that are defined
; in (or inherited by) the linux-server host template definition.
host_name nss-app2
alias nss-app2
; see if we can use /etc/hosts address 127.0.0.1
check_interval 1
retry_interval 1
}
Hostgroup definition:
define hostgroup{
hostgroup_name nss-app-servers ; The name of the hostgroup
alias Nss Application Servers ; Long name of the group
members nss-app1,nss-app2,nss-app3,nss-app4 ; Comma separated list of hosts that belong to this group
}
The current removal script would adjust the members such that if we were removing nss-app2, the subsequent hostgroup definition would look like:
define hostgroup{
hostgroup_name nss-app-servers ; The name of the hostgroup
alias Nss Application Servers ; Long name of the group
members nss-app1,nss-app3,nss-app4 ; Comma separated list of hosts that belong to this group
}
I've also tried this (the bang 'solution') to no avail:
define hostgroup{
hostgroup_name nss-app-servers ; The name of the hostgroup
alias Nss Application Servers ; Long name of the group
members nss-app1,!nss-app2,nss-app3,nss-app4 ; Comma separated list of hosts that belong to this group
}
And an example service is defined as:
define service{
use generic-service ; Name of service template to use
hostgroup_name nss-app-servers
service_description Tomcat Processes
check_command check_nrpe!check_tomcat_procs
event_handler sendTomcatAppAlarm
check_interval 1
retry_interval 1
notifications_enabled 0
}
define host{
use linux-server ; Name of host template to use
; This host definition will inherit all variables that are defined
; in (or inherited by) the linux-server host template definition.
host_name nss-app2
alias nss-app2
; see if we can use /etc/hosts address 127.0.0.1
check_interval 1
retry_interval 1
}
Hostgroup definition:
define hostgroup{
hostgroup_name nss-app-servers ; The name of the hostgroup
alias Nss Application Servers ; Long name of the group
members nss-app1,nss-app2,nss-app3,nss-app4 ; Comma separated list of hosts that belong to this group
}
The current removal script would adjust the members such that if we were removing nss-app2, the subsequent hostgroup definition would look like:
define hostgroup{
hostgroup_name nss-app-servers ; The name of the hostgroup
alias Nss Application Servers ; Long name of the group
members nss-app1,nss-app3,nss-app4 ; Comma separated list of hosts that belong to this group
}
I've also tried this (the bang 'solution') to no avail:
define hostgroup{
hostgroup_name nss-app-servers ; The name of the hostgroup
alias Nss Application Servers ; Long name of the group
members nss-app1,!nss-app2,nss-app3,nss-app4 ; Comma separated list of hosts that belong to this group
}
And an example service is defined as:
define service{
use generic-service ; Name of service template to use
hostgroup_name nss-app-servers
service_description Tomcat Processes
check_command check_nrpe!check_tomcat_procs
event_handler sendTomcatAppAlarm
check_interval 1
retry_interval 1
notifications_enabled 0
}
-
jdalrymple
- Skynet Drone
- Posts: 2620
- Joined: Wed Feb 11, 2015 1:56 pm
Re: remove host from hostgroup but it still gets service che
How are you verifying that Nagios is getting restarted. Look in your nagios.log to verify that it actually is restarting successfully. I suspect your removal script is breaking the syntax of the cfg file and Nagios isn't restarting proper.
Is your script successfully modifying the files?
Is your script successfully modifying the files?
-
tredlightly
- Posts: 8
- Joined: Thu Feb 19, 2015 3:34 pm
Re: remove host from hostgroup but it still gets service che
It is restarting. It uses /etc/init.d/nagios restart
I've also tried /etc/init.d/nagios reload
and /etc/init.d/nagios force-reload
all to no avail. I grep for the existence of the nss-app2 in this set; Here's the egrep for the bang solution:
sudo egrep 'members|nss-app2' /u02/nagios/var/nagios/retention.dat /u02/nagios/var/nagios/status.dat /etc/nagios/objects/custom/nss_app_hosts.cfg
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/etc/nagios/objects/custom/nss_app_hosts.cfg: host_name nss-app2
/etc/nagios/objects/custom/nss_app_hosts.cfg: alias nss-app2
/etc/nagios/objects/custom/nss_app_hosts.cfg: members !nss-app2,nss-app1,nss-app3,nss-app4 ; Comma separated list of hosts that belong to this group
Here it is for the removal solution:
The script:
admin-app-down.sh nss-app2
NEW_MEMBER_SET = nss-app1,nss-app3,nss-app4
Running configuration check...
Stopping nagios:No lock file found in /var/nagios/nagios.pid
Starting nagios: done.
NEW_MEMBER_SET = nss-app1,nss-app3,nss-app4
Running configuration check...
Stopping nagios:. done.
Starting nagios: done.
and the egrep, 3 of them, see the nss-app2 stuff come back in status.dat?
[bmcs@cp8-nss-lb2 scripts]$ sudo egrep 'members|nss-app2' /u02/nagios/var/nagios/retention.dat /u02/nagios/var/nagios/status.dat /etc/nagios/objects/custom/nss_app_hosts.cfg
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/etc/nagios/objects/custom/nss_app_hosts.cfg: host_name nss-app2
/etc/nagios/objects/custom/nss_app_hosts.cfg: alias nss-app2
/etc/nagios/objects/custom/nss_app_hosts.cfg: members nss-app1,nss-app3,nss-app4 ; Comma separated list of hosts that belong to this group
[bmcs@cp8-nss-lb2 scripts]$ sudo egrep 'members|nss-app2' /u02/nagios/var/nagios/retention.dat /u02/nagios/var/nagios/status.dat /etc/nagios/objects/custom/nss_app_hosts.cfg
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/etc/nagios/objects/custom/nss_app_hosts.cfg: host_name nss-app2
/etc/nagios/objects/custom/nss_app_hosts.cfg: alias nss-app2
/etc/nagios/objects/custom/nss_app_hosts.cfg: members nss-app1,nss-app3,nss-app4 ; Comma separated list of hosts that belong to this group
[bmcs@cp8-nss-lb2 scripts]$ sudo egrep 'members|nss-app2' /u02/nagios/var/nagios/retention.dat /u02/nagios/var/nagios/status.dat /etc/nagios/objects/custom/nss_app_hosts.cfg
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: plugin_output=connect to address nss-app2 and port 8080: Connection refused
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/etc/nagios/objects/custom/nss_app_hosts.cfg: host_name nss-app2
/etc/nagios/objects/custom/nss_app_hosts.cfg: alias nss-app2
/etc/nagios/objects/custom/nss_app_hosts.cfg: members nss-app1,nss-app3,nss-app4 ; Comma separated list of hosts that belong to this group
I've also tried /etc/init.d/nagios reload
and /etc/init.d/nagios force-reload
all to no avail. I grep for the existence of the nss-app2 in this set; Here's the egrep for the bang solution:
sudo egrep 'members|nss-app2' /u02/nagios/var/nagios/retention.dat /u02/nagios/var/nagios/status.dat /etc/nagios/objects/custom/nss_app_hosts.cfg
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/etc/nagios/objects/custom/nss_app_hosts.cfg: host_name nss-app2
/etc/nagios/objects/custom/nss_app_hosts.cfg: alias nss-app2
/etc/nagios/objects/custom/nss_app_hosts.cfg: members !nss-app2,nss-app1,nss-app3,nss-app4 ; Comma separated list of hosts that belong to this group
Here it is for the removal solution:
The script:
admin-app-down.sh nss-app2
NEW_MEMBER_SET = nss-app1,nss-app3,nss-app4
Running configuration check...
Stopping nagios:No lock file found in /var/nagios/nagios.pid
Starting nagios: done.
NEW_MEMBER_SET = nss-app1,nss-app3,nss-app4
Running configuration check...
Stopping nagios:. done.
Starting nagios: done.
and the egrep, 3 of them, see the nss-app2 stuff come back in status.dat?
[bmcs@cp8-nss-lb2 scripts]$ sudo egrep 'members|nss-app2' /u02/nagios/var/nagios/retention.dat /u02/nagios/var/nagios/status.dat /etc/nagios/objects/custom/nss_app_hosts.cfg
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/etc/nagios/objects/custom/nss_app_hosts.cfg: host_name nss-app2
/etc/nagios/objects/custom/nss_app_hosts.cfg: alias nss-app2
/etc/nagios/objects/custom/nss_app_hosts.cfg: members nss-app1,nss-app3,nss-app4 ; Comma separated list of hosts that belong to this group
[bmcs@cp8-nss-lb2 scripts]$ sudo egrep 'members|nss-app2' /u02/nagios/var/nagios/retention.dat /u02/nagios/var/nagios/status.dat /etc/nagios/objects/custom/nss_app_hosts.cfg
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/etc/nagios/objects/custom/nss_app_hosts.cfg: host_name nss-app2
/etc/nagios/objects/custom/nss_app_hosts.cfg: alias nss-app2
/etc/nagios/objects/custom/nss_app_hosts.cfg: members nss-app1,nss-app3,nss-app4 ; Comma separated list of hosts that belong to this group
[bmcs@cp8-nss-lb2 scripts]$ sudo egrep 'members|nss-app2' /u02/nagios/var/nagios/retention.dat /u02/nagios/var/nagios/status.dat /etc/nagios/objects/custom/nss_app_hosts.cfg
/u02/nagios/var/nagios/retention.dat:host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: plugin_output=connect to address nss-app2 and port 8080: Connection refused
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/u02/nagios/var/nagios/status.dat: host_name=nss-app2
/etc/nagios/objects/custom/nss_app_hosts.cfg: host_name nss-app2
/etc/nagios/objects/custom/nss_app_hosts.cfg: alias nss-app2
/etc/nagios/objects/custom/nss_app_hosts.cfg: members nss-app1,nss-app3,nss-app4 ; Comma separated list of hosts that belong to this group
-
jdalrymple
- Skynet Drone
- Posts: 2620
- Joined: Wed Feb 11, 2015 1:56 pm
Re: remove host from hostgroup but it still gets service che
Code: Select all
sudo egrep 'members|nss-app2' /u02/nagios/var/nagios/retention.dat /u02/nagios/var/nagios/status.dat /etc/nagios/objects/custom/nss_app_hosts.cfgThere are a number of places that host can be placed in a hostgroup, hostgroup definition, host definition and template definition. You need to start at your host and work backwards to find where in those definitions that your host is being placed into the hostgroup. This should be a straightforward process. Also make sure you don't have a stray .cfg file somewhere that's getting read.
There is no need to modify nagios.cfg. When you make changes to the nagios configuration files it obeys.
-
tredlightly
- Posts: 8
- Joined: Thu Feb 19, 2015 3:34 pm
Re: remove host from hostgroup but it still gets service che
Understood. Our configurations use the cfg_dir directive for this directory: /etc/nagios/objects/custom
We do use the 'install' command to implement the change, which we also use the 'backup' option in it, so we do have files like this, nss_app_hosts.cfg~ (with an appended tilde). My understanding was that nagios would only read files ending in .cfg (and then not .cfg~ ) when gathering configs from the cfg_dir designated directories.
Toward that end, I removed the nss_app_hosts.cfg~ file and tried again, but no joy. This is all we have in terms of configuration for nss-app2 (we do leave the host definition for nss-app2 in the nss_app_hosts.cfg file, but we remove it from the nss-app-servers hostgroup. Here's all we have on it (all of the services are defined to use the hostgroup)
[bmcs@s4-nss-lb2 custom]$ grep nss-app2 *
nss_app_hosts.cfg: host_name nss-app2
nss_app_hosts.cfg: alias nss-app2
nss_app_hosts.cfg: members nss-app1,nss-app2,nss-app3,nss-app4 ; Comma separated list of hosts that belong to this group
nss_app_hosts.cfg~: host_name nss-app2
nss_app_hosts.cfg~: alias nss-app2
nss_app_hosts.cfg~: members nss-app1,nss-app2,nss-app3 ; Comma separated list of hosts that belong to this group
And when I remove it from the members list in nss_app_hosts.cfg you won't see that line from nss_app_hosts.cfg, but you would see it still in nss_app_hosts.cfg~. In the example above, it is nss-app4 that appears to have been added back to the operational config. Oddly enough, it works for that 1 server, nss-app4, but not for any of the others. Originally, nss-app4 would have been the last host in the members list.
We do use the 'install' command to implement the change, which we also use the 'backup' option in it, so we do have files like this, nss_app_hosts.cfg~ (with an appended tilde). My understanding was that nagios would only read files ending in .cfg (and then not .cfg~ ) when gathering configs from the cfg_dir designated directories.
Toward that end, I removed the nss_app_hosts.cfg~ file and tried again, but no joy. This is all we have in terms of configuration for nss-app2 (we do leave the host definition for nss-app2 in the nss_app_hosts.cfg file, but we remove it from the nss-app-servers hostgroup. Here's all we have on it (all of the services are defined to use the hostgroup)
[bmcs@s4-nss-lb2 custom]$ grep nss-app2 *
nss_app_hosts.cfg: host_name nss-app2
nss_app_hosts.cfg: alias nss-app2
nss_app_hosts.cfg: members nss-app1,nss-app2,nss-app3,nss-app4 ; Comma separated list of hosts that belong to this group
nss_app_hosts.cfg~: host_name nss-app2
nss_app_hosts.cfg~: alias nss-app2
nss_app_hosts.cfg~: members nss-app1,nss-app2,nss-app3 ; Comma separated list of hosts that belong to this group
And when I remove it from the members list in nss_app_hosts.cfg you won't see that line from nss_app_hosts.cfg, but you would see it still in nss_app_hosts.cfg~. In the example above, it is nss-app4 that appears to have been added back to the operational config. Oddly enough, it works for that 1 server, nss-app4, but not for any of the others. Originally, nss-app4 would have been the last host in the members list.
-
jdalrymple
- Skynet Drone
- Posts: 2620
- Joined: Wed Feb 11, 2015 1:56 pm
Re: remove host from hostgroup but it still gets service che
Look in objects.cache to see if the hostgroup is defined under the host, or if the host is a member of the hostgroup. I expect it's the former.
-
tredlightly
- Posts: 8
- Joined: Thu Feb 19, 2015 3:34 pm
Re: remove host from hostgroup but it still gets service che
We do have customized locations in case that factors in:
$ pwd
/u02/nagios/var/nagios
nagios]$ grep -n nss-app2 objects.cache
301: members nss-app1,nss-app2,nss-app3,nss-app4
401: host_name nss-app2
Hostgroup first, host afterward.
This is the block for the 301 line reference:
define hostgroup {
hostgroup_name nss-app-servers
alias Nss Application Servers
members nss-app1,nss-app2,nss-app3,nss-app4
}
This is the block for the 401 line reference:
define host {
host_name nss-app2
alias nss-app2
address nss-app2
check_period 24x7
check_command check-host-alive
contact_groups admins
notification_period workhours
initial_state o
importance 0
check_interval 1.000000
retry_interval 1.000000
max_check_attempts 10
active_checks_enabled 1
passive_checks_enabled 1
obsess 1
event_handler_enabled 1
low_flap_threshold 0.000000
high_flap_threshold 0.000000
flap_detection_enabled 1
flap_detection_options a
freshness_threshold 0
check_freshness 0
notification_options r,d,u
notifications_enabled 1
notification_interval 120.000000
first_notification_delay 0.000000
stalking_options n
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
}
$ pwd
/u02/nagios/var/nagios
nagios]$ grep -n nss-app2 objects.cache
301: members nss-app1,nss-app2,nss-app3,nss-app4
401: host_name nss-app2
Hostgroup first, host afterward.
This is the block for the 301 line reference:
define hostgroup {
hostgroup_name nss-app-servers
alias Nss Application Servers
members nss-app1,nss-app2,nss-app3,nss-app4
}
This is the block for the 401 line reference:
define host {
host_name nss-app2
alias nss-app2
address nss-app2
check_period 24x7
check_command check-host-alive
contact_groups admins
notification_period workhours
initial_state o
importance 0
check_interval 1.000000
retry_interval 1.000000
max_check_attempts 10
active_checks_enabled 1
passive_checks_enabled 1
obsess 1
event_handler_enabled 1
low_flap_threshold 0.000000
high_flap_threshold 0.000000
flap_detection_enabled 1
flap_detection_options a
freshness_threshold 0
check_freshness 0
notification_options r,d,u
notifications_enabled 1
notification_interval 120.000000
first_notification_delay 0.000000
stalking_options n
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
}
-
jdalrymple
- Skynet Drone
- Posts: 2620
- Joined: Wed Feb 11, 2015 1:56 pm
Re: remove host from hostgroup but it still gets service che
This all adds up to Nagios not reloading properly, or it's finding that hostgroup definition somewhere else in an unexpected config file.
I have no need to replicate your configuration - in the decade that I've worked with Nagios I've never seen it simply disobey a configuration. I suggest creating a simple config as such to prove the behavior of the configuration to yourself:
Add and remove hosts from the hostgroup at will and you'll see their associated services disappear out of Nagios.
I have no need to replicate your configuration - in the decade that I've worked with Nagios I've never seen it simply disobey a configuration. I suggest creating a simple config as such to prove the behavior of the configuration to yourself:
Code: Select all
define host {
name simple-host
check_command check-host-alive
max_check_attempts 1
check_period 24x7
contacts nagiosadmin
notification_interval 60
notification_period 24x7
register 0
}
define host {
use simple-host
host_name simple-host-a
address 127.0.0.1
}
define host {
use simple-host
host_name simple-host-b
address 127.0.0.1
}
define host {
use simple-host
host_name simple-host-c
address 127.0.0.1
}
define service {
name simple-service
service_description simple-service
hostgroup_name simple-hostgroup
check_command check_dummy!0
max_check_attempts 1
check_interval 1
check_period 24x7
retry_interval 1
notification_interval 60
notification_period 24x7
contacts nagiosadmin
register 1
}
define hostgroup {
hostgroup_name simple-hostgroup
alias simple-hostgroup
members simple-host-a, simple-host-b, simple-host-c