Page 1 of 1

Nagios Core 3.2.1 remembering old services?

Posted: Tue Mar 12, 2013 11:04 am
by redcat
I am running Nagios Core 3.2.1, monitoring several dozen machines. When I inherited this machine from my predecessor the config files had a lot of grunge in them - contacts who were no longer with the company, contract groups that were no associated with any services, etc. I spent a fair bit of time cleaning up the config files, and now things are running just fine and the cfg files are easier to maintain. We had also been running Centreon as an interface to Nagios. Due to circumstances that proved to be quite propicious, we are no longer running Centreon. All changes to Nagios config are now made by editing the files in /usr/local/nagios/etc and restarting nagios, as the gods of *nix intended :)

We have one service, named "NBF-Memory", that reports a pretty meaningless value and tends to generate notifications pretty often. Since the values reported don't tell us anything of value, I removed that service from servicegroups.cfg. I then validated the oveall configuration with "/usr/local/nagios/bin/nagios -v /usr/loca/nagios/etc/nagios.cfg", stopped nagios, and stated it again. All is running well, except that I'm still seeing NBF-Memory on the Nagios display and am getting period alarms from it!

[root@nagios etc]# pwd
/usr/local/nagios/etc
[root@nagios etc]# grep NBF-Memory *.cfg
services.cfg: service_description NBF-Memory
[root@nagios etc]#
[root@nagios etc]# grep -i NBF-Memory servicegroups.cfg
[root@nagios etc]#

My question is this: if I don't have NBF-Memory scheduled through the servicegroups.cfg file (or any other .cfg file), why is it still being run? I am running ndoutils, but that should have no bearing on this. I've verified that Centreon is no longer running, so I don't believe I should have anything other than nagios capable of manipulating nagios' in-memory config. Ideas?

Re: Nagios Core 3.2.1 remembering old services?

Posted: Tue Mar 12, 2013 11:13 am
by abrist
The service could be declared in more than one place, have you done a recursive grep to make sure?

Code: Select all

grep -r NBF-Memory /usr/local/nagios/etc

Re: Nagios Core 3.2.1 remembering old services?

Posted: Tue Mar 12, 2013 12:41 pm
by redcat
No, but I did verify that all of the config files getting included were in the current directory:

[root@nagios etc]# grep cfg_file nagios.cfg
cfg_file=/usr/local/nagios/etc/hostTemplates.cfg
cfg_file=/usr/local/nagios/etc/hosts.cfg
cfg_file=/usr/local/nagios/etc/serviceTemplates.cfg
cfg_file=/usr/local/nagios/etc/services.cfg
cfg_file=/usr/local/nagios/etc/misccommands.cfg
cfg_file=/usr/local/nagios/etc/checkcommands.cfg
cfg_file=/usr/local/nagios/etc/contactgroups.cfg
cfg_file=/usr/local/nagios/etc/contacts.cfg
cfg_file=/usr/local/nagios/etc/hostgroups.cfg
cfg_file=/usr/local/nagios/etc/servicegroups.cfg
cfg_file=/usr/local/nagios/etc/timeperiods.cfg
cfg_file=/usr/local/nagios/etc/escalations.cfg
cfg_file=/usr/local/nagios/etc/dependencies.cfg
cfg_file=/usr/local/nagios/etc/meta_commands.cfg
cfg_file=/usr/local/nagios/etc/meta_contactgroup.cfg
cfg_file=/usr/local/nagios/etc/meta_contact.cfg
cfg_file=/usr/local/nagios/etc/meta_dependencies.cfg
cfg_file=/usr/local/nagios/etc/meta_escalations.cfg
cfg_file=/usr/local/nagios/etc/meta_hostgroup.cfg
cfg_file=/usr/local/nagios/etc/meta_host.cfg
cfg_file=/usr/local/nagios/etc/meta_services.cfg
cfg_file=/usr/local/nagios/etc/meta_timeperiod.cfg


However:

[root@nagios etc]# grep -r -i NBF-Memory *.cfg
services.cfg: service_description NBF-Memory
[root@nagios etc]#

So I'm not seeing any config info anywhere that should be scheduling this service check. I can always disable active checks and notification in the web interface, but I'd really like to understand where they're coming from first.

Re: Nagios Core 3.2.1 remembering old services?

Posted: Tue Mar 12, 2013 1:30 pm
by abrist
Are you using templates or servicegroups? If you included the service in a template or servicegroup that is used by hosts elsewhere, they may inherit the service check.

You could also verify that you only have 1 parent nagios process:

Code: Select all

service nagios stop
ps -aef | grep nagios
killall nagios
service nagios start

Re: Nagios Core 3.2.1 remembering old services?

Posted: Wed Mar 13, 2013 8:57 am
by redcat
I'm using servicegroups, defined in servicegroups.cfg. What i did was grep all the .cfg files looking for NBF-Memory, and it's only found in services.cfg where it's defined "service_description NBF-Memory". It's not being listed anywhere else, and it would have to be in any file that inherited it.

I made sure I only had a single copy of nagios running. When things weren't running exactly as I wanted I did "/etc/init.d/nagios stop", verified that there were no nagios processes showing in "ps", then did "/etc/init.d/nagios start".

I am well and truly mystified.

Re: Nagios Core 3.2.1 remembering old services?

Posted: Wed Mar 13, 2013 9:36 am
by abrist
I presume you are not using nagiosql. Do you have any hosts listed in the "host_name" directive for the NBF_Memory service?

Re: Nagios Core 3.2.1 remembering old services?

Posted: Wed Mar 13, 2013 2:07 pm
by redcat
In services.cfg I have:

define service{
hostgroup_name NBF
service_description NBF-Memory
_SERVICE_ID 1102
check_command check_centreon_memory!dGfGcbDIp23t14!80!90
max_check_attempts 15
check_interval 15
retry_interval 5
check_period 24x7
notification_interval 15
notification_period 24x7
notification_options w,u,c,r
contact_groups SPSS, SPSS_Pager
}

and in hostgroups.cfg I have:

define hostgroup{
hostgroup_name NBF
alias Network-Based-Firewall
}

Ah - here's the problem. In hosts I've got entrieslike:

define host{
host_name hostname-0001
use NBF_nagios
alias host-0001
address 172.20.1.36
_HOST_ID 233
hostgroups All,NBF
...

That's where they're coming from. Thanks for pointing out that possibility - I just hadn't dug deep enough into the config files.

Re: Nagios Core 3.2.1 remembering old services?

Posted: Wed Mar 13, 2013 2:21 pm
by slansing
Yeah, there are a couple ways to add service definitions, glad you found the root of it. You should be able to remove this, verify configuration and restart Nagios.