Page 1 of 2

apply config fail

Posted: Tue Mar 31, 2015 10:11 am
by MichielvM
hi all,

When running Apply Configuration on a Nagios 2014R2.6 / Core 4.0.8 machine I get two messages.

1. Not an error, but strange as the file does not exist and I have no services by that name anymore.
Write host configurations ...
Host configuration files successfully written!

Write service configurations ...
Configuration file: oss-svr-015-oracle.cfg successfully written!
Service configuration files successfully written!
There were oss-svr-015-oracle services linked to oss-svr-015. I renamed them via XI.

2. Error
Error: Could not find any host matching 'til-svr-010' (config file '/usr/local/nagios/etc/services/til-svr-010-oracle.cfg', starting on line 337)
Error: Could not expand hostgroups and/or hosts specified in service (config file '/usr/local/nagios/etc/services/til-svr-010-oracle.cfg', starting on line 337)
There is an active host named til-svr-010
There were also til-svr-010-oracle services linked to oss-svr-010, but I renamed them via XI.

The weird bit is that when I try to locate either til-svr-010-oracle.cfg or oss-svr-015-oracle.cfg, they both don“t exist on that location.
I tried the following:
manual check across the services and hosts for a mismatch.
updatedb
./repairmysql.sh nagios
write config files, clicked delete - write - verify

Still the same errors.
The renaming I did was after I got these errors, I thought that it could have something to do with it.
I suspect that an ex co-worker did some renaming and editing from the commandline as he was more comfortable with that. But not sure.
I am at the point of throwing both hosts and their services out, but then I lose the history and I still have no clue as to what went wrong.

Re: apply config fail

Posted: Tue Mar 31, 2015 12:37 pm
by abrist
What version of centos/rhel are you running?
What version of XI?
What is the output of:

Code: Select all

cat /etc/*release
grep manage_serv /usr/local/nagiosxi/scripts/restart_nagios_with_export.sh
grep manage_serv /etc/sudoers

Re: apply config fail

Posted: Wed Apr 01, 2015 2:22 am
by MichielvM
Xi: 2014 R2.6

Code: Select all

cat /etc/*release
CentOS release 6.6 (Final)
LSB_VERSION=base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
CentOS release 6.6 (Final)
CentOS release 6.6 (Final)

Code: Select all

grep manage_serv /usr/local/nagiosxi/scripts/restart_nagios_with_export.sh
sudo $BASEDIR/manage_services.sh restart nagios

Code: Select all

grep manage_serv /etc/sudoers
NAGIOSXI ALL = NOPASSWD:/usr/local/nagiosxi/scripts/manage_services.sh *
NAGIOSXIWEB ALL = NOPASSWD:/usr/local/nagiosxi/scripts/manage_services.sh *

Re: apply config fail

Posted: Wed Apr 01, 2015 1:56 pm
by abrist
Does a verify from the cli come up clean?

Code: Select all

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Re: apply config fail

Posted: Thu Apr 02, 2015 3:00 am
by MichielvM
Total Warnings: 0
Total Errors: 0

The summary shows the exact number of known hosts and services.

Re: apply config fail

Posted: Thu Apr 02, 2015 11:42 am
by lmiltchev
Go to CCM->Services->type "til-svr-010" in the search bar and hit enter. Do you find any services? If you do, click on the service and examine the config. It seems like this service is not added to a host/hostgroup. Fix this under CCM, save and run the Write Config Tool again to check for config errors. If there are no errors, apply configuration.

Re: apply config fail

Posted: Fri Apr 03, 2015 3:49 am
by MichielvM
Yes, there are services named til-svr-010.
All are linked to host til-svr-010, which exists too.

My guess is that we're dealing with a ghost host.
I've seen that before. The solution was to find a remove the illegal cfg's from the command line and the go to ccm- write config etc. But that fails now.
The til-svr-010-oracle.cfg is nowhere to be found.

Remember: a colleague is known to edit from the command line, although we told him time and agine not too, maybe that's where it originated.

Re: apply config fail

Posted: Fri Apr 03, 2015 9:41 am
by abrist
Just for good measure, try deleting all the configs so that the apply process can rebuild them. Go to --> CCM --> Write Config Files --> Click "Delete" and then "Write" and finally "Verify". Afterwards, try to apply config.

Re: apply config fail

Posted: Fri Apr 03, 2015 10:29 am
by MichielvM
Did that already, doesn't solve it.

Re: apply config fail

Posted: Fri Apr 03, 2015 11:08 am
by lmiltchev
Run the following commands and show us the output in code wraps:

Code: Select all

visudo -c
ll -t /usr/local/nagios/etc/services | grep til-svr-010*
ll -t /usr/local/nagios/etc/hosts | grep til-svr-010*
Also, is it possible to PM me (or anyone on the Nagios Support team) you latest failed snapshot (it will be colored in red)?
Admin->Config Snapshots->Download (the diskette icon)