Page 1 of 1
Configuration error....
Posted: Fri Feb 13, 2015 10:51 am
by JakeHatMacys
So our configuration was broken due to someone putting in a custom command:
Error: Host check command 'check_nrpe - CheckCPU' specified for host 'K4285571' is not defined anywhere!
Error: Host check command 'check_nrpe -c TestScript.bat' specified for host 'K4505308' is not defined anywhere!
These commands have been deleted but so have the hosts & the hosts services.... They are not in core anywhere yet still being referenced when applying configuration and they're still shown in XI.
So I went into /usr/local/nagios/etc/ to delete the host's and services .cfg manually as root but they aren't there either... So I'm a bit lost as to how to address this.
Thoughts? (I've even recreated dummy commands named both of the above to see if that'd get me by but didn't work).
Re: Configuration error....
Posted: Fri Feb 13, 2015 10:53 am
by scottwilkerson
What version of XI and OS are you using? There was a bug that would cause this a couple versions ago.
Re: Configuration error....
Posted: Fri Feb 13, 2015 12:59 pm
by JakeHatMacys
scottwilkerson wrote:What version of XI and OS are you using? There was a bug that would cause this a couple versions ago.
I actually upgraded to current XI this morning so
Nagios XI 2014R2.6.
OS is:
Red Hat Enterprise Linux Server release 6.4 (Santiago)
Re: Configuration error....
Posted: Fri Feb 13, 2015 1:41 pm
by lmiltchev
Do you see the same errors:
Error: Host check command 'check_nrpe - CheckCPU' specified for host 'K4285571' is not defined anywhere!
Error: Host check command 'check_nrpe -c TestScript.bat' specified for host 'K4505308' is not defined anywhere!
when you run the Write Config Tool?
CCM->Tools->Write Config Files->Write (check output for errors)->Verify (check output for errors)
If you don't see any config errors, click on the "Delete" button under the "Write Database Configs To File" page, click on "Write", "Verify" and apply configuration.
Re: Configuration error....
Posted: Fri Feb 13, 2015 1:54 pm
by JakeHatMacys
Still getting the error... here's the kicker. It appears the upgrade breaks the config.
I rolled back to the previous version and applied config fine. Then upgraded again, changed nothing.... applied config and same results the config was magically broken.
The errors that we were seeing were older commands that I had deleted days ago, so yeah this whole thing is confusing to me.
Re: Configuration error....
Posted: Fri Feb 13, 2015 2:47 pm
by lmiltchev
Still getting the error... here's the kicker. It appears the upgrade breaks the config.
When did you see the error - while running the Write Config Tool or after applying configuration?
I rolled back to the previous version and applied config fine. Then upgraded again, changed nothing....
This doesn't help us troubleshoot the issue. What is the version of XI that you are currently running? If you are running 2014R2.6, run the Write Config Tool and post the output. Also run the following commands from the command line and post the output:
Code: Select all
uname -a
cat /etc/*release
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Re: Configuration error....
Posted: Mon Feb 16, 2015 8:34 am
by JakeHatMacys
lmiltchev wrote:Still getting the error... here's the kicker. It appears the upgrade breaks the config.
When did you see the error - while running the Write Config Tool or after applying configuration?
I rolled back to the previous version and applied config fine. Then upgraded again, changed nothing....
This doesn't help us troubleshoot the issue. What is the version of XI that you are currently running? If you are running 2014R2.6, run the Write Config Tool and post the output. Also run the following commands from the command line and post the output:
Code: Select all
uname -a
cat /etc/*release
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Code: Select all
$ cat /etc/*release
LSB_VERSION=base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
Red Hat Enterprise Linux Server release 6.4 (Santiago)
Red Hat Enterprise Linux Server release 6.4 (Santiago)
Code: Select all
Checked 2423 hosts.
Checked 779 host groups.
Checked 1 service groups.
Checked 9 contacts.
Checked 2 contact groups.
Checked 123 commands.
Checked 16 time periods.
Checked 0 host escalations.
Checked 0 service escalations.
Checking for circular paths...
Checked 2423 hosts
Checked 0 service dependencies
Checked 0 host dependencies
Checked 16 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
Total Warnings: 11499
Total Errors: 0
Things look okay - No serious problems were detected during the pre-flight check
[root@esu2v239 ~]#
(We didn't define contacts on our hosts so getting warnings for those)
Config tool came back clean on 2.6 but applying the config I'd see the Error's I mentioned above. With changing nothing and rolling back to 2.5 I got no error's when applying the config. So I then upgraded again to 2.6 (changing nothing with the config) and got the errors again. So there's definitely something to the upgrade that it's not liking that 2.5 doesn't seem to mind... or somehow ghosting it's way back to the config in 2.6
I can take another back up today and upgrade again (currently on 2.5) if you got a series of before & after steps you'd like me to run. The errors at one time were real, but we cleaned out all that stuff over a week ago.
Re: Configuration error....
Posted: Mon Feb 16, 2015 11:33 am
by tgriep
Before you do the upgrade, could you check and see if there is a service group defined for those checks that have those 2 hosts in it?
You may also want to look in the MYSQL log for any errors.
Look in this file /var/log/mysqld.log for any errors.
Also, try this before doing the upgrade to 2.6.
CCM->Tools->Write Config Files->Write (check output for errors)->Verify (check output for errors)
If you don't see any config errors, click on the "Delete" button under the "Write Database Configs To File" page, click on "Write", "Verify" and apply configuration.
If it still fails, could you PM me your system profile?
In addition to this, can you try upgrading to 2.6 again? Next, post the "upgrade.log". It should be located in "/tmp/nagiosxi/" directory.
Also, run the following commands:
Code: Select all
cd /usr/local/nagiosxi/scripts
./reconfigure_nagios.s &> /tmp/reconfigure.txt
and post the "/tmp/reconfigure.txt" file.
lmiltchev