Nagios won't start

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
whjsmith
Posts: 24
Joined: Thu Dec 29, 2011 12:34 pm

Nagios won't start

Post by whjsmith »

We recently recognized an oddity in our Nagios installation that we tried to correct and now Nagios won't start due to a configuration error. We noticed that we had a ping check running for a host that wasn't configured. I looked and there was a configuration text file for it on the server, CNN1059.cfg. I thought that by creating a host cnn1059 in the core config manager, I could apply and then remove and that might fix the problem. Sadly it only got worse. My initial apply after I created the host failed and for whatever reason, it did not create a snapshot to allow me to see what the error was. I removed the entry I just put in and applied again. Same problem, apply did not work and no snapshot. I then decided to restart Nagios and now it won't start saying there is a configuration error. When I ran a check on the configuration it said there was an error with a host config file saying a template didn't exist but I know that it is out to lunch. It complained about definition at line 79 of a config file citing that the template does not exist but we double checked and the template is setup and from the screenshot I have given below, we use that same template for other definitions that it is not complaining about plus we use the same template for serval other servers and the error only refers to this one config file. Any help would be appreciated.
Capture.PNG
Capture1.PNG
Capture2.PNG
You do not have the required permissions to view the files attached to this post.
Our installation is currently:
CentOS Linux release 6.0 (Final)
32 bit
VM Image
Special Configurations: No
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Nagios won't start

Post by mguthrie »

Are you guys doing any manual maintenance of the configuration files under /usr/local/nagios/etc?

See the following wiki:
http://support.nagios.com/wiki/index.ph ... om_the_CCM
paul.jobb
Posts: 167
Joined: Tue Aug 02, 2011 4:37 pm

Re: Nagios won't start

Post by paul.jobb »

Hi this is in addition to what jason posted regarding re-adding the ghost host that was in our running config but not in the xi(nagiosql db)...

We restored(unzipped) a previous configuration to /usr/local/nagios/etc from a last known good snap shot, our core nagios is running fine now. We aren't able apply the xi configuration however with changes, if we make a configuration change(i.e. add a host or contact) and apply it fails, remove those changes and apply it works. No snapshot results are being generated when the apply fails however, we are adding valid configurations so I don't believe the error is with what we are adding...
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Nagios won't start

Post by mguthrie »

We restored(unzipped) a previous configuration to /usr/local/nagios/etc from a last known good snap shot
I have alarm bells going off in my mind about this. You should never have to manually restore XI to a good working snapshot, Nagios XI does this for you every time you Apply Configuration and you get a config error. XI automatically rolls it back to the last good configuration to keep the monitoring engine running.

The fact that new snapshots aren't being created is a concern. You interested in a remote session either this afternoon or Monday?
paul.jobb
Posts: 167
Joined: Tue Aug 02, 2011 4:37 pm

Re: Nagios won't start

Post by paul.jobb »

yes if you could do a remote session that would be great.

As mentioned Jason added that duplicate host and applied and got the error, unfortunately it didn't generate a snapsot error log. unfortunately we tried reversing that change and re-applying to no affect. His apply wiped out(0 bytes) the servicetemplate.cfg therefore had to use the config snapshot to get our monitoring back online. I did backup that prevous etc directory.
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Nagios won't start

Post by mguthrie »

How's the hard drive space on that machine?
paul.jobb
Posts: 167
Joined: Tue Aug 02, 2011 4:37 pm

Re: Nagios won't start

Post by paul.jobb »

nagios server vnl64

[root@vnl64 /]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
99G 52G 42G 56% /
tmpfs 2.0G 0 2.0G 0% /dev/shm
/dev/sda1 97M 64M 29M 69% /boot
tmpfs 75M 30M 46M 40% /var/nagiosramdisk
vnl65.gov.ab.ca:/wmictmp
35G 5.7G 28G 17% /wmictmp


database server vnl65

[root@vnl65 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_vnl65-lv_root
35G 5.7G 28G 17% /
tmpfs 1012M 0 1012M 0% /dev/shm
/dev/sda1 485M 49M 411M 11% /boot
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Nagios won't start

Post by mguthrie »

Yeah that appears ok. We're somewhat booked this afternoon for remote sessions, I'll follow up with you by PM and we'll get a time setup.
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Nagios won't start

Post by mguthrie »

Figured we'll keep working on this in the meantime since this is a BIG clue.
I further bumped up my time out values in the php.ini file
max_execution=240
max_input_time=480
memory_limit=2048M

I have a couple service configs files that have over 150 service checks in them, it appeared when looking at the \usr\local\nagios\etc folders that those would be written at 0 bytes at times but not consistently. I just applied the configuration and it appeared to work and write the changes. When I click configure -> tools -> write config files -> write monitoring data that screen goes blank, I'm thinking it should show me something????
Yes, if you get a blank page that means you have a fatal PHP error, and in this scenario you're either hitting PHP's memory limit or timeout. Check the /var/log/httpd/error_log for which error is occurring.

On a sidenote, I wouldn't recommend a memory limit over 1024M, that means each PHP script run can take up that much memory, so if you've got a lot of connections it's better to keep that lower. Make sure you restart apache after making changes to the php.ini file in order for the new settings to take effect.
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: Nagios won't start

Post by mguthrie »

Just a follow up on this thread. Issue was found to be caused by the Write Config process timing out. We increased the max_execution_time paramater in the php.ini file, restart apache, and the issue appeared to be resolved.
Locked