Page 1 of 1

URGENT: All linux host checks pending forever after upgrade

Posted: Wed Jun 22, 2011 12:37 am
by fao
The upgrade from 2009R1.3 to 2011R1.4 seemed to go smoothly, but after restarting Nagios, I have encountered a number of problems.

The 1st problem is that all Windows checks are failing. This is being dealt with in a separate forum post

http://support.nagios.com/forum/viewtop ... 055#p11055

The second problem is that all of the linux host checks are pending forever. Oddly, the host check for the Nagios host itself succeeds. The ping service succeeds for all with the status "OK: Nothing to Monitor" I don't remember if that is what the status detail typically is for the ping service.

I have attached my log from /var/log/messages

I am running the CentOS 5.3 VM image provided by Nagios.com

Re: URGENT: All linux host checks pending forever after upgr

Posted: Wed Jun 22, 2011 10:07 am
by ormsbeec
Im seeing the same behavior.... since upgrading but the vast majority of my host checks will end up orphaned (I have 2200 hosts). In the mean time I have just added a ping service since that is all my host check is.

Re: URGENT: All linux host checks pending forever after upgr

Posted: Wed Jun 22, 2011 12:49 pm
by ormsbeec
Turns out even the service checks were being orphaned as well. I just rolled back to 1.3, and within in 10 minutes all my checks are back in working order again. I should of kept some log files for the support staff but I forgot to back them up before starting.

Sorry to hijack the thread as well... but the symptoms appear identical.

Re: URGENT: All linux host checks pending forever after upgr

Posted: Wed Jun 22, 2011 1:14 pm
by mguthrie
Can you run:

Code: Select all

killall -9 nagios
service ndo2db stop
service nagios start
service ndo2db start
and see if the check results start coming in?

Re: URGENT: All linux host checks pending forever after upgr

Posted: Thu Jun 23, 2011 12:12 am
by fao
Guthrie, here is the log after executing the commands you suggested

I am still having the problem where the linux host checks are pending forever, though all the service checks complete.

Also, none of the open linux service problems apppear under "Open Service Problems"

Re: URGENT: All linux host checks pending forever after upgr

Posted: Thu Jun 23, 2011 12:31 am
by fao
alright, looking at

/usr/local/nagios/var/spool/checkresults/
-rw------- 1 apache apache 0 Jun 21 16:20 cyn2i7W.ok
-rw------- 1 apache apache 0 Jun 21 16:18 cyN9MLW.ok
-rw------- 1 apache apache 0 Jun 21 16:20 cYnLJKg.ok
-rw------- 1 apache apache 0 Jun 21 16:27 cYnoVUy.ok
-rw------- 1 apache apache 0 Jun 21 16:19 cYNWjC7.ok
-rw------- 1 apache apache 0 Jun 21 16:19 cYoF8ki.ok

[root@hqlprnagios1 spool]# ls -al checkresults/ | wc -l
4711

should all these checks be owned by nagios?

EDIT: I tried changing the owner of all these files to nagios but no success, all host checks still pending

EDIT: deleted all the checks, all host checks still pending in nagiosxi UI. Service checks seem OK

Re: URGENT: All linux host checks pending forever after upgr

Posted: Thu Jun 23, 2011 2:10 am
by fao
found the issue

somehow after the update, my custom template "fao_unix_host" lost the host template xiwizard_linuxserver_host

I added the wizard back to my custom template and the problem is solved

Re: URGENT: All linux host checks pending forever after upgr

Posted: Thu Jun 23, 2011 9:21 am
by mguthrie