After Upgrade Nagios XI to 2012R1.2 - defunct Process

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
quental
Posts: 74
Joined: Tue Apr 17, 2012 5:12 am

After Upgrade Nagios XI to 2012R1.2 - defunct Process

Post by quental »

Nagios XI standard
Centos 6.1 64 bits
Manual installation

Hi,
After performing an upgrade to version 2012R1.2 , we are finding many problems with defunct processes on the machine:

Code: Select all

nagios   13109     1  0 14:51 ?        00:00:02 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios   13156 13109  0 14:52 ?        00:00:00 /usr/local/nagios/libexec/dnxServer -c /usr/local/nagios/etc/dnxServer.cfg -j 5718
nagios   13166 13109  0 14:52 ?        00:00:00 /usr/local/nagios/libexec/dnxServer -c /usr/local/nagios/etc/dnxServer.cfg -j 5718
nagios   13176 13109  0 14:52 ?        00:00:00 /usr/local/nagios/libexec/dnxServer -c /usr/local/nagios/etc/dnxServer.cfg -j 5718
nagios   14832 13109  0 14:52 ?        00:00:00 [nagios] <defunct>
nagios   14833 13109  0 14:52 ?        00:00:00 [nagios] <defunct>
nagios   14835 13109  0 14:52 ?        00:00:00 [nagios] <defunct>
nagios   14836 13109  0 14:52 ?        00:00:00 [nagios] <defunct>
nagios   14838 13109  0 14:52 ?        00:00:00 [nagios] <defunct>
nagios   14841 13109  0 14:52 ?        00:00:00 [nagios] <defunct>
nagios   14842 13109  0 14:52 ?        00:00:00 [nagios] <defunct>
nagios   14843 13109  0 14:52 ?        00:00:00 [nagios] <defunct>
nagios   14844 13109  0 14:52 ?        00:00:00 [nagios] <defunct>
nagios   14845 13109  0 14:52 ?        00:00:00 [nagios] <defunct>
nagios   14846 13109  0 14:52 ?        00:00:00 [nagios] <defunct>
nagios   14847 13109  0 14:52 ?        00:00:00 [nagios] <defunct>
nagios   14850 13109  0 14:52 ?        00:00:00 [nagios] <defunct>
nagios   14852 13109  0 14:52 ?        00:00:00 [nagios] <defunct>
nagios   14853 13109  0 14:52 ?        00:00:00 [nagios] <defunct>
nagios   14856 13109  0 14:52 ?        00:00:00 [nagios] <defunct>
nagios   14858 13109  0 14:52 ?        00:00:00 [nagios] <defunct>
nagios   14869 13109  0 14:52 ?        00:00:00 [nagios] <defunct>
nagios   14870 13109  0 14:52 ?        00:00:00 [nagios] <defunct>
nagios   14871 13109  0 14:52 ?        00:00:00 [nagios] <defunct>
nagios   14872 13109  0 14:52 ?        00:00:00 [nagios] <defunct>
nagios   14873 13109  0 14:52 ?        00:00:00 [nagios] <defunct>
nagios   14879 13109  0 14:52 ?        00:00:00 [nagios] <defunct>
nagios   14891 13109  0 14:52 ?        00:00:00 [nagios] <defunct>
nagios   14893 13109  0 14:52 ?        00:00:00 [nagios] <defunct>
nagios   14894 13109  0 14:52 ?        00:00:00 [nagios] <defunct>
nagios   14898 13109  0 14:52 ?        00:00:00 [nagios] <defunct>
...
...
About 250 defunct process..
We have to reboot the S.O , to work..
any solution or suggestion to make in the system?

Thanks.
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: After Upgrade Nagios XI to 2012R1.2 - defunct Process

Post by mguthrie »

Do you get any log entries in /usr/local/nagios/var/nagios.log about orphan checks?
quental
Posts: 74
Joined: Tue Apr 17, 2012 5:12 am

Re: After Upgrade Nagios XI to 2012R1.2 - defunct Process

Post by quental »

I don´t know.

I will look into the file.....
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: After Upgrade Nagios XI to 2012R1.2 - defunct Process

Post by mguthrie »

Ok, you can do a quick grep on that file with the following command:

Code: Select all

grep orphan /usr/local/nagios/var/nagios.log
quental
Posts: 74
Joined: Tue Apr 17, 2012 5:12 am

Re: After Upgrade Nagios XI to 2012R1.2 - defunct Process

Post by quental »

Hi,
This is what i found in nagios.log:

[1354905389] SERVICE ALERT: SRDCDPAUVILA;PING;OK;SOFT;3;PING OK - Packet loss = 0%, RTA = 15.31 ms
[1354905399] SERVICE ALERT: SRDCDPZATEBEO;PING;OK;SOFT;2;PING OK - Packet loss = 0%, RTA = 16.88 ms
[1354905459] Warning: The check of host 'orapru' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the host...
[1354905509] SERVICE ALERT: SRDCDSCUGAT;PING;CRITICAL;SOFT;1;PING CRITICAL - Packet loss = 16%, RTA = 13.95 ms
[1354905509] SERVICE ALERT: SRDCDGIJON;PING;CRITICAL;SOFT;1;PING CRITICAL - Packet loss = 16%, RTA = 20.20 ms

How to act?
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: After Upgrade Nagios XI to 2012R1.2 - defunct Process

Post by mguthrie »

Lets start with this and see if the issue persists from there:

Code: Select all

service nagios stop
killall -9 nagios
service nagios start
Locked