NPCD Error

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
Satyam
Posts: 63
Joined: Mon Oct 24, 2011 8:14 am

NPCD Error

Post by Satyam »

I often see this error in npcd.log

[02-22-2012 14:49:25] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//service-perfdata.1329901923'

While for most of the hosts the graphs are updated but for few hosts it has not been updated after specific time today. Will it take time to update it or it is because of this error, its not updated.

I have never concentrated on the npcd log that was it giving this error even after everything is working fine or today only after seeing the error i have checked and found for some hosts graphs are not populated.
Thanks,
Sattanathan.S
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: NPCD Error

Post by scottwilkerson »

This very possibly could be a permissions issue. Please run the following procedure
http://assets.nagios.com/downloads/nagi ... ios_XI.pdf

Also, please run the following command in order and send back the results

Code: Select all

# ls -l /usr/local/nagios/var
and

Code: Select all

# ls -l /usr/local/nagios/libexec/process_perfdata.pl
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Satyam
Posts: 63
Joined: Mon Oct 24, 2011 8:14 am

Re: NPCD Error

Post by Satyam »

Dear Scott,

I run the script NagiosXI-FixPerms.sh and it gave me a big shock, i got hundreds of host down alert and hundreds of tickets raised in the integrated ticketing tool and people ran to me to ask what happened. I checked and found that its showing this error:

"Warning: This plugin must be either run as root or setuid root. "

I understood there is something related to the plugin permission issue check_icmp plugin failed.

I changed it using chmod u+s check_icmp

and it fixed the problem. But why did it happened after running that script?

Also i run the commands that you given and pasted the output here:-

# ls -l /usr/local/nagios/var

[root@mmkndnagxi etc]# ls -l /usr/local/nagios/var
total 139792
drwxrwxr-x 2 nagios nagios 12288 Mar 13 00:00 archives
-rw-r--r-- 1 apache apache 10409 Dec 18 21:35 graphapi.log
-rw-rw-r-- 1 nagios nagios 0 Mar 13 14:51 host-perfdata
-rw-r--r-- 1 root root 8239873 Oct 18 12:09 messages.old
-rw-rw-r-- 1 nagios nagios 21690 Mar 13 14:51 nagios.debug
-rw-rw-r-- 1 nagios nagios 1000044 Mar 13 14:51 nagios.debug.old
-rw-r--r-- 1 nagios nagios 6 Mar 13 12:42 nagios.lock
-rw-r--r-- 1 nagios root 2985212 Mar 13 14:50 nagios.log
-rw-r--r-- 1 nagios nagios 194978 Feb 22 16:21 ndo2db.debug
-rw-r--r-- 1 nagios nagios 256710 Feb 1 11:27 ndo2db.debug.bkp-22-02-2012
-rw-r--r-- 1 nagios nagios 1003133 Feb 22 16:21 ndo2db.debug.old
-rw-r--r-- 1 nagios nagios 5 Mar 5 00:57 ndo2db.lock
-rw-r--r-- 1 root root 1014678 Jan 21 22:51 ndo2db.migration
-rw-r--r-- 1 root root 4241 Jan 21 22:52 ndo2db.migration.log
-rw-rw-r-- 1 nagios users 0 Mar 13 12:42 ndomod.tmp
srwxr-xr-x 1 nagios nagios 0 Mar 5 00:57 ndo.sock
-rw-r--r-- 1 nagios nagios 1082730 Mar 13 14:50 npcd.log
-rw-r--r-- 1 nagios nagios 10485774 Jan 30 15:14 npcd.log.old
-rw-r--r-- 1 nagios nagios 7646249 Mar 13 12:42 objects.cache
-rwxrwxrwx 1 nagios nagios 70508 Sep 15 13:16 objects.cache.bkp
-rwxrwxrwx 1 nagios nagios 72448 Sep 16 18:26 objects.cache.Sep16
-rw-r--r-- 1 nagios nagios 7592165 Mar 9 15:57 objects.precache
-rw-r--r-- 1 root root 0 Nov 12 16:15 output
-rw-rw-rw- 1 nagios nagios 5147481 Mar 13 14:50 perfdata.log
-rw------- 1 nagios nagios 11971460 Mar 13 14:43 retention.dat
drwxrwsr-x 2 nagios nagcmd 4096 Mar 13 12:42 rw
-rw-r--r-- 1 nagios nagios 37386152 Mar 13 14:49 SD_Integrator.log
-rw-rw-r-- 1 nagios nagios 3840 Mar 13 14:51 service-perfdata
-rw-rw-r-- 1 nagios nagios 157057 Nov 22 16:54 service_service_desk.log
-rw-r--r-- 1 root root 45 Nov 12 16:16 shaji
-rw-r--r-- 1 root root 514098 Jan 19 19:21 shaji.log
-rw-r--r-- 1 root root 4565048 Jan 19 19:21 shaji.nagios
-rw-r--r-- 1 root root 23 Nov 12 16:16 shaji.output
-rw-r--r-- 1 root root 34845 Feb 2 10:26 shaji.test
-rwxrwxrwx 1 nagios users 29697143 Mar 13 14:25 sms.log
drwxr-xr-x 5 nagios nagios 4096 Sep 9 2011 spool
drwxr-xr-x 2 nagios nagios 4096 Mar 13 14:51 stats
-rw-rw-r-- 1 nagios nagios 11664491 Mar 13 14:51 status.dat
You have mail in /var/spool/mail/root
[root@mmkndnagxi etc]#
----------------------------------

# ls -l /usr/local/nagios/libexec/process_perfdata.pl

[root@mmkndnagxi etc]# ls -l /usr/local/nagios/libexec/process_perfdata.pl
-rwxr-xr-x 1 nagios nagios 42724 Sep 9 2011 /usr/local/nagios/libexec/process_perfdata.pl
[root@mmkndnagxi etc]#
Thanks,
Sattanathan.S
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: NPCD Error

Post by scottwilkerson »

I'm going to have to look at the fix permissions script about the check_icmp

As for the permissions listed all looks good except lets:

Code: Select all

chmod u+w /usr/local/nagios/var/spool
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Satyam
Posts: 63
Joined: Mon Oct 24, 2011 8:14 am

Re: NPCD Error

Post by Satyam »

Hi Scott,

One more issue I am facing with related to npcd. It goes on hogging the system memory and I am frequently getting the nagios system memory critical alert. Then I usually restart the npcd service and it frees a lot of memory. What can be the solution to this. is there any bug in the npcd.
Thanks,
Sattanathan.S
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: NPCD Error

Post by mguthrie »

What version of XI are you running? There was a memory leak in npcd in a previous release that could cause the process to crash periodically.
Satyam
Posts: 63
Joined: Mon Oct 24, 2011 8:14 am

Re: NPCD Error

Post by Satyam »

Hi Mguthrie,

I am running NagiosXI 2011R1.7. Is there anyway I can fix this issue temporarily without having to upgrade NagiosXI to the latest version for now.
Thanks,
Sattanathan.S
Satyam
Posts: 63
Joined: Mon Oct 24, 2011 8:14 am

Re: NPCD Error

Post by Satyam »

Hi,

Today when I came office and saw again it was stopped around early morning 1:00 clock while I was sleeping peacefully in my home, now its creating a big issue... What can be done..

When i checked the log...
[03-16-2012 01:07:35] NPCD: Error while get file list from spooldir (/usr/local/nagios/var/spool/perfdata/) - Cannot allocate memory
[03-16-2012 01:07:35] NPCD: Exiting...
[03-16-2012 01:07:35] NPCD: Daemon ended. PID was '23387'
[03-16-2012 10:25:57] NPCD: npcd Daemon (0.4.14) started with PID=15217
[03-16-2012 10:25:57] NPCD: Please have a look at 'npcd -V' to get license information
[03-16-2012 10:25:57] NPCD: HINT: load_threshold is enabled - ('50.000000')

my npcd version....
[root@mmkndnagxi nagios]# /usr/local/nagios/bin/npcd -V
npcd 0.4.14 - $Revision: 605 $

Also I am always seeing errors like these in my npcd log, what is these all about...

[03-16-2012 01:07:35] NPCD: Error while get file list from spooldir (/usr/local/nagios/var/spool/perfdata/) - Cannot allocate memory
[03-16-2012 01:07:35] NPCD: Exiting...
[03-16-2012 01:07:35] NPCD: Daemon ended. PID was '23387'
[03-16-2012 10:25:57] NPCD: npcd Daemon (0.4.14) started with PID=15217
[03-16-2012 10:25:57] NPCD: Please have a look at 'npcd -V' to get license information
[03-16-2012 10:25:57] NPCD: HINT: load_threshold is enabled - ('50.000000')
[03-16-2012 10:25:59] NPCD: ERROR: Executed command exits with return code '6'
[03-16-2012 10:25:59] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//host-perfdata.1331840262'
[03-16-2012 10:25:59] NPCD: ERROR: Executed command exits with return code '6'
[03-16-2012 10:25:59] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//host-perfdata.1331840248'
[03-16-2012 10:26:00] NPCD: ERROR: Executed command exits with return code '6'
[03-16-2012 10:26:00] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//host-perfdata.1331840307'
[03-16-2012 10:26:00] NPCD: ERROR: Executed command exits with return code '6'
[03-16-2012 10:26:00] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//host-perfdata.1331840292'
Thanks,
Sattanathan.S
Satyam
Posts: 63
Joined: Mon Oct 24, 2011 8:14 am

Re: NPCD Error

Post by Satyam »

Hi,

I am also not able to see performance graphs in the UI, though npcd is running, I am not able to see the performance graphs in the UI. I am running NagiosXIR1.7

The page title its showing is 'PNP Debugger'

The URL visited is http://xiserverip/nagios/pnp/index.php

And the output that I am seeing is....
Initalising

Using /usr/local/nagios/share/perfdata/

RRDTool /usr/bin/rrdtool found.

RRDTool /usr/bin/rrdtool is executable

PHP Function proc_open is enabled

PHP Function fpassthru is enabled

PHP Function xml_parser_create is enabled

PHP zlib Support found.

PHP GD Support found.

RRD Base Directory /usr/local/nagios/share/perfdata/ found.

Hostname xiprddb is set.

Directory /usr/local/nagios/share/perfdata/xiprddb not found.


What can be the issue, earlier I was able to see them....
Thanks,
Sattanathan.S
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: NPCD Error

Post by mguthrie »

I am running NagiosXI 2011R1.7. Is there anyway I can fix this issue temporarily without having to upgrade NagiosXI to the latest version for now.
We fixed the memory leak issue as of 2011R1.8. Unfortunately in order to fix this issue the npcd daemon needs to be recompiled. You can delay the problem by running a cron job to restart npcd once an hour. I would also edit your nagios.cfg file to the following settings:

Code: Select all

enable_embedded_perl=0
use_embedded_perl_implicitly=0
The following errors confirm the issue, since the memory leak was related to the directory scan:

Code: Select all

NPCD: Error while get file list from spooldir (/usr/local/nagios/var/spool/perfdata/) - Cannot allocate memory
Locked