TIMEOUT message in /usr/local/nagios/var/perfdata.log

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
cskang
Posts: 68
Joined: Sat Mar 05, 2011 4:13 pm

TIMEOUT message in /usr/local/nagios/var/perfdata.log

Post by cskang »

Yesterday, 26 Feb 2012, we have experienced very slow response.
So, I browsed /var/log and /usr/local/nagios/var folders to see if there is any indication of issues.

I found a repeated TIMEOUT indications in perfdata.log in /usr/local/nagios/var directory as follow:
2012-02-26 11:53:42 [31473] [0] *** TIMEOUT: Timeout after 5 secs. ***
2012-02-26 11:53:42 [31473] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2012-02-26 11:53:42 [31473] [0] *** TIMEOUT: Please check your npcd.cfg
2012-02-26 11:53:42 [31473] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//host-perfdata.1330284761-PID-31473 deleted
2012-02-26 11:53:42 [31473] [0] *** Timeout while processing Host: "B56-Sec-RR" Service: "_HOST_"
2012-02-26 11:53:42 [31473] [0] *** process_perfdata.pl terminated on signal ALRM
2012-02-26 11:53:55 [31556] [0] *** TIMEOUT: Timeout after 5 secs. ***
2012-02-26 11:53:57 [31556] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2012-02-26 11:53:57 [31556] [0] *** TIMEOUT: Please check your npcd.cfg
2012-02-26 11:53:57 [31556] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//host-perfdata.1330285522-PID-31556 deleted
2012-02-26 11:53:57 [31556] [0] *** Timeout while processing Host: "B44-Sec-RR" Service: "_HOST_"
2012-02-26 11:53:57 [31556] [0] *** process_perfdata.pl terminated on signal ALRM

I have seen similar TIMEOUT once a while, but it happened much more frequently yesterday.
Is this related to the slow response of Nagios XI somehow?
If yes, what would be the cause of the problem and how to mitigate it?
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: TIMEOUT message in /usr/local/nagios/var/perfdata.log

Post by mguthrie »

I would start by editing the following files:

/usr/local/nagios/etc/pnp/process_perfdata.cfg

Code: Select all

TIMEOUT = 15
/usr/local/nagios/etc/pnp/npcd.cfg

Code: Select all

sleep_time = 10
If you set logging to 0 in both files you'll also notice a performance increase.

This can sometimes happen if there are a lot of files in the /usr/local/nagios/var/spool/perfdata directory. The directory scan for results can backup the processing queue, and then things can just snowball from there. Changing the configs above should prevent it in the future, but if you notice the issue persisting, you may need to clear the contents in the /usr/local/nagios/var/spool/perfdata directory so the system can catch up.
Locked