Page 1 of 1

TIMEOUT message in /usr/local/nagios/var/perfdata.log

Posted: Mon Feb 27, 2012 2:30 pm
by cskang
Yesterday, 26 Feb 2012, we have experienced very slow response.
So, I browsed /var/log and /usr/local/nagios/var folders to see if there is any indication of issues.

I found a repeated TIMEOUT indications in perfdata.log in /usr/local/nagios/var directory as follow:
2012-02-26 11:53:42 [31473] [0] *** TIMEOUT: Timeout after 5 secs. ***
2012-02-26 11:53:42 [31473] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2012-02-26 11:53:42 [31473] [0] *** TIMEOUT: Please check your npcd.cfg
2012-02-26 11:53:42 [31473] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//host-perfdata.1330284761-PID-31473 deleted
2012-02-26 11:53:42 [31473] [0] *** Timeout while processing Host: "B56-Sec-RR" Service: "_HOST_"
2012-02-26 11:53:42 [31473] [0] *** process_perfdata.pl terminated on signal ALRM
2012-02-26 11:53:55 [31556] [0] *** TIMEOUT: Timeout after 5 secs. ***
2012-02-26 11:53:57 [31556] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2012-02-26 11:53:57 [31556] [0] *** TIMEOUT: Please check your npcd.cfg
2012-02-26 11:53:57 [31556] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//host-perfdata.1330285522-PID-31556 deleted
2012-02-26 11:53:57 [31556] [0] *** Timeout while processing Host: "B44-Sec-RR" Service: "_HOST_"
2012-02-26 11:53:57 [31556] [0] *** process_perfdata.pl terminated on signal ALRM

I have seen similar TIMEOUT once a while, but it happened much more frequently yesterday.
Is this related to the slow response of Nagios XI somehow?
If yes, what would be the cause of the problem and how to mitigate it?

Re: TIMEOUT message in /usr/local/nagios/var/perfdata.log

Posted: Mon Feb 27, 2012 5:22 pm
by mguthrie
I would start by editing the following files:

/usr/local/nagios/etc/pnp/process_perfdata.cfg

Code: Select all

TIMEOUT = 15
/usr/local/nagios/etc/pnp/npcd.cfg

Code: Select all

sleep_time = 10
If you set logging to 0 in both files you'll also notice a performance increase.

This can sometimes happen if there are a lot of files in the /usr/local/nagios/var/spool/perfdata directory. The directory scan for results can backup the processing queue, and then things can just snowball from there. Changing the configs above should prevent it in the future, but if you notice the issue persisting, you may need to clear the contents in the /usr/local/nagios/var/spool/perfdata directory so the system can catch up.