Page 1 of 1

Errors Experienced

Posted: Wed Mar 28, 2018 11:02 pm
by drstaind1
Hi,

We are currently receiving errors in the system status -
- Performance Grapher
- Database Backend

No performance graphs being written
I have attempted to restart both services and npcd.
Have turned up logging levels, however still nothing new being written.
npcd service is running
Number of spooled files in /usr/local/nagios/var/spool/perfdata/ is increasing (approx 900)

perfdata.log

Code: Select all

2018-03-07 12:31:54 [11029] [0] *** TIMEOUT: Timeout after 5 secs. ***
2018-03-07 12:31:56 [11029] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-03-07 12:31:56 [11029] [0] *** TIMEOUT: Please check your npcd.cfg
2018-03-07 12:31:56 [11029] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1520386294.perfdata.service-PID-11029 deleted
2018-03-07 12:31:56 [11029] [0] *** Timeout while processing Host: "mbp-baulkhamhills-eof" Service: "GigabitEthernet0_0_3_Bandwidth"
2018-03-07 12:31:56 [11029] [0] *** process_perfdata.pl terminated on signal ALRM
npcd.log

Code: Select all

[02-07-2018 14:26:53] NPCD: WARN: MAX load reached: load 10.450000/10.000000 at i=1
[02-07-2018 14:27:24] NPCD: ERROR: Executed command exits with return code '6'
[02-07-2018 14:27:24] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1517973945.perfdata.host'
[02-07-2018 14:27:27] NPCD: WARN: MAX load reached: load 10.210000/10.000000 at i=18
[02-07-2018 14:35:38] NPCD: WARN: MAX load reached: load 10.060000/10.000000 at i=0
[03-07-2018 12:31:56] NPCD: ERROR: Executed command exits with return code '7'
[03-07-2018 12:31:56] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1520386294.perfdata.service'
[03-12-2018 11:35:54] NPCD: WARN: MAX load reached: load 10.280000/10.000000 at i=0
[03-12-2018 11:36:09] NPCD: WARN: MAX load reached: load 10.470000/10.000000 at i=1
[03-12-2018 12:22:34] NPCD: Caught Termination Signal - Hasta la vista... baby
Any assistance would be appreciated. Happy to provide any information.

Re: Errors Experienced

Posted: Thu Mar 29, 2018 10:14 am
by scottwilkerson
Based on the log you showed I believe you are hitting the load threshold.

One of the top problem causes to look for in the above log is lines indicating that you are hitting a load threshold, this is common if you are either receiving too much data for NPCD to keep up with the current system's load, or that it is trying to crunch through stacked up performance data.

You can increase this threshold by editing the following file:

/usr/local/nagios/etc/pnp/npcd.cfg

Change:

Code: Select all

load_threshold = 10.0
To a value greater than your system's current load. Use this with caution however, as the NPCD process will eat as much load as you give it, so watch your resources!

https://support.nagios.com/kb/article.php?id=9