Nagios XI Performance Graphs Not Processing
Posted: Mon Oct 06, 2014 1:13 pm
Having some issues with my 2012R2.9 installation of Nagios in regards to performance graphs. Basically, at the end of 2013, a lot of the host graphs just stopped processing. I've upped the logging on the npcd.log and the perfdata.log files to the max.
The npcd.log file is updating, but the perfdata.log file isn't. This is the last few lines of both:
npcd.log
perfdata.log
The other thing I've seen is that the host-perfdata file and the service-perfdata files are 2GB in size. I can't seem to move them (still new to Linux) or rename them even with the Nagios service stopped. Those files haven't been modified since January or February, respectively.
I'm sure this is causing it not to show graphs, although I am hoping that it is still collecting the data (I see RRD files that look up to date and valid). What should I look at next to troubleshoot this?
Thanks!
The npcd.log file is updating, but the perfdata.log file isn't. This is the last few lines of both:
npcd.log
Code: Select all
[10-06-2014 14:05:41] NPCD: No more files to process... waiting for 15 seconds
[10-06-2014 14:05:56] NPCD: Found 2 files in /usr/local/nagios/var/spool/perfdata/
[10-06-2014 14:05:56] NPCD: DEBUG: load 1.800000/25.000000
[10-06-2014 14:05:56] NPCD: ThreadCounter 0/5 File is .
[10-06-2014 14:05:56] NPCD: DEBUG: load 1.800000/25.000000
[10-06-2014 14:05:56] NPCD: ThreadCounter 0/5 File is ..
[10-06-2014 14:05:56] NPCD: No more files to process... waiting for 15 seconds
[10-06-2014 14:06:11] NPCD: Found 2 files in /usr/local/nagios/var/spool/perfdata/
[10-06-2014 14:06:11] NPCD: DEBUG: load 1.980000/25.000000
[10-06-2014 14:06:11] NPCD: ThreadCounter 0/5 File is .
[10-06-2014 14:06:11] NPCD: DEBUG: load 1.980000/25.000000
[10-06-2014 14:06:11] NPCD: ThreadCounter 0/5 File is ..
[10-06-2014 14:06:11] NPCD: No more files to process... waiting for 15 seconds
[10-06-2014 14:06:26] NPCD: Found 2 files in /usr/local/nagios/var/spool/perfdata/
[10-06-2014 14:06:26] NPCD: DEBUG: load 1.920000/25.000000
[10-06-2014 14:06:26] NPCD: ThreadCounter 0/5 File is .
[10-06-2014 14:06:26] NPCD: DEBUG: load 1.920000/25.000000
[10-06-2014 14:06:26] NPCD: ThreadCounter 0/5 File is ..
[10-06-2014 14:06:26] NPCD: No more files to process... waiting for 15 seconds
[10-06-2014 14:06:41] NPCD: Found 2 files in /usr/local/nagios/var/spool/perfdata/
[10-06-2014 14:06:41] NPCD: DEBUG: load 1.720000/25.000000
[10-06-2014 14:06:41] NPCD: ThreadCounter 0/5 File is .
[10-06-2014 14:06:41] NPCD: DEBUG: load 1.720000/25.000000
[10-06-2014 14:06:41] NPCD: ThreadCounter 0/5 File is ..
[10-06-2014 14:06:41] NPCD: No more files to process... waiting for 15 seconds
Code: Select all
2013-12-18 17:02:51 [3021] [0] *** process_perfdata.pl terminated on signal ALRM
2013-12-19 10:20:36 [15435] [0] *** TIMEOUT: Timeout after 5 secs. ***
2013-12-19 10:20:36 [15435] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2013-12-19 10:20:36 [15435] [0] *** TIMEOUT: Please check your npcd.cfg
2013-12-19 10:20:36 [15435] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1387466421.perfdata.service-PID-15435 deleted
2013-12-19 10:20:36 [15435] [0] *** Timeout while processing Host: "lvsclshdc1dn018" Service: "CentOS_Memory_Usage"
2013-12-19 10:20:36 [15435] [0] *** process_perfdata.pl terminated on signal ALRM
2013-12-24 07:01:31 [30347] [0] *** TIMEOUT: Timeout after 5 secs. ***
2013-12-24 07:01:31 [30347] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2013-12-24 07:01:31 [30347] [0] *** TIMEOUT: Please check your npcd.cfg
2013-12-24 07:01:31 [30347] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1387886481.perfdata.service-PID-30347 deleted
2013-12-24 07:01:31 [30347] [0] *** Timeout while processing Host: "VCSCDEVSQL01" Service: "Drive_D__Disk_Transfers_Per_Second"
2013-12-24 07:01:31 [30347] [0] *** process_perfdata.pl terminated on signal ALRM
2013-12-24 07:04:32 [2261] [0] *** TIMEOUT: Timeout after 5 secs. ***
2013-12-24 07:04:32 [2261] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2013-12-24 07:04:32 [2261] [0] *** TIMEOUT: Please check your npcd.cfg
2013-12-24 07:04:32 [2261] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1387886601.perfdata.service-PID-2261 deleted
2013-12-24 07:04:32 [2261] [0] *** Timeout while processing Host: "SCSTPRODSQL03_LVS" Service: "MSSQL_Free_Pages_Per_Sec"
2013-12-24 07:04:32 [2261] [0] *** process_perfdata.pl terminated on signal ALRM
2013-12-28 15:47:56 [6470] [0] *** TIMEOUT: Timeout after 5 secs. ***
2013-12-28 15:47:56 [6470] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2013-12-28 15:47:56 [6470] [0] *** TIMEOUT: Please check your npcd.cfg
2013-12-28 15:47:56 [6470] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1388263641.perfdata.service-PID-6470 deleted
2013-12-28 15:47:56 [6470] [0] *** Timeout while processing Host: "SCSTPRODSQL02_LVS" Service: "Drive_Z__Bytes_Per_Second"
2013-12-28 15:47:56 [6470] [0] *** process_perfdata.pl terminated on signal ALRM
I'm sure this is causing it not to show graphs, although I am hoping that it is still collecting the data (I see RRD files that look up to date and valid). What should I look at next to troubleshoot this?
Thanks!