Page 1 of 1

performance graph can't show recent days' data

Posted: Mon Oct 26, 2015 9:50 am
by akoei
Our Nagios suddenly stopped showing performance graph/data for after Oct 16th. The performance data before Oct 16th is still able to be shown, but after the date, it is all blank. I have tried rebooted the machine, no help. The system status shows OK, with 6 green check marks.

I check the file system, it looks a lot performance data are stuck in spool folder. Anyone has any idea? Our version is 2012R2.8c

Thanks a lot

Re: performance graph can't show recent days' data

Posted: Mon Oct 26, 2015 4:48 pm
by lmiltchev
I check the file system, it looks a lot performance data are stuck in spool folder.
Where exactly - in xidpe, perfdata?

Can you run the following commands and show us the output?

Code: Select all

ls /usr/local/nagios/var/spool/xidpe | wc -l
ls /usr/local/nagios/var/spool/perfdata | wc -l
ls /usr/local/nagios/var/spool/checkresults | wc -l
Having too many files in "/usr/local/nagios/var/spool/xidpe" directory may indicate that cron is/was not running. After cron runs, it moves these files to the perfdata dir. Having too many files in the "/usr/local/nagios/var/spool/perfdata" directory means that npcd is/was not running and not processing these files. Perhaps, the load on the server exceeded the "load_threshold" value in the "/usr/local/nagios/etc/pnp/npcd.cfg".

Re: performance graph can't show recent days' data

Posted: Tue Oct 27, 2015 8:29 am
by akoei
lmiltchev wrote:
Having too many files in "/usr/local/nagios/var/spool/xidpe" directory may indicate that cron is/was not running. After cron runs, it moves these files to the perfdata dir. Having too many files in the "/usr/local/nagios/var/spool/perfdata" directory means that npcd is/was not running and not processing these files. Perhaps, the load on the server exceeded the "load_threshold" value in the "/usr/local/nagios/etc/pnp/npcd.cfg".
See below, looks npcd issue, the thing I dont understand is we are having the issue starting from Oct 16th, which is corresponding with the npcd log (10-15-2015), but looks on (07-10-2015) the similar issue were logged however we didn't have the issue from then...

any idea how to hix?

[root@bolnxi01 ~]# ls -l /usr/local/nagios/var/spool/xidpe | wc -l
1
[root@bolnxi01 ~]# ls -l /usr/local/nagios/var/spool/perfdata | wc -l
80409
[root@bolnxi01 ~]# ls -l /usr/local/nagios/var/spool/checkresults | wc -l
73
[root@bolnxi01 ~]# service npcd status
NPCD running (pid 1738).
[root@bolnxi01 ~]# tail /usr/local/nagios/var/npcd.log
[07-10-2015 12:57:43] NPCD: ERROR: Executed command exits with return code '7'
[07-10-2015 12:57:43] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1436547428.perfdata.host'
[07-10-2015 12:57:43] NPCD: ERROR: Executed command exits with return code '7'
[07-10-2015 12:57:43] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1436547428.perfdata.service'
[07-10-2015 13:10:46] NPCD: WARN: MAX load reached: load 12.160000/10.000000 at i=0[07-10-2015 17:27:11] NPCD: Caught Termination Signal - Hasta la vista... baby
[07-10-2015 19:54:04] NPCD: npcd Daemon (0.4.14) started with PID=1738
[07-10-2015 19:54:04] NPCD: Please have a look at 'npcd -V' to get license information
[07-10-2015 19:54:04] NPCD: HINT: load_threshold is enabled - ('10.000000')
[10-15-2015 09:05:49] NPCD: ERROR: Executed command exits with return code '7'
[10-15-2015 09:05:49] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1444914303.perfdata.service'

Re: performance graph can't show recent days' data

Posted: Tue Oct 27, 2015 4:11 pm
by rkennedy
Can you post the output of the following -
grep "TIMEOUT =" /usr/local/nagios/etc/pnp/process_perfdata.cfg

Re: performance graph can't show recent days' data

Posted: Wed Oct 28, 2015 7:31 am
by akoei
TIMEOUT = 5
rkennedy wrote:Can you post the output of the following -
grep "TIMEOUT =" /usr/local/nagios/etc/pnp/process_perfdata.cfg

Re: performance graph can't show recent days' data

Posted: Wed Oct 28, 2015 3:07 pm
by rkennedy
Try following this guide to make changes to two things, your Perfdata Timeout, and NPCD load threshold.
https://support.nagios.com/wiki/index.p ... ta_Timeout

Set your TIMEOUT to 20, and the load_threshold to 10x the amount of cores your system has.

Re: performance graph can't show recent days' data

Posted: Thu Oct 29, 2015 11:29 am
by akoei
rkennedy wrote:Try following this guide to make changes to two things, your Perfdata Timeout, and NPCD load threshold.
https://support.nagios.com/wiki/index.p ... ta_Timeout

Set your TIMEOUT to 20, and the load_threshold to 10x the amount of cores your system has.
It works now, thanks a lot!

Re: performance graph can't show recent days' data

Posted: Thu Oct 29, 2015 12:22 pm
by rkennedy
Glad to see this worked! I will now close this thread, feel free to open another if you need more assistance.