performance graph can't show recent days' data

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
akoei
Posts: 8
Joined: Fri Jan 31, 2014 12:04 pm

performance graph can't show recent days' data

Post by akoei »

Our Nagios suddenly stopped showing performance graph/data for after Oct 16th. The performance data before Oct 16th is still able to be shown, but after the date, it is all blank. I have tried rebooted the machine, no help. The system status shows OK, with 6 green check marks.

I check the file system, it looks a lot performance data are stuck in spool folder. Anyone has any idea? Our version is 2012R2.8c

Thanks a lot
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: performance graph can't show recent days' data

Post by lmiltchev »

I check the file system, it looks a lot performance data are stuck in spool folder.
Where exactly - in xidpe, perfdata?

Can you run the following commands and show us the output?

Code: Select all

ls /usr/local/nagios/var/spool/xidpe | wc -l
ls /usr/local/nagios/var/spool/perfdata | wc -l
ls /usr/local/nagios/var/spool/checkresults | wc -l
Having too many files in "/usr/local/nagios/var/spool/xidpe" directory may indicate that cron is/was not running. After cron runs, it moves these files to the perfdata dir. Having too many files in the "/usr/local/nagios/var/spool/perfdata" directory means that npcd is/was not running and not processing these files. Perhaps, the load on the server exceeded the "load_threshold" value in the "/usr/local/nagios/etc/pnp/npcd.cfg".
Be sure to check out our Knowledgebase for helpful articles and solutions!
akoei
Posts: 8
Joined: Fri Jan 31, 2014 12:04 pm

Re: performance graph can't show recent days' data

Post by akoei »

lmiltchev wrote:
Having too many files in "/usr/local/nagios/var/spool/xidpe" directory may indicate that cron is/was not running. After cron runs, it moves these files to the perfdata dir. Having too many files in the "/usr/local/nagios/var/spool/perfdata" directory means that npcd is/was not running and not processing these files. Perhaps, the load on the server exceeded the "load_threshold" value in the "/usr/local/nagios/etc/pnp/npcd.cfg".
See below, looks npcd issue, the thing I dont understand is we are having the issue starting from Oct 16th, which is corresponding with the npcd log (10-15-2015), but looks on (07-10-2015) the similar issue were logged however we didn't have the issue from then...

any idea how to hix?

[root@bolnxi01 ~]# ls -l /usr/local/nagios/var/spool/xidpe | wc -l
1
[root@bolnxi01 ~]# ls -l /usr/local/nagios/var/spool/perfdata | wc -l
80409
[root@bolnxi01 ~]# ls -l /usr/local/nagios/var/spool/checkresults | wc -l
73
[root@bolnxi01 ~]# service npcd status
NPCD running (pid 1738).
[root@bolnxi01 ~]# tail /usr/local/nagios/var/npcd.log
[07-10-2015 12:57:43] NPCD: ERROR: Executed command exits with return code '7'
[07-10-2015 12:57:43] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1436547428.perfdata.host'
[07-10-2015 12:57:43] NPCD: ERROR: Executed command exits with return code '7'
[07-10-2015 12:57:43] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1436547428.perfdata.service'
[07-10-2015 13:10:46] NPCD: WARN: MAX load reached: load 12.160000/10.000000 at i=0[07-10-2015 17:27:11] NPCD: Caught Termination Signal - Hasta la vista... baby
[07-10-2015 19:54:04] NPCD: npcd Daemon (0.4.14) started with PID=1738
[07-10-2015 19:54:04] NPCD: Please have a look at 'npcd -V' to get license information
[07-10-2015 19:54:04] NPCD: HINT: load_threshold is enabled - ('10.000000')
[10-15-2015 09:05:49] NPCD: ERROR: Executed command exits with return code '7'
[10-15-2015 09:05:49] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata//1444914303.perfdata.service'
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: performance graph can't show recent days' data

Post by rkennedy »

Can you post the output of the following -
grep "TIMEOUT =" /usr/local/nagios/etc/pnp/process_perfdata.cfg
Former Nagios Employee
akoei
Posts: 8
Joined: Fri Jan 31, 2014 12:04 pm

Re: performance graph can't show recent days' data

Post by akoei »

TIMEOUT = 5
rkennedy wrote:Can you post the output of the following -
grep "TIMEOUT =" /usr/local/nagios/etc/pnp/process_perfdata.cfg
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: performance graph can't show recent days' data

Post by rkennedy »

Try following this guide to make changes to two things, your Perfdata Timeout, and NPCD load threshold.
https://support.nagios.com/wiki/index.p ... ta_Timeout

Set your TIMEOUT to 20, and the load_threshold to 10x the amount of cores your system has.
Former Nagios Employee
akoei
Posts: 8
Joined: Fri Jan 31, 2014 12:04 pm

Re: performance graph can't show recent days' data

Post by akoei »

rkennedy wrote:Try following this guide to make changes to two things, your Perfdata Timeout, and NPCD load threshold.
https://support.nagios.com/wiki/index.p ... ta_Timeout

Set your TIMEOUT to 20, and the load_threshold to 10x the amount of cores your system has.
It works now, thanks a lot!
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: performance graph can't show recent days' data

Post by rkennedy »

Glad to see this worked! I will now close this thread, feel free to open another if you need more assistance.
Former Nagios Employee
Locked