Performance Graphs are not updating properly

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
acentek
Posts: 123
Joined: Thu Jul 27, 2017 2:00 pm

Performance Graphs are not updating properly

Post by acentek »

It was reported that NagiosXI is not updating service graph data.

I have rebooted the NagiosXI CentOS server and still no go.

I was told i am licensed for support but apparently i cannot call in for support.

So i am posting here first and waiting for the nagios rep to email me where to post my trouble.

All of my services are green in System Status and Monitoring Engine Status.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Performance Graphs are not updating properly

Post by scottwilkerson »

Here is a good place to start for performance graph problems
https://support.nagios.com/kb/article.php?id=9

If you could run the commands in the doc and post back the results it will assist us in collecting the information we need.

additionally verify that the npcd service is running

Code: Select all

service npcd status
Also, please provide your /usr/local/nagios/etc/nagios.cfg

Did the problem just start? Have you or another administrator recently adjusted any configuration files, or install any new packages?
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
acentek
Posts: 123
Joined: Thu Jul 27, 2017 2:00 pm

Re: Performance Graphs are not updating properly

Post by acentek »

This trouble just started this afternoon. Let me go through that link now.

Count The Amount Of Spooled Files


[root@nagios log]# ls /usr/local/nagios/var/spool/perfdata/ | wc -l
1105
[root@nagios log]# ls /usr/local/nagios/var/spool/xidpe/ | wc -l
1

It looks like perfdata is processing and less then 20,000 so i am going to keep moving on. I looked at this page before posting here earlier and the following was what i found 20 minutes ago.

[root@nagios log]# ls /usr/local/nagios/var/spool/perfdata/ | wc -l
1487
[root@nagios log]# ls /usr/local/nagios/var/spool/xidpe/ | wc -l
0

So that shows that perfdata is being processed.

Now like i said graphs are populating but they are behind by 3 hours. If you wait 5 minutes you see the new data is shown. Like the data is queued.

Increase Performance Data Logging Verbosity

I changed it from 0 to 2 and here are the last some lines.

2018-02-28 16:10:50 [26355] [0] *** TIMEOUT: Timeout after 5 secs. ***
2018-02-28 16:10:50 [26355] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-02-28 16:10:50 [26355] [0] *** TIMEOUT: Please check your npcd.cfg
2018-02-28 16:10:50 [26355] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1519844799.perfdata.host-PID-26355 deleted
2018-02-28 16:10:50 [26355] [0] *** Timeout while processing Host: "HOKH-SMS500-AFC" Service: "_HOST_"
2018-02-28 16:10:50 [26355] [0] *** process_perfdata.pl terminated on signal ALRM
2018-02-28 16:10:50 [26353] [0] *** TIMEOUT: Timeout after 5 secs. ***
2018-02-28 16:10:50 [26353] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-02-28 16:10:50 [26353] [0] *** TIMEOUT: Please check your npcd.cfg
2018-02-28 16:10:50 [26353] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1519844770.perfdata.service-PID-26353 deleted
2018-02-28 16:10:50 [26353] [0] *** Timeout while processing Host: "BCKL00BAS02" Service: "Uptime_under_20_minutes"
2018-02-28 16:10:50 [26353] [0] *** process_perfdata.pl terminated on signal ALRM
2018-02-28 16:10:50 [26358] [0] *** TIMEOUT: Timeout after 5 secs. ***
2018-02-28 16:10:50 [26358] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-02-28 16:10:50 [26358] [0] *** TIMEOUT: Please check your npcd.cfg
2018-02-28 16:10:50 [26358] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1519844785.perfdata.host-PID-26358 deleted
2018-02-28 16:10:50 [26358] [0] *** Timeout while processing Host: "MSCK15BAS02" Service: "_HOST_"
2018-02-28 16:10:50 [26358] [0] *** process_perfdata.pl terminated on signal ALRM
2018-02-28 16:10:50 [26356] [0] *** TIMEOUT: Timeout after 5 secs. ***
2018-02-28 16:10:50 [26356] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-02-28 16:10:50 [26356] [0] *** TIMEOUT: Please check your npcd.cfg
2018-02-28 16:10:50 [26356] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1519844785.perfdata.service-PID-26356 deleted
2018-02-28 16:10:50 [26356] [0] *** Timeout while processing Host: "BWVCX0102" Service: "Uptime_under_20_minutes"
2018-02-28 16:10:50 [26356] [0] *** process_perfdata.pl terminated on signal ALRM
2018-02-28 16:10:50 [26359] [0] *** TIMEOUT: Timeout after 5 secs. ***
2018-02-28 16:10:50 [26359] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-02-28 16:10:50 [26359] [0] *** TIMEOUT: Please check your npcd.cfg
2018-02-28 16:10:50 [26359] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1519844799.perfdata.service-PID-26359 deleted
2018-02-28 16:10:50 [26359] [0] *** Timeout while processing Host: "DT_BAS-2" Service: "Uptime_under_20_minutes"
2018-02-28 16:10:50 [26359] [0] *** process_perfdata.pl terminated on signal ALRM
2018-02-28 16:10:57 [26802] [0] *** TIMEOUT: Timeout after 5 secs. ***
2018-02-28 16:10:57 [26802] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-02-28 16:10:57 [26802] [0] *** TIMEOUT: Please check your npcd.cfg
2018-02-28 16:10:57 [26802] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1519844815.perfdata.service-PID-26802 deleted
2018-02-28 16:10:57 [26802] [0] *** Timeout while processing Host: "96_BAS-1" Service: "Uptime_under_20_minutes"
2018-02-28 16:10:57 [26802] [0] *** process_perfdata.pl terminated on signal ALRM
2018-02-28 16:10:57 [26805] [0] *** TIMEOUT: Timeout after 5 secs. ***
2018-02-28 16:10:57 [26805] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-02-28 16:10:57 [26805] [0] *** TIMEOUT: Please check your npcd.cfg
2018-02-28 16:10:57 [26809] [0] *** TIMEOUT: Timeout after 5 secs. ***
2018-02-28 16:10:57 [26804] [0] *** TIMEOUT: Timeout after 5 secs. ***
2018-02-28 16:10:57 [26809] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-02-28 16:10:57 [26805] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1519844829.perfdata.service-PID-26805 deleted
2018-02-28 16:10:57 [26804] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-02-28 16:10:57 [26809] [0] *** TIMEOUT: Please check your npcd.cfg
2018-02-28 16:10:57 [26804] [0] *** TIMEOUT: Please check your npcd.cfg
2018-02-28 16:10:57 [26805] [0] *** Timeout while processing Host: "DAKCX0404" Service: "Uptime_under_20_minutes"
2018-02-28 16:10:57 [26805] [0] *** process_perfdata.pl terminated on signal ALRM
2018-02-28 16:10:57 [26809] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1519844829.perfdata.host-PID-26809 deleted
2018-02-28 16:10:57 [26804] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1519844815.perfdata.host-PID-26804 deleted
2018-02-28 16:10:57 [26809] [0] *** Timeout while processing Host: "SBRD02_Valere" Service: "_HOST_"
2018-02-28 16:10:57 [26804] [0] *** Timeout while processing Host: "HOKH-Valere-4" Service: "_HOST_"
2018-02-28 16:10:57 [26809] [0] *** process_perfdata.pl terminated on signal ALRM
2018-02-28 16:10:57 [26804] [0] *** process_perfdata.pl terminated on signal ALRM
2018-02-28 16:20:37 [27022] [0] *** TIMEOUT: Timeout after 5 secs. ***
2018-02-28 16:20:37 [27022] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2018-02-28 16:20:37 [27022] [0] *** TIMEOUT: Please check your npcd.cfg
2018-02-28 16:20:37 [27022] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1519847590.perfdata.service-PID-27022 deleted
2018-02-28 16:20:37 [27022] [0] *** Timeout while processing Host: "Acenet-HXVL-c3650" Service: "Gi1_1_1-_HXVL3800_Gi0_5_Mgmt_Trunk_Bandwidth"
2018-02-28 16:20:37 [27022] [0] *** process_perfdata.pl terminated on signal ALRM

I do not see any errors. But i see timeouts exceeded.

Increase NPCD Logging Verbosity

[02-28-2018 16:20:37] NPCD: WARN: MAX load reached: load 10.130000/10.000000 at i=793
[02-28-2018 16:20:52] NPCD: WARN: MAX load reached: load 12.120000/10.000000 at i=793
[02-28-2018 16:21:07] NPCD: WARN: MAX load reached: load 18.000000/10.000000 at i=793
[02-28-2018 16:21:22] NPCD: WARN: MAX load reached: load 16.360000/10.000000 at i=793
[02-28-2018 16:21:37] NPCD: WARN: MAX load reached: load 18.960000/10.000000 at i=793
[02-28-2018 16:21:52] NPCD: WARN: MAX load reached: load 17.090000/10.000000 at i=793
[02-28-2018 16:22:07] NPCD: WARN: MAX load reached: load 15.660000/10.000000 at i=793
[02-28-2018 16:22:22] NPCD: WARN: MAX load reached: load 13.290000/10.000000 at i=793
[02-28-2018 16:22:37] NPCD: WARN: MAX load reached: load 11.080000/10.000000 at i=793
[02-28-2018 16:22:52] NPCD: WARN: MAX load reached: load 13.100000/10.000000 at i=793
[02-28-2018 16:23:07] NPCD: WARN: MAX load reached: load 10.650000/10.000000 at i=793
[02-28-2018 16:23:22] NPCD: WARN: MAX load reached: load 10.350000/10.000000 at i=793
[02-28-2018 16:23:37] NPCD: WARN: MAX load reached: load 11.370000/10.000000 at i=793
[02-28-2018 16:24:09] NPCD: WARN: MAX load reached: load 11.510000/10.000000 at i=928
[02-28-2018 16:24:24] NPCD: WARN: MAX load reached: load 12.270000/10.000000 at i=928
[02-28-2018 16:24:39] NPCD: WARN: MAX load reached: load 11.380000/10.000000 at i=928
[02-28-2018 16:24:54] NPCD: WARN: MAX load reached: load 12.090000/10.000000 at i=928
[02-28-2018 16:25:09] NPCD: WARN: MAX load reached: load 12.150000/10.000000 at i=928
[02-28-2018 16:25:24] NPCD: WARN: MAX load reached: load 11.450000/10.000000 at i=928
[02-28-2018 16:25:39] NPCD: WARN: MAX load reached: load 14.440000/10.000000 at i=928
[02-28-2018 16:25:54] NPCD: WARN: MAX load reached: load 14.720000/10.000000 at i=928
[02-28-2018 16:26:09] NPCD: WARN: MAX load reached: load 12.070000/10.000000 at i=928
[02-28-2018 16:26:24] NPCD: WARN: MAX load reached: load 16.310000/10.000000 at i=928
[02-28-2018 16:26:39] NPCD: WARN: MAX load reached: load 13.360000/10.000000 at i=928
[02-28-2018 16:26:54] NPCD: WARN: MAX load reached: load 16.510000/10.000000 at i=928
[02-28-2018 16:27:09] NPCD: WARN: MAX load reached: load 13.860000/10.000000 at i=928
[02-28-2018 16:27:24] NPCD: WARN: MAX load reached: load 12.420000/10.000000 at i=928
[02-28-2018 16:27:39] NPCD: WARN: MAX load reached: load 12.820000/10.000000 at i=928
[02-28-2018 16:27:54] NPCD: WARN: MAX load reached: load 17.960000/10.000000 at i=928
[02-28-2018 16:28:05] NPCD: Caught Termination Signal - Hasta la vista... baby
[02-28-2018 16:28:05] NPCD: npcd Daemon (0.4.14) started with PID=15357
[02-28-2018 16:28:05] NPCD: Please have a look at 'npcd -V' to get license information
[02-28-2018 16:28:05] NPCD: HINT: load_threshold is enabled - ('10.000000')
[02-28-2018 16:28:05] NPCD: WARN: MAX load reached: load 16.790000/10.000000 at i=0
[02-28-2018 16:28:20] NPCD: WARN: MAX load reached: load 14.820000/10.000000 at i=1
[02-28-2018 16:28:35] NPCD: WARN: MAX load reached: load 13.460000/10.000000 at i=1


Now i am seeing NPCD MAX load reached. but that isn't this issue as that shows up in the log as far back as yesterday.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Performance Graphs are not updating properly

Post by scottwilkerson »

that actually can be the issue

You can increase this threshold by editing the following file:

/usr/local/nagios/etc/pnp/npcd.cfg
Change:

Code: Select all

load_threshold = 10.0
To a value greater than your system's current load. Use this with caution however, as the NPCD process will eat as much load as you give it, so watch your resources!

Then restart npcd

Code: Select all

service npcd restart
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
acentek
Posts: 123
Joined: Thu Jul 27, 2017 2:00 pm

Re: Performance Graphs are not updating properly

Post by acentek »

Thank you for the response yesterday.

I increased it to 20.0 last night and graphing caught up to real time overnight.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Performance Graphs are not updating properly

Post by scottwilkerson »

acentek wrote:Thank you for the response yesterday.

I increased it to 20.0 last night and graphing caught up to real time overnight.
Excellent, glad to be of assistance
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Locked