Performance Graph broken

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
vmesquita
Posts: 315
Joined: Fri Aug 10, 2012 12:52 pm

Performance Graph broken

Post by vmesquita »

Hi,

Latelly performance graph is not working anymore. Every service with Graph shows a blank graphis, just the thresholds, like the attached file. Any ideas on how to fix this?
You do not have the required permissions to view the files attached to this post.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Performance Graph broken

Post by lmiltchev »

Have you tried following the steps, outlined on our wiki page?

http://support.nagios.com/wiki/index.ph ... h_Problems
Be sure to check out our Knowledgebase for helpful articles and solutions!
vmesquita
Posts: 315
Joined: Fri Aug 10, 2012 12:52 pm

Re: Performance Graph broken

Post by vmesquita »

I haven't seen it before, but I just tried and it didn't fix. The data seems to be obtained, but somehow it doesn't make it to the graph:

Code: Select all

[root@nagios libexec]# ./check_rrdtraf -f '/var/lib/mrtg/172.27.134.1_10140.rrd' -w 200,200 -c 500,500 -l M
OK - Current BW in: .36Mbps Out: .29Mbps|in=.360971Mb/s;200;500 out=.293609Mb/s;200;500
You do not have the required permissions to view the files attached to this post.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Performance Graph broken

Post by lmiltchev »

Run the following commands and show the output:

Code: Select all

service npcd status
ll /usr/local/nagios/share/perfdata
ls /usr/local/nagios/var/spool/xidpe | wc -l
ls /usr/local/nagios/var/spool/perfdata | wc -l
ls /usr/local/nagios/var/spool/checkresults | wc -l
Be sure to check out our Knowledgebase for helpful articles and solutions!
vmesquita
Posts: 315
Joined: Fri Aug 10, 2012 12:52 pm

Re: Performance Graph broken

Post by vmesquita »

Code: Select all

[root@nagios /]# service npcd status
NPCD running (pid 10255).
[root@nagios /]#

Code: Select all

[root@nagios /]# ll /usr/local/nagios/share/perfdata
.....
drwxrwxrwx 2 nagios nagios  4096 Dec  2  2011 ********
drwxrwxrwx 2 nagios nagios  4096 Dec  2  2011 *********
drwxrwxrwx 2 nagios nagios  4096 Sep  9 12:02 *********
drwxrwxrwx 2 nagios nagios  4096 Sep  9 12:02 ********
drwxrwxrwx 2 nagios nagios  4096 Sep 11 00:01 *********
Note: name of the hosts have been replaced by *****.

Code: Select all

[root@nagios /]# ls /usr/local/nagios/var/spool/xidpe | wc -l
1

Code: Select all

[root@nagios /]# ls /usr/local/nagios/var/spool/perfdata | wc -l
24845

Code: Select all

[root@nagios /]# ls /usr/local/nagios/var/spool/checkresults | wc -l
98062
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Performance Graph broken

Post by lmiltchev »

Make sure logging is enabled in the process_perfdata.cfg and npcd.cfg (log level is set to "1"):

Code: Select all

grep -i "log_level =" /usr/local/nagios/etc/pnp/process_perfdata.cfg
grep -i "log_level =" /usr/local/nagios/etc/pnp/npcd.cfg
After you modified the configs, restart npcd:

Code: Select all

service npcd restart
tail the logs, and post the output:

Code: Select all

tail 30 /usr/local/nagios/var/perfdata.log
tail 30 /usr/local/nagios/var/npcd.log
Be sure to check out our Knowledgebase for helpful articles and solutions!
vmesquita
Posts: 315
Joined: Fri Aug 10, 2012 12:52 pm

Re: Performance Graph broken

Post by vmesquita »

Both were 0, so I changed to 1 as suggested.

The last entrances of the log seem to date back to aug 30:

Code: Select all

==> /usr/local/nagios/var/perfdata.log <==
2013-08-29 17:21:39 [20265] [0] *** TIMEOUT: Please check your npcd.cfg
2013-08-29 17:21:39 [20265] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1377716242.perfdata.service-PID-20265 deleted
2013-08-29 17:21:39 [20265] [0] *** Timeout while processing Host: "*******" Service: "CPU_Stats"
2013-08-29 17:21:39 [20265] [0] *** process_perfdata.pl terminated on signal ALRM
2013-08-30 09:55:01 [31084] [0] *** TIMEOUT: Timeout after 5 secs. ***
2013-08-30 09:55:01 [31084] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2013-08-30 09:55:01 [31084] [0] *** TIMEOUT: Please check your npcd.cfg
2013-08-30 09:55:01 [31084] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata//1377864081.perfdata.service-PID-31084 deleted
2013-08-30 09:55:01 [31084] [0] *** Timeout while processing Host: "******" Service: "Ping"
2013-08-30 09:55:01 [31084] [0] *** process_perfdata.pl terminated on signal ALRM
tail 30 /usr/local/nagios/var/npcd.log

Code: Select all

reached: load 27.720000/10.000000 at i=1[09-12-2013 14:28:42] NPCD: WARN: MAX load reached: load 26.700000/10.000000 at i=1[09-12-2013 14:28:57] NPCD: WARN: MAX load reached: load 25.940000/10.000000 at i=1[09-12-2013 14:29:12] NPCD: WARN: MAX load reached: load 25.300000/10.000000 at i=1[09-12-2013 14:29:27] NPCD: WARN: MAX load reached: load 25.240000/10.000000 at i=1[09-12-2013 14:29:42] NPCD: WARN: MAX load reached: load 24.720000/10.000000 at i=1[09-12-2013 14:29:57] NPCD: WARN: MAX load reached: load 26.570000/10.000000 at i=1[09-12-2013 14:30:12] NPCD: WARN: MAX load reached: load 25.310000/10.000000 at i=1[09-12-2013 14:30:27] NPCD: WARN: MAX load reached: load 26.140000/10.000000 at i=1[09-12-2013 14:30:42] NPCD: WARN: MAX load reached: load 25.070000/10.000000 at i=1[09-12-2013 14:30:57] NPCD: WARN: MAX load reached: load 26.270000/10.000000 at i=1[09-12-2013 14:31:12] NPCD: WARN: MAX load reached: load 25.180000/10.000000 at i=1[09-12-2013 14:31:27] NPCD: WARN: MAX load reached: load 26.730000/10.000000 at i=1[09-12-2013 14:31:42] NPCD: WARN: MAX load reached: load 25.060000/10.000000 at i=1[09-12-2013 14:31:57] NPCD: WARN: MAX load reached: load 25.030000/10.000000 at i=1[09-12-2013 14:32:12] NPCD: WARN: MAX load reached: load 24.920000/10.000000 at i=1[09-12-2013 14:32:27] NPCD: WARN: MAX load reached: load 24.920000/10.000000 at i=1[09-12-2013 14:32:42] NPCD: WARN: MAX load reached: load 24.550000/10.000000 at i=1[09-12-2013 14:32:57] NPCD: WARN: MAX load reached: load 24.710000/10.000000 at i=1[09-12-2013 14:33:12] NPCD: WARN: MAX load reached: load 24.290000/10.000000 at i=1[09-12-2013 14:33:27] NPCD: WARN: MAX load reached: load 21.650000/10.000000 at i=1[09-12-2013 14:33:42] NPCD: WARN: MAX load reached: load 21.080000/10.000000 at i=1[09-12-2013 14:33:57] NPCD: WARN: MAX load reached: load 21.460000/10.000000 at i=1[09-12-2013 14:34:12] NPCD: WARN: MAX load reached: load 21.590000/10.000000 at i=1[09-12-2013 14:34:27] NPCD: WARN: MAX load reached: load 21.530000/10.000000 at i=1[09-12-2013 14:34:42] NPCD: WARN: MAX load reached: load 21.730000/10.000000 at i=1[09-12-2013 14:34:57] NPCD: WARN: MAX load reached: load 22.300000/10.000000 at i=1[09-12-2013 14:35:12] NPCD: WARN: MAX load reached: load 23.170000/10.000000 at i=1[09-12-2013 14:35:27] NPCD: WARN: MAX load reached: load 24.560000/10.000000 at i=1[09-12-2013 14:35:43] NPCD: WARN: MAX load reached: load 24.070000/10.000000 at i=1[09-12-2013 14:35:58] NPCD: WARN: MAX load reached: load 23.400000/10.000000 at i=1[09-12-2013 14:36:13] NPCD: WARN: MAX load reached: load 23.080000/10.000000 at i=1[09-12-2013 14:36:28] NPCD: WARN: MAX load reached: load 23.030000/10.000000 at i=1[09-12-2013 14:36:43] NPCD: WARN: MAX load reached: load 24.670000/10.000000 at i=1[09-12-2013 14:36:58]
 NPCD: WARN: MAX load reached: load 24.080000/10.000000 at i=1[09-12-2013 14:37:13] NPCD: WARN: MAX load reached: load 23.560000/10.000000 at i=1[09-12-2013 14:37:25] NPCD: Caught Termination Signal - Hasta la vista... baby
[09-12-2013 14:37:26] NPCD: npcd Daemon (0.4.14) started with PID=16225
[09-12-2013 14:37:26] NPCD: Please have a look at 'npcd -V' to get license information
[09-12-2013 14:37:26] NPCD: HINT: load_threshold is enabled - ('10.000000')
[09-12-2013 14:37:26] NPCD: WARN: MAX load reached: load 22.910000/10.000000 at i=0[09-12-2013 14:37:41] NPCD: WARN: MAX load reached: load 21.240000/10.000000 at i=1[09-12-2013 14:37:56] NPCD: WARN: MAX load reached: load 23.050000/10.000000 at i=1[09-12-2013 14:38:11] NPCD: WARN: MAX load reached: load 23.160000/10.000000 at i=1[09-12-2013 14:38:26] NPCD: WARN: MAX load reached: load 24.440000/10.000000 at i=1[09-12-2013 14:38:41] NPCD: WARN: MAX load reached: load 24.350000/10.000000 at i=1[09-12-2013 14:38:56] NPCD: WARN: MAX load reached: load 24.370000/10.000000 at i=1[09-12-2013 14:39:11] NPCD: WARN: MAX load reached: load 24.250000/10.000000 at i=1[09-12-2013 14:39:26] NPCD: WARN: MAX load reached: load 25.630000/10.000000 at i=1[09-12-2013 14:39:41] NPCD: WARN: MAX load reached: load 29.220000/10.000000 at i=1[09-12-2013 14:39:56] NPCD: WARN: MAX load reached: load 26.470000/10.000000 at i=1[09-12-2013 14:40:12] NPCD: WARN: MAX load reached: load 27.450000/10.000000 at i=1[09-12-2013 14:40:27] NPCD: WARN: MAX load reached: load 26.960000/10.000000 at i=1[09-12-2013 14:40:42] NPCD: WARN: MAX load reached: load 27.350000/10.000000 at i=1[09-12-2013 14:40:57] NPCD: WARN: MAX load reached: load 27.590000/10.000000 at i=1[09-12-2013 14:41:12] NPCD: WARN: MAX load reached: load 27.590000/10.000000 at i=1
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: Performance Graph broken

Post by sreinhardt »

Looks like you are hitting max load on your system and it is not actually getting processed. What is the current load of your system? About how many hosts and service checks are you running?

Code: Select all

top -n 1
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
vmesquita
Posts: 315
Joined: Fri Aug 10, 2012 12:52 pm

Re: Performance Graph broken

Post by vmesquita »

Code: Select all

Cpu(s): 26.6%us, 69.7%sy,  0.1%ni,  3.4%id,  0.1%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:   3115156k total,  2791908k used,   323248k free,   288468k buffers
Swap:  4194296k total,        8k used,  4194288k free,   809184k cached
We have 123 hosts and 1619 checks.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Performance Graph broken

Post by lmiltchev »

The default load threshold in the "/usr/local/nagios/etc/pnp/npcd.cfg" file is set to 10. It assumes you have a single core processor. Depending on your hardware, you can increase this (dual core: x 2, quad core x 4, etc.), for example:

Code: Select all

load_threshold = 20.0
or

Code: Select all

load_threshold = 40.0
then restart npcd:

Code: Select all

service npcd restart
You have quite many files piled up in the "/usr/local/nagios/var/spool/perfdata" and "usr/local/nagios/var/spool/checkresults" directories. You will have to probably delete these files:

Code: Select all

cd /usr/local/nagios/var/spool
rm -rf perfdata
mkdir perfdata
chown nagios:nagios perfdata
chmod 755 perfdata
rm -rf checkresults
mkdir checkresults
chown nagios:nagios checkresults
chmod 755 checkresults
service npcd restart
What's your hardware like on your nagios server (CPU, RAM, HDD)?
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked