Perfdata graphs empty

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
vhoover
Posts: 123
Joined: Mon Sep 09, 2013 12:17 pm

Perfdata graphs empty

Post by vhoover »

This is a important production issue. The performance data graphs are empty even though performance data is being collected and populated. I have tried suggestions in every forum post I could find regarding this kind of issue and have found no such luck. Please help.
[root@nagiosxi ~]# tail -25 /usr/local/nagios/var/perfdata.log
2015-09-11 21:52:03 [1789] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-09-11 21:52:03 [1788] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata/service-perfdata.1442029862-PID-1788 deleted
2015-09-11 21:52:03 [1789] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata/service-perfdata.1442029860-PID-1789 deleted
2015-09-11 21:52:03 [1788] [0] *** process_perfdata.pl terminated on signal ALRM
2015-09-11 21:52:03 [1789] [0] *** process_perfdata.pl terminated on signal ALRM
2015-09-11 21:52:03 [1785] [0] *** TIMEOUT: Timeout after 40 Sec. ****
2015-09-11 21:52:03 [1785] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-09-11 21:52:03 [1785] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-09-11 21:52:03 [1785] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata/host-perfdata.1442029860-PID-1785 deleted
2015-09-11 21:52:03 [1785] [0] *** process_perfdata.pl terminated on signal ALRM
2015-09-11 21:52:03 [1786] [0] *** TIMEOUT: Timeout after 40 Sec. ****
2015-09-11 21:52:03 [1786] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-09-11 21:52:03 [1786] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-09-11 21:52:03 [1786] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata/host-perfdata.1442029862-PID-1786 deleted
2015-09-11 21:52:03 [1786] [0] *** process_perfdata.pl terminated on signal ALRM
2015-09-11 21:55:18 [16841] [0] *** TIMEOUT: Timeout after 40 Sec. ****
2015-09-11 21:55:18 [16841] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-09-11 21:55:18 [16841] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-09-11 21:55:18 [16841] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata/service-perfdata.1442030059-PID-16841 deleted
2015-09-11 21:55:18 [16841] [0] *** process_perfdata.pl terminated on signal ALRM
2015-09-11 21:58:05 [21682] [0] *** TIMEOUT: Timeout after 40 Sec. ****
2015-09-11 21:58:05 [21682] [0] *** TIMEOUT: Deleting current file to avoid NPCD loops
2015-09-11 21:58:05 [21682] [0] *** TIMEOUT: Please check your process_perfdata.cfg
2015-09-11 21:58:05 [21682] [0] *** TIMEOUT: /usr/local/nagios/var/spool/perfdata/service-perfdata.1442030123-PID-21682 deleted
2015-09-11 21:58:05 [21682] [0] *** process_perfdata.pl terminated on signal ALRM
[root@nagiosxi ~]# tail -25 /usr/local/nagios/var/npcd.log
[09-11-2015 21:52:03] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata/host-perfdata.1442029862'
[09-11-2015 21:55:18] NPCD: ERROR: Executed command exits with return code '1'
[09-11-2015 21:55:18] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata/service-perfdata.1442030059'
[09-11-2015 21:58:05] NPCD: ERROR: Executed command exits with return code '1'
[09-11-2015 21:58:05] NPCD: ERROR: Command line was '/usr/local/nagios/libexec/process_perfdata.pl -n -b /usr/local/nagios/var/spool/perfdata/service-perfdata.1442030123'
[09-11-2015 21:58:20] NPCD: WARN: MAX load reached: load 33.140000/20.000000 at i=0
[09-11-2015 21:58:35] NPCD: WARN: MAX load reached: load 27.330000/20.000000 at i=1
[09-11-2015 21:59:38] NPCD: WARN: MAX load reached: load 38.010000/20.000000 at i=1
[09-11-2015 21:59:53] NPCD: WARN: MAX load reached: load 30.580000/20.000000 at i=1
[09-11-2015 22:00:08] NPCD: WARN: MAX load reached: load 27.250000/20.000000 at i=1
[09-25-2015 03:14:32] NPCD: Caught Termination Signal - Hasta la vista... baby
[09-25-2015 04:05:42] NPCD: npcd Daemon (0.4.14) started with PID=23254
[09-25-2015 04:05:42] NPCD: Please have a look at 'npcd -V' to get license information
[09-25-2015 04:05:42] NPCD: HINT: load_threshold is enabled - ('20.000000')
[09-29-2015 04:56:47] NPCD: Caught Termination Signal - Hasta la vista... baby
[09-29-2015 04:56:47] NPCD: npcd Daemon (0.4.14) started with PID=31396
[09-29-2015 04:56:47] NPCD: Please have a look at 'npcd -V' to get license information
[09-29-2015 04:56:47] NPCD: HINT: load_threshold is enabled - ('40.000000')
[10-01-2015 00:02:00] NPCD: npcd Daemon (0.4.14) started with PID=1667
[10-01-2015 00:02:00] NPCD: Please have a look at 'npcd -V' to get license information
[10-01-2015 00:02:00] NPCD: HINT: load_threshold is enabled - ('40.000000')
[10-11-2015 22:04:47] NPCD: Caught Termination Signal - Hasta la vista... baby
[10-11-2015 22:56:05] NPCD: npcd Daemon (0.4.14) started with PID=12628
[10-11-2015 22:56:05] NPCD: Please have a look at 'npcd -V' to get license information
[10-11-2015 22:56:05] NPCD: HINT: load_threshold is enabled - ('40.000000')
User avatar
tgriep
Madmin
Posts: 9190
Joined: Thu Oct 30, 2014 9:02 am

Re: Perfdata graphs empty

Post by tgriep »

Can you run the following on your Nagios system to see if the performance files are spooling and that could be the cause of the issue? Please post the output.

Code: Select all

ls /usr/local/nagios/var/spool/xidpe | wc -l
ls /usr/local/nagios/var/spool/perfdata | wc -l
ls /usr/local/nagios/var/spool/checkresults | wc -l
Be sure to check out our Knowledgebase for helpful articles and solutions!
vhoover
Posts: 123
Joined: Mon Sep 09, 2013 12:17 pm

Re: Perfdata graphs empty

Post by vhoover »

Looks like they are not spooling.
[root@nagiosxi ~]# ls /usr/local/nagios/var/spool/xidpe | wc -l
0
[root@nagiosxi ~]# ls /usr/local/nagios/var/spool/perfdata | wc -l
0
[root@nagiosxi ~]# ls /usr/local/nagios/var/spool/checkresults | wc -l
0
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Perfdata graphs empty

Post by ssax »

Are you seeing anything in your /var/log/cron?

Is it only one check where this is happening or all of them?
vhoover
Posts: 123
Joined: Mon Sep 09, 2013 12:17 pm

Re: Perfdata graphs empty

Post by vhoover »

All of them
SteveBeauchemin
Posts: 524
Joined: Mon Oct 14, 2013 7:19 pm

Re: Perfdata graphs empty

Post by SteveBeauchemin »

I believe I have seen similar errors in my log file in the past.

The timeout in my process_perfdata.cfg file needed to be longer. Yours does too.
The file to edit is /usr/local/nagios/etc/pnp/process_perfdata.cfg

At my site, I increased the timeout. It is now set to 60.
I see from your log that your timeout is set to 40.
The system is throwing away your data because it takes
longer than 40 seconds to process your files.

This may not be your complete answer, there could be more to it.
But your log is saying timeout, and shows the file delete before it is processed.
That's pretty clear.

Next log file...

The npcd log file shows that it wants to process files,
but they were deleted before it could get to them.

I would make changes to the npcd.cfg file in that same directory as the other config file.
I would increase the number of npcd_max_threads. I have mine set to 15.
Also, I decreased the sleep_time to 6

I am not suggesting that you should use those numbers. I worked at this until
my settings were right for my site. Those numbers are where I ended up after trial and error.

Try making changes to those number slowly. Increase threads, reduce sleep.
Use "service npcd restart" after each change. Wait and see if the system starts working better.

The Timeout set to 60 should make the most difference, but npcd needs more
parallel processes so it can get the job done faster.

One last thought. Have you considered setting up a ram disk for these files?
You will still need the changes I suggested ram disk or no ram disk.
It is much easier to setup than I thought it would be. Nagios has
instructions in pdf somewhere. If you do use the ram disk... You just need
to keep an eye on the space used and make sure you know early before it fills
up if there is a problem. I have mine set to 500MB at this time. I have noticed it
filling up a couple times. One time I needed to restart npcd. Once I needed to
restart nagios. Lessons learned...

Good Luck.

Steve B
XI 5.7.3 / Core 4.4.6 / NagVis 1.9.8 / LiveStatus 1.5.0p11 / RRDCached 1.7.0 / Redis 3.2.8 /
SNMPTT / Gearman 0.33-7 / Mod_Gearman 3.0.7 / NLS 2.0.8 / NNA 2.3.1 /
NSClient 0.5.0 / NRPE Solaris 3.2.1 Linux 3.2.1 HPUX 3.2.1
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Perfdata graphs empty

Post by Box293 »

SteveBeauchemin wrote:One last thought. Have you considered setting up a ram disk for these files?
You will still need the changes I suggested ram disk or no ram disk.
It is much easier to setup than I thought it would be. Nagios has
instructions in pdf somewhere. If you do use the ram disk... You just need
to keep an eye on the space used and make sure you know early before it fills
up if there is a problem. I have mine set to 500MB at this time. I have noticed it
filling up a couple times. One time I needed to restart npcd. Once I needed to
restart nagios. Lessons learned...
Great advice @SteveBeauchemin, the RAM Disk is an invaluable performance enhancement for Nagios XI. Here is the official procedure for it:
https://assets.nagios.com/downloads/nag ... giosXI.pdf
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
vhoover
Posts: 123
Joined: Mon Sep 09, 2013 12:17 pm

Re: Perfdata graphs empty

Post by vhoover »

I have a ram disk setup and I have changed the perfdata timeout but have yet to have the graphs populate.
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: Perfdata graphs empty

Post by Box293 »

Lets increase the logging verbosity and then take a deeper look into the logs. Follow the FAQ entry below to increase the log level of process_perfdata and npcd:

http://support.nagios.com/wiki/index.ph ... leshooting

Wait 15 - 20 minutes and then get a tail of the logs:

Code: Select all

tail -250 /usr/local/nagios/var/perfdata.log > /tmp/perfdata.txt
tail -250 /usr/local/nagios/var/npcd.log > /tmp/npcd.txt
Send us a copy of /tmp/perfdata.txt and /tmp/npcd.txt

Don't forget to turn down the log level as per the FAQ when you are done!
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
vhoover
Posts: 123
Joined: Mon Sep 09, 2013 12:17 pm

Re: Perfdata graphs empty

Post by vhoover »

Here are the files.
You do not have the required permissions to view the files attached to this post.
Locked