no more performance graphs

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
ktservices
Posts: 19
Joined: Mon Mar 26, 2012 6:20 am
Location: Germany
Contact:

no more performance graphs

Post by ktservices »

Hello everybody,

we have a problem with our performance graphs. First after installing Nagios XI everything worked fine, also the performance graphs. Nagios XI is running on a bare metal server with RHEL 6.x. We are installing our Servers with kickstart and the default lifetime for the password is one year, foolishly even for the nagios user (i have missed that) :oops: . Now after a year the password for the nagios user gots invalid and many things in nagios stopped working. After fixing my mistake every works fine in nagios xi, but the performance graphs are no longer drawn.
In Nagios XI the "Performance Grapher" is shown as running, in "/usr/local/nagios/var" the files "host-perfdata" and "service-perfdata" are updated periodically. In "/usr/local/nagios/share/perfdata/.." no more data is written. Certainly i have rebooted the server, in the meantime an update for nagios xi appeard, the installation of the update was successful, but the performance graphs still not beeing drawn.

I tried the solutions shown in the Nagios XI FAQs already, but nothing changed.

Hope anybody can help me.
Best Regards
Reinhold Krinninger
ewilliams
Posts: 9
Joined: Thu May 09, 2013 6:37 pm

Re: no more performance graphs

Post by ewilliams »

Hi,

I went through a similiar issue and followed all of the FAQ guides also. From what i read this is usually caused by permissions so if you have triple checked all perms and are certain they are correct the other isssue can be caused by high load on the nagios server.

If your server has a load over 10 viewed through top you may need to look at why, but in the meantime you can change the thresholds so PNP still runs.

Changed the load_threshold of the npcd.cfg to ## and restarted npcd (Can be restarted via the UI)
http://support.nagios.com/forum/viewtop ... n&start=20
/usr/local/nagios/etc/pnp/npcd.cfg
load_threshold = 10.0

Hope that helps.
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: no more performance graphs

Post by slansing »

Thanks for the tips ewilliams, let us know if you need further help ktservices!
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: no more performance graphs

Post by abrist »

ewilliams is correct if you are experiencing load threshold issues. Check you npcd and perfdata logs:

Code: Select all

tail -25 /usr/local/nagios/var/perfdata.log 
tail -25 /usr/local/nagios/var/npcd.log 
Remember, you will need to restart npcd after any changes to npcd.cfg or process-perfdata.cfg.

Code: Select all

service npcd restart
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
ktservices
Posts: 19
Joined: Mon Mar 26, 2012 6:20 am
Location: Germany
Contact:

Re: no more performance graphs

Post by ktservices »

Hello,

the directory "/usr/local/nagios/var" contains no file named "perfdata.log".

"tail -25 /usr/local/nagios/var/npcd.log" output this:

/[01-11-2013 15:10:37] NPCD: HINT: load_threshold is enabled - ('10.000000')
[02-07-2013 11:31:37] NPCD: Caught Termination Signal - Hasta la vista... baby
[02-07-2013 11:35:25] NPCD: npcd Daemon (0.4.14) started with PID=3973
[02-07-2013 11:35:25] NPCD: Please have a look at 'npcd -V' to get license information
[02-07-2013 11:35:25] NPCD: HINT: load_threshold is enabled - ('10.000000')
[03-04-2013 08:55:28] NPCD: Caught Termination Signal - Hasta la vista... baby
[03-04-2013 08:59:10] NPCD: npcd Daemon (0.4.14) started with PID=4118
[03-04-2013 08:59:10] NPCD: Please have a look at 'npcd -V' to get license information
[03-04-2013 08:59:10] NPCD: HINT: load_threshold is enabled - ('10.000000')
[03-22-2013 13:31:50] NPCD: Caught Termination Signal - Hasta la vista... baby
[03-22-2013 13:35:44] NPCD: npcd Daemon (0.4.14) started with PID=4007
[03-22-2013 13:35:44] NPCD: Please have a look at 'npcd -V' to get license information
[03-22-2013 13:35:44] NPCD: HINT: load_threshold is enabled - ('10.000000')
[04-09-2013 17:24:27] NPCD: Caught Termination Signal - Hasta la vista... baby
[04-09-2013 17:28:20] NPCD: npcd Daemon (0.4.14) started with PID=4028
[04-09-2013 17:28:20] NPCD: Please have a look at 'npcd -V' to get license information
[04-09-2013 17:28:20] NPCD: HINT: load_threshold is enabled - ('10.000000')
[05-07-2013 16:08:38] NPCD: Caught Termination Signal - Hasta la vista... baby
[05-07-2013 16:12:29] NPCD: npcd Daemon (0.4.14) started with PID=4023
[05-07-2013 16:12:29] NPCD: Please have a look at 'npcd -V' to get license information
[05-07-2013 16:12:29] NPCD: HINT: load_threshold is enabled - ('10.000000')
[05-13-2013 11:41:29] NPCD: Caught Termination Signal - Hasta la vista... baby
[05-13-2013 11:45:19] NPCD: npcd Daemon (0.4.14) started with PID=4190
[05-13-2013 11:45:19] NPCD: Please have a look at 'npcd -V' to get license information
[05-13-2013 11:45:19] NPCD: HINT: load_threshold is enabled - ('10.000000')

Nagios XI is running on a powerful server (24 Cores, 16 GB RAM, lots of local HD-Space) with only a few checks (<1000). I saw the server never busy. Actual:
top - 13:32:15 up 1 day, 1:47, 1 user, load average: 0.94, 0.77, 0.64
Tasks: 624 total, 2 running, 620 sleeping, 0 stopped, 2 zombie
Cpu(s): 3.4%us, 0.8%sy, 0.0%ni, 95.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 16327580k total, 4366096k used, 11961484k free, 246284k buffers
Swap: 4194296k total, 0k used, 4194296k free, 680864k cached

Now i rechecked the file- and directory-permissions as shown in the faqs, User "nagios" is able to change into the directory "/usr/local/nagios/share/perfdata" and can change/write into files in the subdirectories files. I also executed the command "/usr/local/nagiosxi/scripts/reset_config_perms" again.
So far nothing changed.

Best Regards
Reinhold Krinninger
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: no more performance graphs

Post by abrist »

Well, you are not experiencing load issues. You will need to turn on logging for perfdata. Edit the file:

Code: Select all

/usr/local/nagios/etc/pnp/process_perfdata.cfg 
Change:

Code: Select all

LOG_LEVEL = 0
To:

Code: Select all

LOG_LEVEL = 1
Restart npcd:

Code: Select all

service npcd restart
Wait 20 - 30 minutes and then tail the perfdata.log file:

Code: Select all

tail -50 /usr/local/nagios/var/perfdata.log 
Post the output here in code wrpas.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
ktservices
Posts: 19
Joined: Mon Mar 26, 2012 6:20 am
Location: Germany
Contact:

Re: no more performance graphs

Post by ktservices »

Hello,

seems that my problem is solved. Both subdirectories "xidpe" and "perfdata" in "/usr/local/nagios/var/spool/" seems to be corrupted. The subdirectory "xidpe" contained very much files, but should under normal conditions be empty(?), the subdirectory "perfdata" was always empty, but the ls-command shows an unnormal size of the directory. i deleted both directories and recreated both directories with same rights, user and group. After a restart of nagios now the performance graphs are drawn again. :P
I would like to thank everybody who has replied to my post.

Reinhold Krinninger
Locked