'No data to display' for some hosts in Graph Explorer

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
crystal.then
Posts: 57
Joined: Mon Oct 27, 2014 12:05 am

'No data to display' for some hosts in Graph Explorer

Post by crystal.then »

Hi,

We have recently begun experiencing an issue where some services show "no data to display" when trying to graph them using Graph Explorer -> Scalable Performance Graph.

There are two main hosts for which we have noticed this behaviour (so far), and they have the following characteristics:

- Windows machines; monitored using NRPE Agent
- Run MS SQL and are clustered together
- Some services are able to be graphed correctly (CPU usage, memory usage)
- Drive usage services show 'No data to display' (C:, E: & F: Drive Usage)

For the erroring services we can see that data is being collected correctly; notifications are sent out when thresholds are breached and show the current status/usage of the service. So the data is there, but is not being picked up by the graphing tool for some services. The RRD performance graphs for the erroring services also do not show data.

Our Nagios setup:

- Nagios XI v5.4.8
- Running on a CentOS VM
- 461 monitored hosts
- 3100 monitored services

Let me know if you need any more information for troubleshooting. Thanks in advance!

Regards,
Matt
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: 'No data to display' for some hosts in Graph Explorer

Post by npolovenko »

Hello, @crystal.then. I know you said that the data is being collected correctly, but I just want to clarify. If you go click on a "broken" service and navigate to the advanced tab, are you able to see RRD output next to the performance data table? Can you check all "broken" services this way and let us know?
rrd.png
You may delete the corresponding RRD and XML files for the broken services from /usr/local/nagios/share/perfdata/. Or maybe move them to a different directory for now. This will force Nagios to recreate a new RRD's. You may also entirely clean out the contents of this directory /var/lib/mrtg/, it's all temporary files.
After that please give the system up to 30 min and check back on the services in question. If the problem is still there, please share your system profile with us so we can go over every major log file.
To send us your system profile. Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and attach it to your next post, or you could upload it to the cloud storage of your choice and share a link with me in a pm.
You do not have the required permissions to view the files attached to this post.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
crystal.then
Posts: 57
Joined: Mon Oct 27, 2014 12:05 am

Re: 'No data to display' for some hosts in Graph Explorer

Post by crystal.then »

Hi @npolovenko, thank you for your prompt response!

I've gone through the steps you mentioned, and have found that the performance data is not showing in the Advanced section of affected services. For one of the services I removed the xml and rrd files from the /usr/local/nagios/share/perfdata/ directory, but it has not been recreated (it has been well over 30 minutes since removing the file). I guess this is because performance data is not being correctly ingested. I reviewed the /var/lib/mrtg folder but the files do not seem relevant to the host and services in question, so I did not remove any files.

Please see attached my system profile. I hope this can shed some light on the issue.

Thanks again for your help so far. Let me know if you need any more information.

Regards,

Matt
You do not have the required permissions to view the files attached to this post.
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: 'No data to display' for some hosts in Graph Explorer

Post by npolovenko »

Hi, @crystal.then.
1. Let's increase the NPCD timeout value:
Open the following config file:

Code: Select all

nano /usr/local/nagios/etc/pnp/process_perfdata.cfg
and change the:

Code: Select all

TIMEOUT = 5
to

Code: Select all

TIMEOUT = 40
2. Lets increase the load threshold:
Open the following config file:

Code: Select all

nano /usr/local/nagios/etc/pnp/npcd.cfg
and change the:

Code: Select all

load_threshold = 10.0
to

Code: Select all

load_threshold = 30.0
3. You can delete everything from: /var/lib/mrtg/ Those are all temp files.
4. I've seen some log entries indicating that you have a few crashed DB tables. You may run a db_repair script:

Code: Select all

cd /usr/local/nagiosxi/scripts
./repair_databases.sh
*It may take a while for this script to finish since you have a large system.

5. Please run the following commands:

Code: Select all

service nagios stop
killall -9 nagios
service nagios start
service crond restart
service npcd restart
7. (optional) Please increase the size of the root partition, it says its 86% used. It's not critical at this point but it's something that may cause problems in the future.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Locked