Page 2 of 3

Re: performance graphs stopped working

Posted: Mon Jun 06, 2016 3:44 pm
by jeephigh
Also which backups should I try to restore? I have figured out which days the performance graphs were working and the state history was working, but which database do I restore and in what order? There are MySQL , Nagios , nagiosql , nagiosxi and test backups from which I could restore in /store/backups/MySQL

Thanks

Re: performance graphs stopped working

Posted: Mon Jun 06, 2016 4:00 pm
by jeephigh
One more piece of information that might be useful. I did create a new server key to install a new security certificate so would could use the name and domain we wanted in the url over 443. These were configured and installed on the same day that the last perf graphs showed up. Not sure why that would have anything to do with it.

Re: performance graphs stopped working

Posted: Mon Jun 06, 2016 4:04 pm
by lmiltchev
If you followed this document, you wouldn't have to worry about what DB to restore and in which order - the script restores all of them. However, I would recommend that you review our "Nagios XI - Performance Graph Problems" KB article first (before restoring XI):

https://support.nagios.com/kb/article.php?id=9

Have you made many changes since the last backup (when performance graphs were working)?

Re: performance graphs stopped working

Posted: Mon Jun 06, 2016 5:19 pm
by jeephigh
Thanks Imiltchev for the document.

So I checked my other host groups that have performance data enabled and lo and behold all of their perf graphs are showing AND their state history shows up.

Only the one host group has stopped showing this data. I did change the check_ping argument and check intervals for this group around the time they stopped working.
Here is the command that I changed for the hostgroup using bulk modifications. Also included the following two arguments. I had their check interval at one minute.
$USER1$/check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p 5
500.0,50%
700.0,100%

Re: performance graphs stopped working

Posted: Mon Jun 06, 2016 5:31 pm
by jeephigh
I also did a tail on perfdata and npcd. I am not getting even close to threshold and I did not see any errors in the perfdata.

Unfortunately I did not have any nagiosxi backups before I took one today.

Re: performance graphs stopped working

Posted: Mon Jun 06, 2016 6:15 pm
by Box293
jeephigh wrote:I also did a tail on perfdata and npcd. I am not getting even close to threshold and I did not see any errors in the perfdata.
You'll need to enable debugging (as outlined in that KB article) to see further information.
jeephigh wrote: I did change the check_ping argument and check intervals for this group around the time they stopped working.
For one of those hosts, please post us a screenshot of the Host Status details, the Advanced tab.

Then run this command (replace localhost with the name of your host in the screenshot provided ... it is case sensitive):

Code: Select all

/usr/local/nagios/share/perfdata/localhost/rrdtool info _HOST_.rrd | grep type
Please post the output.

Re: performance graphs stopped working

Posted: Tue Jun 07, 2016 10:17 am
by jeephigh
Thanks for the reply Box293-

I did follow the KB article to enable debugging on both perfdata and npcd. I was expecting to see recurring errors or timeouts or something. I didn't.

Here is the screenshot of one of the hosts advanced tab.
Mason,Host.PNG
I ran the command you suggested. Here is the output
ds[1].type = "GAUGE"
ds[2].type = "GAUGE"
ds[3].type = "GAUGE"
ds[4].type = "GAUGE"

Thanks for your help.

Re: performance graphs stopped working

Posted: Tue Jun 07, 2016 3:14 pm
by jeephigh
It took a lot of digging through the perfdata debugging logs, but I did find an error that seems to appear for each of the hosts in the one hostgroup that does not diplay perf graphs or state history.
Here it is.
error.PNG
I see this error only on the hosts in the hostgroup that is having trouble with state history and perf graphs.

Any direction someone can point me would be great. Thanks for the suggestions so far.

Re: performance graphs stopped working

Posted: Tue Jun 07, 2016 4:27 pm
by tgriep
You may want to go through this article that has a solution for the error you posted.
https://support.nagios.com/kb/article.php?id=149
If the number of data sources change, that could cause the graphs to not update. Try that and see if that works for you.

Re: performance graphs stopped working

Posted: Tue Jun 07, 2016 4:43 pm
by Box293
tgriep wrote:You may want to go through this article that has a solution for the error you posted.
https://support.nagios.com/kb/article.php?id=149
If the number of data sources change, that could cause the graphs to not update. Try that and see if that works for you.
Please do this.

However ...
jeephigh wrote:Thanks for the reply Box293-

I did follow the KB article to enable debugging on both perfdata and npcd. I was expecting to see recurring errors or timeouts or something. I didn't.

Here is the screenshot of one of the hosts advanced tab.

Image

I ran the command you suggested. Here is the output
ds[1].type = "GAUGE"
ds[2].type = "GAUGE"
ds[3].type = "GAUGE"
ds[4].type = "GAUGE"

Thanks for your help.
For this host and the output, it's the opposite. The RRD file has 4 datasources but the plugin result is returning 2.

From your earlier post:
jeephigh wrote:$USER1$/check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p 5
500.0,50%
700.0,100%
check_ping returns 2 data sources.
check_icmp returns 4 data sources.

Please update your host objects or host template to use check_icmp.