Page 1 of 1

check_alive command does not output performance graph

Posted: Mon Oct 19, 2020 4:04 am
by axvaster
Hello,


Problem description:

We have 2 hosts, which have same host check command "check_alive" (ping).

Let's call them host A & host B.

Host B can show ping's performance graph, host A can't; no matter which period is set, it just shows no data.
(The graph can be found in "Host Status Detail > Performance Graphs")

But they both have same host check command "check_alive" with same default parameters.



Little bit different settings in CCM:

- Host B:

◊ Host B set check command "check_alive" directly in the host check command;

◊ And Host B has default generic template without check command.

- Host A:

◊ Host A uses only a customized template; and set host check command: "check_alive" via that customized template.

◊ The customized template uses only same one default generic template.


These means that the command could not be overwritten.

Could you help us find out how this happened?



Note:

1. The structure looks like this:

Host A ← N/A ← customized template with host check command: check_alive ← default generic_host

Host B ← check_alive ← default generic_host


2. Other performance graph seems ok.

So it could not be a performance issue

Re: check_alive command does not output performance graph

Posted: Mon Oct 19, 2020 4:39 pm
by ssax
Please attach the XML and RRD files for this service from /usr/local/nagios/share/perfdata/HOSTNAME.

Please send me a copy of your profile as well, you can download it from Admin > System Profile by clicking the Download Profile button.

Thank you!

Re: check_alive command does not output performance graph

Posted: Thu Oct 22, 2020 11:39 pm
by axvaster
Please check the attachments

thank you!

Moderator's Note: The profile has been shared with the support team but has been removed from the public forum.

Re: check_alive command does not output performance graph

Posted: Fri Oct 23, 2020 5:13 pm
by ssax
Looking at the XML file on the FW2 one we see this:

Code: Select all

  <RRD>
    <RC>1</RC>
    <TXT>/usr/local/nagios/share/perfdata/DMZ-FW2/_HOST_.rrd: found extra data on update argument: 1.208:0.363</TXT>
  </RRD>
And this:

Code: Select all

<NAGIOS_HOSTPERFDATA>rta=0.649ms;3000.000;5000.000;0; pl=0%;80;100;; rtmax=1.208ms;;;; rtmin=0.363ms;;;;</NAGIOS_HOSTPERFDATA>
So the current check is returning 4 datasources but if we check the RRD file with we see there is only 2 datasources:

Code: Select all

[root@xid ~]# rrdtool info _HOST_.rrd |grep ds
ds[1].index = 0
ds[1].type = "GAUGE"
ds[1].minimal_heartbeat = 8460
ds[1].min = NaN
ds[1].max = NaN
ds[1].last_ds = "0.747000"
ds[1].value = 2.9133000000e+01
ds[1].unknown_sec = 0
ds[2].index = 1
ds[2].type = "GAUGE"
ds[2].minimal_heartbeat = 8460
ds[2].min = NaN
ds[2].max = NaN
ds[2].last_ds = "0"
ds[2].value = 0.0000000000e+00
ds[2].unknown_sec = 0
When a plugin changes the number of datasources the RRD either needs to be renamed (or deleted) so that it can be recreated with the proper number of datasources.

You can try doing this first and it will automatically add the missing datasources to the RRD file (then it should start graphing again):

https://support.nagios.com/kb/article/n ... g-149.html

But if you have issues you can just rename /usr/local/nagios/share/perfdata/HOSTNAME/_HOST_.rrd and the next check result will automatically create a new RRD, then if you wait 10-15 minutes you should see it start graphing.

Re: check_alive command does not output performance graph

Posted: Wed Oct 28, 2020 1:36 am
by axvaster
Though I am not sure why or how the plugin changed the datasource it used to run;

it is all good after I removed the "_HOST_.rrd" file.

Thanks for your help!!

Re: check_alive command does not output performance graph

Posted: Wed Oct 28, 2020 7:56 am
by scottwilkerson
axvaster wrote:Though I am not sure why or how the plugin changed the datasource it used to run;

it is all good after I removed the "_HOST_.rrd" file.

Thanks for your help!!
Great!

Closing case