Performance Graphs

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
operations_asavie
Posts: 33
Joined: Tue Dec 22, 2015 7:07 am

Performance Graphs

Post by operations_asavie »

Hi,

I am running the standard VMware wizard on an ESXi Host, for the CPU usage check I was able to see the data graphed in the Performance Graph but since I have added in warning and critical threshold % values for this service the Performance Graph is not being populated.

See images attached.

I am looking to have the % CPU usage graphed?
You do not have the required permissions to view the files attached to this post.
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Performance Graphs

Post by rkennedy »

Can you navigate to the advanced tab for the service in question, and post a screenshot of it? I'd like to see what perfdata it's returning.

Also, can you run the check over the CLI, and post the full input / output for the check? Do this once without the warning / critical defined, and once again with them defined.
Former Nagios Employee
operations_asavie
Posts: 33
Joined: Tue Dec 22, 2015 7:07 am

Re: Performance Graphs

Post by operations_asavie »

Hi,

See below.

[root@NAG-IXDUB-02 libexec]# ./check_esx3.pl -H 172.17.4.39 -f /usr/local/nagiosxi/etc/components/vmware/4F0BC5J_mgmt_auth.txt -l CPU
ESX3 OK - cpu usage=1160.00 MHz (3.62%) | cpu_usagemhz=1160.00Mhz;; cpu_usage=3.62%;;
[root@NAG-IXDUB-02 libexec]# ./check_esx3.pl -H 172.17.4.39 -f /usr/local/nagiosxi/etc/components/vmware/4F0BC5J_mgmt_auth.txt -l CPU -s usage -w 80 -c 90
ESX3 OK - cpu usage=3.12 % | cpu_usage=3.12%;80;90
You do not have the required permissions to view the files attached to this post.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Performance Graphs

Post by lmiltchev »

What happens if you move the RRD and the XML file for this service out of the "/usr/local/nagios/share/perfdata/<hostname>/" directory (to let's say "/tmp/"), and wait for 15-20 min? Do the graphs show up? The RRD/XML files should get recreated.
Be sure to check out our Knowledgebase for helpful articles and solutions!
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Performance Graphs

Post by ssax »

The reason why this is not working is because you changed the command which changed the perfdata information.

This the RRD file is expecting two parameters (called datasources) because that is what it was originally built with:

Code: Select all

cpu_usagemhz=1160.00Mhz;; cpu_usage=3.62%;;
When you modified the check command it removed the cpu_usagemhz datasource so it's only trying to insert one datasource value now when the RRD is expecting two, which causes an error and will not insert the data.

You can remove the RRD and XML files for this service in /usr/local/nagios/share/perfdata/HOSTNAME/ so that it can rebuild them if you don't care about the historical data.

Note: You will lose all historical performance graph information for this service if you delete those files.

The only ways to get it graphing again WITHOUT losing the historical data would be to change the command back to what it was before OR follow this sweet guide that I wrote:

How to delete a datasource from an RRD

Here is how I did it:

First, open up the /usr/local/nagios/share/perfdata/HOSTNAME/SERVICE.xml file and find the <DS>NUMBER</DS> entry (this means data source number) for the one you want to remove.

Code: Select all

  <DATASOURCE>
    <TEMPLATE>check_isis</TEMPLATE>
    <RRDFILE>/usr/local/nagios/share/perfdata/UVN-DCD-SD02/Performance_Data.rrd</RRDFILE>
    <RRD_STORAGE_TYPE>SINGLE</RRD_STORAGE_TYPE>
    <RRD_HEARTBEAT>8460</RRD_HEARTBEAT>
    <IS_MULTI>0</IS_MULTI>
    <DS>4</DS>
    <NAME>MESSAGESPS</NAME>
    <LABEL>MESSAGESPS</LABEL>
    <UNIT></UNIT>
    <ACT>2324</ACT>
    <WARN></WARN>
    <WARN_MIN></WARN_MIN>
    <WARN_MAX></WARN_MAX>
    <WARN_RANGE_TYPE></WARN_RANGE_TYPE>
    <CRIT></CRIT>
    <CRIT_MIN></CRIT_MIN>
    <CRIT_MAX></CRIT_MAX>
    <CRIT_RANGE_TYPE></CRIT_RANGE_TYPE>
    <MIN></MIN>
    <MAX></MAX>
  </DATASOURCE>
Since you want to remove MESSAGESPS we can see that the DS number is 4.

Then remove the MESSAGEPS from the script/plugin/perfdata output.

Run these commands to install the tool that we will use to delete the data source:

Code: Select all

cd /tmp
wget "http://downloads.sourceforge.net/project/pnp4nagios/PNP-0.6/pnp4nagios-0.6.25.tar.gz?r=&ts=1452788875&use_mirror=iweb" -O /tmp/pnp4nagios-0.6.25.tar.gz
tar zxf /tmp/pnp4nagios-0.6.25.tar.gz
cd /tmp/pnp4nagios-0.6.25
./configure
make all
cp /tmp/pnp4nagios-0.6.25/scripts/rrd_modify.pl /root/scripts/
chmod +x /root/scripts/rrd_modify.pl
Run these commands to delete the data source:
*** NOTE: Make sure to change DATASOURCENUM, HOSTNAME, and SERVICENAME to the proper values.
*** NOTE: SERVICENAME should be changed to the actual service name from the filename (validate what it should be, it may differ from what you have set in Nagios)

Code: Select all

/root/scripts/rrd_modify.pl /usr/local/nagios/share/perfdata/HOSTNAME/SERVICENAME.rrd delete DATASOURCENUM
mv /usr/local/nagios/share/perfdata/HOSTNAME/SERVICENAME.rrd /usr/local/nagios/share/perfdata/HOSTNAME/SERVICENAME.rrd.bak
mv /usr/local/nagios/share/perfdata/HOSTNAME/SERVICENAME.rrd.chg /usr/local/nagios/share/perfdata/HOSTNAME/SERVICENAME.rrd
chown nagios.nagios /usr/local/nagios/share/perfdata/HOSTNAME/SERVICENAME.rrd
chmod 775 /usr/local/nagios/share/perfdata/HOSTNAME/SERVICENAME.rrd
Now it should start graphing properly when the new checks come in (may take 15 to 20 minutes).
operations_asavie
Posts: 33
Joined: Tue Dec 22, 2015 7:07 am

Re: Performance Graphs

Post by operations_asavie »

Thank you for your help. I was able to just remove the RRD and XML file as the history was not important. I have this service graphing the percentage now.

Another quick question relating to the same script/ wizard.

I'm running a service check on the datastore on an ESXi host, I can get the value no problem and it carries out the correct checks when I implement the warning and critical thresholds, see attached. My problem is the 90.63% for the datastore in the image attached is the % space free, not % space used. So the warning and critical values to be specified for me should be -w 10% -c 5%. Obviously this won't work correctly though as the check is if the value is > the warning and critical thresholds defined, not <. Can you tell me how to change this in the script?
You do not have the required permissions to view the files attached to this post.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Performance Graphs

Post by lmiltchev »

You could use ":" after the threshold for "less than" logic. See the "Threshold and ranges" section in the "Nagios Plugins Development Guidelines" here:

https://nagios-plugins.org/doc/guidelin ... HOLDFORMAT

Here's some examples with and without ":" after the threshold.

"Greater than" examples:

Code: Select all

[root@localhost ~]# /usr/local/nagios/libexec/check_esx3.pl -H "x.x.x.x" -f "/usr/local/nagiosxi/etc/components/vmware/MyHost_auth.txt" -l "VMFS" -s Datastore1 -w 90% -c 95%
ESX3 OK - Datastore1=10102235.00 MB (82.21%) | Datastore1=82.21%;90;95
[root@localhost ~]# /usr/local/nagios/libexec/check_esx3.pl -H "x.x.x.x" -f "/usr/local/nagiosxi/etc/components/vmware/MyHost_auth.txt" -l "VMFS" -s Datastore1 -w 80% -c 95%
ESX3 WARNING - Datastore1=10102235.00 MB (82.21%) | Datastore1=82.21%;80;95
[root@localhost ~]# /usr/local/nagios/libexec/check_esx3.pl -H "x.x.x.x" -f "/usr/local/nagiosxi/etc/components/vmware/MyHost_auth.txt" -l "VMFS" -s Datastore1 -w 70% -c 80%
ESX3 CRITICAL - Datastore1=10102235.00 MB (82.21%) | Datastore1=82.21%;70;80
"Less than" examples:

Code: Select all

[root@localhost ~]# /usr/local/nagios/libexec/check_esx3.pl -H "x.x.x.x" -f "/usr/local/nagiosxi/etc/components/vmware/MyHost_auth.txt" -l "VMFS" -s Datastore1 -w 80%: -c 70%:
ESX3 OK - Datastore1=10102235.00 MB (82.21%) | Datastore1=82.21%;80:;70:
[root@localhost ~]# /usr/local/nagios/libexec/check_esx3.pl -H "x.x.x.x" -f "/usr/local/nagiosxi/etc/components/vmware/MyHost_auth.txt" -l "VMFS" -s Datastore1 -w 90%: -c 80%:
ESX3 WARNING - Datastore1=10102235.00 MB (82.21%) | Datastore1=82.21%;90:;80:
[root@localhost ~]# /usr/local/nagios/libexec/check_esx3.pl -H "x.x.x.x" -f "/usr/local/nagiosxi/etc/components/vmware/MyHost_auth.txt" -l "VMFS" -s Datastore1 -w 95%: -c 90%:
ESX3 CRITICAL - Datastore1=10102235.00 MB (82.21%) | Datastore1=82.21%;95:;90:
Hope this helps.
Be sure to check out our Knowledgebase for helpful articles and solutions!
operations_asavie
Posts: 33
Joined: Tue Dec 22, 2015 7:07 am

Re: Performance Graphs

Post by operations_asavie »

Thank you for your help with this. With the command now, the thresholds of -w 20%: -c 10%: are working fine but they are carrying out checks on each of the datastores but I am only concerned with the 9f0bc5j.datastore=791910.00 MB (93.29%), is there a way to modify the command to only check the values of this against the defined thresholds and ultimately only alert on this and not all?
You do not have the required permissions to view the files attached to this post.
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: Performance Graphs

Post by lmiltchev »

You can modify the existing check (or create a new one), where in the "$ARG3$" field you will have:

Code: Select all

-s 9f0bc5j.datastore -w 20%: -c 10%:
When you pass only "vmfs" to the plugin, it shows all datastores info. If you want to see a specific datastore, you can pass a sub-command - "-s <datastore name>".

Hope this helps.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Locked