Page 1 of 2
Issue with false critical values on Virtual Machines
Posted: Thu Jun 07, 2018 2:50 am
by Ivajlo911
Hi,
we have a problem with false critical values coming from Nagios on a few virtual machines. Here is one example:
Service: L3CSGHVI CPU Usage
Host: vCSSGH
Address: ......
State: CRITICAL
Info:
ESX3 CRITICAL - L3CSGHVI cpu usage=-0.01 %
Date/Time: 2018-06-07 08:25:1
Service: L3CSGHVI Memory
Host: vCSSGH
Address: ......
State: CRITICAL
Info:
ESX3 CRITICAL - L3CSGHVI mem usage=-0.01 %
Date/Time: 2018-05-25 05:43:17
Details of our implementation:
CentOS Linux release 7.4.1708 (Core)
Manual Install of Nagios XI
No special configurations on our system, ie; is Gnome installed
We are not using a proxy
We are using Nagios XI 5.4.13.
Re: Issue with false critical values on Virtual Machines
Posted: Thu Jun 07, 2018 1:00 pm
by scottwilkerson
Can you show the full command your services?
I think this may be an anomaly in the ESX api, which you can probably get arrounf by setting your warning and critical thresholds to make negative numbers be OK
So for example if you had a warning/critical threshold set to 80 change it to ~:80 this means from negative infinity to 80 are OK and above is not.
Re: Issue with false critical values on Virtual Machines
Posted: Fri Jun 08, 2018 2:51 am
by Ivajlo911
Hi,
I would avoid such workarounds if possible.
These are the commands:
/usr/local/nagios/libexec/check_esx3.pl -H "10.10.7.220" -f "/usr/local/nagiosxi/etc/components/vmware/vCSSGH_auth.txt" -N "L3CSGHVI" -l "CPU" -s usage -w 80% -c 90%
ESX3 OK - "L3CSGHVI" cpu usage=18.50 % | cpu_usage=18.50%;80;90+
/usr/local/nagios/libexec/check_esx3.pl -H "10.10.7.220" -f "/usr/local/nagiosxi/etc/components/vmware/vCSSGH_auth.txt" -N "L3CSGHVI" -l "MEM" -s usage -w 80% -c 90%
ESX3 OK - "L3CSGHVI" mem usage=71.99 % | mem_usage=71.99%;80;90
Re: Issue with false critical values on Virtual Machines
Posted: Fri Jun 08, 2018 1:07 pm
by cdienger
Is the behavior frequent or consistent? The output in the last post looks to be normal. I tested with the 6.7.0 vmware sdk and haven't been able to reproduce yet. Run "/usr/bin/vmware-cmd --version" to find the version that is installed on the XI system.
Also, here is a screenshot of my check settings just to make sure you're running something similar.
Re: Issue with false critical values on Virtual Machines
Posted: Thu Jun 14, 2018 3:54 am
by Ivajlo911
Hi,
behavior is more consistent then frequent. Happens from time to time - every two or three days for two particular VMs.
Also we use 6.5.0 vmware sdk. Do you think we should update.
Our settings are the same as yours.
Re: Issue with false critical values on Virtual Machines
Posted: Thu Jun 14, 2018 9:50 am
by cdienger
Yes, I think it would be worth it to update. As was pointed out, this is likely something related to the API and is hopefully addressed in the update.
Re: Issue with false critical values on Virtual Machines
Posted: Mon Jun 18, 2018 2:13 am
by Ivajlo911
Hi,
after update problem continues:
***** Nagios XI Alert *****
Nagios has detected a problem with this service.
Notification Type: PROBLEM
Service: L3CNS02 CPU Usage
Host: vCSSGH
Address:
State: CRITICAL
Info:
ESX3 CRITICAL - L3CNS02 cpu usage=-0.01 %
Date/Time: 2018-06-18 04:57:47
Re: Issue with false critical values on Virtual Machines
Posted: Mon Jun 18, 2018 12:43 pm
by cdienger
Please confirm the plugin and wizard version. The plugin version can be seen with:
/usr/local/nagios/libexec/check_esx3.pl --version
and the wizard version can be found under Admin > System Extensions > Manage Config Wizards > VMware. The latest wizard version is 1.7.1 and you can upgrade to that version by clicking the Check for Updates and Intall Updates buttons found on the top of the page.
Re: Issue with false critical values on Virtual Machines
Posted: Tue Jun 26, 2018 1:43 am
by Ivajlo911
Hi,
Wizard version is: 1.6.9
The version of the plugin is:
[root@l3cnagint ~]# /usr/local/nagios/libexec/check_esx3.pl --version
check_esx3.pl 0.2.1
Re: Issue with false critical values on Virtual Machines
Posted: Tue Jun 26, 2018 1:31 pm
by cdienger
Is that a typo or does it actually show 0.2.1? My machien shows 0.7.1. In either case, try upgrading the wizard which should have the latest plugin as well.