Page 1 of 3

VMware config wizard issue

Posted: Mon Jul 24, 2017 1:49 pm
by atremblay
I've used the XI builtin config wizard for VMWare monitoring of connected VMs and all the stats pull fine except for the I/O stat, which is the disk space. This is one of the crucial ones I would like to monitor so I can know if a VM is running low on disk space. I just get "VM IO Unknown error". I tracked down the block of code which seems to be executing this, and began writing syslog messages at various points to see what it thought it was evaluating, but haven't gone much further.

I wanted to check with you guys if there is a known issue with this plugin and VMware 6.0? And if we could troubleshoot together cause you probably have a better idea than me where all the logs are kept. Thanks.

Re: VMware config wizard issue

Posted: Mon Jul 24, 2017 3:43 pm
by tgriep
There could be a version compatibility between the plugin and the VMWare Perl SDK that could be causing the issue you are having so make sure you upgrade to the latest Perl SDK from VMWare.
Also, try running upgrading to this version of the check_esx3 plugin. Some customers that had intermittent issues were solved by this version of the plugin.

First, login to the XI server as root and make a backup of the existing plugin by running the following.

Code: Select all

cp /usr/local/nagios/libexec/check_esx3.pl /usr/local/nagios/libexec/check_esx3.pl.old
Then run the following to install a required Perl Module for the new plugin.

Code: Select all

yum install perl-Nagios-Plugin
Then download this updated plugin to your PC.
https://github.com/shinken-monitoring/p ... ck_esx3.pl

After it is downloaded, login to the XI GUI and go to the Admin > Manage Plugins menu
Browse to the new plugin and then click on the Upload Button.

Then rerun the wizard and see if this fixes the issue.

Re: VMware config wizard issue

Posted: Tue Jul 25, 2017 4:25 pm
by atremblay
Ok Cool. That fixed the networking issue for the vCentre host itself. Not really something I was too worried about, but that's a plus. The I/O has changed it's output message from:

ESX3 CRITICAL - HOST-VM IO Unknown error
to
CHECK_ESX3.PL CRITICAL - HOST-VM IO Unknown error

So no real change except listing the script name instead of just brief ESX3...

Re: VMware config wizard issue

Posted: Tue Jul 25, 2017 4:43 pm
by tgriep
Can you run the check from the command line in verbose mode and post the output here?
Run this example but replace the required info.

Code: Select all

/usr/local/nagios/libexec/check_esx3.pl -H xxx.xxx.xxx.xxx -u username -p password -l IO -v

Re: VMware config wizard issue

Posted: Tue Jul 25, 2017 4:55 pm
by atremblay
It's using the SUB @ line 3187 called vm_disk_io_info

First the function defines the var $output, and if you see the default text in there it's what I'm getting back out of it 'HOST-VM IO Unknown Error'.

The functions first IF statement asks if there's any subcommands, which I'm not providing so you skip down to the ELSE.

Then it calls on this SUB called return_host_vmware_performance_values (a common function in this script). I inserted code to syslog the returned $values and it's empty. So something's going wrong, after that it runs an IF statement to make sure there is something in the $values, or else it returns the default $output I'm getting.

The return_host_vmware_performance_values requires $vmname which is valid and correct because it works when the script is run for other things like Memory.
The next Var is $defperfargs, I don't knwo exactly what this one is, just haven't taken enough time to dig that deep yet.
And the rest it requires are hard-coded, and the returned values formatting.

The function return_host_vmware_performance_values is at line 955.
It uses the $vmname var to query the VMware via the SDK downloaded to make this work.
From here it calls on the generic_performance_values function just above at line 842.

The function generic_performance_values get's a bit more in depth for what it's querying from the VIM provider, but I loose it here.
But running another syslog here on returned values I can see that the generic_performance_values function is not returning anything either. But it's not returning any error either.

Hope this helps, sorry I don't really know perl, been more PHP for a long time, but I can kind of gather what's going on in here.



Let me add the output here since it won't let me post two times in a short period of time.
CHECK_ESX3.PL OK - io commands aborted=0, io bus resets=0, io read latency=0 ms, write latency=0 ms, kernel latency=0 ms, device latency=0 ms, queue latency=0 ms | io_aborted=0;; io_busresets=0;; io_read=0ms;; io_write=0ms;; io_kernel=0ms;; io_device=0ms;; io_queue=0ms;;

Re: VMware config wizard issue

Posted: Tue Jul 25, 2017 5:05 pm
by atremblay
I tested the same command but including the -N option to go after a specific VM. I tested it with the -l MEM command (success), and with -l IO (failed), so as to rule out the whole "Is the VM name correct...

[root@------ ~]# /usr/local/nagios/libexec/check_esx3.pl -H X.X.X.X -N "<server_name>" -u <username> -p <password> -l MEM -v
CHECK_ESX3.PL OK - "<server_name>" mem usage=4095.47 MB(25.99%), overhead=53.34 MB, active=1064.96 MB, swapped=0.00 MB, swapin=0.00 MB, swapout=0.00 MB, memctl=0.00 MB | mem_usagemb=4095.47MB;; mem_usage=25.99%;; mem_overhead=53.34MB;; mem_active=1064.96MB;; mem_swap=0.00MB;; mem_swapin=0.00MB;; mem_swapout=0.00MB;; mem_memctl=0.00MB;;
[root@------ ~]# /usr/local/nagios/libexec/check_esx3.pl -H X.X.X.X -N "<server_name>" -u <username> -p <password> -l IO -v
CHECK_ESX3.PL CRITICAL - HOST-VM IO Unknown error

Re: VMware config wizard issue

Posted: Wed Jul 26, 2017 8:55 am
by tgriep
You could try adding the -T timeshift option to the command and see if that helps.
"-T, --timestamp=<timeshift>"
Timeshift in seconds that could fix issues with "Unknown error". Use values like 5, 10, 20, etc
Try and install a newer version of the VMWare Perl SDK, it could be that the version you are running may have bugs.

Re: VMware config wizard issue

Posted: Thu Jul 27, 2017 8:26 am
by atremblay
Sorry, I'm not too sure I understand what the -T is expected to do. I provided values ranging from 1-1000 for it, but no change.

I'd installed the SDK about 2 months ago when I was first setting up the server and I double checked my notes, I had installed the correct version. 6.0.0-2503617. Should I be trying to install 6.5 even though we're running VMware 6.0?

Re: VMware config wizard issue

Posted: Thu Jul 27, 2017 11:41 am
by tgriep
The timeshift options gives the check more time to run and sometimes that helps in the unknown errors.
Yes, try installing the latest 6.5 version of the SDK. It should still work with the older version of VMWare the servers are running.

Re: VMware config wizard issue

Posted: Thu Jul 27, 2017 1:42 pm
by atremblay
Testing that I notice that there are now quite a few modules too old to work. But it had no effect once installed (6.5 SDK that is).

The following Perl modules were found on the system but may be too old to work
with vSphere CLI:

MIME::Base64 3.14 or newer
Compress::Zlib 2.037 or newer
Compress::Raw::Zlib 2.037 or newer
version 0.78 or newer
IO::Compress::Base 2.037 or newer
IO::Compress::Zlib::Constants 2.061 or newer
LWP 6.15 or newer
LWP::Protocol::https 6.04 or newer
Socket6 0.23 or newer
IO::Socket::INET6 2.71 or newer
Net::HTTP 6.09 or newer