Page 1 of 1
ESX Monitoring error
Posted: Tue Nov 01, 2016 6:23 am
by sgoffar
Hi All,
For multiple of ESX host we are getting CRITICAL: communication with the VMware API failed after 2 retries.
My current my $current_box293_version = '2016-05-10';
Can you please help on this.
Re: ESX Monitoring error
Posted: Tue Nov 01, 2016 8:44 am
by sgoffar
On further analysis we found that the error we are getting below types of error in state history also:
UNKNOWN: Error: Cannot complete login due to an incorrect user name or password.
UNKNOWN: Error connecting to server at '
https://172.22.200.61/sdk/webService': Connection refused
Host Status Total number of concurrent checks exceeds 15, aborting!
Re: ESX Monitoring error
Posted: Tue Nov 01, 2016 3:33 pm
by tgriep
Are the checks failing for all ESX host /guest checks on the server or only certain ones?
It looks like one of the login accounts are expired so go to the following page and download the Manual.pdf file.
https://exchange.nagios.org/directory/P ... re/details
Start at the beginning and make sure the account settings are correct
If they are, take a look at the Plugin Test on page 11, run that test while logged in as the nagios user, but add the -v to the end of the command for verbose output and post the output here.
Re: ESX Monitoring error
Posted: Tue Nov 01, 2016 9:36 pm
by sgoffar
This is failing for few of the ESX server. ESX servers are set to monitor via VMWare server --host option and we are able to get the VMWare server details successfully
Re: ESX Monitoring error
Posted: Wed Nov 02, 2016 9:30 am
by tgriep
You need to add the --concurrent_checks option to your command(s) for the plugin. For example:
Code: Select all
--concurrent_checks 20
/usr/local/nagios/libexec/check_by_ssh -E 1 -l vi-admin -H 192.168.1.231 -C "~/box293_check_vmware.pl --concurrent_checks 20 --server 192.168.1.211 --check vCenter_Name_Version"
Increase the value to whatever you feel is appropriate.
NOTE: You will need to monitor the vMA CPU and Memory usage and size it appropriately as the more checks that run ... the more resources are consumed on the vMA.
Try that out and see if this helps the concurrent issue you are having.