box293_check_vmware plugin issue with Cluster HA Status
box293_check_vmware plugin issue with Cluster HA Status
Hi,
I'm having an issue with box293's check_vmware plugin.
We have 2 gearman workers dedicated to vmware monitoring with the vmware SDK.
The Cluster HA Status works fine for our VCSAs but two.
I have this error : [Undefined subroutine &ClusterFailoverLevelAdmissionControlPolicy::cpuFailoverResourcesPercent called at /usr/local/nagios/libexec/check_vmware.pl line 1912.]
The nagios service accounts for vmware have the same rights and are in the same groups.
The other Cluster checks from box293's plugin are working fine on those 2 VCSAs (Cluster CPU Usage, Cluster Memory Usage ...).
Do you have any idea how I could fix this ?
I'm having an issue with box293's check_vmware plugin.
We have 2 gearman workers dedicated to vmware monitoring with the vmware SDK.
The Cluster HA Status works fine for our VCSAs but two.
I have this error : [Undefined subroutine &ClusterFailoverLevelAdmissionControlPolicy::cpuFailoverResourcesPercent called at /usr/local/nagios/libexec/check_vmware.pl line 1912.]
The nagios service accounts for vmware have the same rights and are in the same groups.
The other Cluster checks from box293's plugin are working fine on those 2 VCSAs (Cluster CPU Usage, Cluster Memory Usage ...).
Do you have any idea how I could fix this ?
Re: box293_check_vmware plugin issue with Cluster HA Status
Try running the command directly on the command line of two gearman workers as well as the XI command line. Do they all fail? What is the full command?
Are the machines that fail a different version for the rest?
Please provide a copy of the /usr/local/nagios/libexec/check_vmware.pl.
Are the machines that fail a different version for the rest?
Please provide a copy of the /usr/local/nagios/libexec/check_vmware.pl.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: box293_check_vmware plugin issue with Cluster HA Status
I've run the commands like you asked, they all failed.
See attachments for details.
All machines are VmWare VCSA 6.5.
The only thing that has been modified in the check_vmware.pl file is the amount of concurrent checks.
See attachments for details.
All machines are VmWare VCSA 6.5.
The only thing that has been modified in the check_vmware.pl file is the amount of concurrent checks.
You do not have the required permissions to view the files attached to this post.
Re: box293_check_vmware plugin issue with Cluster HA Status
Compare the working and non working cluster settings - What is defined for "Define host failover capacity by" ? Is the non working one be set to something other than "Cluster resource percentage" ?
https://docs.vmware.com/en/VMware-vSphe ... B060A.html
https://docs.vmware.com/en/VMware-vSphe ... B060A.html
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: box293_check_vmware plugin issue with Cluster HA Status
The working ones are either disabled (--ha_admission_control disabled option in the command for the check to be OK) or using the Cluster resource percentage set at 25% CPU & Memory.
The non working ones are both using Dedicated failover hosts set at 1.
The non working ones are both using Dedicated failover hosts set at 1.
Re: box293_check_vmware plugin issue with Cluster HA Status
Try running the commands again with the --ha_state, --ha_host_monitoring, and --ha_admission_control options one at a time. It could be that one of them is causing the failure and can be excluded.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: box293_check_vmware plugin issue with Cluster HA Status
I tried the commands with every options for both hosts on both workers and I still get the same error using the different options.
Example with one host on one worker :
Example with one host on one worker :
You do not have the required permissions to view the files attached to this post.
Re: box293_check_vmware plugin issue with Cluster HA Status
Run the commands with the --debug option. This should create a box293_check_vmware_debug_log.txt file in /home/nagios/ or /root/ depending on which account you run it with.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: box293_check_vmware plugin issue with Cluster HA Status
Alright so I ran the command with the debug option.
Here is the output :
I get the exact same output for both the non working hosts.
Here is the output of a host were the checks works as intended :
Code: Select all
sudo -u naemon ./check_vmware.pl --server GV1-FF-VXR-VCSA-01 --check Cluster_HA_Status --cluster VXMA-Cluster --debugHere is the output of a host were the checks works as intended :
Code: Select all
sudo -u naemon ./check_vmware.pl --server GV2-FF-VXR-VCSA-02 --check Cluster_HA_Status --cluster GV2-FF-CL01 --debugYou do not have the required permissions to view the files attached to this post.
Re: box293_check_vmware plugin issue with Cluster HA Status
It defaults to using code meant for the cpuFailoverResourcesPercent policy if the statements above it don't match. In this case, the policy being returned is resourceReductionToToleratePercent which doesn't match:
or
to get it to match against the last if statement, change line 1872 to:
Hopefully this should do it. Let us know your results.
Code: Select all
if ($cluster_ha_config_admission_control_policy_key eq 'failoverLevel') {Code: Select all
elsif ($cluster_ha_config_admission_control_policy_key eq 'slotPolicy')
Code: Select all
elsif ($cluster_ha_config_admission_control_policy_key eq 'failoverHosts') {Code: Select all
elsif ($cluster_ha_config_admission_control_policy_key eq 'failoverHosts' || $cluster_ha_config_admission_control_policy_key eq 'resourceReductionToToleratePercent' ) {As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.