check_vmware_api.pl timeout after 60 seconds

coreymanshack · Post by **coreymanshack** » Wed Dec 21, 2016 6:01 pm

So my check_vmware_api.pl formerly check_esx3 plugin on nagios is taking VERY long variable amounts of time to monitor my esxi 6.0 host that is pretty much idle. Most of the time the plugin times out. I followed instructions online and installed other versions of Net-HTTP and libwww-perl but the issue persists.

Code: Select all

perl -MCPAN -e shell
install GAAS/Net-HTTP-6.03.tar.gz 
install GAAS/libwww-perl-5.837.tar.gz

I'm at a loss and do not know how to troubleshoot this further. I'm using VMware-vSphere-Perl-SDK-6.0.0-3561779.x86_64.tar.gz and have also tried VMware-vSphere-Perl-SDK-6.0.0-2503617.x86_64.tar.gz
How should I proceed to get this working?

Edit: I've also ran these check commands over ssh with 'time' and I get the same results. I've tried caching the session with no difference in time.

Post by **mcapra** » Thu Dec 22, 2016 12:54 pm

The vSphere SDKs are pretty heavy in terms of the resources required to leverage them. Are you running these checks directly from your Nagios Core machine? Can you show us the check command definitions for your VMWare checks?

You might also consider using the box293_check_vmware plugin as it's documentation is incredibly comprehensive and @box293 is a swell guy in genereal.

coreymanshack · Post by **coreymanshack** » Thu Dec 22, 2016 2:28 pm

mcapra wrote:The vSphere SDKs are pretty heavy in terms of the resources required to leverage them. Are you running these checks directly from your Nagios Core machine? Can you show us the check command definitions for your VMWare checks?

You might also consider using the box293_check_vmware plugin as it's documentation is incredibly comprehensive and @box293 is a swell guy in genereal.

Yes the checks are running directly from the Nagios Core machine. When they are running CPU usage is almost 0, this box is nearly idle with 0.00 load.
From commands.cfg

Code: Select all

# check vmware esxi machine
# check cpu
define command{
        command_name check_esx_cpu
        command_line $USER1$/check_vmware_api.pl -t 300 -H $HOSTADDRESS$ -u $USER11$ -p $USER12$ -l cpu -s usage -w $ARG1$ -c $ARG2$
        }

# check memory usage
define command{
        command_name check_esx_mem
        command_line $USER1$/check_vmware_api.pl -t 300 -H $HOSTADDRESS$ -u $USER11$ -p $USER12$ -l mem -s usage -w $ARG1$ -c $ARG2$
        }

# check net usage
define command{
        command_name check_esx_net
        command_line $USER1$/check_vmware_api.pl -t 300 -H $HOSTADDRESS$ -u $USER11$ -p $USER12$ -l net -s usage -w $ARG1$ -c $ARG2$
        }

# check runtime status
define command{
        command_name check_esx_runtime
        command_line $USER1$/check_vmware_api.pl -t 300 -H $HOSTADDRESS$ -u $USER11$ -p $USER12$ -l runtime -s status
        }

# check io read
define command{
        command_name check_esx_ioread
        command_line $USER1$/check_vmware_api.pl -t 300 -H $HOSTADDRESS$ -u $USER11$ -p $USER12$ -l io -s read -w $ARG1$ -c $ARG2$
        }

# check io write
define command{
        command_name check_esx_iowrite
        command_line $USER1$/check_vmware_api.pl -t 300 -H $HOSTADDRESS$ -u $USER11$ -p $USER12$ -l io -s write -w $ARG1$ -c $ARG2$
        }

My service definition -

Code: Select all

 check cpu
define service{
        use                             generic-service
        host_name                       testLab
#vmServer1, buildServer, backupServer
        service_description             ESXi CPU Load
        check_command                   check_esx_cpu!80!90
}

# check memory usage
define service{
        use                             generic-service
        host_name                       testLab
        service_description             ESXi Memory usage
        check_command                   check_esx_mem!80!90
}

# check net
define service{
        use                             generic-service
        host_name                       testLab
        service_description             ESXi Network usage
        check_command                   check_esx_net!102400!204800
}

# check runtime status
define service{
        use                             generic-service
        host_name                       testLab
        service_description             ESXi Runtime status
        check_command                   check_esx_runtime
}

# check io read
define service{
        use                             generic-service
        host_name                       testLab
        service_description             ESXi IO read
        check_command                   check_esx_ioread!40!90
}

# check io write
define service{
        use                             generic-service
        host_name                       testLab
        service_description             ESXi IO write
        check_command                   check_esx_iowrite!40!90
}

I sure would like to get this one working, but if i can't I'll check out the other addon. I increased my execution time on these to 300 seconds today and they are still timing out.

avandemore · Post by **avandemore** » Thu Dec 22, 2016 4:02 pm

We don't support 3rd party plugins. If you have a slow plugin, contact the author they may be able to assist.

If you run the plugin from another system, do you get any better results?

You might have better luck having a passive check sent from VMware to the Nagios server.

coreymanshack · Post by **coreymanshack** » Thu Dec 22, 2016 4:38 pm

avandemore wrote:We don't support 3rd party plugins. If you have a slow plugin, contact the author they may be able to assist.

If you run the plugin from another system, do you get any better results?

You might have better luck having a passive check sent from VMware to the Nagios server.

That's a good stance to have, but I'm looking at nagios xi and you guys seem to offer vmware monitoring out of the box with that "To monitor VMware systems in Nagios XI is as easy as running the VMware Wizard.". What plugin are you using there?

If you run the plugin from another system, do you get any better results?

How should I run the plugin from another system? What are you suggesting?

You might have better luck having a passive check sent from VMware to the Nagios server.

Do you have any good resources you can point me to on this?

avandemore · Post by **avandemore** » Thu Dec 22, 2016 4:46 pm

This is our documentation:
https://assets.nagios.com/downloads/nag ... ios-XI.pdf

You would run the plugin from another system the same as you would from Nagios. There is nothing magical about a Nagios plugin, it's simply an executable. The executable will work from any system provided the dependencies are met.

Here is some docs on our passive checks:

https://assets.nagios.com/downloads/nag ... ios-XI.pdf
https://github.com/NagiosEnterprises/ncpa
https://assets.nagios.com/downloads/nag ... h_NRDS.pdf

NCPA is the newest, NSCA has also received a recent update.

Nagios Support Forum

check_vmware_api.pl timeout after 60 seconds

check_vmware_api.pl timeout after 60 seconds

Re: check_vmware_api.pl timeout after 60 seconds

Re: check_vmware_api.pl timeout after 60 seconds

Re: check_vmware_api.pl timeout after 60 seconds

Re: check_vmware_api.pl timeout after 60 seconds

Re: check_vmware_api.pl timeout after 60 seconds