Page 1 of 1

Service check timed out after 60.01 seconds

Posted: Fri Aug 19, 2016 4:11 am
by jyoti22
Hi Team,
We have on-boarded total 85 servers on XI and used 2 vMA's for it. Both vMA's have sufficient memory and HDD but still on the servers where we are monitoring datastore latency shows "Service check timed out after 60.01 seconds" error.

memory and disk usage output of both vMA's is as below. Kindly let us know what could be the reason of this timeout error and what is the workaround to fix this error.

Code: Select all

vi-admin@AUSHOSVMAPRD01:~> free -m
             total       used       free     shared    buffers     cached
Mem:         16081       2138      13942          0        133       1676
-/+ buffers/cache:        328      15753
Swap:          133          0        133

vi-admin@AUSHOSVMAPRD01:~> df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda3       2.7G  1.5G  1.1G  58% /
udev            7.9G   88K  7.9G   1% /dev
tmpfs           7.9G     0  7.9G   0% /dev/shm
/dev/sda1       128M   37M   85M  31% /boot


vi-admin@aushosvmaprd00:~> free -m
             total       used       free     shared    buffers     cached
Mem:         16081       2138      13942          0          7        205
-/+ buffers/cache:       1924      14156
Swap:          133          0        133
vi-admin@aushosvmaprd00:~> df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda3       2.7G  1.6G  1.1G  60% /
udev            7.9G   88K  7.9G   1% /dev
tmpfs           7.9G     0  7.9G   0% /dev/shm
/dev/sda1       128M   37M   85M  31% /boot
Thanks,
Jyoti

Re: Service check timed out after 60.01 seconds

Posted: Fri Aug 19, 2016 7:23 am
by eloyd
Can you please provide the portion of your configuration that shows what the service check actually is? A screenshot of the Core Config screen showing the "common settings" and "check settings" would be perfect.

Re: Service check timed out after 60.01 seconds

Posted: Fri Aug 19, 2016 10:20 am
by lmiltchev
Thanks @eloyd!

@jyoti22 any updates?

Re: Service check timed out after 60.01 seconds

Posted: Mon Aug 22, 2016 2:26 am
by jyoti22
Please find attached screenshot of common and check settings

Re: Service check timed out after 60.01 seconds

Posted: Mon Aug 22, 2016 7:30 am
by eloyd
I see @box293's check_vmware in there. These are potentially complex checks that can consume a lot of resources to initiate. I'll see if I can get him to chime in and take a look.

Re: Service check timed out after 60.01 seconds

Posted: Mon Aug 22, 2016 2:41 pm
by tgriep
Are you running the latest version of the box293_check_vmware.pl script on your vMA server?
If not, try and upgrade to the latest as there were some performance enhancements that may help out on this issue.
Here is the link to the latest version.
https://exchange.nagios.org/directory/P ... re/details

Re: Service check timed out after 60.01 seconds

Posted: Mon Aug 22, 2016 4:57 pm
by Box293
If you are using the latest version and the timeout is occurring then there may be a bug.

What happens when you execute the check on the vMA directly? Does it time out?

If it does time out, can you run the command again on the vMA but this time add the --debug argument at the end.

~/box293_check_vmware.pl xxxxxxxxxxxxx --debug

This will create the file /home/vi-admin/box293_check_vmware_debug_log.txt
Please email/PM me that file and I'll investigate some more.