Page 1 of 1

Timeout in check_wmi_plus.pl

Posted: Wed Apr 29, 2015 7:34 am
by mon-team
Hi guys,

It seems like the timeout set into the plugin check_wmi_plus.pl is ignored when the check is executed by modgearman.

Running the check in a gearman queue, we get a CRITICAL status with output "(Service Check Timed Out On Worker: ...)" and execution_time >=60s while without gearman, the same service go into UNKNOWN status with "UNKNOWN - Plugin Timed out (15 sec)".

Any idea about this behaviour?. We would like to executed those services through gearman with timeout=15s.

We are running Nagios XI 2012R2.9 and mod_gearman-1.4.14.

Thanks

Best Regards

Re: Timeout in check_wmi_plus.pl

Posted: Wed Apr 29, 2015 9:36 am
by lmiltchev
Can you post the service config along with the other relevant configs (templates, commands, etc.)? Also, post the "mod_gearman_worker.conf". Hide sensitive info.

Re: Timeout in check_wmi_plus.pl

Posted: Thu Apr 30, 2015 3:33 am
by mon-team
see in attachments our config files.

Re: Timeout in check_wmi_plus.pl

Posted: Thu Apr 30, 2015 11:46 am
by rseiwert
I might be wrong here but it looks like you have your check_wmi_plus command defined with a hardcoded timeout of 15 secs

Code: Select all

define command {
       command_name                             check_wmi_plus
       command_line                             $USER1$/check_wmi_plus.pl -H $HOSTADDRESS$ -u username -p password -t 15 $ARG1$
}
I would just try changing the -t 15 to -t 60 in your command def and see if that solves your problem.

Re: Timeout in check_wmi_plus.pl

Posted: Thu Apr 30, 2015 12:28 pm
by jdalrymple
I would think 15 seconds should be plenty.

I'm assuming this works form the Nagios server proper, just not from gearman? Is it on the same host or a different host? If different I'd look at your firewall configuration.

Re: Timeout in check_wmi_plus.pl

Posted: Mon May 04, 2015 6:20 am
by mon-team
I'm assuming this works form the Nagios server proper, just not from gearman?
yes, with gearman the script's timeout is ignored.
yes, wiht gearman the script timeout is being ignored
We experience this issue on all services using check_wmi_plus.pl as plugin

No problem when running the check from the commandline of nagiosxi server and worker servers.
Our Modgearman workers and nagiosXI belong to the same subnet.

Re: Timeout in check_wmi_plus.pl

Posted: Mon May 04, 2015 9:33 am
by tgriep
On the worker that is failing, can you increase the debug log level and send in the log file?

Edit the mod_gearman_worker.conf file on the remote worker.

Change this from

Code: Select all

debug=0
to

Code: Select all

debug=3
Restart the worker

Code: Select all

service mod_gearman_worker restart
Then post the following file after the check is run.

Code: Select all

/var/log/mod_gearman/mod_gearman_worker.log