Page 2 of 4
Re: Timeout issue
Posted: Sun Aug 24, 2014 11:35 pm
by cg28oh
Installed from source, Nagios Core 4.0.8 and plugins 2.0.3
Re: Timeout issue
Posted: Mon Aug 25, 2014 10:01 am
by sreinhardt
If I gave build instructions, would you be willing to pull down build and test the timeout branch, and see if that resolves the strange times in your testing? I can certainly setup internal test systems instead, but it seems like you have a pretty good setup incidentally to test this out.
Re: Timeout issue
Posted: Tue Aug 26, 2014 1:38 pm
by cg28oh
Sure thing!
Re: Timeout issue
Posted: Tue Aug 26, 2014 5:20 pm
by abrist
The old math from check_snmp timeout looks like:
Code: Select all
alarm(timeout_interval * retries + 5);
(With a default retries of 5)
As you can see, the "actual" timeout value gets very large, very quickly, essentially well exceeding what timeout you actually set.
The new code looks like:
(with the retries computed as a fraction of the total timeout)
To build the branch, make sure you have the necessary deps for building nagios plugins and then run the following:
Code: Select all
cd /tmp
wget https://github.com/nagios-plugins/nagios-plugins/archive/timeout_state.zip
unzip timeout_state
cd nagios-plugins-timeout_state/
./tools/setup
./configure
make
The new plugin bin should be located at:
Code: Select all
/tmp/nagios-plugins-timeout_state/plugins/check_snmp
If you wish to install all the plugins from the branch, run:
Code: Select all
cd /tmp/nagios-plugins-timeout_state
make install
Re: Timeout issue
Posted: Wed Sep 03, 2014 9:27 am
by cg28oh
Now the state is "CRITICAL - Plugin timed out while executing system call" with the default settings.
Re: Timeout issue
Posted: Fri Sep 05, 2014 10:23 am
by sreinhardt
What are the arguments you are passing the newly built binaries?
Re: Timeout issue
Posted: Tue Sep 09, 2014 6:41 am
by cg28oh
This is what I had set:
Code: Select all
define command{
command_name check_snmp
command_line $USER1$/check_snmp -e 1 -t 10 -H $HOSTADDRESS$ $ARG1$
}
and once that produced the "System call timeout" message I tried the default setting
Code: Select all
define command{
command_name check_snmp
command_line $USER1$/check_snmp -H $HOSTADDRESS$ $ARG1$
}
which produced the same message.
Re: Timeout issue
Posted: Tue Sep 09, 2014 5:21 pm
by abrist
Does the remote device support snmp, is the firewall open, and is it listening for requests? Lets do a walk to find out:
Code: Select all
snmpwalk -c <community> -v1 <remote device ip address>
Or:
Code: Select all
snmpwalk -c <community> -v2c <remote device ip address>
Re: Timeout issue
Posted: Thu Sep 11, 2014 4:17 pm
by cg28oh
Yes they do, however they are satellite connection. Depending on the amount of sites online, the response time can range from 700ms to 8-10 seconds. Only SNMP v1 is supported.
Code: Select all
snmpget -v 1 -c X 10.0.0.1 sysUpTime.0
DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (113517114) 13 days, 3:19:31.14
Re: Timeout issue
Posted: Fri Sep 12, 2014 2:49 pm
by abrist
As the default retries are divided by the timeout value, setting -t 10 (3 seconds or so) may not be enough. Try setting the timeout to a higher number like 30 seconds.