Page 1 of 1
service check timed out after 60s
Posted: Thu Nov 05, 2015 11:44 pm
by devnully
Hi
I have an snmp service check (check_snmp) that queries a script on a host. If I run the script from the command line it works :
COMMAND: /usr/local/nagios/libexec/check_snmp -H seis-nuc-dev -C public -o .1.3.6.1.4.1.2021.53.4.1.2.21.99.104.101.99.107.95.97.114.114.97.121.95.100.101.116.101.99.116.111.114.115.1 -r "WRA" -t 100 -w1 -c1 -u "WRA detector running"
OUTPUT: SNMP OK - 1 WRA detector running | iso.3.6.1.4.1.2021.53.4.1.2.21.99.104.101.99.107.95.97.114.114.97.121.95.100.101.116.101.99.116.111.114.115.1=1WRA detector running;1;1;
However, status information for that service is "(Service check timed out after 60.01 seconds)".
I have quite a few more check_snmp services running (>100) and this is the only one that gives me trouble. Obviously I'm doing something wrong, but I can not put my nose at it.
Cheers,
hw
Re: service check timed out after 60s
Posted: Fri Nov 06, 2015 10:45 am
by rkennedy
What is the command that XI is running for this - can you check under the CCM?
Does it work if you add -t 120 to it?
Re: service check timed out after 60s
Posted: Sun Nov 08, 2015 5:16 pm
by devnully
Hi,
CCM command (Testing check from command line...)
COMMAND: /usr/local/nagios/libexec/check_snmp -H seis-nuc-dev -C public -o .1.3.6.1.4.1.2021.53.4.1.2.21.99.104.101.99.107.95.97.114.114.97.121.95.100.101.116.101.99.116.111.114.115.1 -r "WRA" -t 100 -w1 -c1 -u "WRA detector running"
-t 100 = 100 seconds, however I'll try t 120
Thanks,
hw
Re: service check timed out after 60s
Posted: Sun Nov 08, 2015 6:24 pm
by devnully
CCM:
Command view
$USER1$/check_snmp -H $HOSTADDRESS$ -C $ARG1$ -o .1.3.6.1.4.1.2021.53.4.1.2.21.99.104.101.99.107.95.97.114.114.97.121.95.100.101.116.101.99.116.111.114.115.$ARG2$ -r "$ARG3$" -t $ARG4$ -w1 -c1 -u "$ARG3$ detector running"
ARG1 = public
ARG2 = 1
ARG3 = WRA
ARG4 = 120
Re: service check timed out after 60s
Posted: Sun Nov 08, 2015 8:56 pm
by Box293
I can see you are running problem which can lead to the issues you are having, using the "Test Check Command" button. Due to some issues with how PHP escapes characters the "Test Check Command" does not always work in these situations and should be ignored.
So for all further testing of this service you need to:
Make the changes to the service
Save the Service
Apply Configuration
Go back to the home screen and find the Service
When viewing the Service Status Details page click the Schedule a forced immediate check link
Just to re-iterate, for all further testing for this service DO NOT use the "Test Check Command" button, follow the steps above.
I see you are using double quotes. Are these defined in the command definition OR in the $ARGx$ fields?
Re: service check timed out after 60s
Posted: Mon Nov 09, 2015 12:19 am
by devnully
I have no problems with the [TEST CHECK COMMAND] button. I get the expected result.
However following your advise I still get the same issue. See attached picture.
Oh and yes I have increased the timeout from 100 to 120 seconds, but the result is the same.
Thanks,
hw
Re: service check timed out after 60s
Posted: Mon Nov 09, 2015 12:59 am
by Box293
devnully wrote:I have no problems with the [TEST CHECK COMMAND] button. I get the expected result.
I understand what you are saying, but unfortunately it's not working outside of CCM which means CCM and the way it tests things is making it appear that it works when it isn't when the monitoring engine executes the check.
Can you run these commands at the command line please:
Code: Select all
su nagios
/usr/local/nagios/libexec/check_snmp -H seis-nuc-dev -C public -o .1.3.6.1.4.1.2021.53.4.1.2.21.99.104.101.99.107.95.97.114.114.97.121.95.100.101.116.101.99.116.111.114.115.1 -r "WRA" -t 100 -w1 -c1 -u "WRA detector running"
time /usr/local/nagios/libexec/check_snmp -H seis-nuc-dev -C public -o .1.3.6.1.4.1.2021.53.4.1.2.21.99.104.101.99.107.95.97.114.114.97.121.95.100.101.116.101.99.116.111.114.115.1 -r "WRA" -t 100 -w1 -c1 -u "WRA detector running"
What output is generated? The command with time at the front of it will time how long it takes for the plugin to execute.
Re: service check timed out after 60s
Posted: Mon Nov 09, 2015 5:44 pm
by devnully
Thanks for all your help.
Nothing wrong with Nagios. (Everything wrong with my fault finding ...)
Issue is our network (DNS, multihomed host).
Again thanks for your help and I'm going to wash the egg of my face now.
hw
Re: service check timed out after 60s
Posted: Mon Nov 09, 2015 7:10 pm
by Box293
Excellent, I'm glad you discovered the problem now and now 20 more posts of troubleshooting
