service check timed out after 60s

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
devnully
Posts: 10
Joined: Mon Jan 12, 2015 4:33 pm

service check timed out after 60s

Post by devnully »

Hi

I have an snmp service check (check_snmp) that queries a script on a host. If I run the script from the command line it works :

COMMAND: /usr/local/nagios/libexec/check_snmp -H seis-nuc-dev -C public -o .1.3.6.1.4.1.2021.53.4.1.2.21.99.104.101.99.107.95.97.114.114.97.121.95.100.101.116.101.99.116.111.114.115.1 -r "WRA" -t 100 -w1 -c1 -u "WRA detector running"
OUTPUT: SNMP OK - 1 WRA detector running | iso.3.6.1.4.1.2021.53.4.1.2.21.99.104.101.99.107.95.97.114.114.97.121.95.100.101.116.101.99.116.111.114.115.1=1WRA detector running;1;1;

However, status information for that service is "(Service check timed out after 60.01 seconds)".

I have quite a few more check_snmp services running (>100) and this is the only one that gives me trouble. Obviously I'm doing something wrong, but I can not put my nose at it.

Cheers,
hw
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: service check timed out after 60s

Post by rkennedy »

What is the command that XI is running for this - can you check under the CCM?

Does it work if you add -t 120 to it?
Former Nagios Employee
devnully
Posts: 10
Joined: Mon Jan 12, 2015 4:33 pm

Re: service check timed out after 60s

Post by devnully »

Hi,

CCM command (Testing check from command line...)
COMMAND: /usr/local/nagios/libexec/check_snmp -H seis-nuc-dev -C public -o .1.3.6.1.4.1.2021.53.4.1.2.21.99.104.101.99.107.95.97.114.114.97.121.95.100.101.116.101.99.116.111.114.115.1 -r "WRA" -t 100 -w1 -c1 -u "WRA detector running"

-t 100 = 100 seconds, however I'll try t 120

Thanks,
hw
devnully
Posts: 10
Joined: Mon Jan 12, 2015 4:33 pm

Re: service check timed out after 60s

Post by devnully »

CCM:
Command view

$USER1$/check_snmp -H $HOSTADDRESS$ -C $ARG1$ -o .1.3.6.1.4.1.2021.53.4.1.2.21.99.104.101.99.107.95.97.114.114.97.121.95.100.101.116.101.99.116.111.114.115.$ARG2$ -r "$ARG3$" -t $ARG4$ -w1 -c1 -u "$ARG3$ detector running"

ARG1 = public
ARG2 = 1
ARG3 = WRA
ARG4 = 120
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: service check timed out after 60s

Post by Box293 »

I can see you are running problem which can lead to the issues you are having, using the "Test Check Command" button. Due to some issues with how PHP escapes characters the "Test Check Command" does not always work in these situations and should be ignored.

So for all further testing of this service you need to:

Make the changes to the service
Save the Service
Apply Configuration
Go back to the home screen and find the Service
When viewing the Service Status Details page click the Schedule a forced immediate check link

Just to re-iterate, for all further testing for this service DO NOT use the "Test Check Command" button, follow the steps above.

I see you are using double quotes. Are these defined in the command definition OR in the $ARGx$ fields?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
devnully
Posts: 10
Joined: Mon Jan 12, 2015 4:33 pm

Re: service check timed out after 60s

Post by devnully »

I have no problems with the [TEST CHECK COMMAND] button. I get the expected result.
However following your advise I still get the same issue. See attached picture.

Oh and yes I have increased the timeout from 100 to 120 seconds, but the result is the same.

Thanks,
hw
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: service check timed out after 60s

Post by Box293 »

devnully wrote:I have no problems with the [TEST CHECK COMMAND] button. I get the expected result.
I understand what you are saying, but unfortunately it's not working outside of CCM which means CCM and the way it tests things is making it appear that it works when it isn't when the monitoring engine executes the check.

Can you run these commands at the command line please:

Code: Select all

su nagios

/usr/local/nagios/libexec/check_snmp -H seis-nuc-dev -C public -o .1.3.6.1.4.1.2021.53.4.1.2.21.99.104.101.99.107.95.97.114.114.97.121.95.100.101.116.101.99.116.111.114.115.1 -r "WRA" -t 100 -w1 -c1 -u "WRA detector running"

time /usr/local/nagios/libexec/check_snmp -H seis-nuc-dev -C public -o .1.3.6.1.4.1.2021.53.4.1.2.21.99.104.101.99.107.95.97.114.114.97.121.95.100.101.116.101.99.116.111.114.115.1 -r "WRA" -t 100 -w1 -c1 -u "WRA detector running"
What output is generated? The command with time at the front of it will time how long it takes for the plugin to execute.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
devnully
Posts: 10
Joined: Mon Jan 12, 2015 4:33 pm

Re: service check timed out after 60s

Post by devnully »

Thanks for all your help.

:oops:

Nothing wrong with Nagios. (Everything wrong with my fault finding ...)
Issue is our network (DNS, multihomed host).

Again thanks for your help and I'm going to wash the egg of my face now.

hw
User avatar
Box293
Too Basu
Posts: 5126
Joined: Sun Feb 07, 2010 10:55 pm
Location: Deniliquin, Australia
Contact:

Re: service check timed out after 60s

Post by Box293 »

Excellent, I'm glad you discovered the problem now and now 20 more posts of troubleshooting :lol:
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Locked