Nagios Support Forum

Posted: **Tue Jun 18, 2013 11:06 am**

Today I saw that one of my services was giving an error with the message: "CHECK_NRPE: Socket timeout after 10 seconds."

I figured the service was down so I started checking. The service was up, so Nagios was making a mistake. So I went to the command line on the Nagios server (the one making the check_nrpe call, not the server being probed) and did this:

$ time /usr/lib/nagios/plugins/check_nrpe -H my.hostname -c check_my_nrpe_service

PING OK - Packet loss = 0%, RTA = 88.35 ms|rta=88.345001ms;100.000000;1000.000000;0.000000 pl=0%;10;10;0

real 0m4.129s
user 0m0.008s
sys 0m0.000s

(The "service" to be checked is basically running check_ping).

So, the probed server responds within 5 seconds, but check_nrpe complains about a 10-second timeout.

I have other services on this same server being checked via NRPE (e.g. system load, user load, disk space, etc.) and they all seem to work without a problem.

I searched around and the only promising lead was a badly-cached IP address lookup (which *has* happened to me when configuring iptables and a host's IP address changes), but I double-checked the hostname in the monitor's config file (it's correct), DNS resolves correctly, and I have restarted Nagios entirely just in case there was an incorrect cached DNS lookup. No change in behavior.

Any suggestions?

Posted: **Tue Jun 18, 2013 11:46 am**

Any reason why you are using check_nrpe to do a ping check? (Are you checking a separate network node?)

Posted: **Tue Jun 18, 2013 11:53 am**

Yes, I'm using check_ping from the remote host because I have to check to see whether a VPN tunnel is available from that host. I can't check it from anywhere else.

Posted: **Tue Jun 18, 2013 1:51 pm**

Try to increase the timeout, just in case the scheduler is a bit behind, or the server is under load.
Was this check working at one point? Or has it been failing since deployment?

Posted: **Thu Jun 20, 2013 10:58 am**

It /was/ working for a while. I have tried increasing the timeout, but I may be doing it incorrectly:

define command {
command_name check_nrpe_with_timeout
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ -t $ARG2$
}

define service {
use local-service
host_name hostname
service_description VPN:[Client Name]
check_command check_nrpe_with_timeout!check_VPN_[client_name]!30
}

I still get this error:
CHECK_NRPE: Socket timeout after 10 seconds.

I would have expected "socket timeout after 30 seconds" when specifying the timeout. I definitely restarted Nagios after making those changes, and I have only one Nagios server running -- no intermediaries).

Posted: **Thu Jun 20, 2013 1:03 pm**

You may have the directive:

Code: Select all

command_timeout=10

OR

Code: Select all

connection_timeout=10

declared in the remote host's nrpe.cfg

Posted: **Fri Jun 21, 2013 11:06 am**

This is all I have:

$ grep _timeout `find . -type f`
./conf.d/my_vpn_host.cfg: command_name check_nrpe_with_timeout
./conf.d/my_vpn_host.cfg: check_command check_nrpe_with_timeout!check_VPN_client_name!30
./nagios.cfg:service_check_timeout=60
./nagios.cfg:host_check_timeout=30
./nagios.cfg:event_handler_timeout=30
./nagios.cfg:notification_timeout=30
./nagios.cfg:ocsp_timeout=5
./nagios.cfg:perfdata_timeout=5

Any other suggestions?

Posted: **Fri Jun 21, 2013 11:35 am**

Whoops, I just realized that you might have meant the server being monitored -- seeing as how you suggested looking at nrpe.cfg.

I only have /etc/nagios/nrpe.cfg -- no other configuration files on the server.

$ grep _timeout nrpe.cfg
command_timeout=60
connection_timeout=300

So the 10-second timeout is still a mystery to me.

Posted: **Fri Jun 21, 2013 12:36 pm**

Also check your nagios.cfg on the core server for:

Code: Select all

service_check_timeout=10

Posted: **Mon Jun 24, 2013 11:06 am**

You can see above that it is already set to 60:

> ./nagios.cfg:service_check_timeout=60

Nagios Support Forum

check_nrpe works from CLI, fails from server with timeout

check_nrpe works from CLI, fails from server with timeout

Re: check_nrpe works from CLI, fails from server with timeou

Re: check_nrpe works from CLI, fails from server with timeou

Re: check_nrpe works from CLI, fails from server with timeou

Re: check_nrpe works from CLI, fails from server with timeou

Re: check_nrpe works from CLI, fails from server with timeou

Re: check_nrpe works from CLI, fails from server with timeou

Re: check_nrpe works from CLI, fails from server with timeou

Re: check_nrpe works from CLI, fails from server with timeou

Re: check_nrpe works from CLI, fails from server with timeou