Page 1 of 1

check_nrpe socket timeout alerts

Posted: Wed Dec 29, 2010 1:59 pm
by dwalli
I'm hoping someone can help me dig into the errors I'm seeing from Nagios 3 (server running on Ubuntu 8.10) talking to an NRPE 2.12 client on RHEL 5.5.

The data I'm getting isn't terribly descriptive. Here's the email I get regarding an alert:
***** Nagios *****

Notification Type: PROBLEM

Service: PROC
Host: <hostname>
Address: <ipaddress>
State: CRITICAL

Date/Time: Wed Dec 29 01:35:29 EST 2010

Additional Info:

CHECK_NRPE: Socket timeout after 10 seconds.
The check_nrpe socket timeout would seem to indicate that the Nagios server can't reach the nrpe agent on the monitored client, but the monitoring seems to work just fine (the Service State information page shows active check type, when the last check was done, everything green, etc.).

The service definition from services.cfg is thus:

Code: Select all

define service{
        host_name               <hostname>
        service_description     PROC
        check_command           check_nrpe!check_cpu!"-n -w 95 -c 100"
        use                     generic-service
        contact_groups          Unix Admins
        }
Note the -n argument in the check_command; I just added that today because a manual check was failing with the same error:
/etc/nagios3/conf.d# /usr/lib/nagios/plugins/check_nrpe -H <ipaddress> -c check_cpu
CHECK_NRPE: Socket timeout after 10 seconds.
When I added the -n argument, the check worked:
/etc/nagios3/conf.d# /usr/lib/nagios/plugins/check_nrpe -H <ipaddress> -c check_cpu -n
CPU Usage normal: CPU7: 8.08% CPU6: 7.86% CPU5: 8.13% CPU4: 15.16% CPU3: 7.17% CPU2: 7.51% CPU1: 7.40% CPU0: 35.51%
Any tips as to what I may have mis-configured such that the alert is throwing this "socket timeout"?

Regards,
Don

Re: check_nrpe socket timeout alerts

Posted: Sat Feb 05, 2011 5:56 am
by sandyspatil
Change the command definition of nrpe_check and increase the time period to 30 sec.
For e. g. -t 30

Re: check_nrpe socket timeout alerts

Posted: Sat Feb 12, 2011 12:57 pm
by mguthrie
The command is still being blocked I think. Check to make sure you have ssl installed on both machines. Also make sure your firewall is allowing traffic on port 5666 on both machines, and that the nrpe.cfg files have the "allowed_hosts" defined correctly.