CHECK_NRPE: Socket timeout - CPS @ /etc/xinetd.conf

inas.labib · Post by **inas.labib** » Mon Aug 15, 2016 2:54 am

Hi ,

We have received nagios notification as "CHECK_NRPE: Socket timeout after 30 seconds." from our backup server.
We received 50 nrpe check during a particular time (Aug 10 04:58:13) and the nrpe stopped responding as per the "cps" value @ /etc/xinetd.conf

# cat /etc/xinetd.conf |grep -i cps
cps = 50 10

Our question is , we have only 15 services configured for this server and how could we received 50 nrpe check requests.
Attached Service configuration file and the /var/log/messages . Kindly assist

bwallace · Post by **bwallace** » Mon Aug 15, 2016 9:29 am

What are the following values set to for each of the 15 services checks on that server?

Code: Select all

    check_interval 
    retry_interval 
    max_check_attempts 
    notification_interval

Also, what notification options are enabled?

Host - notification options

This directive is used to determine when notifications for the host should be sent out. Valid options are a combination of one or more of the following:
d = send notifications on a DOWN state,
u = send notifications on an UNREACHABLE state,
r = send notifications on recoveries (OK state),
f = send notifications when the host starts and stops flapping, and
s = send notifications when scheduled downtime starts and ends.

If you do not specify any notification options, Nagios will assume that you want notifications to be sent out for all possible states.
*Example: If you specify d,r in this field, notifications will only be sent out when the host goes DOWN and when it recovers from a DOWN state.

Service - notification options

This directive is used to determine when notifications for the service should be sent out. Valid options are a combination of one or more of the following:

w = send notifications on a WARNING state,
u = send notifications on an UNKNOWN state,
c = send notifications on a CRITICAL state,
r = send notifications on recoveries (OK state),
f = send notifications when the service starts and stops flapping, and
s = send notifications when scheduled downtime starts and ends.

If you do not specify any notification options, Nagios will assume that you want notifications to be sent out for all possible states.
*Example: If you specify w,r in this field, notifications will only be sent out when the service goes into a WARNING state and when it recovers from a WARNING state.

inas.labib · Post by **inas.labib** » Thu Aug 18, 2016 6:57 am

Please find services checks values. Kindly assist to find why we receive multiple nrpe requests to the backup_server .
Attached the service configuration file for reference

check_interval : 5
retry_interval : 1
max_check_attempts : 5
notification_interval : 1440

Not specified any notification options.

Post by **lmiltchev** » Thu Aug 18, 2016 9:41 am

Open the "/etc/xinetd.d/nrpe" file on the client in a text editor, and add the following lines to the file inside the closing "}":

Code: Select all

per_source = UNLIMITED
instances = UNLIMITED

Restart xinetd:

Code: Select all

service xinetd restart

Let us know if this solved the issue.

inas.labib · Post by **inas.labib** » Thu Aug 25, 2016 9:05 am

Hi ,

Thanks for the solution. We have only 15 services configured for monitoring and we need to know why we receive more than 50 nrpe requests from nagios to client. Kindly assist.

Post by **lmiltchev** » Thu Aug 25, 2016 11:09 am

What is the NRPE version that you are running on the server, and on the client? I just spoke to one of our developers about this issue, and I was told (quote):

If check_nrpe is version 3.0 and it can't connect to the host ("Could not complete SSL handshake: or kicked out by xinetd), it can send a 2.x packet as a fall-back. That could double the number of requests.

15*2 is not 50..., but this is probably something you can look at. Are you running NRPE as a "standalone" daemon or under xinetd? Does restarting the agent/daemon fix the issue?

inas.labib · Post by **inas.labib** » Mon Aug 29, 2016 3:49 am

Hi,

Thank you for your response.

The Version of the NRPE client and server is 2.15

We would like to have an update from you as to what is the root cause for this issue.
so that we can avoid such a situation, if we know the where about while yet not compromising on the monitoring.

Please let me know if you need information or anything else you may require from our side.

Regards

ssax · Post by **ssax** » Mon Aug 29, 2016 12:43 pm

The root cause is that by default per_source is set to 10 so if they all checked at once you would receive that message. You can review the defaults in your /etc/xinetd.conf.

Let us know if you have any questions.

Thank you

Nagios Support Forum

CHECK_NRPE: Socket timeout - CPS @ /etc/xinetd.conf

CHECK_NRPE: Socket timeout - CPS @ /etc/xinetd.conf

Re: CHECK_NRPE: Socket timeout - CPS @ /etc/xinetd.conf

Re: CHECK_NRPE: Socket timeout - CPS @ /etc/xinetd.conf

Re: CHECK_NRPE: Socket timeout - CPS @ /etc/xinetd.conf

Re: CHECK_NRPE: Socket timeout - CPS @ /etc/xinetd.conf

Re: CHECK_NRPE: Socket timeout - CPS @ /etc/xinetd.conf

Re: CHECK_NRPE: Socket timeout - CPS @ /etc/xinetd.conf

Re: CHECK_NRPE: Socket timeout - CPS @ /etc/xinetd.conf