Hi ,
We have received nagios notification as "CHECK_NRPE: Socket timeout after 30 seconds." from our backup server.
We received 50 nrpe check during a particular time (Aug 10 04:58:13) and the nrpe stopped responding as per the "cps" value @ /etc/xinetd.conf
# cat /etc/xinetd.conf |grep -i cps
cps = 50 10
Our question is , we have only 15 services configured for this server and how could we received 50 nrpe check requests.
Attached Service configuration file and the /var/log/messages . Kindly assist
CHECK_NRPE: Socket timeout - CPS @ /etc/xinetd.conf
-
inas.labib
- Posts: 170
- Joined: Tue Sep 11, 2012 3:48 am
CHECK_NRPE: Socket timeout - CPS @ /etc/xinetd.conf
You do not have the required permissions to view the files attached to this post.
Re: CHECK_NRPE: Socket timeout - CPS @ /etc/xinetd.conf
What are the following values set to for each of the 15 services checks on that server?
Also, what notification options are enabled?
Host - notification options
This directive is used to determine when notifications for the host should be sent out. Valid options are a combination of one or more of the following:
d = send notifications on a DOWN state,
u = send notifications on an UNREACHABLE state,
r = send notifications on recoveries (OK state),
f = send notifications when the host starts and stops flapping, and
s = send notifications when scheduled downtime starts and ends.
If you do not specify any notification options, Nagios will assume that you want notifications to be sent out for all possible states.
*Example: If you specify d,r in this field, notifications will only be sent out when the host goes DOWN and when it recovers from a DOWN state.
Service - notification options
This directive is used to determine when notifications for the service should be sent out. Valid options are a combination of one or more of the following:
w = send notifications on a WARNING state,
u = send notifications on an UNKNOWN state,
c = send notifications on a CRITICAL state,
r = send notifications on recoveries (OK state),
f = send notifications when the service starts and stops flapping, and
s = send notifications when scheduled downtime starts and ends.
If you do not specify any notification options, Nagios will assume that you want notifications to be sent out for all possible states.
*Example: If you specify w,r in this field, notifications will only be sent out when the service goes into a WARNING state and when it recovers from a WARNING state.
Code: Select all
check_interval
retry_interval
max_check_attempts
notification_interval Host - notification options
This directive is used to determine when notifications for the host should be sent out. Valid options are a combination of one or more of the following:
d = send notifications on a DOWN state,
u = send notifications on an UNREACHABLE state,
r = send notifications on recoveries (OK state),
f = send notifications when the host starts and stops flapping, and
s = send notifications when scheduled downtime starts and ends.
If you do not specify any notification options, Nagios will assume that you want notifications to be sent out for all possible states.
*Example: If you specify d,r in this field, notifications will only be sent out when the host goes DOWN and when it recovers from a DOWN state.
Service - notification options
This directive is used to determine when notifications for the service should be sent out. Valid options are a combination of one or more of the following:
w = send notifications on a WARNING state,
u = send notifications on an UNKNOWN state,
c = send notifications on a CRITICAL state,
r = send notifications on recoveries (OK state),
f = send notifications when the service starts and stops flapping, and
s = send notifications when scheduled downtime starts and ends.
If you do not specify any notification options, Nagios will assume that you want notifications to be sent out for all possible states.
*Example: If you specify w,r in this field, notifications will only be sent out when the service goes into a WARNING state and when it recovers from a WARNING state.
Be sure to check out the Knowledgebase for helpful articles and solutions!
-
inas.labib
- Posts: 170
- Joined: Tue Sep 11, 2012 3:48 am
Re: CHECK_NRPE: Socket timeout - CPS @ /etc/xinetd.conf
Please find services checks values. Kindly assist to find why we receive multiple nrpe requests to the backup_server .
Attached the service configuration file for reference
check_interval : 5
retry_interval : 1
max_check_attempts : 5
notification_interval : 1440
Not specified any notification options.
Attached the service configuration file for reference
check_interval : 5
retry_interval : 1
max_check_attempts : 5
notification_interval : 1440
Not specified any notification options.
You do not have the required permissions to view the files attached to this post.
Re: CHECK_NRPE: Socket timeout - CPS @ /etc/xinetd.conf
Open the "/etc/xinetd.d/nrpe" file on the client in a text editor, and add the following lines to the file inside the closing "}":
Restart xinetd:
Let us know if this solved the issue.
Code: Select all
per_source = UNLIMITED
instances = UNLIMITEDCode: Select all
service xinetd restartBe sure to check out our Knowledgebase for helpful articles and solutions!
-
inas.labib
- Posts: 170
- Joined: Tue Sep 11, 2012 3:48 am
Re: CHECK_NRPE: Socket timeout - CPS @ /etc/xinetd.conf
Hi ,
Thanks for the solution. We have only 15 services configured for monitoring and we need to know why we receive more than 50 nrpe requests from nagios to client. Kindly assist.
Thanks for the solution. We have only 15 services configured for monitoring and we need to know why we receive more than 50 nrpe requests from nagios to client. Kindly assist.
Re: CHECK_NRPE: Socket timeout - CPS @ /etc/xinetd.conf
What is the NRPE version that you are running on the server, and on the client? I just spoke to one of our developers about this issue, and I was told (quote):
15*2 is not 50..., but this is probably something you can look at. Are you running NRPE as a "standalone" daemon or under xinetd? Does restarting the agent/daemon fix the issue?If check_nrpe is version 3.0 and it can't connect to the host ("Could not complete SSL handshake: or kicked out by xinetd), it can send a 2.x packet as a fall-back. That could double the number of requests.
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
inas.labib
- Posts: 170
- Joined: Tue Sep 11, 2012 3:48 am
Re: CHECK_NRPE: Socket timeout - CPS @ /etc/xinetd.conf
Hi,
Thank you for your response.
The Version of the NRPE client and server is 2.15
We would like to have an update from you as to what is the root cause for this issue.
so that we can avoid such a situation, if we know the where about while yet not compromising on the monitoring.
Please let me know if you need information or anything else you may require from our side.
Regards
Thank you for your response.
The Version of the NRPE client and server is 2.15
We would like to have an update from you as to what is the root cause for this issue.
so that we can avoid such a situation, if we know the where about while yet not compromising on the monitoring.
Please let me know if you need information or anything else you may require from our side.
Regards
Re: CHECK_NRPE: Socket timeout - CPS @ /etc/xinetd.conf
The root cause is that by default per_source is set to 10 so if they all checked at once you would receive that message. You can review the defaults in your /etc/xinetd.conf.
Let us know if you have any questions.
Thank you
Let us know if you have any questions.
Thank you