Hi,
I am trying to disable warning/criticals (essentially the email alerts) that are sent when the following events happen:
- random 'CHECK_NRPE: Error - Could not complete SSL handshake.' on servers that most definitely work. Recovery happens a few moments later. Very sporadic and random (and annoying).
- sporadic 'CHECK_NRPE: Socket timeout after 30 seconds.' Also recovers quickly.
Are there ways to do the following:
- make these queries NOT be warn/criticals?
or
-disable alerts for such events
or
-disable whatever possible to stop these checks from failing in this manner
Thank you.
Cleaning out False Positive Alerts (ssl handshake/timeouts)
Re: Cleaning out False Positive Alerts (ssl handshake/timeou
There are a couple ways this could be done:
- You should be able to raise the max_check_attempts, which will help to counteract the false positive.
- Increase your notification_interval to a longer amount of time so that it has time to resolve itself, thus preventing false positives.
Will either of those solutions work for you?
- You should be able to raise the max_check_attempts, which will help to counteract the false positive.
- Increase your notification_interval to a longer amount of time so that it has time to resolve itself, thus preventing false positives.
Will either of those solutions work for you?
Former Nagios Employee
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: Cleaning out False Positive Alerts (ssl handshake/timeou
NRPE v3 has the ability to do this (check_nrpe):and1100 wrote: make these queries NOT be warn/criticals?
NEW TIMEOUT SYNTAX
-t <interval>:<state>
<interval> = Number of seconds before connection times out (default=10)
<state> = Check state to exit with in the event of a timeout (default=CRITICAL)
Timeout state must be a valid state name (case-insensitive) or integer:
(OK, WARNING, CRITICAL, UNKNOWN) or integer (0-3)
Code: Select all
/usr/local/nagios/libexec/check_nrpe -H centos18 -t 2:3
CHECK_NRPE STATE UNKNOWN: Socket timeout after 2 seconds.
echo $?
3https://support.nagios.com/kb/article.php?id=520
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Cleaning out False Positive Alerts (ssl handshake/timeou
Hi -- I think this is doable. I am upping the max_check_attempts on each individual service, correct?rkennedy wrote:There are a couple ways this could be done:
- You should be able to raise the max_check_attempts, which will help to counteract the false positive.
- Increase your notification_interval to a longer amount of time so that it has time to resolve itself, thus preventing false positives.
Will either of those solutions work for you?
Re: Cleaning out False Positive Alerts (ssl handshake/timeou
Hi -- this sounds great, however, wouldn't this be a change I would have to do on each client? I guess my goal is to be able to make the changes on the Nagios server itself. I should have specified that. Is there a way to disable SSL handshaking on the Nagios server as a whole, rather than nrpe checks individually?Box293 wrote:NRPE v3 has the ability to do this (check_nrpe):and1100 wrote: make these queries NOT be warn/criticals?NEW TIMEOUT SYNTAX
-t <interval>:<state>
<interval> = Number of seconds before connection times out (default=10)
<state> = Check state to exit with in the event of a timeout (default=CRITICAL)
Timeout state must be a valid state name (case-insensitive) or integer:
(OK, WARNING, CRITICAL, UNKNOWN) or integer (0-3)https://support.nagios.com/kb/article.php?id=515Code: Select all
/usr/local/nagios/libexec/check_nrpe -H centos18 -t 2:3 CHECK_NRPE STATE UNKNOWN: Socket timeout after 2 seconds. echo $? 3
https://support.nagios.com/kb/article.php?id=520
Thank you.
Re: Cleaning out False Positive Alerts (ssl handshake/timeou
Nope, this would just take adjusting your check_nrpe command to use the -t 2:3 parameter. In your case, it would probably be -t 30:0. (time out 30 seconds, OK if time out)
Keep in mind this is a feature of NRPE v3. SSL needs to be configured on the client side, as that's where it's specified what needs to be used to talk to the client.
Keep in mind this is a feature of NRPE v3. SSL needs to be configured on the client side, as that's where it's specified what needs to be used to talk to the client.
Former Nagios Employee
Re: Cleaning out False Positive Alerts (ssl handshake/timeou
Hi Guys,
I am back now and giving this a shot. Recently, we had this scenario happen:
A host randomly threw out this email alert:
Hyperlink: TBD;
Additional Info:
CHECK_NRPE: Error - Could not complete SSL handshake.
The max_check_attempts for the HOST seems to be pretty high already (it's 10).
max_check_attempts 10
The notification interval seems to be standard:
notification_interval 30
Here's an example of the service that threw out this alert:
define service{
use generic-service
host_name ahost
service_description Process: sssd
check_command check_nrpe_linux!check_sssd
}
I'm not seeing where to implement check_nrpe to use the parameters you've said? Just to confirm, I need to recompile nrpe to v3 on the Nagios server? If so, I'm performing that right now.
Thanks for all of your help.
I am back now and giving this a shot. Recently, we had this scenario happen:
A host randomly threw out this email alert:
Hyperlink: TBD;
Additional Info:
CHECK_NRPE: Error - Could not complete SSL handshake.
The max_check_attempts for the HOST seems to be pretty high already (it's 10).
max_check_attempts 10
The notification interval seems to be standard:
notification_interval 30
Here's an example of the service that threw out this alert:
define service{
use generic-service
host_name ahost
service_description Process: sssd
check_command check_nrpe_linux!check_sssd
}
I'm not seeing where to implement check_nrpe to use the parameters you've said? Just to confirm, I need to recompile nrpe to v3 on the Nagios server? If so, I'm performing that right now.
Thanks for all of your help.
Re: Cleaning out False Positive Alerts (ssl handshake/timeou
It really depends how your check_nrpe_linux command is defined, but I imagine you could hard code pass it there for every service check. Keep in mind though, that this is only applicable to NRPE v3, nothing prior. What you'd want to pass is probably -t 60:0
Yes, you need to upgrade to NRPE v3 for this.
Yes, you need to upgrade to NRPE v3 for this.
Former Nagios Employee
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: Cleaning out False Positive Alerts (ssl handshake/timeou
Just check_nrpe, which is outlined at the bottom of this KB article:and1100 wrote:I need to recompile nrpe to v3 on the Nagios server? If so, I'm performing that right now.
https://support.nagios.com/kb/article.php?id=515
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.