Hi,
We've been recently getting lot of Nagios alerts (mainly flapping), and it seems to be check_nrpe related, the alerts would say either a connection refused or timed out, or eventually service recovered. We changed the timeout setting on check_nrpe command in Nagios from 30 seconds to 60 seconds and it may have done the trick as we haven't received those same alerts since.
Just wanted some other opinions on whether this was the right course of action or if there are any other suggestions.
Thanks
Getting Too Many Nagios Alerts
Re: Getting Too Many Nagios Alerts
Hi
Here's a pretty in-depth explanation around flapping detection:
https://assets.nagios.com/downloads/nag ... pping.html
As you have seen there are trade-offs to be made between not getting alerted and getting falsely alerted.
My advice is to read the above document and then make changes incrementally until you are comfortable
with the frequency of alerts you are getting.
Thanks
Here's a pretty in-depth explanation around flapping detection:
https://assets.nagios.com/downloads/nag ... pping.html
As you have seen there are trade-offs to be made between not getting alerted and getting falsely alerted.
My advice is to read the above document and then make changes incrementally until you are comfortable
with the frequency of alerts you are getting.
Thanks