Do not notify when an alarm is generated for a short period

sanagios · Post by **sanagios** » Thu Apr 09, 2020 1:28 pm

lmiltchev wrote:In your first post you said:
This usually only occurs for a few minutes, and soon afterwards, the loads decrease.
You could try increasing the max_check_attempts value from 5 to something higher, e.g. 10. This way, nagios will be retrying the service a bit longer, before determining the state. Hopefully, this will provide enough time for the service to recover.

Service - max check attempts

This directive is used to define the number of times that Nagios will retry the service check command if it returns any state other than an OK state. Setting this value to 1 will cause Nagios to generate an alert without retrying the service check again.

Parameter name: max_check_attempts
Required: yes

As I understand it in the previous comments, the "First Notification Delay" already meets my need, but there were some doubts regarding the example I attached, I would like to have these questions clarified.

If my understanding is incorrect, please tell me.

scottwilkerson · Post by **scottwilkerson** » Thu Apr 09, 2020 3:57 pm

sanagios wrote:As I understand it in the previous comments, the "First Notification Delay" already meets my need, but there were some doubts regarding the example I attached, I would like to have these questions clarified.

The First Notification Delay only affects the first notification that would be sent out after the host/service has gone into a non-OK state.

What you may need to do to get your desired results would be to set the "Notification interval" to an extremely large number such as 999999

sanagios · Post by **sanagios** » Fri May 29, 2020 2:38 pm

If the Notification Interval is equal to 0, when a service changes its status from OK to Warning, Nagios will alert, but when the service changes its status from Warning to Critical or Critical> Warning, will it alert once for every change between these states?

scottwilkerson · Post by **scottwilkerson** » Fri May 29, 2020 4:46 pm

From the Help in XI for notification_interval

Host - notification interval

This directive is used to define the number of "time units" to wait before re-notifying a contact that this service is still down or unreachable. Unless you've changed the interval_length directive from the default value of 60, this number will mean minutes. If you set this value to 0, Nagios will not re-notify contacts about problems for this host - only one problem notification will be sent out.

So, once the service goes into a non-OK state, only 1 notification should go out. This would exclude flapping notifications.

Nagios Support Forum

Do not notify when an alarm is generated for a short period

Re: Do not notify when an alarm is generated for a short per

Re: Do not notify when an alarm is generated for a short per

Re: Do not notify when an alarm is generated for a short per

Re: Do not notify when an alarm is generated for a short per