Page 1 of 2
Do not notify when an alarm is generated for a short period
Posted: Thu Apr 02, 2020 2:02 pm
by sanagios
Hello team,
I would like your support regarding a question.
We have many servers being monitored in our infrastructure, we noticed that some of them have high loads of ram and cpu usage, and this occurs at different times. This usually only occurs for a few minutes, and soon afterwards, the loads decrease. As they are at different times, I cannot use a notification exception by time. But I would like to notify if the service is in warning or critical and a few minutes later return to its normal state.
From what I read here "
https://assets.nagios.com/downloads/nag ... .html#host" it was not very clear which directive would best suit this situation.
Can you support us?
Re: Do not notify when an alarm is generated for a short per
Posted: Thu Apr 02, 2020 4:05 pm
by scottwilkerson
If you go to
Configure -> CCM -> Services -> Edit Service -> Alert Settings Tab
You can set a value in the "First notification delay" field.
When this is set, the service will only send a notification if the service remains in a non-OK state longer than the "First notification delay" in minutes
Re: Do not notify when an alarm is generated for a short per
Posted: Thu Apr 02, 2020 5:22 pm
by sanagios
And how this definition will behave in the situations below:
- If there are several status changes above "set time in minutes" to "First notification delay", will it remain considering the set time in minutes for each state change? That is, will it only send notification if it exceeds the time defined in "First notification delay"?
- - If there are changes in status WARNING > CRITICAL and/or CRITICAL> WARNING
I ask this question because I defined 20 min in "First notification delay" and in Reports> Available Reports> Notifications I saw the following.
screenshot.407.jpg
screenshot.406.jpg
screenshot.405.jpg
Re: Do not notify when an alarm is generated for a short per
Posted: Fri Apr 03, 2020 7:21 am
by scottwilkerson
Are these all the same service?
Was this set to 20 minutes back on 3/31 before these were sent?
Re: Do not notify when an alarm is generated for a short per
Posted: Fri Apr 03, 2020 8:31 am
by sanagios
scottwilkerson wrote:Are these all the same service?
Was this set to 20 minutes back on 3/31 before these were sent?
Yes, they are the same service.
The 20-minute value was set before 3/31. But I noticed that sent these notifications, so I wanted to understand better with you.
Re: Do not notify when an alarm is generated for a short per
Posted: Fri Apr 03, 2020 8:55 am
by scottwilkerson
Looking at your screenshots again, I notices that it is never recovering and the "first notification delay" is only going to affect the first non-OK notification.
Can you show a screenshot of the Check Settings tab? Is "Is Volatile" enabled?
Re: Do not notify when an alarm is generated for a short per
Posted: Tue Apr 07, 2020 7:47 am
by sanagios
scottwilkerson wrote:Looking at your screenshots again, I notices that it is never recovering and the "first notification delay" is only going to affect the first non-OK notification.
Exactly, he did not recover to OK, he varied between WARNING and CRITICAL. So I asked the questions below:
- If there are several status changes above "set time in minutes" to "First notification delay", will it remain considering the set time in minutes for each state change? That is, will it only send notification if it exceeds the time defined in "First notification delay"?
- If there are changes in status WARNING > CRITICAL and/or CRITICAL> WARNING
Can you show a screenshot of the Check Settings tab? Is "Is Volatile" enabled?
Following is the requested screenshot:
screenshot.423.jpg
Re: Do not notify when an alarm is generated for a short per
Posted: Tue Apr 07, 2020 4:54 pm
by scottwilkerson
What version of Nagios XI are you running?
Re: Do not notify when an alarm is generated for a short per
Posted: Wed Apr 08, 2020 11:33 am
by sanagios
scottwilkerson wrote:What version of Nagios XI are you running?
Nagios XI 5.6.8
Re: Do not notify when an alarm is generated for a short per
Posted: Wed Apr 08, 2020 4:09 pm
by lmiltchev
In your first post you said:
This usually only occurs for a few minutes, and soon afterwards, the loads decrease.
You could try increasing the
max_check_attempts value from 5 to something higher, e.g. 10. This way, nagios will be retrying the service a bit longer, before determining the state. Hopefully, this will provide enough time for the service to recover.
Service - max check attempts
This directive is used to define the number of times that Nagios will retry the service check command if it returns any state other than an OK state. Setting this value to 1 will cause Nagios to generate an alert without retrying the service check again.
Parameter name: max_check_attempts
Required: yes