Do not notify when an alarm is generated for a short period
Do not notify when an alarm is generated for a short period
Hello team,
I would like your support regarding a question.
We have many servers being monitored in our infrastructure, we noticed that some of them have high loads of ram and cpu usage, and this occurs at different times. This usually only occurs for a few minutes, and soon afterwards, the loads decrease. As they are at different times, I cannot use a notification exception by time. But I would like to notify if the service is in warning or critical and a few minutes later return to its normal state.
From what I read here "https://assets.nagios.com/downloads/nag ... .html#host" it was not very clear which directive would best suit this situation.
Can you support us?
I would like your support regarding a question.
We have many servers being monitored in our infrastructure, we noticed that some of them have high loads of ram and cpu usage, and this occurs at different times. This usually only occurs for a few minutes, and soon afterwards, the loads decrease. As they are at different times, I cannot use a notification exception by time. But I would like to notify if the service is in warning or critical and a few minutes later return to its normal state.
From what I read here "https://assets.nagios.com/downloads/nag ... .html#host" it was not very clear which directive would best suit this situation.
Can you support us?
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Do not notify when an alarm is generated for a short per
If you go to
Configure -> CCM -> Services -> Edit Service -> Alert Settings Tab
You can set a value in the "First notification delay" field.
When this is set, the service will only send a notification if the service remains in a non-OK state longer than the "First notification delay" in minutes
Configure -> CCM -> Services -> Edit Service -> Alert Settings Tab
You can set a value in the "First notification delay" field.
When this is set, the service will only send a notification if the service remains in a non-OK state longer than the "First notification delay" in minutes
Re: Do not notify when an alarm is generated for a short per
And how this definition will behave in the situations below:
- If there are several status changes above "set time in minutes" to "First notification delay", will it remain considering the set time in minutes for each state change? That is, will it only send notification if it exceeds the time defined in "First notification delay"?
- - If there are changes in status WARNING > CRITICAL and/or CRITICAL> WARNING
I ask this question because I defined 20 min in "First notification delay" and in Reports> Available Reports> Notifications I saw the following.
- If there are several status changes above "set time in minutes" to "First notification delay", will it remain considering the set time in minutes for each state change? That is, will it only send notification if it exceeds the time defined in "First notification delay"?
- - If there are changes in status WARNING > CRITICAL and/or CRITICAL> WARNING
I ask this question because I defined 20 min in "First notification delay" and in Reports> Available Reports> Notifications I saw the following.
You do not have the required permissions to view the files attached to this post.
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Do not notify when an alarm is generated for a short per
Are these all the same service?
Was this set to 20 minutes back on 3/31 before these were sent?
Was this set to 20 minutes back on 3/31 before these were sent?
Re: Do not notify when an alarm is generated for a short per
Yes, they are the same service.scottwilkerson wrote:Are these all the same service?
Was this set to 20 minutes back on 3/31 before these were sent?
The 20-minute value was set before 3/31. But I noticed that sent these notifications, so I wanted to understand better with you.
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Do not notify when an alarm is generated for a short per
Looking at your screenshots again, I notices that it is never recovering and the "first notification delay" is only going to affect the first non-OK notification.
Can you show a screenshot of the Check Settings tab? Is "Is Volatile" enabled?
Can you show a screenshot of the Check Settings tab? Is "Is Volatile" enabled?
Re: Do not notify when an alarm is generated for a short per
Exactly, he did not recover to OK, he varied between WARNING and CRITICAL. So I asked the questions below:scottwilkerson wrote:Looking at your screenshots again, I notices that it is never recovering and the "first notification delay" is only going to affect the first non-OK notification.
- If there are several status changes above "set time in minutes" to "First notification delay", will it remain considering the set time in minutes for each state change? That is, will it only send notification if it exceeds the time defined in "First notification delay"?
- If there are changes in status WARNING > CRITICAL and/or CRITICAL> WARNING
Following is the requested screenshot:Can you show a screenshot of the Check Settings tab? Is "Is Volatile" enabled?
You do not have the required permissions to view the files attached to this post.
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Do not notify when an alarm is generated for a short per
What version of Nagios XI are you running?
Re: Do not notify when an alarm is generated for a short per
Nagios XI 5.6.8scottwilkerson wrote:What version of Nagios XI are you running?
Re: Do not notify when an alarm is generated for a short per
In your first post you said:
You could try increasing the max_check_attempts value from 5 to something higher, e.g. 10. This way, nagios will be retrying the service a bit longer, before determining the state. Hopefully, this will provide enough time for the service to recover.This usually only occurs for a few minutes, and soon afterwards, the loads decrease.
Service - max check attempts
This directive is used to define the number of times that Nagios will retry the service check command if it returns any state other than an OK state. Setting this value to 1 will cause Nagios to generate an alert without retrying the service check again.
Parameter name: max_check_attempts
Required: yes
Be sure to check out our Knowledgebase for helpful articles and solutions!