Host DOWN doesn't send aler after escalation

emi65 · Post by **emi65** » Tue Mar 09, 2021 5:50 am

Hi expert

I'm using Nagios Core 4.4.5 in the RedHat 7 environment

I configured an host alert to check host alive each 2 minutes
Check Interval 2 min
Retry Check 1 min
Max Check Attempts 3

So in case of problem Nagios send an email alert after 5 minutes

Also I set an Notification Interval to 20 minute
The behavior excpeted is to send an email each 20 minutes if the host status doesn't change

I set and host escalation with parameters
First notification * 2
Last notification * 0
Notification interval * 0

In this way an SMS alert is sent after the 2th notification

This work correctly

Email notification and SMS notification (Escalation) is sent but
after these 2 notification nothing is sent again
I expected a email each 20 minutes Notification Interval to 20 minute

I add a host.cfg file of this host

Seems that the host notification_interval doesn't work

Someone could you help me ?

Thanks
Emilio

emi65 · Post by **emi65** » Thu Apr 01, 2021 7:25 am

No one have a suggestion for this situation

After Host down I receive an email an no other until host goes up

Why I don't receive any email after the time set to notification interval ?

Thanks
Emilio

Post by **mcapra** » Thu Apr 01, 2021 11:10 am

emi65 wrote: I configured an host alert to check host alive each 2 minutes
Check Interval 2 min
Retry Check 1 min
Max Check Attempts 3

I think this would have Nagios dispatch an alert no greater than 4 minutes from initially detecting the problem, assuming the problem didn't start until immediately after the last ~2min check execution. Based on my interpretation of the documentation, it sounds like max_check_attempts is inclusive of the first problematic check (before the retry_interval checks start):
https://assets.nagios.com/downloads/nag ... tions.html

This directive is used to define the number of times that Nagios will retry the service check command if it returns any state other than an OK state. Setting this value to 1 will cause Nagios to generate an alert without retrying the service check again.

There was a point in my life when I knew definitively whether or not retry_interval was inclusive of that initial problematic check, but I can't remember

emi65 wrote:Seems that the host notification_interval doesn't work

I believe your understanding of how the first_notification and last_notification directives work is correct.

However, when notification_interval is set to 0, per the docs:

If you specify a value of 0 for the interval, Nagios will send the first notification when this escalation definition is valid, but will then prevent any more problem notifications from being sent out for the host. Notifications are only sent out when the host recovers.

It sounds like you actually want your notification_interval for the hostescalation definition to be 20, if you want that escalation to repeat every 20 minutes forever until the problem is solved. The notification_interval for the hostescalation is probably superseding the notification_interval for the host in this case.

As an aside, I'm not 100% sure why that directive is required. I'd think for the particular hostescalation/servicescalation, if the notification_interval directive is not defined, you could just inherit whatever the associated host/service triggering the escalation has defined.

Nagios Support Forum

Host DOWN doesn't send aler after escalation

Host DOWN doesn't send aler after escalation

Re: Host DOWN doesn't send aler after escalation

Re: Host DOWN doesn't send aler after escalation