Page 1 of 1

Need help to understand delay notificiation Flapping/Crit

Posted: Mon Mar 12, 2018 8:30 am
by bennyboy
Hi,

First we choose to not notify on Flapping detection. I have a host check with Flapping detection from 9 mars 20h01 to 10 mars 01h26.
From 1h27 - 1h30 the host stay down and I see state the state type changing from soft to hard. But I only have the critical notification at 2h54.

Can you help me to understand please.

I understand if I choose to send notification on flapping I got that notification earlier but I don't understand the delay for the real Host Down Critical at 1h31 and receive the notification at 2h54.

If it's about the Flapping % need to cooldown can you send me detail information about how he establish that.

Thank you!

Re: Need help to understand delay notificiation Flapping/Cri

Posted: Mon Mar 12, 2018 10:27 am
by lmiltchev
I don't understand the delay for the real Host Down Critical at 1h31 and receive the notification at 2h54.
It would take some time for the host/service to "stop flapping". Nagios stores the last 21 check results, and determines how many state changes (transitions) have occurred. When calculating the percent state change, more weight is given to new state changes, compared to old ones.

For more information on the issue, please review our official documentation on flapping here:

https://assets.nagios.com/downloads/nag ... pping.html

Can you show us screenshots of the "State History" and "Notifications" reports for the service in question for the relevant time period? Also, show us the service definition, along with the definition of all relevant templates, being used by this service.

Run the following command, and show the output:

Code: Select all

grep flap /usr/local/nagios/etc/nagios.cfg

Re: Need help to understand delay notificiation Flapping/Cri

Posted: Mon Mar 12, 2018 10:37 am
by bennyboy
enable_flap_detection=1
high_host_flap_threshold=20.0
high_service_flap_threshold=20.0
low_host_flap_threshold=5.0
low_service_flap_threshold=5.0

Re: Need help to understand delay notificiation Flapping/Cri

Posted: Mon Mar 12, 2018 10:38 am
by bennyboy
State history

Re: Need help to understand delay notificiation Flapping/Cri

Posted: Mon Mar 12, 2018 4:11 pm
by lmiltchev
We don't know for sure exactly when you enabled flap detection, e.g. how many state changes there were already for the "enedfc" host. We could only speculate that the host stopped flapping at 2:54am, and you received a notification (as expected).

We would recommend that you keep an eye on the host (now that it exited the flapping state) to make sure that the future notifications are sent out with no delay.