Page 1 of 1
Flapping_How does it work?
Posted: Wed Jun 26, 2013 12:26 am
by justine
Hi...
Been searching forums about flapping but it seems there are limited info about it.
How does flapping works? Does nagiosxi relay to ping services to determine the host status?
Does this mean that flapping is the connection between nagiosxi server and the monitored host? but no problem detected on the host itself?
How would i know if the flapping reported by Nagios is correct? Are there parameters/logs in windows that i could check?
Re: Flapping_How does it work?
Posted: Wed Jun 26, 2013 9:35 am
by abrist
justine wrote:How does flapping works?
When a host or service problem state changes too often, the object will be flagged as flapping.
justine wrote:Does nagiosxi relay to ping services to determine the host status?
Usually, though it will rely on whatever check is configured for the host check.
justine wrote:Does this mean that flapping is the connection between nagiosxi server and the monitored host? but no problem detected on the host itself?
Flapping can be caused by a bad connection between the nagios server and the remote host, or an issue with the host itself.
justine wrote:How would i know if the flapping reported by Nagios is correct? Are there parameters/logs in windows that i could check?
If the host is flapping, but all services report good on the remote host, then you may just have issues with ping. If all services including ping are flapping you either have a network problem or a problem with the host. If only select services are flapping, there is a chance that the remote host is having an issue with load, or at least those services in particular.
The cause of flapping can be hard to hunt down sometimes. If you are experience many flapping objects, make sure your checks do not have unreasonably low timeouts and maybe tune your alert thresholds higher. In these circumstances you should also check your networking hardware.
Re: Flapping_How does it work?
Posted: Wed Jun 26, 2013 5:50 pm
by Box293
Another example of flapping can be with a service check for free disk space.
Consider you want to receive alerts when there is 10 GB of remaining free disk space on a host.
Your host currently has 10.01 GB of free disk space.
A user logs into the server and the temporary files on the server increase, reducing the free disk space to 9.99 GB.
So the next Nagios check detects the server has past the threshold and triggers an alert.
Then the user logs off, the temporary files get deleted and now the server returns to 10.01 GB of free disk space.
So the next Nagios check detects the server has no longer past the threshold and returns to an OK state.
So if you were checking the free disk space every 5 minutes, and over an hour these conditions occured multiple times (user logs on, free disk space reduces, user logs off, free disk space increases), Nagios can detect that this service is flapping and hence stops sending alerts until it stablises again (within the flapping thresholds you defined).
It's a way of stopping Nagios from sending too many alerts. If your support team is flooding with alerts all the time, they can be ignored and potentially a critical problem can be overlooked.
Does this help you understand flapping at all?
Re: Flapping_How does it work?
Posted: Thu Jun 27, 2013 1:31 am
by justine
Yes. Thank you.
Really helpful! Thank you guys!
Re: Flapping_How does it work?
Posted: Thu Jun 27, 2013 9:43 am
by slansing
Glad this helped! Closing as resolved.