Page 1 of 1
Nagios XI Detecting flapping that isn't there
Posted: Tue Jul 12, 2011 2:11 pm
by BanditBBS
We have a large geographic area up and down the Ohio River Valley but all sites are connected via MPLS and have great response time. We only have the one XI server at the main office. At two of the sites we have devices that keep reporting as down and before the subsequent checks shows back as up. However, we could be pinging them form our desktops in the same location as the XI server and no pings are ever dropped. We had to turn off alerting for flap detection because there are just to many hosts appearing as flapping. We only have a few hundred hosts and services, so the install isn't that large.
We are running the official VM image upgraded to 1.5.
Re: Nagios XI Detecting flapping that isn't there
Posted: Wed Jul 13, 2011 10:08 am
by mguthrie
At two of the sites we have devices that keep reporting as down and before the subsequent checks shows back as up.
What do you have set for the "max_check_attempts" for these hosts?
Here's some good info on soft vs hard states:
http://nagios.sourceforge.net/docs/3_0/statetypes.html
The default "check-host-alive" uses the /usr/local/nagios/check_icmp check plugin. You could try testing this from command-line with a larger number of packets and see what it's coming up with. There's a lot of flexibility with that plugin, so if you needed to tweak the "check-host-alive" command to have a little more forgiveness in the check you could do that as well.
Re: Nagios XI Detecting flapping that isn't there
Posted: Thu Jul 14, 2011 10:06 am
by BanditBBS
mguthrie wrote:What do you have set for the "max_check_attempts" for these hosts?
Max_check_attempts is set to 3 with the check_interval set to 15 and retry_interval set to 5.
I'll mess around with the check_icmp and see if I can make this any better.
Re: Nagios XI Detecting flapping that isn't there
Posted: Mon Jul 18, 2011 12:07 pm
by mguthrie
Sounds good, let us know if you have additional questions.