Page 1 of 1
Host flapping that isn't flapping
Posted: Wed Oct 01, 2014 12:52 pm
by snapon_admin
Any idea why this host is saying it's flapping when it clearly isn't?
flappingnotflapping.png
State change 0%, in warning state for 1d, still flapping somehow. Further information, flap detection is OFF.
flap settings.png
This service also appears to not be sending notifications, and yet has a contact group associated with it. I'm assuming that's because it's "flapping", but I'm not sure about that. As far as I can tell, this particular service check has NEVER sent a notification, which is odd, because notifications have been enabled since day one.
Re: Host flapping that isn't flapping
Posted: Wed Oct 01, 2014 2:56 pm
by lmiltchev
This service also appears to not be sending notifications, and yet has a contact group associated with it. I'm assuming that's because it's "flapping", but I'm not sure about that.
You are correct - if the service or host is currently flapping, no one gets notified.
Can you post the service definition?
It is strange that it shows in the CCM that flapping is disabled, but it is not taking it. Was the configuration applied successfully? Did someone enabled it in the UI?
Click on the service, go to "Advanced" tab and show us a screenshot of this page. Does it show (on this page) that flapping has been disabled?
Re: Host flapping that isn't flapping
Posted: Wed Oct 01, 2014 3:02 pm
by snapon_admin
Yeah, actually several configs have been applied correctly. This service has been in Nagios for months and Flapping was disabled right away. I've made several changes and applied configs without issue in that time.
Service def:
Code: Select all
define service {
host_name liswsas01p on lisapps14g
service_description CIM email
use xiwizard_nrpe_service
check_command check_nrpe!cim_email_check!!!!!!!
is_volatile 0
max_check_attempts 5
check_interval 5
retry_interval 1
check_period xi_timeperiod_24x7
flap_detection_enabled 0
notification_interval 60
notification_period xi_timeperiod_24x7
notification_options w,c,r,
notifications_enabled 1
contact_groups CIM Team
_xiwizard solaris
register 1
}
Advanced tab:
advanced.png
Re: Host flapping that isn't flapping
Posted: Thu Oct 02, 2014 9:40 am
by lmiltchev
Is the service still in the flapping state? When a service is flapping, the "state change" value under the "Advanced" tab should be different than 0%... Does Nagios Core show the service as "flapping" (click on the "See this service in Nagios Core").
Re: Host flapping that isn't flapping
Posted: Thu Oct 02, 2014 10:10 am
by snapon_admin
In Core flapping says "N/A".
Re: Host flapping that isn't flapping
Posted: Thu Oct 02, 2014 2:25 pm
by lmiltchev
It the service is not flapping in Core, it *has to be* a database issue.Try:
Code: Select all
service nagios stop
killall nagios
service ndo2db stop
killall ndo2db
service ndo2db start
service nagios start
Re: Host flapping that isn't flapping
Posted: Thu Oct 02, 2014 3:13 pm
by snapon_admin
Before running those commands I went and checked the service again and it's suddenly not flapping for some reason. Earlier today I had run a forced passive check just to clear it from it's current state (warning), and it went back into a warning state about 10 minutes later but stayed in flapping status. I don't know when flapping stopped for the service, but it's not currently flapping so that appears to be fixed. My next question is, why was it flapping in the first place and how can I prevent it from happening again? Flap detection has been disabled since this service was put in nearly a year ago.
Also, this particular check (it's a home built script one of our people made) has a tendency to receive different status information, but still stay the same state. For example, it will be in warning state with 6 errors, and then will switch to 7 errors, but still be in warning state. If I make this service volatile, will the contacts receive emails for each alert when the status information changes, even if the state doesn't change? Will that cause the service to "flap"?
Re: Host flapping that isn't flapping
Posted: Thu Oct 02, 2014 3:29 pm
by lmiltchev
Flap detection has been disabled since this service was put in nearly a year ago.
The only reasonable explanation is that someone enabled it back (at least temporarily) from the web UI.
I wouldn't recommend making the service volatile as every time this service is checked and it's in non-OK state, contacts will be notified. You will probably need to work on the this check and make sure it is developed according to the nagios plugins development guidelines.
https://nagios-plugins.org/doc/guidelines.html