Re: [Nagios-devel] BUG HostStateFlapping calculation - RFC

Guest · Post by **Guest** » Wed Feb 15, 2006 2:36 pm

On 15 Feb 2006 at 10:28, Percy Jahn wrote:

> Hello,
>
> IMHO at the moment, hoststateflapping is calculated at a very weird
> way. A "no-state-change" event is added to the flapping history if
> wait_threshold has elapsed. Actual calculation:
> wait_threshold=(hst->total_service_check_interval*interval_length)/hst
> ->total_services;
>
> In other words: if a host with 10 services, all in a 5 minutes
> interval checked is flapping, a "no-state-change" event is added every
> 50 Minutes, but hostchecks (for a down host) are executed every 30
> seconds. In our installation, we use a service called "config_backup"
> which makes a backup of our routers/switches and is called every 1440
> Minutes. The wait_threshold raises very high, with only one config
> service attached to the host. So a very long time without state change
> must be passed for a non-flapping state.
>
> The attached patch calculates the wait_threshold as a
> "average_service_check_interval" (W):
>
> (1/N(1)) + (1/N(2)) + ... + (1/N(n)) = (1/W)
>
> In other words: if a host with 10 services, all in a 5 minute interval
> checked is flapping, a "no-state-change" event is added every 30
> seconds.
>
> RFC
>
> Best regards
> Percy Jahn
>

I'll put this on my TODO list for Nagios 3.x. I can see advantages
to both methods of the wait_threshold calculation. The current
method attempts to make sure (most) all services associated with the
host have been checked before a new host state (non)transition is
recorded.

Ethan Galstad,
Nagios Developer
---
Email: [email protected]
Website: http://www.nagios.org

This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]