Re: [Nagios-devel] RFC: Downtime and flapping

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

Re: [Nagios-devel] RFC: Downtime and flapping

Post by Guest »

On 02/04/2011 11:57 AM, Andreas Ericsson wrote:
> On 02/04/2011 11:30 AM, Jochen Bern wrote:
>> It IIUC also means that during the downtime, the CGI-bins will keep
>> displaying the *historic* flapping state, along with the *current*
>> host/service state.
> Perhaps, but it should clear up fairly rapidly, and if a FLAPPING_START
> notification was sent out, I'd expect to get a FLAPPING_STOP one when
> repairs are done, assuming that happens after downtime has ended.

You don't have a guarantee to see post-downtime FLAPPING_STOPs right now
(because they're not exempt from being blocked by the downtime, and
because you'd have to skip any kind of testing the UP-again service and
manually delete the remaining downtime right away to completely avoid
the time gap). Same effect if a notification_period is used - I *did*
search for the "bug" when colleagues reported that, when following up on
a service they last got a FLAPPING_START from, they found a non-flapping
OK in the UI.

>> Downtime disables notifications anyway, and there already is logic to
>> trigger actions when downtime ends (*). IMHO, the proper way to provid=
e
>> a clean slate after a downtime would be to flush (**) the entire histo=
ry
>> at that point.
> Effectively lying about state history? No thanks.

You talk about lying, I talk about misleading. Deriving a flapping flag
from a state history whose entries hail from way in the past - no matter
whether updates were blocked by downtime, check_interval, dependencies,
a forced reschedule into the distant future, or whatever - qualifies for
the latter.

>> (**) Whether the bins should be reset to OK, PENDING,
>> last-before-downtime or the current post-downtime $*STATE$ (if one is
>> already available) is up for discussion ...
> Current state will always be current state. I'm not going to change
> that, ever. Most of our customers regularly check the ui during repairs
> to see if the service is up and running as expected. Showing anything
> but the *real* current state there would be counterproductive for all
> nagios users.

You might want to note that I never asked for overwriting "the" current
state (rather than the history bins) in the first place. Matter of fact,
the UI's flapping marker being happily derived from a *stale* history is
sort of my main point.

Regards,
J. Bern
--=20
Jochen Bern, Systemingenieur --- LINworks GmbH
Postfach 100121, 64201 Darmstadt | Robert-Koch-Str. 9, 64331 Weiterstadt
PGP (1024D/4096g) FP =3D D18B 41B1 16C0 11BA 7F8C DCF7 E1D5 FAF4 444E 1C2=
7
Tel. +49 6151 9067-231, Zentr. -0, Fax -299 - Amtsg. Darmstadt HRB 85202
Unternehmenssitz Weiterstadt, Gesch=E4ftsf=FChrer Metin Dogan, Oliver Mic=
hel





This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Locked