Re: [Nagios-devel] Automatically acknowledge services of an

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

Re: [Nagios-devel] Automatically acknowledge services of an

Post by Guest »

On 12/14/2010 08:16 AM, Matthieu Kermagoret wrote:
> Hi list,
>
> Sorry for my late answer but thanks to all of you who replied. It'll
> try to explain our issue a bit better.
>
> 2010/12/9 Mathieu Gagné:
>> On 12/8/10 5:08 PM, Julien Mathis wrote:
>> That said, I still do not fully understand what you want to achieve or
>> what you really need. We do agree that you are proposing a "solution" to
>> a unknown/unclear problem. (to us)
>>
>> When the host is DOWN, service problems are silenced and NO
>> notifications are sent, they are "muted". Why would you want to
>> acknowledge a service problem if there isn't any notifications sent to
>> contacts?
>>
>> Is there any particular issue you are encountering? What are the course
>> of events and what is the expected behavior?
>>
>
> The main problem we try to fix with this patch is about notifications.
> In fact you can configure services in such a way that notifications
> are sent when their state is UNKNOWN (and that's what we do, as the
> UNKNOWN state can be triggered by host problem, service dependency
> issue, or an UNKNOWN return value from a plugin, (don't know if the
> last is definitely wrong or not'0). So some of our customers want to
> stop notifications of services associated to an host when they
> acknowledge it.
>

It seems you would be better off with a microscopic eventbroker module
that prevents sending UNKNOWN notifications if the host is down and
acknowledged. It could use custom variables for tweaking from the
default behaviour, and something similar could be integrated into a
later Nagios release to be supported from scratch.

>> Are service notifications sent to contacts when the host is back UP? Do
>> you want to acknowledge service problems for display purposes only?
>>
>
> No they're not. Andreas is right when he says that the patch if
> "poorly thought out", because it's only a part of the solution we
> wanted to create. The original way we wanted to do it is to keep a
> state about the acknowledgement (whether automatically generated or
> not) and remove it when the host is back up if it was automatically
> generated. However this change would require to modify the host and
> service structures, which is AFAIK forbidden for the 3.x branch.

It is forbidden, but it's not forbidden to add extra object info in
separate hashlists. Internal state for various things can be kept
there, and we'll mark them as "subject to change" so module authors
know not to use them for anything that's supposed to work a long
period of time. That being said, such a design still leaves me asking
why the entire thing isn't in a broker module from the start.

I'd take a patch to block notifications from eventbroker modules in
the blink of an eye if that's the case. NEBERROR_CALLBACKOVERRIDE is
meant for things like that, but it's currently only supported for
host and service checks. Such a patch would make it positively
trivial to write an eventbroker module that does what you want.

> So we
> went with the "try to get it into upstream" way. I agree that this
> patch itself as it is only fill our customer's needs, but does the
> whole solution seems more appealing ?
>

I'm not sure. I don't think I've fully understood the problem, tbh.

The workflow, afaiu, is this:
Some work is scheduled for a host, but the host isn't put into
scheduled downtime.
The host goes down, causing a DOWN state for the host and an UNKNOWN
state for agent-based service checks.
Someone acks the host with "working on it. It'll be up soon".

Currently, no service notifications should be sent for the unknown
states, since the host is down. If that's not the case, it's a bug
and "patches welcome", or I'll fix it myself when I have time. Or
it could be that notifications are sent to service-contacts who
aren't also host-contacts, since they won't be marked as "already
notified". Hmm. I'm not sure if that's a bug or not, but it seems
an unlikely scenario tbh.

If service notifications still go out in spite of the host being
down (I know this can happen sometimes, although it can be worked

...[email truncated]...


This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Locked