Re: [Nagios-devel] first_notification_delay for hosts
Posted: Thu Dec 01, 2005 3:49 am
On Thu, 24 Nov 2005, Andreas Ericsson wrote:
> This patch adds a variable to the host object configuration,
> first_notification_delay, which causes notifications for a host to be put off
> until a minimum amount of time has passed.
>
> This is intended to artificially mimic the service notification logic that
> allows some time to pass between a detected error and the first notification
> by forcing at least some "sleep-time" between the HARD detection of a downed
> host and the first notification sent for it.
>
> Because of how notifications are scheduled, this means that no host
> notifications are sent unless the host has been checked first the
> max_check_attempts times (run serially), waited until a service (or the host)
> has been checked again and then, if the host is still down, the notification
> is sent provided (first_notification_delay * interval_length) seconds has
> passed.
>
> I did the documentation update. All credits for the code should go to Mathias
> Sundman, a Sungard employee and also a customer of ours who sent the patch to
> me for review. I'm forwarding it to the list with his explicit consent. I've
> tested it and found it to be in good working order.
Ethan, do you think this patch has any chance of making it into Nagios
2.0?
Just some background why I wrote this patch; Many of the hosts we monitor
are such that we can accept them to lose network connectivity for some
time (say 10-30 minutes), but if they go down permanently we want to be
notified of this.
To achieve this we had to setup a dummy notification group for the
host, and then use escalations to be notified after a number of
notification_intervals has elapsed. That solution had a number of
drawbacks and felt more like a work around than a real solution.
Then I searched the list archive and found a number of other people with
the same problem as me but no other solution than the escalation method.
So, I decided to but this patch together that works very well for us in
our production environment atleast...
Cheers // Mathias
--
A. Because people read from top to bottom.
Q. Why should I not top-post?
_________________________________________________________
Mathias Sundman (^) ASCII Ribbon Campaign
NILINGS AB X NO HTML/RTF in e-mail
Tel: +46-(0)8-666 32 28 / \ NO Word docs in e-mail
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
> This patch adds a variable to the host object configuration,
> first_notification_delay, which causes notifications for a host to be put off
> until a minimum amount of time has passed.
>
> This is intended to artificially mimic the service notification logic that
> allows some time to pass between a detected error and the first notification
> by forcing at least some "sleep-time" between the HARD detection of a downed
> host and the first notification sent for it.
>
> Because of how notifications are scheduled, this means that no host
> notifications are sent unless the host has been checked first the
> max_check_attempts times (run serially), waited until a service (or the host)
> has been checked again and then, if the host is still down, the notification
> is sent provided (first_notification_delay * interval_length) seconds has
> passed.
>
> I did the documentation update. All credits for the code should go to Mathias
> Sundman, a Sungard employee and also a customer of ours who sent the patch to
> me for review. I'm forwarding it to the list with his explicit consent. I've
> tested it and found it to be in good working order.
Ethan, do you think this patch has any chance of making it into Nagios
2.0?
Just some background why I wrote this patch; Many of the hosts we monitor
are such that we can accept them to lose network connectivity for some
time (say 10-30 minutes), but if they go down permanently we want to be
notified of this.
To achieve this we had to setup a dummy notification group for the
host, and then use escalations to be notified after a number of
notification_intervals has elapsed. That solution had a number of
drawbacks and felt more like a work around than a real solution.
Then I searched the list archive and found a number of other people with
the same problem as me but no other solution than the escalation method.
So, I decided to but this patch together that works very well for us in
our production environment atleast...
Cheers // Mathias
--
A. Because people read from top to bottom.
Q. Why should I not top-post?
_________________________________________________________
Mathias Sundman (^) ASCII Ribbon Campaign
NILINGS AB X NO HTML/RTF in e-mail
Tel: +46-(0)8-666 32 28 / \ NO Word docs in e-mail
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]