Page 1 of 1

Re: [Nagios-devel] Bug fix priority

Posted: Sat Aug 08, 2009 2:34 pm
by Guest
This is a multi-part message in MIME format.
--------------040208080103070608040207
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Julien Mathis wrote:
> Hello All
>
> I am increasingly surprised at the enthusiasm that can cause scheduling
> bug (2010 problem) and bug 57 (http://tracker.nagios.org/view.php?id=57).
>
> It is rather problematic to see a scheduler doing a break until 2010 at
> the same date that the next check because we do not monitor 24/24 7/7
> ... If we do not have eyes on the screen, we are not warned ... less
> well to do even on a patch-based cron.
>
> Nobody is impacted by this problem? I see a lot of minor bugs on the
> mailing list and I think this bug have the most impact. If a scheduler
> stop to schedule, it loses some of its interest ... And more over, it
> loses credibility.
>
> This is a bug which is known since several months, and I just think it
> should be fixed in urgence. I know we tried to solve a short time ago
> without success.
>
> What do you think about that ?
>

agree the scheduler is an important piece of nagios no doubt.
regarding http://tracker.nagios.org/view.php?id=57, it looks to me like
the problem is that the function to get the next valid time does not
take timeperiod exclusions into account. I think a long term solution
would be to have this function take timeperiod exclusions into account.
for a work around, please test the attached diff if you don't mind.
this diff causes the function to check if it is in a current weekday
valid timeperiod, and if so make the next valid time 5 minutes from now.
5 minutes is arbitrary, because this functions doesn't know if the
preferred time falls in a exclusion, it only knows that the preferred
time falls within a valid week timeperiod of the timeperiod it is
working with.
when the time comes to run the check, and it falls in an exclusion, but
is in a valid time for the timeperiod being assessed, the check is not
run, but rescheduled for 5 minutes later again.
this is the only reasonable work around I could come up with, and did
very basic testing of it with the configs pasted in the bug link.
developers, let me know what you think of this.


--------------040208080103070608040207
Content-Type: text/x-patch;
name="utils.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="utils.diff"

--- utils.c 2009-06-17 15:11:04.000000000 +0200
+++ /tmp/utils.c 2009-08-08 17:20:50.000000000 +0200
@@ -1465,10 +1465,14 @@

/* calculate the time for the start of this time range */
day_range_start=(time_t)(day_start + temp_timerange->range_start);
+ day_range_end=(time_t)(day_start + temp_timerange->range_end);

- if((have_earliest_time==FALSE || day_range_start=preferred_time){
+ if((have_earliest_time==FALSE || day_range_start=preferred_time || (day_range_startpreferred_time))){
have_earliest_time=TRUE;
- earliest_time=day_range_start;
+ if(day_range_startpreferred_time)
+ earliest_time = preferred_time + 300;
+ else
+ earliest_time=day_range_start;
earliest_day=day_start;
}
}

--------------040208080103070608040207--





This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]