Re: [Nagios-devel] [PATCH] base/workers: Write only to initialized
Posted: Fri Nov 02, 2012 10:32 am
On 11/01/2012 10:03 PM, [email protected] wrote:
> From: Robin Sonefors
>
> get_job_id returns -1 when there are no free slots in the worker. We
> didn't handle this case with anything other than a comment - the end
> result is that the nagios core will assign the -1 slot of the worker,
> causing memory errors.
>
> This seems to be what sometimes created crashes when shutting
> down/restarting the process. While I haven't been able to create a
> completely reproducible test case, I have a fairly large, completely
> unrelated test suite that used to cause a crash roughly 1/2 the time it
> was executed, and this patched stopped that.
>
> The FIXME to figure out somewhere else to put the check is still in its
> place - I still don't do that - but not dumping core because we've got a
> sizable workload seems reasonable. It's unlikely that another worker is
> available to spread the workload for us, so the error returned about us
> being too busy to do anything should normally already be quite
> indirectly noticable anyway.
>
> Signed-off-by: Robin Sonefors
> ---
> base/workers.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
Applied. Thanks.
> diff --git a/base/workers.c b/base/workers.c
> index 5338558..9f9aecd 100644
> --- a/base/workers.c
> +++ b/base/workers.c
> @@ -830,8 +830,9 @@ static worker_process *get_worker(worker_job *job)
>
> if (job->id /* XXX FIXME Fiddle with finding a new, less busy, worker here */
> + return NULL;
> }
> - wp->jobs[job->id % wp->max_jobs] = job;
> + wp->jobs[job->id] = job;
> job->wp = wp;
> return wp;
>
>
--
Andreas Ericsson [email protected]
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
> From: Robin Sonefors
>
> get_job_id returns -1 when there are no free slots in the worker. We
> didn't handle this case with anything other than a comment - the end
> result is that the nagios core will assign the -1 slot of the worker,
> causing memory errors.
>
> This seems to be what sometimes created crashes when shutting
> down/restarting the process. While I haven't been able to create a
> completely reproducible test case, I have a fairly large, completely
> unrelated test suite that used to cause a crash roughly 1/2 the time it
> was executed, and this patched stopped that.
>
> The FIXME to figure out somewhere else to put the check is still in its
> place - I still don't do that - but not dumping core because we've got a
> sizable workload seems reasonable. It's unlikely that another worker is
> available to spread the workload for us, so the error returned about us
> being too busy to do anything should normally already be quite
> indirectly noticable anyway.
>
> Signed-off-by: Robin Sonefors
> ---
> base/workers.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
Applied. Thanks.
> diff --git a/base/workers.c b/base/workers.c
> index 5338558..9f9aecd 100644
> --- a/base/workers.c
> +++ b/base/workers.c
> @@ -830,8 +830,9 @@ static worker_process *get_worker(worker_job *job)
>
> if (job->id /* XXX FIXME Fiddle with finding a new, less busy, worker here */
> + return NULL;
> }
> - wp->jobs[job->id % wp->max_jobs] = job;
> + wp->jobs[job->id] = job;
> job->wp = wp;
> return wp;
>
>
--
Andreas Ericsson [email protected]
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]