Page 1 of 1

Re: [Nagios-devel] Problems with many hanging Nagios processes (Nagios

Posted: Mon Nov 28, 2005 8:10 am
by Guest
[email protected] wrote:
> Hi everybody,
>
> unfortunately nobody answered to Alex from viveconsulting.co.nz who had a
> problem with "Nagios spawning rogue ..." and mailed to nagios mailing list
> some months ago.


A link to the mail archives would be helpful.


> Right now, we have the same problemn very likely he
> described in a very detailed way. I tried also a lot of different things
> (from configuration changes to tuning issues) to find out the real problem
> and I guess the real bottleneck is the pipe used for communication between
> Nagios processes.


Most likely. It's the only real bottleneck in nagios today, so...


> But I found not many reports e.g. emails about this
> problem in the web and mail archives.
>
> So why am I writing to list? Maybe someone can give me a hint, how to solve
> or workaround that problem? We have 677 services configured and use 350
> RRDs. Our Nagios CMS is a PIII 866 MHz with SCSI RAID 5. The system load is
> a little bit more than 1.00. As long as we stay below 1.00 no problem, but
> otherwise ... (Detailed problem description in Alexs' mail)
>

CMS? Content Management System?
Anyways, 677 services shouldn't be a problem.


> This is just our start with Nagios. We want to configure thousands of
> services and more than 100 hundred hosts. We would also invest in faster
> hardware, dual CPU, 2GB memory and faster SCSI HDDs but is faster hardware
> an option?

It helps, but not very much I'm afraid. The bottleneck requires a kernel
recompile to be solved on most systems, and that's a very bad thing to
do just to fix this particular problem.

> Looking at this issue with the focus on implementation: If the
> pipe is the bottleneck it will stay a bottle neck on faster hardware too.
> But maybe faster hardware will allow us to configure 3000 services, what
> would be enough for the Nagios instance. And then, we deploy another Nagios
> instance ...
>

This is definitely a solution. Otherwise you could keep your eyes open
in the somewhat near future for a mail with

[PATCH] checks: Multiplex running checks.

in the topic. I'm working on it right now, but perhaps Ethan won't let
it in for the 2.x branch since it's a fairly massive change.

--
Andreas Ericsson [email protected]
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231





This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]