Re: [Nagios-devel] Problems with many hanging Nagios processes
-
Guest
Re: [Nagios-devel] Problems with many hanging Nagios processes
--Apple-Mail-43--583683755
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
charset=UTF-8;
delsp=yes;
format=flowed
Hi Mahesh,
On 19 Dec 2006, at 00:42, Mahesh Kunjal wrote:
> Here is what we did to resolve.
>
> 1. Edit the include/nagios.h.in
> change
> #define COMMAND_BUFFER_SLOTS 1024
> to
> #define COMMAND_BUFFER_SLOTS 60000
>
> And change
> #define SERVICE_BUFFER_SLOTS 1024
> to
> #define SERVICE_BUFFER_SLOTS 60000
>
I was intrigued by this as we have a performance issue, but not with =20
the same symptoms. Our problem is that NSCA processes increase when =20
the nagios server is under load. They appear to be blocking on =20
writing to the command pipe. Switching NSCA to single daemon =20
mitigates the problem (slaves will timeout their passive results), =20
but we wanted to know where any slow downs could be.
=46rom your findings, we've created a performance static patch, =20
attached. This collects the maximum and current values for the =20
command and service buffer slots and is then written to status.dat =20
(by default every 10 seconds). What I found with a fake slave sending =20=
128 results every 5 seconds was that the maximum values were fairly =20
low (under 100), but when I put the server under load, the =20
maximum_command_buffer_items shot up to 1969 and the =20
maximum_service_buffer_items shot up to 2156 (had changed from =20
defaults to your 60000).
This could show if the buffer is filled at various points or if there =20=
is not enough data ready for Nagios to process further down the chain.
I'd be interested in figures from other systems.
Warning: the patch is not thread safe, so there is no guarantees that =20=
the statistic data will not be corrupted (but should not affect usual =20=
Nagios operation). Applies onto Nagios 2.5. Tested on Debian with 2.6 =20=
kernel.
Ton
http://www.altinity.com
T: +44 (0)870 787 9243
F: +44 (0)845 280 1725
Skype: tonvoon
=EF=BF=BC
--Apple-Mail-43--583683755
Content-Type: multipart/mixed;
boundary=Apple-Mail-44--583683755
--Apple-Mail-44--583683755
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
charset=US-ASCII
Hi Mahesh,On =
19 Dec 2006, at 00:42, Mahesh Kunjal wrote:Here is what we did to resolve.1. Edit =
the include/nagios.h.inchange#define COMMAND_BUFFER_SLOTS =
1024to#define =
COMMAND_BUFFER_SLOTS 60000And change#define SERVICE_BUFFER_SLOTS 1024to#define =
SERVICE_BUFFER_SLOTS 60000<SPAN class=3D"Apple-style-span"=
style=3D"border-collapse: separ
...[email truncated]...
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]