Re: [Nagios-devel] NSCA in standalone single-process daemon mode

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

Re: [Nagios-devel] NSCA in standalone single-process daemon mode

Post by Guest »

Thomas Guyot-Sionnest wrote:
> Hi list,
>
> I'm running a big Nagios monitoring system which has about a hundred of
> remote passive checks reporting trough NSCA. Lately when I added more
> passive checks I noticed that the number of "Failed" checks (No results
> received) increased (For most of the checks it's impossible to say if it did
> run or not).
>
> I'm currently running NSCA in inetd mode using D. J. Bernstein's tcpserver
> program. Since most checks are run by Vixie Cron, and therefore will run at
> the exact same time, my two guess were that either:
>
> 1. I'm jamming up the monitoring server for more that 10 seconds will all
> the checks.
>
> Or
>
> 2. All NSCA processes writing at the same command file trigger some obscure
> OS or Nagios bug.
>
> I have reasons to think it's not #1, so to test #2 I wanted to run NSCA in
> single-process daemon mode. When I do this it get the first passive check
> correctly and send_nsca fail on all other checks. Running strace I see that
> it block on the poll syscall after processing the first check, and send_nsca
> timeouts after 10 seconds.
>
> I'm running Nagios 2.0b3 on Slackware 10.1.0, Dual Athlon MP with 4G of ram,
> NSCA Version 2.6, Official & unpatched.
>
> Compiled with Gcc:
> Configured with: ../gcc-3.3.4/configure --prefix=/usr --enable-shared
> --enable-threads=posix --enable-__cxa_atexit --disable-checking
> --with-gnu-ld --verbose --target=i486-slackware-linux
> --host=i486-slackware-linux
> Thread model: posix
> gcc version 3.3.4
>
> Any thoutht on what's going wrong here?
>

Nagios' command-file is being filled up. It can only hold 4096 bytes
(hard OS limit on most unix-like systems) so with 100+ checks going off
at the same time you're lucky to get half of them written to the pipe
before it times out.

--
Andreas Ericsson [email protected]
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231





This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Locked