This is a multi-part message in MIME format.
------=_NextPart_000_0033_01C66ED7.F2747420
Content-Type: text/plain;
charset="us-ascii"
Content-Transfer-Encoding: 7bit
> -----Original Message-----
> From: Andreas Ericsson [mailto:[email protected]]
> Sent: May 3, 2006 4:19
> To: Thomas Guyot-Sionnest
> Cc: [email protected]
> Subject: Re: [Nagios-devel] NSCA in standalone single-process
> daemon mode
>
> Thomas Guyot-Sionnest wrote:
> > Hi list,
> >
> > I'm running a big Nagios monitoring system which has about
> a hundred of
> > remote passive checks reporting trough NSCA. Lately when I
> added more
> > passive checks I noticed that the number of "Failed" checks
> (No results
> > received) increased (For most of the checks it's impossible
> to say if it did
> > run or not).
> >
> > I'm currently running NSCA in inetd mode using D. J.
> Bernstein's tcpserver
> > program. Since most checks are run by Vixie Cron, and
> therefore will run at
> > the exact same time, my two guess were that either:
> >
> > 1. I'm jamming up the monitoring server for more that 10
> seconds will all
> > the checks.
> >
> > Or
> >
> > 2. All NSCA processes writing at the same command file
> trigger some obscure
> > OS or Nagios bug.
> >
> > I have reasons to think it's not #1, so to test #2 I wanted
> to run NSCA in
> > single-process daemon mode. When I do this it get the first
> passive check
> > correctly and send_nsca fail on all other checks. Running
> strace I see that
> > it block on the poll syscall after processing the first
> check, and send_nsca
> > timeouts after 10 seconds.
> >
> > I'm running Nagios 2.0b3 on Slackware 10.1.0, Dual Athlon
> MP with 4G of ram,
> > NSCA Version 2.6, Official & unpatched.
> >
> > Compiled with Gcc:
> > Configured with: ../gcc-3.3.4/configure --prefix=/usr
> --enable-shared
> > --enable-threads=posix --enable-__cxa_atexit --disable-checking
> > --with-gnu-ld --verbose --target=i486-slackware-linux
> > --host=i486-slackware-linux
> > Thread model: posix
> > gcc version 3.3.4
> >
> > Any thoutht on what's going wrong here?
> >
>
> Nagios' command-file is being filled up. It can only hold 4096 bytes
> (hard OS limit on most unix-like systems) so with 100+ checks
> going off
> at the same time you're lucky to get half of them written to the pipe
> before it times out.
>
I doubt it's the case since I have "command_check_interval=-1" and nsca
should just block when the pipe is full.
I noticed in the code that nagios offload the pipe in a circular buffer, and
from what I tested it seems that if this buffer fill up nagios start
dropping commands. However this only occurred when I was sending about 5
times the equivalent of what we currently send to nagios. The way I was
testing is running commands similar to this:
`(for ((i=0; i> /path/to/nagios.cmd`
Running that with i<500 just before passive check results comes affect a few
checks. With i<1000 I get almost no checks in.
With 3000 it blocks on the pipe and takes significantly more time to run
(0m0.053s for 1000, 0m0.405s for the next 1000 and 0m33.856s for the third
1000).
I'd really like to try NSCA in standalone mode, any idea why it stop working
after the first check kicks in?
Thanks,
Thomas Guyot
------=_NextPart_000_0033_01C66ED7.F2747420
Content-Type: application/x-pkcs7-signature;
name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename="smime.p7s"
MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIIwTCCAkkw
ggGyoAMCAQICAw+pBzANBgkqhkiG9w0BAQQFADBiMQswCQYDVQQGEwJaQTElMCMGA1UEChMcVGhh
d3RlIENvbnN1bHRpbmcgKFB0eSkgTHRkLjEsMCoGA1UEAxMjVGhhd3RlIFBlcnNvbmFsIEZyZWVt
YWlsIElzc3VpbmcgQ0EwHhcNMDUxMDEzMjE1NDQ2WhcNMDYxMDEzMjE1NDQ2WjBCMR8wHQYDVQQD
ExZUaGF3dGUgRnJlZW1haWwgTWVtYmVyMR8wHQYJKoZIhvcNAQkBFhB0aG9tYXNAemFuZ28uY29t
MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCpO58O2SQ+znxpvrbDsmLxepJzdphhREbvvU23
0jSS4DatcJo1W0r7FN56SmI1Bns0QQz/mKUxkbwSDtV3VURwhUtOjgM/mps1SK155dOCMCNCMVMM
S01
...[email truncated]...
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]