Page 1 of 1

[Nagios-devel] Problems with extensive passive monitoring

Posted: Mon Oct 09, 2006 4:46 am
by Guest
This message is in MIME format. The first part should be readable text,
while the remaining parts are likely unreadable without MIME-aware tools.

--8323584-1729058409-1160397968=:14309
Content-Type: TEXT/PLAIN; charset=US-ASCII

Hi all,

in our environment we got a problem with extensive passive monitoring
feature of nagios.

Description in short:
---------------------

In our environment we got more than 250 clients where each of them runs
its own nagios server to monitor itself. Each client runs up to 8 service
checks and posts these results as external command via use of
send_nsca/nsca to a master nagios server, I call it cluster master nagios
server or short CMNS.

This CMNS is also a client of one site master nagios server (or short
SMNS). CMNS must forward its messages to SMNS as external commands like
the clients did to it.

With build-in feature of nagios (we use version 2.5) you can use
send_nsca/nsca to forward messages from CMNS to SMNS too but this results
in:
* heavy load on CMNS due to fork of at least one external command
send_nsca to forward one message (in our environment up to 1000
forks per minute) to SMNS.
* up to 1000 nsca per minute to deliver external command messages from
clients to CMNS
* loosing of incomming messages from clients on CMNS because it reads
from external command pipe only 30 seconds .. then it makes a pause.
* child processes of CMNS become childs of `init' and all of them
write further into the pipe over which they are connected with the
nagios master process.
* thereby they eat a lot of memory so a machine with 512MB RAM and 2GB
swap must be booted after 2 days otherwise it hangs

The whole description can be read on:
http://www.mountcup.de/tiki/tiki-index. ... monitoring

My solution
-----------
Instead of calling an external program (ocsp_command or ochp_command) for
each external command message to forward it from CMNS to SMNS let write
the nagios process these messages in a named pipe. The patch attached
gives you this functionallity for nagios version 2.5.

Then let a helper program read from this named pipe on CMNS site and let
it forward the messages through a (I call it here) channel to whatever you
want, in this case to SMNS. I have written a perl program that does this
for you which is added as attachment too.

What do you thing about the option to use namend pipes in addition to
ocsp_command and/or ochp_command running as external process?
The NDO interface can't be used in this case because there aren't any
connectors inside the code for external commands.

best regards
Mike

-----------------------------------------------------------------------------
Mike Becher [email protected]
Leibniz-Rechenzentrum der http://www.lrz.de
Bayerischen Akademie der Wissenschaften phone: +49-89-35831-8721
Gruppe Hochleistungssysteme fax: +49-89-35831-9700
Boltzmannstrasse 1
D-85748 Garching bei Muenchen
Germany
-----------------------------------------------------------------------------
--8323584-1729058409-1160397968=:14309
Content-Type: TEXT/PLAIN; charset=US-ASCII;
name="nagios-2.5-ocxp_command_npipe.patch"
Content-Transfer-Encoding: BASE64
Content-ID:
Content-Description: nagios-2.5-ocxp_command_npipe.patch
Content-Disposition: attachment; filename="nagios-2.5-ocxp_command_npipe.patch"

ZGlmZiAtdSAtciAtTiBuYWdpb3MtMi41L2Jhc2UvY29uZmlnLmMgbmFnaW9z
LW1pYmUtMi41L2Jhc2UvY29uZmlnLmMNCi0tLSBuYWdpb3MtMi41L2Jhc2Uv
Y29uZmlnLmMJMjAwNS0xMi0yNyAwMDoxODoxNC4wMDAwMDAwMDAgKzAxMDAN
CisrKyBuYWdpb3MtbWliZS0yLjUvYmFzZS9jb25maWcuYwkyMDA2LTA5LTI2
IDA3OjM5OjU2LjAwMDAwMDAwMCArMDIwMA0KQEAgLTI3NzAsNiArMjc3MCwx
NCBAQA0KIAkJCXdyaXRlX3RvX2xvZ3NfYW5kX2NvbnNvbGUodGVtcF9idWZm
ZXIsTlNMT0dfVkVSSUZJQ0FUSU9OX0VSUk9SLFRSVUUpOw0KIAkJCWVycm9y
cysrOw0KIAkJICAgICAgICB9DQorICAgIGVsc2Ugew0KKwkgICAgaWYodmVy
aWZ5X2NvbmZpZz09VFJVRSl7DQorCSAgICAgIGNoYXIgcmF3X2NvbW1hbmRf
bGluZVtNQVhfQ09NTUFORF9CVUZGRVJdOw0KKwkJICAgIHByaW50ZigiIG9j
c3BfY29tbWFuZCBpcyBzZXQgdG8gXCIlc1wiXG

...[email truncated]...


This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]