[Nagios-devel] Problems with extensive passive monitoring

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

[Nagios-devel] Problems with extensive passive monitoring

Post by Guest »

This message is in MIME format. The first part should be readable text,
while the remaining parts are likely unreadable without MIME-aware tools.

--8323584-1729058409-1160397968=:14309
Content-Type: TEXT/PLAIN; charset=US-ASCII

Hi all,

in our environment we got a problem with extensive passive monitoring
feature of nagios.

Description in short:
---------------------

In our environment we got more than 250 clients where each of them runs
its own nagios server to monitor itself. Each client runs up to 8 service
checks and posts these results as external command via use of
send_nsca/nsca to a master nagios server, I call it cluster master nagios
server or short CMNS.

This CMNS is also a client of one site master nagios server (or short
SMNS). CMNS must forward its messages to SMNS as external commands like
the clients did to it.

With build-in feature of nagios (we use version 2.5) you can use
send_nsca/nsca to forward messages from CMNS to SMNS too but this results
in:
* heavy load on CMNS due to fork of at least one external command
send_nsca to forward one message (in our environment up to 1000
forks per minute) to SMNS.
* up to 1000 nsca per minute to deliver external command messages from
clients to CMNS
* loosing of incomming messages from clients on CMNS because it reads
from external command pipe only 30 seconds .. then it makes a pause.
* child processes of CMNS become childs of `init' and all of them
write further into the pipe over which they are connected with the
nagios master process.
* thereby they eat a lot of memory so a machine with 512MB RAM and 2GB
swap must be booted after 2 days otherwise it hangs

The whole description can be read on:
http://www.mountcup.de/tiki/tiki-index. ... monitoring

My solution
-----------
Instead of calling an external program (ocsp_command or ochp_command) for
each external command message to forward it from CMNS to SMNS let write
the nagios process these messages in a named pipe. The patch attached
gives you this functionallity for nagios version 2.5.

Then let a helper program read from this named pipe on CMNS site and let
it forward the messages through a (I call it here) channel to whatever you
want, in this case to SMNS. I have written a perl program that does this
for you which is added as attachment too.

What do you thing about the option to use namend pipes in addition to
ocsp_command and/or ochp_command running as external process?
The NDO interface can't be used in this case because there aren't any
connectors inside the code for external commands.

best regards
Mike

-----------------------------------------------------------------------------
Mike Becher [email protected]
Leibniz-Rechenzentrum der http://www.lrz.de
Bayerischen Akademie der Wissenschaften phone: +49-89-35831-8721
Gruppe Hochleistungssysteme fax: +49-89-35831-9700
Boltzmannstrasse 1
D-85748 Garching bei Muenchen
Germany
-----------------------------------------------------------------------------
--8323584-1729058409-1160397968=:14309
Content-Type: TEXT/PLAIN; charset=US-ASCII;
name="nagios-2.5-ocxp_command_npipe.patch"
Content-Transfer-Encoding: BASE64
Content-ID:
Content-Description: nagios-2.5-ocxp_command_npipe.patch
Content-Disposition: attachment; filename="nagios-2.5-ocxp_command_npipe.patch"

ZGlmZiAtdSAtciAtTiBuYWdpb3MtMi41L2Jhc2UvY29uZmlnLmMgbmFnaW9z
LW1pYmUtMi41L2Jhc2UvY29uZmlnLmMNCi0tLSBuYWdpb3MtMi41L2Jhc2Uv
Y29uZmlnLmMJMjAwNS0xMi0yNyAwMDoxODoxNC4wMDAwMDAwMDAgKzAxMDAN
CisrKyBuYWdpb3MtbWliZS0yLjUvYmFzZS9jb25maWcuYwkyMDA2LTA5LTI2
IDA3OjM5OjU2LjAwMDAwMDAwMCArMDIwMA0KQEAgLTI3NzAsNiArMjc3MCwx
NCBAQA0KIAkJCXdyaXRlX3RvX2xvZ3NfYW5kX2NvbnNvbGUodGVtcF9idWZm
ZXIsTlNMT0dfVkVSSUZJQ0FUSU9OX0VSUk9SLFRSVUUpOw0KIAkJCWVycm9y
cysrOw0KIAkJICAgICAgICB9DQorICAgIGVsc2Ugew0KKwkgICAgaWYodmVy
aWZ5X2NvbmZpZz09VFJVRSl7DQorCSAgICAgIGNoYXIgcmF3X2NvbW1hbmRf
bGluZVtNQVhfQ09NTUFORF9CVUZGRVJdOw0KKwkJICAgIHByaW50ZigiIG9j
c3BfY29tbWFuZCBpcyBzZXQgdG8gXCIlc1wiXG

...[email truncated]...


This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Locked