[Nagios-devel] QUERY: Obsessive-Compulsive Processors obsessing too much?

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

[Nagios-devel] QUERY: Obsessive-Compulsive Processors obsessing too much?

Post by Guest »

One of the things that I ran into when tracking down my runaway process
issue, is the handling of the obsessive-compulsive options. In
particular, while the service checks are run in parallel, the ocsp_command
is run in series (along with event handlers and so forth, all by
reap_service_checks). To illustrate, a set of 5 services may well be run
at the same time, taking T(service_run_time), but then the
ocsp_commands are run one after another, taking T(N*ocsp_command_time) :

Parallel Service | | | | | T
Check Execution | | | | | ( run_service_checks() )
\ \ | / / I
Reaper \ \|/ /
Interval \ | / M
V ( reap_service_checks )
--- ( my_system ) E
Serial --- ( my_system )
Obsessive-Compulsive --- |
Execution --- V
---

In the setup that I am working on, I have Nagios running at a rate of at
least 1 service check per second, with an ocsp_command to distribute the
results to other machines. Thus, every reaper_interval (10 seconds),
Nagios hangs around for the ocsp_command to finish running for every
service check.

Since the oscp_command is dependent on TCP handshakes to complete, the
time it takes to finish is noticably variable, and thus Nagios continually
gets later and later.

My query is, do I shift my distributed monitoring to be more batched, and
run my distributed monitoring stuff off the periodic execution of
service_perfdata_file_processing_command, or do I change Nagios to run the
oscp_command in a double fork like run_service_check() ?

--==--
Bruce.





This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Locked