Re: [Nagios-devel] RFC: New IPC Method for Check Results

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

Re: [Nagios-devel] RFC: New IPC Method for Check Results

Post by Guest »

Hendrik Bäcker wrote:
> Ethan Galstad wrote:
>> Proposed solution:
>>
>> The new method I am proposing is simple and straightforward. Why I
>> didn't implement something like this years ago is beyond me. :-)
>>
> Cause, you just wanted to begin your programmers way with a pipe?? *just
> kidding*
>> Instead of passing check results from child processes to the main Nagios
>> process via two methods (pipe and file), I suggest that all information
>> be written to files in a special check result queue directory (e.g.,
>> var/checkresults). Child processes that perform host/service checks can
>> write all results to a file in the queue directory. The main Nagios
>> process will then periodically process all files/check results in the
>> queue in a time-ordered fasion.
>>
> Some of us will remember my post about "a good way to handle performance
> data" with a small discussion about pipes vs. "spooldirs"?!
> In the actual release of the PNP Addon we have established a small
> daemon that does exactly what you wrote above.
> Short excurs: Nagios writes only files with perfdata, rotate them every
> x seconds to a spool dir, daemon reads the files and process them to
> fill the rrdfiles.
> This solution brought me from a latency around 350 Seconds ( ~ 2000
> Serviceechecks) down to 2-5 seconds.

Good to hear that you saw such improvements. Hopefully this will have
similar effects for passive checks...

>
> Cause of this I would say: this is the right way.
>> Any performance hits that may occur with the new IPC method due to disk
>> thrashing can be minimized if the queue directory is placed on a
>> memory-mapped filesystem. Whether this will actually be necessary or
>> not in all but the largest installations remains to be seen.
>>
> I would suggest to keep an eye on the number of files within a
> directory. I know some guys with a huge number of distributed nagios
> servers and a big amount of service checks.
> It might be bad if nagios dies for hours and on re-awakening to process
> thousand of single files if you think of using one file for each result.

I'll make sure that multiple results can be stored in a single file
(ideal for bulk transfers using NSCA). A configurable option will allow
Nagios to process only results made within a certain timeframe. I think
that should take care of it.

>
> Just my 2 Cents.
>
> Kind regards
> Hendrik
>



Ethan Galstad,
Nagios Developer
---
Email: [email protected]
Website: http://www.nagios.org





This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Locked