Re: [Nagios-devel] RFC: New IPC Method for Check Results

Guest · Post by **Guest** » Wed Apr 11, 2007 8:17 pm

Hendrik Bäcker wrote:
> Ethan Galstad wrote:
>> Proposed solution:
>>
>> The new method I am proposing is simple and straightforward. Why I
>> didn't implement something like this years ago is beyond me.

>>
> Cause, you just wanted to begin your programmers way with a pipe?? *just
> kidding*
>> Instead of passing check results from child processes to the main Nagios
>> process via two methods (pipe and file), I suggest that all information
>> be written to files in a special check result queue directory (e.g.,
>> var/checkresults). Child processes that perform host/service checks can
>> write all results to a file in the queue directory. The main Nagios
>> process will then periodically process all files/check results in the
>> queue in a time-ordered fasion.
>>
> Some of us will remember my post about "a good way to handle performance
> data" with a small discussion about pipes vs. "spooldirs"?!
> In the actual release of the PNP Addon we have established a small
> daemon that does exactly what you wrote above.
> Short excurs: Nagios writes only files with perfdata, rotate them every
> x seconds to a spool dir, daemon reads the files and process them to
> fill the rrdfiles.
> This solution brought me from a latency around 350 Seconds ( ~ 2000
> Serviceechecks) down to 2-5 seconds.

Good to hear that you saw such improvements. Hopefully this will have
similar effects for passive checks...

>
> Cause of this I would say: this is the right way.
>> Any performance hits that may occur with the new IPC method due to disk
>> thrashing can be minimized if the queue directory is placed on a
>> memory-mapped filesystem. Whether this will actually be necessary or
>> not in all but the largest installations remains to be seen.
>>
> I would suggest to keep an eye on the number of files within a
> directory. I know some guys with a huge number of distributed nagios
> servers and a big amount of service checks.
> It might be bad if nagios dies for hours and on re-awakening to process
> thousand of single files if you think of using one file for each result.

I'll make sure that multiple results can be stored in a single file
(ideal for bulk transfers using NSCA). A configurable option will allow
Nagios to process only results made within a certain timeframe. I think
that should take care of it.

>
> Just my 2 Cents.
>
> Kind regards
> Hendrik
>

Ethan Galstad,
Nagios Developer
---
Email: [email protected]
Website: http://www.nagios.org

This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]