Page 1 of 1

Re: [Nagios-devel] RFC: New IPC Method for Check Results

Posted: Wed Apr 11, 2007 1:46 pm
by Guest
Maildir is such a big win over mbox that I also agree that this is a =
good idea. I'd suggest you worry about the writer and read locking. Do =
something that works on many implementations of file systems (i.e. avoid =
flock). I believe that some similar code uses the file mode or name to =
intricate if it is being written. Of course you then need to clean up =
files that are too old which are left over from crashed writers.
Roy

-----Original Message-----
From: [email protected] =
[mailto:[email protected]] On Behalf Of Hendrik =
B=E4cker
Sent: Wednesday, April 11, 2007 12:42 PM
To: Nagios Developers List
Subject: Re: [Nagios-devel] RFC: New IPC Method for Check Results

Ethan Galstad wrote:
> Proposed solution:
>
> The new method I am proposing is simple and straightforward. Why I=20
> didn't implement something like this years ago is beyond me. :-)
> =20
Cause, you just wanted to begin your programmers way with a pipe?? *just
kidding*
> Instead of passing check results from child processes to the main =
Nagios=20
> process via two methods (pipe and file), I suggest that all =
information=20
> be written to files in a special check result queue directory (e.g.,=20
> var/checkresults). Child processes that perform host/service checks =
can=20
> write all results to a file in the queue directory. The main Nagios=20
> process will then periodically process all files/check results in the=20
> queue in a time-ordered fasion.
> =20
Some of us will remember my post about "a good way to handle performance
data" with a small discussion about pipes vs. "spooldirs"?!
In the actual release of the PNP Addon we have established a small
daemon that does exactly what you wrote above.
Short excurs: Nagios writes only files with perfdata, rotate them every
x seconds to a spool dir, daemon reads the files and process them to
fill the rrdfiles.
This solution brought me from a latency around 350 Seconds ( ~ 2000
Serviceechecks) down to 2-5 seconds.

Cause of this I would say: this is the right way.
>
> Any performance hits that may occur with the new IPC method due to =
disk=20
> thrashing can be minimized if the queue directory is placed on a=20
> memory-mapped filesystem. Whether this will actually be necessary or=20
> not in all but the largest installations remains to be seen.
> =20
I would suggest to keep an eye on the number of files within a
directory. I know some guys with a huge number of distributed nagios
servers and a big amount of service checks.
It might be bad if nagios dies for hours and on re-awakening to process
thousand of single files if you think of using one file for each result.

Just my 2 Cents.

Kind regards
Hendrik

-------------------------------------------------------------------------=

Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share =
your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page ... ge&CID=3D=
DEVDEV
_______________________________________________
Nagios-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/lis ... gios-devel





This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]