On Tue, Aug 23, 2011 at 5:48 PM, Mark Goldfinch
wrote:
> On this particular point, the overall system CPU statistics displayed at =
the top of "top" are an average across all CPUs. =A0As previously mooted, N=
agios core isn't multi-threaded, so it can only max a single core. =A0100% =
of 1/8 CPUs =3D=3D 12.5% hence why you're seeing 87.5% idle time, 7 of your=
cores are not stressed out.
Nagios forks a new process to execute each check - so it will take
advantage of multiple cores as long as the kernel scheduler is working
properly :p - on our biggest pollers we get 300-400 checks running at
a time in parallel at any given time during the polling cycle.
Some blog posts I wrote about Nagios performance that might help (some
of the topics have been covered):
http://www.semintelligent.com/blog/?q=3DPerformance
We found that changing host and service inter-check delay to 'n' for
no delay made a big difference - also, changing sleep time to 0.02 and
compiling Nagios with nanosleep enabled helps a lot as well - and we
added a few additional patches to remove hard-coded sleep statements
that were in the code that were causing Nagios to sleep more than we
wanted.
Right now on an HP DL385 we max out at about 10k checks (combo of host
and service checks) per 5 minutes with a sustained service check
latency of 2-3 seconds - that is a quad core host with 8 GB of RAM.
We have latency requirements that are very specific to our environment
- we keep all pollers at less than 10 secs service latency at all
times.
- Max
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: maxs@webwizarddesign.com