Page 1 of 1

[Nagios-devel] ocsp slows nagios a great deal

Posted: Sun Aug 13, 2006 11:42 am
by Guest
I'm in the process of trying to set up a distributed nagios
environment monitoring about 9,000 services on 2,500 hosts. I'm using
Sunfire V210 servers running Solaris 10.

I've found that the distributed servers which monitor the active
services can run about 1700 checks every 5 minutes if ocsp isn't
enabled, but once I enable obsess_over_services, the number of active
checks I can do goes WAY down. Is this known behavior? It seems like
it might be a bug or something...why would it be part of the design to
lose so much performance per server when implementing distributed
monitoring?

Here's a breakdown:

- ocsp disabled: 1800 checks / 5 min.

- ocsp command set to /bin/true: 1200 checks / 5 min.

- ocsp command set to a perl program that forks, then pipes output to
send_nsca: 800 checks / 5 min.

- ocsp command set to a shell program that pipes output to send_nsca
(this is the "submit_check_result_via_nsca" script that came with
the nagios distribution): 500 checks / 5 min.

What's the deal? I've followed the instructions in the "performance
tuning" place in the manual, but nothing seems to help much, and I
don't know what else to check. Resources on the machines are not being
fully utilized.... there's about 30% free cpu at any given time, and
plenty of RAM (only 500 MB used of 2 GB). Any help would be much
appreciated!

Solaris 10 is fully patched with recommended updates from last week.
I'm running Nagios 2.5 and it's configured like this:

--with-perlcache \
--enable-embedded-perl \
--enable-nanosleep \
--with-gd-inc=$GD_INC_PATH \
--with-gd-lib=$GD_LIB_PATH

Thanks,
Loren


--
loren jan wilson
network engineering, uchicago.edu
1155 rm. 327 ; 773/702-8189





This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]