high service check latency

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
raggmopp
Posts: 17
Joined: Fri Oct 28, 2011 8:42 am

high service check latency

Post by raggmopp »

Hi all:

Dell PE2950, 16GB ram, plenty of disk space, etc
Just upgraded to Nagios 3.3.1 from Nagios 3.2.3
MySQL 5.0.77
NDO2DB 1.4b9
RRDTool 1.4.5
NRPE 2.8.1

Been nagios for a while (nagios 2.x) and I have been upgrading, the latest upgrade from 3.2.3. Every other upgrade has gone off without problems except this one to Nagios 3.3.1.

The Service Check Latency has jumped from being about 1 sec to 200+ seconds. I have searched for tuning tips and have made the following changes in nagios.cfg but with little effect.
max_concurrent_checks=100
check_result_reaper_frequency=15
max_check_result_reaper_time=25


The out below is the result of nagiosstats.
Nagios Stats 3.3.1
Copyright (c) 2003-2008 Ethan Galstad (http://www.nagios.org)
Last Modified: 07-25-2011
License: GPL

CURRENT STATUS DATA
------------------------------------------------------
Status File: /usr/local/nagios/var/status.dat
Status File Age: 0d 0h 0m 11s
Status File Version: 3.3.1

Program Running Time: 0d 16h 27m 5s
Nagios PID: 6224
Used/High/Total Command Buffers: 0 / 3 / 4096

Total Services: 2023
Services Checked: 2023
Services Scheduled: 2020
Services Actively Checked: 2023
Services Passively Checked: 0
Total Service State Change: 0.000 / 9.870 / 0.020 %
Active Service Latency: 0.008 / 324.337 / 242.021 sec
Active Service Execution Time: 0.011 / 52.108 / 0.874 sec
Active Service State Change: 0.000 / 9.870 / 0.020 %
Active Services Last 1/5/15/60 min: 128 / 902 / 1850 / 1935
Passive Service Latency: 0.000 / 0.000 / 0.000 sec
Passive Service State Change: 0.000 / 0.000 / 0.000 %
Passive Services Last 1/5/15/60 min: 0 / 0 / 0 / 0
Services Ok/Warn/Unk/Crit: 2021 / 1 / 0 / 1
Services Flapping: 0
Services In Downtime: 0

Total Hosts: 152
Hosts Checked: 152
Hosts Scheduled: 28
Hosts Actively Checked: 152
Host Passively Checked: 0
Total Host State Change: 0.000 / 0.000 / 0.000 %
Active Host Latency: 0.000 / 471.193 / 284.723 sec
Active Host Execution Time: 0.007 / 0.162 / 0.044 sec
Active Host State Change: 0.000 / 0.000 / 0.000 %
Active Hosts Last 1/5/15/60 min: 4 / 16 / 29 / 29
Passive Host Latency: 0.000 / 0.000 / 0.000 sec
Passive Host State Change: 0.000 / 0.000 / 0.000 %
Passive Hosts Last 1/5/15/60 min: 0 / 0 / 0 / 0
Hosts Up/Down/Unreach: 152 / 0 / 0
Hosts Flapping: 0
Hosts In Downtime: 0

Active Host Checks Last 1/5/15 min: 5 / 17 / 49
Scheduled: 5 / 16 / 44
On-demand: 0 / 1 / 5
Parallel: 5 / 16 / 46
Serial: 0 / 0 / 0
Cached: 0 / 1 / 3
Passive Host Checks Last 1/5/15 min: 0 / 0 / 0
Active Service Checks Last 1/5/15 min: 179 / 939 / 2898
Scheduled: 179 / 939 / 2898
On-demand: 0 / 0 / 0
Cached: 0 / 0 / 0
Passive Service Checks Last 1/5/15 min: 0 / 0 / 0

External Commands Last 1/5/15 min: 0 / 0 / 0


I have been unable to find any reasons why or solutions. Anybody else?

Thanks
xvvivan
Posts: 8
Joined: Thu Oct 27, 2011 10:31 am
Location: Varese - Italy

Re: high service check latency

Post by xvvivan »

Hi,

In the past I had a similar problem and the cause was ndoutils.
You could try disabling "broker_module" temporarily?
This test is only to identify a possible cause.

Regards

Ivan
raggmopp
Posts: 17
Joined: Fri Oct 28, 2011 8:42 am

Re: high service check latency

Post by raggmopp »

Tried disabling the ndo and mysql - no change. The nagios process cruises along and then after some time starts eating memory, spawning a bunch of chrildren which become defuncts, and the %sys and LOAD start going up. At this time, the Service Latency Checks start climbing up, dramatically.
mguthrie
Posts: 4380
Joined: Mon Jun 14, 2010 10:21 am

Re: high service check latency

Post by mguthrie »

Try:

Code: Select all

max_concurrent_checks=0
check_result_reaper_frequency=5
max_check_result_reaper_time=15
If this the latency problems still exist, post your entire nagios.cfg. There's either a bad setting that's blocking the loop that executes new checks, or your system is grossly underpowered for the checks that have been scheduled.
raggmopp
Posts: 17
Joined: Fri Oct 28, 2011 8:42 am

Re: high service check latency

Post by raggmopp »

Hi all:

Good news. Did a recompile and resinstall, the Service Check Latency has fallen to 0.5 sec on average (from 600 sec) and it is remaining stable.
My previous compile did not include the --with-perlcache option.

A recompile (with the perlcache option) and reinstall has made a dramatic improvement.

Many thanks!
AndersKarl
Posts: 1
Joined: Thu Nov 03, 2011 5:56 am
Contact:

Re: high service check latency

Post by AndersKarl »

I tried that and it worked great. Thanks!
I am a purchaser of Business Software
Locked