Service check latency is too high

Anuraag · Post by **Anuraag** » Fri Apr 11, 2014 10:21 am

Hi,

I have too high service check latency and below are the logs

Tactical Overview:

Service Check Execution Time: 0.20 / 180.08 / 2.073 sec
Service Check Latency: 0.79 / 8307.20 / 4817.833 sec
Host Check Execution Time: 0.00 / 124.14 / 36.090 sec
Host Check Latency: 0.00 / 10.78 / 0.099 sec
# Active Host / Service Checks: 109 / 3144
# Passive Host / Service Checks: 0 / 0

Preflight nagios check :

Nagios 2.10
Copyright (c) 1999-2007 Ethan Galstad (http://www.nagios.org)
Last Modified: 10-21-2007
License: GPL

HOST SCHEDULING INFORMATION
---------------------------
Total hosts: 109
Total scheduled hosts: 1
Host inter-check delay method: SMART
Average host check interval: 300.00 sec
Host inter-check delay: 300.00 sec
Max host check spread: 30 min
First scheduled check: Fri Apr 11 16:11:34 2014
Last scheduled check: Fri Apr 11 16:11:34 2014

SERVICE SCHEDULING INFORMATION
-------------------------------
Total services: 3144
Total scheduled services: 3144
Service inter-check delay method: SMART
Average service check interval: 784.37 sec
Inter-check delay: 0.25 sec
Interleave factor method: SMART
Average services per host: 28.84
Service interleave factor: 29
Max service check spread: 30 min
First scheduled check: Fri Apr 11 16:12:01 2014
Last scheduled check: Tue Apr 15 05:00:00 2014

CHECK PROCESSING INFORMATION
----------------------------
Service check reaper interval: 10 sec
Max concurrent service checks: Unlimited

PERFORMANCE SUGGESTIONS
-----------------------
I have no suggestions - things look okay.

below are the settings for nagios.cfg

service_check_timeout=180
host_check_timeout=300
use_aggressive_host_checking=0
max_concurrent_checks=0
service_reaper_frequency=10

I do not understand where the problem lies.. Could someone please help resolve the issue..

sreinhardt · Post by **sreinhardt** » Fri Apr 11, 2014 2:16 pm

Firstly, I cannot stress enough that you should update the nagios version here. 2.10 is several years old. Otherwise, what do other stats on your system look like such as load, memory use, disk io and disk wait? How long has this been going on for, and were there any changes that were made around when this started happening?

Code: Select all

top | head -n 1
free -m
df -h
df -i
iostat

Anuraag · Post by **Anuraag** » Fri Apr 11, 2014 4:11 pm

Yes, I do agree that the version is very much old and we are planning on upgrading to the nagios xi..
I have checked the memory being used by using nmon and the stats say that its pretty much idle..There is around 5 GB of free space. This has stated happening only since yesterday and I don know why it has started behaving suddenly like this. The server was pretty much handling the same load till the day before.
Another thing that I have observed is that when doing check_ping to any host, It is taking around 2 to 3 mins to get back with the output that the PING is OK. So I have changed the host check interval from 60sec to 300 secs.. Otherwise all the host checks were getting timed out after 60 secs..

slansing · Post by **slansing** » Mon Apr 14, 2014 9:59 am

Were you pinging a address on the same net as you? Can you try doing this and let us know if it returns decently quick? Was there sluggishness in the SSH session to the Core server, or was the ping just taking a long time to make it out? You could also try traceroute's out to your hosts.

Can you still provide the output of:

Code: Select all

top | head -n 1
free -m
df -h
df -i
iostat

Please? We'd rather see then details than an interpretation of them.

Nagios Support Forum

Service check latency is too high

Service check latency is too high

Re: Service check latency is too high

Re: Service check latency is too high

Re: Service check latency is too high