Service check latency is too high

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Anuraag
Posts: 4
Joined: Fri Feb 21, 2014 11:46 am

Service check latency is too high

Post by Anuraag »

Hi,

I have too high service check latency and below are the logs

Tactical Overview:

Service Check Execution Time: 0.20 / 180.08 / 2.073 sec
Service Check Latency: 0.79 / 8307.20 / 4817.833 sec
Host Check Execution Time: 0.00 / 124.14 / 36.090 sec
Host Check Latency: 0.00 / 10.78 / 0.099 sec
# Active Host / Service Checks: 109 / 3144
# Passive Host / Service Checks: 0 / 0


Preflight nagios check :

Nagios 2.10
Copyright (c) 1999-2007 Ethan Galstad (http://www.nagios.org)
Last Modified: 10-21-2007
License: GPL

HOST SCHEDULING INFORMATION
---------------------------
Total hosts: 109
Total scheduled hosts: 1
Host inter-check delay method: SMART
Average host check interval: 300.00 sec
Host inter-check delay: 300.00 sec
Max host check spread: 30 min
First scheduled check: Fri Apr 11 16:11:34 2014
Last scheduled check: Fri Apr 11 16:11:34 2014


SERVICE SCHEDULING INFORMATION
-------------------------------
Total services: 3144
Total scheduled services: 3144
Service inter-check delay method: SMART
Average service check interval: 784.37 sec
Inter-check delay: 0.25 sec
Interleave factor method: SMART
Average services per host: 28.84
Service interleave factor: 29
Max service check spread: 30 min
First scheduled check: Fri Apr 11 16:12:01 2014
Last scheduled check: Tue Apr 15 05:00:00 2014


CHECK PROCESSING INFORMATION
----------------------------
Service check reaper interval: 10 sec
Max concurrent service checks: Unlimited


PERFORMANCE SUGGESTIONS
-----------------------
I have no suggestions - things look okay.

below are the settings for nagios.cfg

service_check_timeout=180
host_check_timeout=300
use_aggressive_host_checking=0
max_concurrent_checks=0
service_reaper_frequency=10

I do not understand where the problem lies.. Could someone please help resolve the issue..
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: Service check latency is too high

Post by sreinhardt »

Firstly, I cannot stress enough that you should update the nagios version here. 2.10 is several years old. Otherwise, what do other stats on your system look like such as load, memory use, disk io and disk wait? How long has this been going on for, and were there any changes that were made around when this started happening?

Code: Select all

top | head -n 1
free -m
df -h
df -i
iostat
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
Anuraag
Posts: 4
Joined: Fri Feb 21, 2014 11:46 am

Re: Service check latency is too high

Post by Anuraag »

Yes, I do agree that the version is very much old and we are planning on upgrading to the nagios xi..
I have checked the memory being used by using nmon and the stats say that its pretty much idle..There is around 5 GB of free space. This has stated happening only since yesterday and I don know why it has started behaving suddenly like this. The server was pretty much handling the same load till the day before.
Another thing that I have observed is that when doing check_ping to any host, It is taking around 2 to 3 mins to get back with the output that the PING is OK. So I have changed the host check interval from 60sec to 300 secs.. Otherwise all the host checks were getting timed out after 60 secs..
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Service check latency is too high

Post by slansing »

Were you pinging a address on the same net as you? Can you try doing this and let us know if it returns decently quick? Was there sluggishness in the SSH session to the Core server, or was the ping just taking a long time to make it out? You could also try traceroute's out to your hosts.

Can you still provide the output of:

Code: Select all

top | head -n 1
free -m
df -h
df -i
iostat
Please? We'd rather see then details than an interpretation of them.
Locked