Latency Issue

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
delboy1966
Posts: 94
Joined: Thu Oct 22, 2015 5:26 am

Latency Issue

Post by delboy1966 »

Capture.PNG
Capture.PNG
I am seeing some high check latency and can't work out what is causing it.

i have a main Nagios server and 4 worker nodes running mod_gearman

Specs are:
Nagios 4.3.2 core
gearmand-0.25-1
Nagvis 1.8.5
Livestatus 1.2.7i3p2

Running on a VM, which it has been running on for over 2 years now.
64G RAM
6 x CPU

The process queue is about 30 minutes behind the current time.
Here is the output of running nagios with the -s switch


OBJECT CONFIG PROCESSING TIMES (* = Potential for precache savings with -u option)
----------------------------------
Read: 0.293804 sec
Resolve: 0.008414 sec *
Recomb Contactgroups: 0.001341 sec *
Recomb Hostgroups: 0.002187 sec *
Dup Services: 0.021718 sec *
Recomb Servicegroups: 1.774507 sec *
Duplicate: 0.000001 sec *
Inherit: 0.004745 sec *
Register: 0.070415 sec
Free: 0.014057 sec
============
TOTAL: 2.191189 sec * = 0.453228 sec (20.68%) estimated savings


Timing information on configuration verification is listed below.

CONFIG VERIFICATION TIMES
----------------------------------
Object Relationships: 0.009689 sec
Circular Paths: 0.000150 sec
Misc: 0.000084 sec
============
TOTAL: 0.009923 sec


RETENTION DATA TIMES
----------------------------------
Read and Process: 0.659380 sec
============
TOTAL: 0.659380 sec


EVENT SCHEDULING TIMES
-------------------------------------
Get service info: 0.048644 sec
Get host info info: 0.006077 sec
Get service params: 0.000009 sec
Schedule service times: 0.100407 sec
Schedule service events: 0.016305 sec
Get host params: 0.000000 sec
Schedule host times: 0.015831 sec
Schedule host events: 0.002078 sec
============
TOTAL: 0.189351 sec


Projected scheduling information for host and service checks
is listed below. This information assumes that you are going
to start running Nagios with your current config files.

HOST SCHEDULING INFORMATION
---------------------------
Total hosts: 2555
Total scheduled hosts: 2553
Host inter-check delay method: SMART
Average host check interval: 600.00 sec
Host inter-check delay: 0.24 sec
Max host check spread: 40 min
First scheduled check: Sat Sep 16 07:36:27 2017
Last scheduled check: Sat Sep 16 07:46:26 2017


SERVICE SCHEDULING INFORMATION
-------------------------------
Total services: 18188
Total scheduled services: 18170
Service inter-check delay method: SMART
Average service check interval: 634.61 sec
Inter-check delay: 0.03 sec
Interleave factor method: SMART
Average services per host: 7.12
Service interleave factor: 8
Max service check spread: 40 min
First scheduled check: Sat Sep 16 07:36:27 2017
Last scheduled check: Wed Sep 20 10:05:31 2017


CHECK PROCESSING INFORMATION
----------------------------
Average check execution time: 7.69s
Estimated concurrent checks: 354 (59.00 per cpu core)
Max concurrent service checks: Unlimited


PERFORMANCE SUGGESTIONS
-----------------------
I have no suggestions - things look okay.
dwhitfield
Former Nagios Staff
Posts: 4583
Joined: Wed Sep 21, 2016 10:29 am
Location: NoLo, Minneapolis, MN
Contact:

Re: Latency Issue

Post by dwhitfield »

delboy1966 wrote:
Nagios 4.3.2 core

Running on a VM, which it has been running on for over 2 years now.
Can you tell us a bit about the upgrade history? When did you upgrade to 4.3.2? Did the issue start after that?

I don't see anything that would obviously fix this, but 4.3.4 is out, so you might try upgrading and seeing if the issue remains: https://github.com/NagiosEnterprises/na ... /Changelog
delboy1966
Posts: 94
Joined: Thu Oct 22, 2015 5:26 am

Re: Latency Issue

Post by delboy1966 »

Have now upgraded to 4.3.4 and the issue seems to still remain.
I reload Nagios and the scheduling queue displays the next check time as the current time but then falls behind again.
Latency on checks are around 40 minutes still.

Tony
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Latency Issue

Post by scottwilkerson »

Could you share your nagios.cfg

Also, do you have a lot of checks queued in your gearman queue? Are there enough workers to process the checks?
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
delboy1966
Posts: 94
Joined: Thu Oct 22, 2015 5:26 am

Re: Latency Issue

Post by delboy1966 »

This topic can be closed now.
We decided to spin up a new Nagios server and migrate all the checks across.
New version of Nagios, mod_gearman compiled with the Nagios 4 headers and livestatus.
All working fine at the moment, so problem seems to be solved.

Thanks for you help

Tony
Locked