Latency Issue
Posted: Sat Sep 16, 2017 1:37 am
I am seeing some high check latency and can't work out what is causing it.
i have a main Nagios server and 4 worker nodes running mod_gearman
Specs are:
Nagios 4.3.2 core
gearmand-0.25-1
Nagvis 1.8.5
Livestatus 1.2.7i3p2
Running on a VM, which it has been running on for over 2 years now.
64G RAM
6 x CPU
The process queue is about 30 minutes behind the current time.
Here is the output of running nagios with the -s switch
OBJECT CONFIG PROCESSING TIMES (* = Potential for precache savings with -u option)
----------------------------------
Read: 0.293804 sec
Resolve: 0.008414 sec *
Recomb Contactgroups: 0.001341 sec *
Recomb Hostgroups: 0.002187 sec *
Dup Services: 0.021718 sec *
Recomb Servicegroups: 1.774507 sec *
Duplicate: 0.000001 sec *
Inherit: 0.004745 sec *
Register: 0.070415 sec
Free: 0.014057 sec
============
TOTAL: 2.191189 sec * = 0.453228 sec (20.68%) estimated savings
Timing information on configuration verification is listed below.
CONFIG VERIFICATION TIMES
----------------------------------
Object Relationships: 0.009689 sec
Circular Paths: 0.000150 sec
Misc: 0.000084 sec
============
TOTAL: 0.009923 sec
RETENTION DATA TIMES
----------------------------------
Read and Process: 0.659380 sec
============
TOTAL: 0.659380 sec
EVENT SCHEDULING TIMES
-------------------------------------
Get service info: 0.048644 sec
Get host info info: 0.006077 sec
Get service params: 0.000009 sec
Schedule service times: 0.100407 sec
Schedule service events: 0.016305 sec
Get host params: 0.000000 sec
Schedule host times: 0.015831 sec
Schedule host events: 0.002078 sec
============
TOTAL: 0.189351 sec
Projected scheduling information for host and service checks
is listed below. This information assumes that you are going
to start running Nagios with your current config files.
HOST SCHEDULING INFORMATION
---------------------------
Total hosts: 2555
Total scheduled hosts: 2553
Host inter-check delay method: SMART
Average host check interval: 600.00 sec
Host inter-check delay: 0.24 sec
Max host check spread: 40 min
First scheduled check: Sat Sep 16 07:36:27 2017
Last scheduled check: Sat Sep 16 07:46:26 2017
SERVICE SCHEDULING INFORMATION
-------------------------------
Total services: 18188
Total scheduled services: 18170
Service inter-check delay method: SMART
Average service check interval: 634.61 sec
Inter-check delay: 0.03 sec
Interleave factor method: SMART
Average services per host: 7.12
Service interleave factor: 8
Max service check spread: 40 min
First scheduled check: Sat Sep 16 07:36:27 2017
Last scheduled check: Wed Sep 20 10:05:31 2017
CHECK PROCESSING INFORMATION
----------------------------
Average check execution time: 7.69s
Estimated concurrent checks: 354 (59.00 per cpu core)
Max concurrent service checks: Unlimited
PERFORMANCE SUGGESTIONS
-----------------------
I have no suggestions - things look okay.
i have a main Nagios server and 4 worker nodes running mod_gearman
Specs are:
Nagios 4.3.2 core
gearmand-0.25-1
Nagvis 1.8.5
Livestatus 1.2.7i3p2
Running on a VM, which it has been running on for over 2 years now.
64G RAM
6 x CPU
The process queue is about 30 minutes behind the current time.
Here is the output of running nagios with the -s switch
OBJECT CONFIG PROCESSING TIMES (* = Potential for precache savings with -u option)
----------------------------------
Read: 0.293804 sec
Resolve: 0.008414 sec *
Recomb Contactgroups: 0.001341 sec *
Recomb Hostgroups: 0.002187 sec *
Dup Services: 0.021718 sec *
Recomb Servicegroups: 1.774507 sec *
Duplicate: 0.000001 sec *
Inherit: 0.004745 sec *
Register: 0.070415 sec
Free: 0.014057 sec
============
TOTAL: 2.191189 sec * = 0.453228 sec (20.68%) estimated savings
Timing information on configuration verification is listed below.
CONFIG VERIFICATION TIMES
----------------------------------
Object Relationships: 0.009689 sec
Circular Paths: 0.000150 sec
Misc: 0.000084 sec
============
TOTAL: 0.009923 sec
RETENTION DATA TIMES
----------------------------------
Read and Process: 0.659380 sec
============
TOTAL: 0.659380 sec
EVENT SCHEDULING TIMES
-------------------------------------
Get service info: 0.048644 sec
Get host info info: 0.006077 sec
Get service params: 0.000009 sec
Schedule service times: 0.100407 sec
Schedule service events: 0.016305 sec
Get host params: 0.000000 sec
Schedule host times: 0.015831 sec
Schedule host events: 0.002078 sec
============
TOTAL: 0.189351 sec
Projected scheduling information for host and service checks
is listed below. This information assumes that you are going
to start running Nagios with your current config files.
HOST SCHEDULING INFORMATION
---------------------------
Total hosts: 2555
Total scheduled hosts: 2553
Host inter-check delay method: SMART
Average host check interval: 600.00 sec
Host inter-check delay: 0.24 sec
Max host check spread: 40 min
First scheduled check: Sat Sep 16 07:36:27 2017
Last scheduled check: Sat Sep 16 07:46:26 2017
SERVICE SCHEDULING INFORMATION
-------------------------------
Total services: 18188
Total scheduled services: 18170
Service inter-check delay method: SMART
Average service check interval: 634.61 sec
Inter-check delay: 0.03 sec
Interleave factor method: SMART
Average services per host: 7.12
Service interleave factor: 8
Max service check spread: 40 min
First scheduled check: Sat Sep 16 07:36:27 2017
Last scheduled check: Wed Sep 20 10:05:31 2017
CHECK PROCESSING INFORMATION
----------------------------
Average check execution time: 7.69s
Estimated concurrent checks: 354 (59.00 per cpu core)
Max concurrent service checks: Unlimited
PERFORMANCE SUGGESTIONS
-----------------------
I have no suggestions - things look okay.