Page 1 of 1

Nagios Performance Tuning

Posted: Thu May 03, 2018 6:53 am
by amitgupta19
I am Using the Nagios 3 on centos5

Following is the Nagiostats:


Nagios Stats 3.3.1
Copyright (c) 2003-2008 Ethan Galstad (www.nagios.org)
Last Modified: 07-25-2011
License: GPL

CURRENT STATUS DATA
------------------------------------------------------
Status File: /var/log/nagios/status.dat
Status File Age: 0d 0h 0m 4s
Status File Version: 3.3.1

Program Running Time: 0d 1h 1m 41s
Nagios PID: 16702
Used/High/Total Command Buffers: 0 / 0 / 4096

Total Services: 2574
Services Checked: 2574
Services Scheduled: 2573
Services Actively Checked: 2574
Services Passively Checked: 0
Total Service State Change: 0.000 / 18.420 / 0.024 %
Active Service Latency: 0.440 / 117.416 / 81.937 sec
Active Service Execution Time: 0.006 / 63.754 / 0.783 sec
Active Service State Change: 0.000 / 18.420 / 0.024 %
Active Services Last 1/5/15/60 min: 371 / 1891 / 2572 / 2573
Passive Service Latency: 0.000 / 0.000 / 0.000 sec
Passive Service State Change: 0.000 / 0.000 / 0.000 %
Passive Services Last 1/5/15/60 min: 0 / 0 / 0 / 0
Services Ok/Warn/Unk/Crit: 2530 / 17 / 1 / 26
Services Flapping: 0
Services In Downtime: 0

Total Hosts: 384
Hosts Checked: 384
Hosts Scheduled: 384
Hosts Actively Checked: 384
Host Passively Checked: 0
Total Host State Change: 0.000 / 0.000 / 0.000 %
Active Host Latency: 0.000 / 111.237 / 75.401 sec
Active Host Execution Time: 4.006 / 5.865 / 4.249 sec
Active Host State Change: 0.000 / 0.000 / 0.000 %
Active Hosts Last 1/5/15/60 min: 9 / 184 / 384 / 384
Passive Host Latency: 0.000 / 0.000 / 0.000 sec
Passive Host State Change: 0.000 / 0.000 / 0.000 %
Passive Hosts Last 1/5/15/60 min: 0 / 0 / 0 / 0
Hosts Up/Down/Unreach: 384 / 0 / 0
Hosts Flapping: 0
Hosts In Downtime: 0

Active Host Checks Last 1/5/15 min: 27 / 212 / 681
Scheduled: 20 / 174 / 561
On-demand: 7 / 38 / 120
Parallel: 20 / 174 / 561
Serial: 7 / 35 / 107
Cached: 0 / 3 / 13
Passive Host Checks Last 1/5/15 min: 0 / 0 / 0
Active Service Checks Last 1/5/15 min: 353 / 2011 / 6441
Scheduled: 353 / 2011 / 6441
On-demand: 0 / 0 / 0
Cached: 0 / 0 / 0
Passive Service Checks Last 1/5/15 min: 0 / 0 / 0

External Commands Last 1/5/15 min: 0 / 0 / 0


Can anyone please help me identifying the issues and how to correct them?

Re: Nagios Performance Tuning

Posted: Thu May 03, 2018 7:19 am
by eloyd
#1 - upgrade to a Nagios Core 4 version. The process threading and handling is so much improved that it's like magic.

#2 - I don't see any "issues" that need to be corrected. You haven't identified anything that you think needs to be changed in your metrics. I can assume that you mean the check latency times are too high for your liking, but that is just me making an assumption. Can you ask a specific question that we can try to answer?

Re: Nagios Performance Tuning

Posted: Thu May 03, 2018 7:36 am
by amitgupta19
Isn't the Following 2 parameters on the higher side:

Active Service Latency: 0.440 / 117.416 / 81.937 sec
Active Host Latency: 0.000 / 111.237 / 75.401 sec

If yes, what could be impact of the high latency.
How to correct it?

Re: Nagios Performance Tuning

Posted: Thu May 03, 2018 7:50 am
by eloyd
Latency represents how long Nagios waited before scheduling the service check until it received results. This could be caused by a number of things, but the #1 reason is likely that you're running Nagios 3. Upgrade to Nagios 4, make no other changes, and I'll bet it will drop dramatically.

Re: Nagios Performance Tuning

Posted: Thu May 03, 2018 8:14 am
by amitgupta19
I can see the below the messages in the Event Log of the Nagios:

[03-05-2018 06:11:43] Max concurrent service checks (200) has been reached. Nudging XXXXXXXXXXXXXXXXX:Active Directory Domain Services by 10 seconds...
Informational Message[03-05-2018 06:11:42] Max concurrent service checks (200) has been reached. Nudging XXXXXXXXXXXX:Active Directory Web Services by 7 seconds...
Informational Message[03-05-2018 06:11:16] Max concurrent service checks (200) has been reached. Nudging XXXXXXXXXXXX:E:\ Drive Space by 11 seconds...
Informational Message[03-05-2018 06:11:15] Max concurrent service checks (200) has been reached. Nudging XXXXXXXXXXXXXXXXX:C:\ Drive Space by 7 seconds...

is it OK ?

Re: Nagios Performance Tuning

Posted: Thu May 03, 2018 8:17 am
by eloyd
Your problem is that you have too many things going on at once because it's taking so long to finish any one thing. So they all pile up and cause long latency. I strongly suggest upgrading to Nagios Core 4 if that's a possibility. This will likely solve your problems.

Re: Nagios Performance Tuning

Posted: Thu May 03, 2018 8:19 am
by amitgupta19
Yes i am in process of upgrading to the Nagios 4.

Other thread is w.r.t to overcome the current problems.

Re: Nagios Performance Tuning

Posted: Fri May 04, 2018 2:28 pm
by kyang
Thanks @eloyd!

Let us know if you have any more questions.