So the production server at this point... I've received multiple complaints that it is taking about 2 minutes for changes to be applied and "while those changes are taking effect Nagios isn't checking anything." Not sure if the second statement is true or not.
At this time I see:
ps -ef | grep nagios.cfg
root 34212 32642 0 01:31 pts/0 00:00:00 grep nagios.cfg
nagios 51312 1 5 May20 ? 00:33:59 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
nagios 51457 51312 0 May20 ? 00:00:01 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
At this time we are monitoring
2099 hosts and 17410 services.
Host Check Latency
Min 0.00 sec
Max 1.13 sec
Avg 0.01 sec
Host Check Execution Time
Min 0.00 sec
Max 10.00 sec
Avg 0.06 sec
Service Check Latency
Min 0.00 sec
Max 1.17 sec
Avg 0.02 sec
Service Check Execution Time
Min 0.00 sec
Max 60.01 sec
Avg 0.14 sec
Active Host Checks
1-min 574
5-min 1,922
15-min 1,923
Passive Host Checks
1-min 0
5-min 0
15-min 0
Active Service Checks
1-min 5,183
5-min 15,468
15-min 15,470
Passive Service Checks
1-min 0
5-min 10
15-min 10
Metric
Value
Load
1-min 1.15
5-min 1.11
15-min 1.14
CPU Stats
User 2.82%
Nice 0.00%
System 1.18%
I/O Wait 0.01%
Steal 0.00%
Idle 95.99%
Memory
Total 129022 MB
Used 115,618 MB
Free 13,404 MB
Shared 33 MB
Buffers 298 MB
Cached 85,341 MB
Swap
Total 4,094 MB
Used 95 MB
Free 3,999 MB
My top CPU resource users:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
26939 mysql 20 0 9.8g 301m 3820 S 69.4 0.2 118658:49 mysqld
33326 apache 20 0 491m 68m 8736 R 69.4 0.1 14:37.10 httpd
9662 nagios 20 0 55988 7420 1060 S 17.8 0.0 5:22.99 ndo2db
9607 nagios 20 0 106m 90m 1336 S 8.6 0.1 0:54.38 nagios
62167 apache 20 0 492m 68m 8744 S 5.9 0.1 11:05.51 httpd
35899 apache 20 0 475m 52m 8332 S 5.6 0.0 0:23.72 httpd
I know there are two performance tunning documents:
https://assets.nagios.com/downloads/nag ... ios-XI.pdf
https://assets.nagios.com/downloads/nag ... zation.pdf
After reading these I'm still trying to figure out what the best option is.
Any advice?
Slowness troubleshooting --> 5.4.11 to 5.4.13.
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Slowness troubleshooting --> 5.4.11 to 5.4.13.
Previously we had increased the time following this doc
https://support.nagios.com/kb/article.php?id=172
For faster restarts it would be better to set this back down to about 20 seconds
In the Maximizing Performance doc you listed there are 2 things that will boost performance the most
https://assets.nagios.com/downloads/nag ... giosXI.pdf
and
https://assets.nagios.com/downloads/nag ... Server.pdf
https://support.nagios.com/kb/article.php?id=172
For faster restarts it would be better to set this back down to about 20 seconds
Code: Select all
for i in {1..20} ; dohttps://assets.nagios.com/downloads/nag ... giosXI.pdf
and
https://assets.nagios.com/downloads/nag ... Server.pdf
Re: Slowness troubleshooting --> 5.4.11 to 5.4.13.
It is running on a 1TB fusion io card. Before I start messing with disk and offloading mysql. I first want to try:
1) Increasing the frequency of the checks for test servers from 3min to 7min.
2) I will also make the time change to 20 seconds as mentioned below.
3) Change the settings for check_result_reaper_frequency=3 and max_check_result_reaper_time=10
1) Increasing the frequency of the checks for test servers from 3min to 7min.
2) I will also make the time change to 20 seconds as mentioned below.
3) Change the settings for check_result_reaper_frequency=3 and max_check_result_reaper_time=10
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Slowness troubleshooting --> 5.4.11 to 5.4.13.
These all sound like logical steps. #1 will make a big difference system wide.emartine wrote:It is running on a 1TB fusion io card. Before I start messing with disk and offloading mysql. I first want to try:
1) Increasing the frequency of the checks for test servers from 3min to 7min.
2) I will also make the time change to 20 seconds as mentioned below.
3) Change the settings for check_result_reaper_frequency=3 and max_check_result_reaper_time=10
#2 will make the Apply Configurations happen much faster