Suffering from heavy load problems
Posted: Fri Sep 10, 2010 12:08 pm
I need to open up a new case about this issue. We have 290 hosts and 680 services on our server. Our server is a centos 5 system (32 bit) with 4 processors and 12 gigs of ram. The load average rarely if ever drops below 4.45. Also, in the past couple of days we've had it jump to a load average of 10 and it went to 20, which caused a problem of slowing down the system so much that some alerts weren't coming back in 10 seconds. Pulling up the Nagios console was going extremely slow as well. Looking at top I noticed a ton of httpd processes running constantly. I pulled them up in the server-status apache page to try and find out what was going on:
0-0 3289 0/210/210 _ 59.71 0 399 0.0 0.36 0.36 10.35.42.242 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
1-0 18732 0/90/199 _ 25.53 2 346 0.0 0.18 0.39 10.35.42.143 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
2-0 - 0/0/202 . 4.84 23 0 0.0 0.00 0.28 ::1 ourserver.nuskin.net OPTIONS * HTTP/1.0
3-0 3292 0/213/213 W 62.26 0 0 0.0 0.48 0.48 10.35.42.200 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
4-0 24678 0/45/191 _ 12.67 0 493 0.0 0.04 0.30 10.35.42.242 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
5-0 27685 0/25/212 W 6.41 0 0 0.0 0.04 0.35 10.35.42.98 ourserver.nuskin.net GET /server-status HTTP/1.1
6-0 3295 0/210/210 _ 60.95 1 386 0.0 0.34 0.34 10.35.42.143 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
7-0 3296 0/213/213 _ 59.40 0 742 0.0 0.33 0.33 10.35.42.242 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
8-0 29254 0/13/186 _ 3.57 2 292 0.0 0.02 0.29 10.35.42.143 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
9-0 23099 0/55/192 W 16.24 0 0 0.0 0.07 0.33 10.35.42.200 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
10-0 16624 0/106/202 _ 28.38 0 569 0.0 0.22 0.40 10.35.42.242 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
11-0 3401 0/209/209 _ 58.83 2 604 0.0 0.33 0.33 10.35.42.189 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
12-0 30270 0/5/155 _ 1.38 1 365 0.0 0.00 0.26 10.35.43.245 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
13-0 26543 0/33/201 L 9.42 0 0 0.0 0.05 0.30 10.35.42.242 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
14-0 3413 0/206/206 W 57.95 0 0 0.0 0.40 0.40 10.35.42.200 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
15-0 4602 0/197/197 W 55.76 0 0 0.0 0.38 0.38 10.35.42.200 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
16-0 21254 0/68/161 W 19.18 0 0 0.0 0.14 0.28 10.35.42.200 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
17-0 26557 0/32/120 _ 9.54 1 311 0.0 0.06 0.27 10.35.42.143 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
18-0 5355 0/190/190 _ 54.01 1 314 0.0 0.30 0.30 10.35.43.245 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
19-0 11133 0/145/145 W 40.12 0 0 0.0 0.22 0.22 10.35.42.200 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
20-0 - 0/0/41 . 5.87 79 0 0.0 0.00 0.06 ::1 ourserver.nuskin.net OPTIONS * HTTP/1.0
21-0 - 0/0/109 . 30.55 125 0 0.0 0.00 0.17 ::1 ourserver.nuskin.net OPTIONS * HTTP/1.0
22-0 13603 0/125/125 _ 36.02 0 303 0.0 0.24 0.24 10.35.42.242 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
This page is being hit constantly:
/nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
I can't tell what this particular function is, but I'm trying to figure out why it is taking so much of our server up. If I stop the apache server, the load will drop down to about 1.8 or 1.9 to 3.0. I would like to figure out how to optimize or even cache this particular call so that the server won't be tied up with this process.
I've tried some http optimizations, but nothing has had much effect on it quite yet.
What can I do this address this issue?
We've thought about using the DNX system or even fusion when it comes out, but I need some help in the meantime.
0-0 3289 0/210/210 _ 59.71 0 399 0.0 0.36 0.36 10.35.42.242 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
1-0 18732 0/90/199 _ 25.53 2 346 0.0 0.18 0.39 10.35.42.143 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
2-0 - 0/0/202 . 4.84 23 0 0.0 0.00 0.28 ::1 ourserver.nuskin.net OPTIONS * HTTP/1.0
3-0 3292 0/213/213 W 62.26 0 0 0.0 0.48 0.48 10.35.42.200 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
4-0 24678 0/45/191 _ 12.67 0 493 0.0 0.04 0.30 10.35.42.242 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
5-0 27685 0/25/212 W 6.41 0 0 0.0 0.04 0.35 10.35.42.98 ourserver.nuskin.net GET /server-status HTTP/1.1
6-0 3295 0/210/210 _ 60.95 1 386 0.0 0.34 0.34 10.35.42.143 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
7-0 3296 0/213/213 _ 59.40 0 742 0.0 0.33 0.33 10.35.42.242 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
8-0 29254 0/13/186 _ 3.57 2 292 0.0 0.02 0.29 10.35.42.143 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
9-0 23099 0/55/192 W 16.24 0 0 0.0 0.07 0.33 10.35.42.200 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
10-0 16624 0/106/202 _ 28.38 0 569 0.0 0.22 0.40 10.35.42.242 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
11-0 3401 0/209/209 _ 58.83 2 604 0.0 0.33 0.33 10.35.42.189 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
12-0 30270 0/5/155 _ 1.38 1 365 0.0 0.00 0.26 10.35.43.245 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
13-0 26543 0/33/201 L 9.42 0 0 0.0 0.05 0.30 10.35.42.242 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
14-0 3413 0/206/206 W 57.95 0 0 0.0 0.40 0.40 10.35.42.200 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
15-0 4602 0/197/197 W 55.76 0 0 0.0 0.38 0.38 10.35.42.200 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
16-0 21254 0/68/161 W 19.18 0 0 0.0 0.14 0.28 10.35.42.200 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
17-0 26557 0/32/120 _ 9.54 1 311 0.0 0.06 0.27 10.35.42.143 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
18-0 5355 0/190/190 _ 54.01 1 314 0.0 0.30 0.30 10.35.43.245 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
19-0 11133 0/145/145 W 40.12 0 0 0.0 0.22 0.22 10.35.42.200 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
20-0 - 0/0/41 . 5.87 79 0 0.0 0.00 0.06 ::1 ourserver.nuskin.net OPTIONS * HTTP/1.0
21-0 - 0/0/109 . 30.55 125 0 0.0 0.00 0.17 ::1 ourserver.nuskin.net OPTIONS * HTTP/1.0
22-0 13603 0/125/125 _ 36.02 0 303 0.0 0.24 0.24 10.35.42.242 ourserver.nuskin.net GET /nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
This page is being hit constantly:
/nagiosxi/ajaxhelper.php?cmd=getxicoreajax&opts=%7B%22func%
I can't tell what this particular function is, but I'm trying to figure out why it is taking so much of our server up. If I stop the apache server, the load will drop down to about 1.8 or 1.9 to 3.0. I would like to figure out how to optimize or even cache this particular call so that the server won't be tied up with this process.
I've tried some http optimizations, but nothing has had much effect on it quite yet.
What can I do this address this issue?
We've thought about using the DNX system or even fusion when it comes out, but I need some help in the meantime.