High load averages after upgrade

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
vicnanio
Posts: 17
Joined: Tue Oct 15, 2013 2:41 pm

High load averages after upgrade

Post by vicnanio »

I just finished doing an upgrade to the latest version of NagiosXI and now my server is getting high load averages.
Also, we are running the Linux virtual appliance.

Here is what top looks like:

Code: Select all

top - 15:12:00 up 15 min,  2 users,  load average: 13.64, 11.94, 6.56
Tasks: 166 total,  11 running, 155 sleeping,   0 stopped,   0 zombie
Cpu(s): 89.0%us,  9.7%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.3%hi,  1.0%si,  0.0%st
Mem:   1019516k total,   877748k used,   141768k free,    31500k buffers
Swap:  2064376k total,   138928k used,  1925448k free,    64628k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                        
15141 root      20   0  183m  67m 2260 R  7.2  6.8   0:25.44 mrtg                                                                                           
19238 apache    20   0  445m  25m 4412 S  6.5  2.6   0:05.01 httpd                                                                                          
 1592 apache    20   0  445m  26m 4244 R  5.6  2.6   0:16.52 httpd                                                                                          
15001 apache    20   0  445m  25m 4224 R  5.6  2.5   0:12.70 httpd                                                                                          
15623 apache    20   0  445m  25m 4040 R  5.2  2.5   0:10.94 httpd                                                                                          
 1586 apache    20   0  446m  26m 4316 R  4.9  2.7   0:20.64 httpd                                                                                          
19410 apache    20   0  445m  26m 4596 S  4.9  2.6   0:05.07 httpd                                                                                          
 7537 apache    20   0  437m  18m 4376 S  4.6  1.9   0:18.31 httpd                                                                                          
 7538 apache    20   0  445m  25m 4280 S  4.6  2.6   0:15.68 httpd                                                                                          
 1589 apache    20   0  446m  26m 4332 R  4.2  2.7   0:20.75 httpd                                                                                          
15629 apache    20   0  445m  26m 4512 S  4.2  2.7   0:11.76 httpd                                                                                          
15642 apache    20   0  438m  19m 4496 S  4.2  1.9   0:12.46 httpd                                                                                          
15053 apache    20   0  446m  26m 4564 R  3.6  2.7   0:12.51 httpd    

Has anyone been experiencing this issue?
User avatar
lmiltchev
Bugs find me
Posts: 13589
Joined: Mon May 23, 2011 12:15 pm

Re: High load averages after upgrade

Post by lmiltchev »

What is the Nagios XI version that you upgraded from? What the load used to be prior to the upgrade? Are you using mod gearman or mklivestatus?
Be sure to check out our Knowledgebase for helpful articles and solutions!
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: High load averages after upgrade

Post by scottwilkerson »

Looking at the code you posted it look like httpd is where most of the load is. Are there a large number of users accessing the UI?
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
vicnanio
Posts: 17
Joined: Tue Oct 15, 2013 2:41 pm

Re: High load averages after upgrade

Post by vicnanio »

scottwilkerson wrote:Looking at the code you posted it look like httpd is where most of the load is. Are there a large number of users accessing the UI?

I have attached another picture of what the load looked like this morning when I got into the office.

We only had like 3 people access the server.
You do not have the required permissions to view the files attached to this post.
Last edited by vicnanio on Fri Dec 19, 2014 10:26 am, edited 1 time in total.
vicnanio
Posts: 17
Joined: Tue Oct 15, 2013 2:41 pm

Re: High load averages after upgrade

Post by vicnanio »

lmiltchev wrote:What is the Nagios XI version that you upgraded from? What the load used to be prior to the upgrade? Are you using mod gearman or mklivestatus?

We upgraded from Nagios XI 2012R2.4 or 2.0.

We are at 2014R2.0 now.


We are using the virtual appliance and it currently has 1 cpu and 10GB RAM assigned to it right now.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: High load averages after upgrade

Post by scottwilkerson »

Wow, this top shows 995 processes, something is definitely wrong here, I know already I see WAY too many mrtg processes running...

Additionally, is this machine only running with 1GB of RAM? How many hosts/services are you running?

http://assets.nagios.com/downloads/nagi ... ements.pdf

Can you run the following and send the top.txt

Code: Select all

killall -9 mrtg
ps -ef > /tmp/top.txt
thanks
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
vicnanio
Posts: 17
Joined: Tue Oct 15, 2013 2:41 pm

Re: High load averages after upgrade

Post by vicnanio »

scottwilkerson wrote:Wow, this top shows 995 processes, something is definitely wrong here, I know already I see WAY too many mrtg processes running...

Additionally, is this machine only running with 1GB of RAM? How many hosts/services are you running?

http://assets.nagios.com/downloads/nagi ... ements.pdf

Can you run the following and send the top.txt

Code: Select all

killall -9 mrtg
ps -ef > /tmp/top.txt
thanks

DUH!

I apologize, that was an oversight on my part. I didn't implement this appliance, just inherited it, and assumed the resources were enough. I have increased the RAM and CPU's, all seems to be running within normal ranges. I do have another issue, in that the performance data is not showing up on the graphs but I will open up a separate thread for that issue.

Thanks
Locked