Performance issues
-
westernuniv
- Posts: 120
- Joined: Tue Aug 21, 2012 9:29 am
Performance issues
I had some performance issues as described in the topic http://support.nagios.com/forum/viewtop ... 4&start=10 but it is now locked so I can't add an update.
I applied the tuning that was recommended and upgraded to 2012R2.5 and things seemed to be running smoothly until today (16 days after changes were made). My CPU is spiking (which in turn is causing other checks to time out) and memory is steadily increasing. The box has 96GB of memory and 74GB is in use.
Watching htop the CPU goes through the roof when JMX queries are run (pinning all 24 cores over 100%). To me it sounds like there is a memory leak some where since the problem is gradual and only clears on reboot.
httpd seems to spike as well but not as much.
Nagios XI: 2012R2.5
OS: CentOS 6.4 (Final)
PHP Version: 5.3.3
I applied the tuning that was recommended and upgraded to 2012R2.5 and things seemed to be running smoothly until today (16 days after changes were made). My CPU is spiking (which in turn is causing other checks to time out) and memory is steadily increasing. The box has 96GB of memory and 74GB is in use.
Watching htop the CPU goes through the roof when JMX queries are run (pinning all 24 cores over 100%). To me it sounds like there is a memory leak some where since the problem is gradual and only clears on reboot.
httpd seems to spike as well but not as much.
Nagios XI: 2012R2.5
OS: CentOS 6.4 (Final)
PHP Version: 5.3.3
Re: Performance issues
How many checks are you running per 5 minutes?
What is the average check latency/duration for those JMX checks?
Have you added any checks in the last 16 days? If so, what kinds of checks?
What is the average check latency/duration for those JMX checks?
Have you added any checks in the last 16 days? If so, what kinds of checks?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
-
westernuniv
- Posts: 120
- Joined: Tue Aug 21, 2012 9:29 am
Re: Performance issues
Every 5 minutes:
Host Checks: 420
Service Checks: 2123
The average check latency for JMX is around 0.18 seconds
The average duration is all over the map, anything from 0.1 sec to 46 seconds (I did a random sampling of 20-30 jmx checks)
Within the past 16 days there have been approx 50 checks added. The majority being NRPE checks to other hosts and 2-3 JMX checks, a HTTP post (to test site auth) and some SSH
Host Checks: 420
Service Checks: 2123
The average check latency for JMX is around 0.18 seconds
The average duration is all over the map, anything from 0.1 sec to 46 seconds (I did a random sampling of 20-30 jmx checks)
Within the past 16 days there have been approx 50 checks added. The majority being NRPE checks to other hosts and 2-3 JMX checks, a HTTP post (to test site auth) and some SSH
-
slansing
- Posts: 7698
- Joined: Mon Apr 23, 2012 4:28 pm
- Location: Travelling through time and space...
Re: Performance issues
Is anything else installed on this system besides Nagios XI and it's dependencies? Can you determine through TOP or other sources what is eating this memory up?96GB of memory and 74GB is in use
-
westernuniv
- Posts: 120
- Joined: Tue Aug 21, 2012 9:29 am
Re: Performance issues
Nope, just Nagios and its related dependencies. The things that are using the most memory (but only averaging 0.1 - 0.2% per process) are mysql and httpd.
Re: Performance issues
What is the iowait (wa) that is shown when you run top?
Be sure to check out our Knowledgebase for helpful articles and solutions!
-
westernuniv
- Posts: 120
- Joined: Tue Aug 21, 2012 9:29 am
Re: Performance issues
Code: Select all
top - 13:45:01 up 16 days, 5:29, 2 users, load average: 2.36, 3.17, 3.63
Tasks: 686 total, 1 running, 685 sleeping, 0 stopped, 0 zombie
Cpu(s): 2.6%us, 0.7%sy, 0.0%ni, 96.6%id, 0.1%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 99059156k total, 98209912k used, 849244k free, 608536k buffers
Swap: 16777208k total, 14656k used, 16762552k free, 22465972k cached
Re: Performance issues
How much of that 74gb is used for disk caching?
Code: Select all
free -mFormer Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
-
westernuniv
- Posts: 120
- Joined: Tue Aug 21, 2012 9:29 am
Re: Performance issues
Code: Select all
# free -m
total used free shared buffers cached
Mem: 96737 95923 813 0 594 21940
-/+ buffers/cache: 73389 23348
Swap: 16383 14 16369
Re: Performance issues
Ahhh, the 74gb is without the disk buffers. Can you post the output of the following in code wraps:
Code: Select all
ps -aux | sort -k 3Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.