I'm currently in the process of building a NagiosXI server from the ground up (VM - CentOS 6.3). I'm running the latest stable release of NagiosXI and the OS is up to date. In this project, I've been consolidating hosts and services that are currently being monitored on two separate Nagios core servers. I have added about 350 hosts and 600 services to NagiosXI and am noticing very significant utilization of CPU and Memory.
Memory:
Of the 2GB supplied for the server, it is currently eating up over 1700MB of this and is swapping.
CPU:
I have been receiving very frequent CPU utilization alerts and when I jump on the server to investigate, I noticed that several processes are intermittently hogging up a great amount of resources. Any help in troubleshooting what's going on?
Our setup is completely stock -- created straight from the VM.
Code: Select all
top - 08:50:46 up 1 day, 1:16, 2 users, load average: 7.89, 6.00, 5.87
Tasks: 158 total, 16 running, 142 sleeping, 0 stopped, 0 zombie
Cpu(s): 5.8%us, 3.8%sy, 0.0%ni, 0.0%id, 80.8%wa, 0.0%hi, 9.6%si, 0.0%st
Mem: 1918812k total, 1650208k used, 268604k free, 61448k buffers
Swap: 262136k total, 31452k used, 230684k free, 772084k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
32346 nagios 20 0 28760 3412 388 R 77.6 0.2 0:02.94 nagios
16 root 20 0 0 0 0 R 6.3 0.0 66:58.55 kblockd/0
32367 nagios 20 0 172m 5808 2804 R 3.2 0.3 0:00.13 process_perfdat
417 root 20 0 0 0 0 R 1.4 0.0 23:37.97 jbd2/dm-1-8
32026 postgres 20 0 210m 5180 3696 R 1.4 0.3 0:00.08 postmaster
32011 postgres 20 0 210m 5164 3668 R 0.6 0.3 0:00.88 postmaster
5928 apache 20 0 435m 26m 4552 R 0.3 1.4 14:00.62 httpd
6156 apache 20 0 436m 27m 4532 R 0.3 1.4 13:50.23 httpd
31474 nagios 20 0 28764 3968 952 R 0.3 0.2 0:02.40 nagios
32329 root 20 0 15032 1276 928 R 0.3 0.1 0:00.03 top
1 root 20 0 19360 1328 1040 S 0.0 0.1 0:02.57 init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd
3 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0
4 root 20 0 0 0 0 R 0.0 0.0 0:09.11 ksoftirqd/0
5 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0
6 root RT 0 0 0 0 S 0.0 0.0 26:16.96 watchdog/0
7 root 20 0 0 0 0 S 0.0 0.0 1:42.93 events/0
8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cgroup
9 root 20 0 0 0 0 S 0.0 0.0 0:00.00 khelper
10 root 20 0 0 0 0 S 0.0 0.0 0:00.00 netns
11 root 20 0 0 0 0 S 0.0 0.0 0:00.00 async/mgr
12 root 20 0 0 0 0 S 0.0 0.0 0:00.00 pmmysqld
postmaster
kblockd (>16%)
php
httpd (>40%)
flush-253:1
jbd2
process_perfdat (>13%)
watchdog (>17%)
vmtoolsd (>58%)
nagios (>77%)
Code: Select all
Linux 2.6.32-279.14.1.el6.x86_64 (RST-NAGIOSXI-1) 11/17/2012 _x86_64_ (1 CPU)
12:00:02 AM CPU %user %nice %system %iowait %steal %idle
12:10:04 AM all 50.68 0.00 7.37 5.32 0.00 36.63
12:20:02 AM all 40.41 0.00 5.93 2.04 0.00 51.62
12:30:01 AM all 39.75 0.00 5.98 3.27 0.00 51.00
12:40:04 AM all 38.13 0.00 5.74 1.44 0.00 54.69
12:50:06 AM all 41.65 0.00 5.99 2.99 0.00 49.37
01:00:01 AM all 40.68 0.00 6.15 1.47 0.00 51.70
01:10:01 AM all 39.11 0.00 5.62 3.79 0.00 51.47
01:20:01 AM all 38.88 0.00 5.86 3.12 0.00 52.15
01:30:04 AM all 43.41 0.00 6.33 3.08 0.00 47.18
01:40:02 AM all 40.44 0.00 5.95 1.81 0.00 51.80
01:50:03 AM all 40.65 0.00 6.24 3.60 0.00 49.52
02:00:02 AM all 43.88 0.00 6.47 1.39 0.00 48.26
02:10:01 AM all 41.08 0.00 6.20 1.97 0.00 50.74
02:20:01 AM all 36.91 0.00 5.52 1.05 0.00 56.52
02:30:01 AM all 35.73 0.00 5.41 0.89 0.00 57.97
02:40:01 AM all 35.64 0.00 5.41 1.14 0.00 57.81
02:50:01 AM all 36.89 0.00 5.81 1.28 0.00 56.02
03:00:01 AM all 40.58 0.00 6.02 1.99 0.00 51.41
03:10:03 AM all 40.91 0.00 5.98 1.25 0.00 51.87
03:20:02 AM all 39.63 0.00 5.91 1.67 0.00 52.79
03:30:01 AM all 41.21 0.06 6.17 15.58 0.00 36.99
03:40:02 AM all 50.04 0.08 7.85 34.98 0.00 7.04
03:40:02 AM CPU %user %nice %system %iowait %steal %idle
03:50:01 AM all 43.64 0.00 6.61 2.35 0.00 47.41
04:00:01 AM all 41.31 0.00 6.41 1.90 0.00 50.38
04:10:01 AM all 36.92 0.00 5.76 3.00 0.00 54.32
04:20:01 AM all 38.18 0.00 5.80 3.45 0.00 52.57
04:30:01 AM all 35.51 0.00 5.43 2.38 0.00 56.68
04:40:03 AM all 31.15 0.00 5.02 1.49 0.00 62.34
04:50:01 AM all 32.41 0.00 5.29 1.45 0.00 60.85
05:00:01 AM all 30.06 0.00 4.79 0.42 0.00 64.72
05:10:01 AM all 29.38 0.00 4.95 0.89 0.00 64.78
05:20:01 AM all 27.83 0.00 4.44 0.29 0.00 67.44
05:30:01 AM all 28.27 0.00 4.45 0.42 0.00 66.86
05:40:01 AM all 27.81 0.00 4.51 0.43 0.00 67.25
05:50:01 AM all 27.72 0.00 4.59 0.49 0.00 67.21
06:00:01 AM all 27.91 0.00 4.46 0.39 0.00 67.24
06:10:02 AM all 27.77 0.00 4.72 0.89 0.00 66.61
06:20:01 AM all 27.87 0.00 4.57 0.73 0.00 66.83
06:30:01 AM all 27.41 0.00 4.34 0.37 0.00 67.88
06:40:01 AM all 27.70 0.00 4.51 0.39 0.00 67.41
06:50:01 AM all 27.77 0.00 4.60 0.54 0.00 67.08
07:00:01 AM all 27.37 0.00 4.30 0.33 0.00 68.00
07:10:01 AM all 30.25 0.00 4.87 2.12 0.00 62.76
07:20:01 AM all 29.09 0.00 4.67 0.58 0.00 65.66
07:20:01 AM CPU %user %nice %system %iowait %steal %idle
07:30:01 AM all 31.21 0.00 5.09 0.82 0.00 62.87
07:40:01 AM all 32.87 0.00 5.39 0.79 0.00 60.94
07:50:01 AM all 34.48 0.00 5.52 1.70 0.00 58.30
08:00:01 AM all 32.41 0.00 5.27 1.71 0.00 60.61
08:10:01 AM all 34.35 0.00 5.50 2.20 0.00 57.94
08:20:01 AM all 37.07 0.00 5.90 1.33 0.00 55.70
08:30:02 AM all 42.39 0.00 6.69 6.71 0.00 44.21
08:40:01 AM all 44.53 0.00 6.73 3.09 0.00 45.65
08:50:02 AM all 54.54 0.00 7.09 6.01 0.00 32.36
09:00:01 AM all 34.86 0.00 5.47 1.57 0.00 58.10
09:10:01 AM all 26.92 0.00 4.42 0.32 0.00 68.34
Average: all 34.92 0.00 5.42 2.26 0.00 57.39Code: Select all
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
10 0 31452 205068 64320 826280 0 0 7 555 496 234 32 5 61 2 0
Linux 2.6.32-279.14.1.el6.x86_64 (RST-NAGIOSXI-1) 11/17/2012 _x86_64_ (1 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
31.98 0.00 5.00 2.12 0.00 60.90
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 49.94 13.69 1110.86 964440 78235346
dm-0 0.18 0.47 0.98 33104 68696
dm-1 139.62 12.89 1109.89 908026 78166648