We've added another NagiosXI Server to the site and am sending the events to the master server via Outbound Transfers (NRDP) and the Inbound transfers is working OK on the Master server.
As we added servers to the remote Nagios server, the Master Server Load started to increase to the point where Load average is +20 (compared to 3-5 beforehand). To date there are ~100 passive hosts & 600 passive services added.
When I look at the "Scheduled events over time" graph (Monitoring Engine Status) the column for Now seems to be permanently capped at just over 500.
It's almost like Nagios has reached a limit and won't process any more events more than 500.
I've gone through the performance tuning docs some time ago and am pretty happy that it was running at optimum before we started the Inbound transfers.
Currently monitoring 950 Active Hosts & 8,500 Active Services ( half of these are run via gearmand on another server)
Looking at top it appears that the mysqld process is consistently consuming the most resources.
Code: Select all
Tasks: 262 total, 8 running, 254 sleeping, 0 stopped, 0 zombie
Cpu(s): 90.1%us, 9.1%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.1%hi, 0.7%si, 0.0%st
Mem: 3947696k total, 3064476k used, 883220k free, 70680k buffers
Swap: 2359288k total, 70032k used, 2289256k free, 1462800k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2275 mysql 20 0 1697m 66m 3772 S 47.2 1.7 49:15.73 mysqld
33949 apache 20 0 456m 40m 5188 R 20.9 1.0 0:12.67 httpd
30074 apache 20 0 456m 40m 5324 S 13.0 1.0 0:16.34 httpd
15605 apache 20 0 456m 40m 5348 S 12.3 1.0 0:28.20 httpd
24063 nagios 20 0 56960 8116 1068 S 12.3 0.2 2:55.78 ndo2db
37604 apache 20 0 454m 37m 4888 S 11.3 1.0 0:07.91 httpd
13734 apache 20 0 459m 43m 5232 S 10.6 1.1 0:31.50 httpd
26248 apache 20 0 458m 41m 5332 S 10.3 1.1 0:21.26 httpd
34095 apache 20 0 455m 39m 5192 S 10.3 1.0 0:10.93 httpd
44067 apache 20 0 445m 29m 4468 S 10.3 0.8 0:00.94 httpd
53473 apache 20 0 458m 42m 5472 S 10.3 1.1 0:54.72 httpd
27918 apache 20 0 455m 40m 5200 S 10.0 1.0 0:16.61 httpd
30071 apache 20 0 446m 31m 5212 S 10.0 0.8 0:14.44 httpd
31553 apache 20 0 459m 43m 5528 S 10.0 1.1 1:19.72 httpd
21243 apache 20 0 457m 42m 5544 S 9.6 1.1 1:31.44 httpd
29962 apache 20 0 447m 31m 5112 S 9.6 0.8 0:17.70 httpd
30072 apache 20 0 456m 40m 5184 S 9.0 1.0 0:14.78 httpd
35068 apache 20 0 456m 39m 5116 R 8.0 1.0 0:12.08 httpd
26000 apache 20 0 456m 39m 5208 R 7.3 1.0 0:17.28 httpd
22135 apache 20 0 457m 41m 5228 S 5.6 1.1 0:22.90 httpd
43649 apache 20 0 456m 39m 4556 S 5.6 1.0 0:01.60 httpd
27363 apache 20 0 457m 41m 5232 S 5.3 1.1 0:22.19 httpd
35319 apache 20 0 453m 37m 4872 S 5.3 1.0 0:10.90 httpd
30024 apache 20 0 457m 41m 5216 S 5.0 1.1 0:17.96 httpd
23955 gearmand 20 0 533m 4712 1160 S 3.7 0.1 0:46.43 gearmand
56063 apache 20 0 457m 42m 5484 R 2.7 1.1 1:58.71 httpd
44066 apache 20 0 448m 31m 4488 R 2.3 0.8 0:01.18 httpd