Page 1 of 2

Monitoring Engine very slow to start / restart

Posted: Fri Aug 07, 2015 8:40 am
by pfsweb
Nagios XI with 8000 services and 1116 hosts being monitored.
When we apply configs or restart the monitoring engine it can take up to 15 minutes before NagiosXi is back to normal operation.
Database is on a separate server and shows nominal memory and cpu utilization.
NagiosXI server has 12Gb of RAM and 6 Processors in a VMWare environment.

Please assists.

Re: Monitoring Engine very slow to start / restart

Posted: Fri Aug 07, 2015 11:23 am
by jolson
The most important question here is which version of Nagios XI you're running - if you're on an older version, the 'Apply Configuration' command can take much longer to run than on our newer versions of Nagios XI.

Re: Monitoring Engine very slow to start / restart

Posted: Thu Dec 10, 2015 11:35 am
by pfsweb
Hi,

We are currently using 5.2.0 so we are only a few days behind the latest release. We are still having an issue after applying the config where the process state shows as stopped and takes about 10-15 minutes to start. During this 10-15 minute period service and host check notifications are disabled and the scheduled next check times are 3-5 minutes behind the actual time.

Also we now have:
1255 Hosts
9234 Services

Re: Monitoring Engine very slow to start / restart

Posted: Thu Dec 10, 2015 11:38 am
by rkennedy
I believe resources could be hitting a throttle. What kind of disks is Nagios XI running on? Can you run the following and post the output?

Code: Select all

top|head -5

Re: Monitoring Engine very slow to start / restart

Posted: Fri Dec 11, 2015 11:43 am
by pfsweb
Sure.

top - 10:40:24 up 1 day, 3:31, 1 user, load average: 3.87, 9.15, 7.47
Tasks: 640 total, 2 running, 638 sleeping, 0 stopped, 0 zombie
Cpu(s): 28.5%us, 7.9%sy, 0.0%ni, 60.6%id, 2.5%wa, 0.0%hi, 0.4%si, 0.0%st
Mem: 17368188k total, 14320492k used, 3047696k free, 212208k buffers
Swap: 5095420k total, 0k used, 5095420k free, 9986504k cached

Re: Monitoring Engine very slow to start / restart

Posted: Fri Dec 11, 2015 1:34 pm
by hsmith
That's still somewhat of a high load for 6 cpus.. What is the clock speed?

What's the output of a lscpu command?

Re: Monitoring Engine very slow to start / restart

Posted: Fri Dec 11, 2015 2:03 pm
by pfsweb
Results:


Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 6
On-line CPU(s) list: 0-5
Thread(s) per core: 1
Core(s) per socket: 3
Socket(s): 2
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 26
Stepping: 4
CPU MHz: 2533.423
BogoMIPS: 5066.84
Hypervisor vendor: VMware
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 8192K
NUMA node0 CPU(s): 0-5

Re: Monitoring Engine very slow to start / restart

Posted: Fri Dec 11, 2015 2:06 pm
by hsmith
Is it possible to add more cores to this machine?

Re: Monitoring Engine very slow to start / restart

Posted: Fri Dec 11, 2015 3:18 pm
by pfsweb
I should be able to early next year but unfortunately that's not an option at the moment. Do you believe this issue directly relates to number of cores we have? Or is there possibly any other cause for this sort of issue we can look into?

Re: Monitoring Engine very slow to start / restart

Posted: Fri Dec 11, 2015 3:26 pm
by jolson
Another optimization that _could_ help would be to utilize a RAM disk if you haven't done so already. Here are a couple of great articles regarding RAM disks in Nagios XI:
https://labs.nagios.com/2015/08/14/util ... -easy-way/
https://assets.nagios.com/downloads/nag ... giosXI.pdf

Ultimately you'll need additional CPUs - the load of your server is too high for the amount of CPUs you're currently using, which is very likely causing the slowness you've mentioned.