Monitoring Engine very slow to start / restart

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
pfsweb
Posts: 47
Joined: Fri Jun 27, 2014 9:01 am

Monitoring Engine very slow to start / restart

Post by pfsweb »

Nagios XI with 8000 services and 1116 hosts being monitored.
When we apply configs or restart the monitoring engine it can take up to 15 minutes before NagiosXi is back to normal operation.
Database is on a separate server and shows nominal memory and cpu utilization.
NagiosXI server has 12Gb of RAM and 6 Processors in a VMWare environment.

Please assists.
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Monitoring Engine very slow to start / restart

Post by jolson »

The most important question here is which version of Nagios XI you're running - if you're on an older version, the 'Apply Configuration' command can take much longer to run than on our newer versions of Nagios XI.
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
pfsweb
Posts: 47
Joined: Fri Jun 27, 2014 9:01 am

Re: Monitoring Engine very slow to start / restart

Post by pfsweb »

Hi,

We are currently using 5.2.0 so we are only a few days behind the latest release. We are still having an issue after applying the config where the process state shows as stopped and takes about 10-15 minutes to start. During this 10-15 minute period service and host check notifications are disabled and the scheduled next check times are 3-5 minutes behind the actual time.

Also we now have:
1255 Hosts
9234 Services
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Monitoring Engine very slow to start / restart

Post by rkennedy »

I believe resources could be hitting a throttle. What kind of disks is Nagios XI running on? Can you run the following and post the output?

Code: Select all

top|head -5
Former Nagios Employee
pfsweb
Posts: 47
Joined: Fri Jun 27, 2014 9:01 am

Re: Monitoring Engine very slow to start / restart

Post by pfsweb »

Sure.

top - 10:40:24 up 1 day, 3:31, 1 user, load average: 3.87, 9.15, 7.47
Tasks: 640 total, 2 running, 638 sleeping, 0 stopped, 0 zombie
Cpu(s): 28.5%us, 7.9%sy, 0.0%ni, 60.6%id, 2.5%wa, 0.0%hi, 0.4%si, 0.0%st
Mem: 17368188k total, 14320492k used, 3047696k free, 212208k buffers
Swap: 5095420k total, 0k used, 5095420k free, 9986504k cached
User avatar
hsmith
Agent Smith
Posts: 3539
Joined: Thu Jul 30, 2015 11:09 am
Location: 127.0.0.1
Contact:

Re: Monitoring Engine very slow to start / restart

Post by hsmith »

That's still somewhat of a high load for 6 cpus.. What is the clock speed?

What's the output of a lscpu command?
Former Nagios Employee.
me.
pfsweb
Posts: 47
Joined: Fri Jun 27, 2014 9:01 am

Re: Monitoring Engine very slow to start / restart

Post by pfsweb »

Results:


Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 6
On-line CPU(s) list: 0-5
Thread(s) per core: 1
Core(s) per socket: 3
Socket(s): 2
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 26
Stepping: 4
CPU MHz: 2533.423
BogoMIPS: 5066.84
Hypervisor vendor: VMware
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 8192K
NUMA node0 CPU(s): 0-5
User avatar
hsmith
Agent Smith
Posts: 3539
Joined: Thu Jul 30, 2015 11:09 am
Location: 127.0.0.1
Contact:

Re: Monitoring Engine very slow to start / restart

Post by hsmith »

Is it possible to add more cores to this machine?
Former Nagios Employee.
me.
pfsweb
Posts: 47
Joined: Fri Jun 27, 2014 9:01 am

Re: Monitoring Engine very slow to start / restart

Post by pfsweb »

I should be able to early next year but unfortunately that's not an option at the moment. Do you believe this issue directly relates to number of cores we have? Or is there possibly any other cause for this sort of issue we can look into?
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Monitoring Engine very slow to start / restart

Post by jolson »

Another optimization that _could_ help would be to utilize a RAM disk if you haven't done so already. Here are a couple of great articles regarding RAM disks in Nagios XI:
https://labs.nagios.com/2015/08/14/util ... -easy-way/
https://assets.nagios.com/downloads/nag ... giosXI.pdf

Ultimately you'll need additional CPUs - the load of your server is too high for the amount of CPUs you're currently using, which is very likely causing the slowness you've mentioned.
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
Locked