Host Checks / Latency

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
OptimusB
Posts: 146
Joined: Mon Oct 27, 2014 10:08 pm
Location: Canada
Contact:

Host Checks / Latency

Post by OptimusB »

So I've noticed some spikes in latency for some of our hosts and rtmax spikes up to 150ms. After investigation, it is actually the Nagios XI server that is causing these "spikes". When checking ping times from other servers, it seems to be quite stable, but at the same time a ping session from the Nagios XI server can show some fluctuations of ping times. I suspect this could be due to load and by adding more CPU and RAM, it does seem to be better, but it isn't solid.

So my question is, how reliable is the host check data of rta and rtmax? If this is derived from what the Nagios XI is getting based on the load of the XI server? Anyone else have this issue? Just trying to get some more reliable data.
latency.jpg
You do not have the required permissions to view the files attached to this post.
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Host Checks / Latency

Post by rkennedy »

Well, if the XI server is under pretty heavy load, then you could see an increase delay in responses so I feel this all lines up proportionately.

To help us get a better idea - what kind of resources did you have before / after, and how many host / service checks are you running?
Former Nagios Employee
OptimusB
Posts: 146
Joined: Mon Oct 27, 2014 10:08 pm
Location: Canada
Contact:

Re: Host Checks / Latency

Post by OptimusB »

This instance has one gearman worker. We had 2CPU and 8GB of RAM for the server before. It is now on 4CPU with 16GB of RAM. According to CCM, we have 221 hosts and 4176 services. Not too much.

We have another instance with about 1500 hosts and about 24065 service counts that's using 6CPU with 20GB of RAM, but that also has 3 mod gearman workers. I am seeing some latencies but it is much lower. I guess the best way to handle this is to spin up an additional gearman worker node and have that only handle host checks to get better latency results.
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Host Checks / Latency

Post by rkennedy »

Not much at all.

Gearman may be the solution. When the system only had 2CPU/8G, were you noticing any throttles anywhere (CPU/RAM)? Feel free to PM over a profile, and I can take a look to see if anything is standing out that would cause this. I may not be able to find much though, since you've already added the resources.

Another option to throw at you, is to use check_by_ssh or check_nrpe as agents. Run these on a 'cloud' service, and you'd be able to get additional information about the latency from an external source.
Former Nagios Employee
Locked