All checks running at same time interval

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
User avatar
WillemDH
Posts: 2320
Joined: Wed Mar 20, 2013 5:49 am
Location: Ghent
Contact:

All checks running at same time interval

Post by WillemDH »

Hello,

We have to shut down one of our datacenters Friday, the one where our Nagios server is running. So I asked a collegaue to move the Nagios server to another datacenter (VMWare). he did this yesterday evening and today I noticed that all the checks seem to run at the same time interval while yesterday this was not the case.
See the screenshot of the Monitoring Engine Event Queue.

As this does not seem an ideal situation, I already tried rebooting the server, but this doesn't seem to help.
In http://support.nagios.com/forum/viewtop ... ention.dat, I was aksed to delete the retention.dat file. is this the only solution to spread the service checks again more evenly?

I presume that the VMotion of the virtual Nagios XI server caused the checks to lag an then all start at the same time. is there any way to prevent this from happening, as next week i'll have to move the server back to the original datacenter.

Grtz

Willem
You do not have the required permissions to view the files attached to this post.
Nagios XI 5.8.1
https://outsideit.net
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: All checks running at same time interval

Post by abrist »

This is usually a sign that the server got behind in the scheduling of checks. This is usually caused by performance issues. Is the load average higher than normal? How about the io wait?
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
User avatar
WillemDH
Posts: 2320
Joined: Wed Mar 20, 2013 5:49 am
Location: Ghent
Contact:

Re: All checks running at same time interval

Post by WillemDH »

Andy,

As I said we had to move the server to another datacenter (VMotion). This has caused the latency, as we have to temporarily move the server to a les performant datastore. So I do know the reason, but the performance loss is only temporary. It should be ok now. The question is how can I solve it after a Vmotion to another datacenter.

Grtz
Nagios XI 5.8.1
https://outsideit.net
abrist
Red Shirt
Posts: 8334
Joined: Thu Nov 15, 2012 1:20 pm

Re: All checks running at same time interval

Post by abrist »

Remove retention.dat and restart nagios, or just wait it out while the server catches up.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
User avatar
WillemDH
Posts: 2320
Joined: Wed Mar 20, 2013 5:49 am
Location: Ghent
Contact:

Re: All checks running at same time interval

Post by WillemDH »

It seems the server managed to distribute the load evenly again. Thread can be closed.

Thanks!

Willem
Nagios XI 5.8.1
https://outsideit.net
Locked