Page 2 of 3
Re: Monitoring Engine Process 15 minute delayed start?
Posted: Wed Dec 05, 2012 5:23 pm
by ockmeyer
I haven't waited the full 15 minutes, but after a few minutes it still shows the Monitoring Engine Process "not running".
Re: Monitoring Engine Process 15 minute delayed start?
Posted: Thu Dec 06, 2012 10:16 am
by mguthrie
Would you be interested in a remote session to take a look at this, maybe tomorrow? If so let me know and we'll discuss the details in a PM.
Re: Monitoring Engine Process 15 minute delayed start?
Posted: Thu Dec 06, 2012 11:00 am
by ockmeyer
sure
Re: Monitoring Engine Process 15 minute delayed start?
Posted: Thu Dec 06, 2012 11:20 am
by mguthrie
Details sent over PM.
Re: Monitoring Engine Process 15 minute delayed start?
Posted: Fri Dec 07, 2012 3:12 pm
by ockmeyer
Mike worked with me on this and we concluded that the retention.dat file was too large. Deleting it and letting it regenerate seems to have fixed the problem.
I verified it with a second server having an identical problem.
Thanks for all the help, Mike!
Re: Monitoring Engine Process 15 minute delayed start?
Posted: Fri Dec 07, 2012 3:15 pm
by mguthrie
Thanks for updating on this with the solution as well. If you ever see this issue come up again, can we have you send us the oversized retention.dat file so we can try and better trace why that file got so large?
Re: Monitoring Engine Process 15 minute delayed start?
Posted: Fri Dec 07, 2012 5:06 pm
by ockmeyer
Will do.
Re: Monitoring Engine Process 15 minute delayed start?
Posted: Mon Dec 10, 2012 1:20 pm
by mguthrie
After looking through the file, I'm noticing that there's almost 90000 host comments all related to:
"This host has been scheduled for fixed downtime..."
Anything goofy showing up in your downtime or recurring downtime pages?
Re: Monitoring Engine Process 15 minute delayed start?
Posted: Mon Dec 10, 2012 1:48 pm
by ockmeyer
I created three recurring downtime schedules the Friday before all of this started. Each one corresponds to the three maintenance windows we have, and they are based on hostgroups. People were complaining about getting alerts when they were doing maintenance and I thought this would eliminate that, but it appears to have created a bigger problem.
With hundreds of hosts in each hostgroup, is there a better way of configuring this without the side effect of such a large file?
Re: Monitoring Engine Process 15 minute delayed start?
Posted: Mon Dec 10, 2012 3:10 pm
by mguthrie
Unless you've got 30k hosts+services, then there's something goofy going on with that scheduler. I'm wondering if LOTS of duplicate downtime schedules are being created somehow. Even if you've got 10k checks total, there shouldn't be that many comments in the file.
You did set this up correctly, we'll do some digging on the possibility of duplicate schedules getting created and see what we can find out...