Page 2 of 8
Re: CPU Load Spike daily
Posted: Mon Jul 07, 2014 1:11 pm
by BanditBBS
This is a "test" system as I couldnt release this to the wild with this issue. So I will attempt this fix and report back tomorrow. The machine is unusable right now as we are in the spike window. I will get you the nagios.cfg and PM it to you as requested.
EDIT: Test initiated. I fought through the sluggish system

Re: CPU Load Spike daily
Posted: Mon Jul 07, 2014 1:47 pm
by BanditBBS
Capture.JPG
I left nagios off for 6 minutes and it is now 10+ minutes after I turned it back on and still seems spread out ok. System is usable as well. True test will be 12:30CST tomorrow, I'll report back then.
Re: CPU Load Spike daily
Posted: Mon Jul 07, 2014 4:55 pm
by slansing
Excellent! We look forward to hearing tomorrow, thanks for checking back in!
Re: CPU Load Spike daily
Posted: Tue Jul 08, 2014 12:34 pm
by BanditBBS
Like clockwork the load spike just started.
Maybe that was just a small spike, it went back down. I know as soon as i type this it'll go back up again....just ignore me, I'll report back in an hour.
Re: CPU Load Spike daily
Posted: Tue Jul 08, 2014 1:02 pm
by BanditBBS
Ok, its 30 minutes into the spike window and I can say it seems ok. However, after applying that scheduling patch yesterday, look at this:
Service is setup for 5 minute check intervals but scheduler gave it 10 mins:
Capture1.JPG
After those 10 minutes, this time it gave it 6 minutes:
Capture2.JPG
EDIT: The ones scheduled for hourly are even way worse. Anywhere between 15 and 40 minutes the next check gets scheduled for.
Re: CPU Load Spike daily
Posted: Tue Jul 08, 2014 1:52 pm
by BanditBBS
And now I just want to give up. Seems after a week straight of the spike starting at 12:30CST today it magically started at 13:30CST. Server is unusable for past 20 minutes. I'm sure if I stopped nagios for a few minutes it'd be fine until tomorrow.
Re: CPU Load Spike daily
Posted: Tue Jul 08, 2014 4:28 pm
by abrist
Eric[1] is working diligently on this issue. I will make sure to pull him into this topic.
Re: CPU Load Spike daily
Posted: Tue Jul 08, 2014 4:36 pm
by BanditBBS
abrist wrote:Eric[1] is working diligently on this issue. I will make sure to pull him into this topic.
Thanks Andy.
Just to update, here is today's perf graph.
Capture.JPG
FYI - Displaying EST times as I am currently in Ohio
It wasn't nearly as long as the previous 7 days, and not even as long s it looks like it was in this graph since rrd averages over time. But there were a couple large spike that made the machine unusable for 10-15 minutes each. So it did get better with applying that schedule patch, just not 100% better. But then there is the scheduling time and they are just way to off. There are items I need checked every 5 minutes, that is why they are set that way, if it doesn't get checks for 10 minutes instead it could cost customers money. And for the 1 hour scheduled stuff, some are looking for files that are only present once per hour, so it it checks short, then it'll always error out. So in short, the new scheduling is not good at all*except for the load easing).
Re: CPU Load Spike daily
Posted: Tue Jul 08, 2014 5:35 pm
by abrist
BanditBBS wrote: So in short, the new scheduling is not good at all*except for the load easing).
I have a suspicion that commit will get reverted . . . .
More on this soon as Eric[1]'s work progresses. . . .
Re: CPU Load Spike daily
Posted: Wed Jul 09, 2014 9:24 am
by BanditBBS
Update: as of 9am CST I have been at double digit load for 25 minutes and counting. I'm going to kill the nagios process and le tit sit for a few minutes just so I can use the server.