CPU Load Spike daily
Re: CPU Load Spike daily
Fair enough. Can we move you to a remote, I want to see this behavior first hand . . . .
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
Re: CPU Load Spike daily
Umm, no(that felt good!). I don't have it installed anymore. With all the testing I was doing to my environment I had reverted back to a clean install and honestly don't have the time to test it right now.abrist wrote:Fair enough. Can we move you to a remote, I want to see this behavior first hand . . . .
All I did was install it and created a couple services. One with a 5 minute check interval and one with a 60 minute interval. Once restarted(after deleting retention or waiting long enough for checks to be in past) start nagios and let it schedule and run the checks. Then just look at the last check time and next check time, will be no where near 5 or 60 minutes. Reschedule the next check of the service results in same behavior. The next check runs immediately and then the next scheduled check is 7-10 minutes for the 5 minute check and was 30-40 minutes for the 1 hour check.
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
Re: CPU Load Spike daily
Thanks for the details, I will pass them off to the Erics.
Former Nagios employee
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
"It is turtles. All. The. Way. Down. . . .and maybe an elephant or two."
VI VI VI - The editor of the Beast!
Come to the Dark Side.
- tylerhoadley
- Posts: 43
- Joined: Tue Jul 02, 2013 1:41 pm
Re: CPU Load Spike daily
Hey,
So I looked at the commit changes that were imposed.
I have a dev XI box where I was running r1.1 with core 4.0.6. before upgrading I multiplied 1000 service checks (HTTP) and forced a check on all services on this host.
the monitoring engine queue showed the huge spike for checks...
I then downloaded the latest XI tarball, extracted the nagios-core (nagios-4.0.7.tar.gz) and patched the the two *.c files with the commited code. tar'd it back up and then updated XI to r1.3 (via ./upgrade)
I stopped core, Deleted the retention.dat file and waited 5 mins for these 1000 1min check to lapse. (seen all checks in one column ready to go) I then started it back up, and here is my queue.

I'm assuming this is what the "commit" is suppose to do... it looks fairly even on queued checks.
Please Comment.
So I looked at the commit changes that were imposed.
I have a dev XI box where I was running r1.1 with core 4.0.6. before upgrading I multiplied 1000 service checks (HTTP) and forced a check on all services on this host.
the monitoring engine queue showed the huge spike for checks...
I then downloaded the latest XI tarball, extracted the nagios-core (nagios-4.0.7.tar.gz) and patched the the two *.c files with the commited code. tar'd it back up and then updated XI to r1.3 (via ./upgrade)
I stopped core, Deleted the retention.dat file and waited 5 mins for these 1000 1min check to lapse. (seen all checks in one column ready to go) I then started it back up, and here is my queue.

I'm assuming this is what the "commit" is suppose to do... it looks fairly even on queued checks.
Please Comment.
Re: CPU Load Spike daily
Tyler, yes it is what it should do, howveer, in my test I saw the 5 minute check interval being ignored and checks being schedule anywhere from 7-10 minutes.
Can you verify after some of the checks run the next scheduled check time is actually what you had set?
Can you verify after some of the checks run the next scheduled check time is actually what you had set?
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
- tylerhoadley
- Posts: 43
- Joined: Tue Jul 02, 2013 1:41 pm
Re: CPU Load Spike daily
First 5 mins....
Service State: Ok
Duration: 30m 55s
Service Stability: Unchanging (stable)
Last Check: 2014-07-21 15:20:00
Next Check: 2014-07-21 15:25:00
second 5 mins...
Service State: Ok
Duration: 35m 8s
Service Stability: Unchanging (stable)
Last Check: 2014-07-21 15:25:00
Next Check: 2014-07-21 15:30:00
and third 5 min wait....
Service State: Ok
Duration: 40m 5s
Service Stability: Unchanging (stable)
Last Check: 2014-07-21 15:30:00
Next Check: 2014-07-21 15:35:00
Fourth 5 min....
Service State: Ok
Duration: 50m 15s
Service Stability: Unchanging (stable)
Last Check: 2014-07-21 15:40:00
Next Check: 2014-07-21 15:45:00
Service State: Ok
Duration: 30m 55s
Service Stability: Unchanging (stable)
Last Check: 2014-07-21 15:20:00
Next Check: 2014-07-21 15:25:00
second 5 mins...
Service State: Ok
Duration: 35m 8s
Service Stability: Unchanging (stable)
Last Check: 2014-07-21 15:25:00
Next Check: 2014-07-21 15:30:00
and third 5 min wait....
Service State: Ok
Duration: 40m 5s
Service Stability: Unchanging (stable)
Last Check: 2014-07-21 15:30:00
Next Check: 2014-07-21 15:35:00
Fourth 5 min....
Service State: Ok
Duration: 50m 15s
Service Stability: Unchanging (stable)
Last Check: 2014-07-21 15:40:00
Next Check: 2014-07-21 15:45:00
Last edited by tylerhoadley on Mon Jul 21, 2014 2:40 pm, edited 2 times in total.
- tylerhoadley
- Posts: 43
- Joined: Tue Jul 02, 2013 1:41 pm
Re: CPU Load Spike daily
I will follow a few other service checks as well, just to be safe.
All seem to be reporting the same "correct" check times.
All seem to be reporting the same "correct" check times.
Last edited by tylerhoadley on Mon Jul 21, 2014 2:50 pm, edited 1 time in total.
Re: CPU Load Spike daily
Very interesting, can't imagine what I had done wrong, but if I am proven wrong, more power to ya
If it continues to act right to you, then it had to be something I did.
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
Re: CPU Load Spike daily
I believe I found and fixed this issue in commit 4e6eb7c. The problem was that the next check time for services was always getting an additional random delay of between 0 and the check interval. This should only have been true when the next check time is further in the future because of a check timeperiod constraint. The logic for host checks was already correct. Please test this commit and let us know whether it resolves the issue.
Re: CPU Load Spike daily
Eric,estanley wrote:I believe I found and fixed this issue in commit 4e6eb7c. The problem was that the next check time for services was always getting an additional random delay of between 0 and the check interval. This should only have been true when the next check time is further in the future because of a check timeperiod constraint. The logic for host checks was already correct. Please test this commit and let us know whether it resolves the issue.
Tyler seems to have tested it well and I am comfortable with his results. Unless you really want me to test the commit, I'd say its golden
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github