Hello,
I noticed CPU load is spiking every 7 hours on my Nagios XI production server. Check out the screenshot. The spikes exactly happen every 7 hours and started after the update. As I do not know of any checks running every 7 hours, and the backup is only running once a day, I have no immediate idea where this is coming from.
Could this be related to the Nagios 2014 R1.1 update? How can I best troubleshoot this? By doing a top at the expected spike time?
Friday 23:04
Saturday 06:04
Saturday 13:04
Saturday 19:59
Sunday 03:09
Sunday 9:54
Any other ideas? This is not an urgent issue, as we have 6 vCPU's, but I would like to find out what is causing it.
Grtz
Willem
CPU Load spike every 7 hours
CPU Load spike every 7 hours
You do not have the required permissions to view the files attached to this post.
Nagios XI 5.8.1
https://outsideit.net
https://outsideit.net
Re: CPU Load spike every 7 hours
I'm having a very similar problem on my prod servers and now at home. http://support.nagios.com/forum/viewtop ... 16&t=27703 is my thread. The server I am trying to diagnose is downloaded OVF from nagios and no changes. Hopefully one of ours gets figured out and it helps the other person!
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: CPU Load spike every 7 hours
Around the time of the spike, what is being recorded in the event log?
Home > Monitoring Process > Event Log
Home > Monitoring Process > Event Log
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: CPU Load spike every 7 hours
Very strange, but it seems the load spikes 'normalized'. I did nothing configuration-wise which could explain this change in behaviour. See screenshot. I did not have the opportunity to check the logs when the load spiked.
I'll check again in a week to see if there is any returning cpu load pattern.
Willem
I'll check again in a week to see if there is any returning cpu load pattern.
Willem
You do not have the required permissions to view the files attached to this post.
Nagios XI 5.8.1
https://outsideit.net
https://outsideit.net
Re: CPU Load spike every 7 hours
Willem,
Have you been graphing/monitoring the I/O Wait on the nagios servers since your upgrade? (Using the monitoring wizard for nagios server) If so, has that been going high as well?
EDIT: IGNORE THIS - We have determined the I/O wait issue is real for us and a disk issue. Unrelated to the high CPU load spike.
Have you been graphing/monitoring the I/O Wait on the nagios servers since your upgrade? (Using the monitoring wizard for nagios server) If so, has that been going high as well?
EDIT: IGNORE THIS - We have determined the I/O wait issue is real for us and a disk issue. Unrelated to the high CPU load spike.
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
Re: CPU Load spike every 7 hours
Well then...
Fresh install, literally only logged in and ran top. Came back the next day to this. Needless to say this is something we'll be looking into knowing we can replicate it in-house.
Fresh install, literally only logged in and ran top. Came back the next day to this. Needless to say this is something we'll be looking into knowing we can replicate it in-house.
You do not have the required permissions to view the files attached to this post.
Former Nagios employee
Re: CPU Load spike every 7 hours
Yeah, its not a big issue but there is definitely a pattern you can see and if it is only monitoring localhost, what the heck is on a 7 hour loop.
thanks!
thanks!
2 of XI5.6.14 Prod/DR/DEV - Nagios LogServer 2 Nodes
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
See my projects on the Exchange at BanditBBS - Also check out my Nagios stuff on my personal page at Bandit's Home and at github
Re: CPU Load spike every 7 hours
Good to know you guys can reproduce this. Good luck on finding the root cause!
Grtz
Willem
Grtz
Willem
Nagios XI 5.8.1
https://outsideit.net
https://outsideit.net
Re: CPU Load spike every 7 hours
This may be an issue with the way checks are (re)scheduled in Core.
top is good for seeing the load and CPU usage; in addition,
with the extra 'f' lists processes as a tree which makes it easy to what Nagios is running, as well as what processes are contributing to the load.
Also, nagios.log files from when the load spikes could be helpful. To get more information out of Core, debug_level=28 will write debug info on the process, scheduled events and checks to /usr/local/nagios/var/nagios.debug The extra writes from debug logging hit performance a bit, so it's not for general production use.
Are you seeing problems in monitoring: high check latencies, timeouts or retries? Anything else indicating a problem other than the high load?
top is good for seeing the load and CPU usage; in addition,
Code: Select all
ps -ef fAlso, nagios.log files from when the load spikes could be helpful. To get more information out of Core, debug_level=28 will write debug info on the process, scheduled events and checks to /usr/local/nagios/var/nagios.debug The extra writes from debug logging hit performance a bit, so it's not for general production use.
Are you seeing problems in monitoring: high check latencies, timeouts or retries? Anything else indicating a problem other than the high load?
Re: CPU Load spike every 7 hours
As I said
So no I'm not having any issues with Nagios atm. But I'm guessing If we had only 2 vCPU's, it could have been an issue. As I said in a later post, it seems the situation has stabilized, so it's kind of hard to give you any more information.This is not an urgent issue, as we have 6 vCPU's, but I would like to find out what is causing it.
Nagios XI 5.8.1
https://outsideit.net
https://outsideit.net