Hi,
I imagine I can't be the only person who had this issue but I can't find a definitive answer for the current version of Nagios XI.
Almost every night at around 3am one of our servers CPUs goes critical for quite a while. Trying to troubleshoot this we have eliminated any obvious tasks that are occurring during the night, so I now want to know exactly which process is causing the CPU spike, as this server is running many services.
Is there a Nagios plugin or built in service that can simply tell me which process is causing the CPU to spike?
thanks
CPU spikes - which process is causing it?
Re: CPU spikes - which process is causing it?
NCPA can be used to poll processes that are using more that X amount of cpu:
./check_ncpa.py -H server_ip -t 'token' -M 'processes' -q 'cpu_percent=80'
https://www.nagios.org/ncpa/getting-started.php covers how to install and the service check would look something like the below.
Depending on how long it spikes, you may want to lower to lower the check, retry and max check attempts on the Check Settings tab. You can also use the Check period option on the same tab to only run this check during the time frame that this occurs.
./check_ncpa.py -H server_ip -t 'token' -M 'processes' -q 'cpu_percent=80'
https://www.nagios.org/ncpa/getting-started.php covers how to install and the service check would look something like the below.
Depending on how long it spikes, you may want to lower to lower the check, retry and max check attempts on the Check Settings tab. You can also use the Check period option on the same tab to only run this check during the time frame that this occurs.
You do not have the required permissions to view the files attached to this post.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: CPU spikes - which process is causing it?
Thanks - will have a look at NCPA
Re: CPU spikes - which process is causing it?
OK got that working and it does show 2 tasks have gone critical for example, which is helpful.
I was after something a little more granular that could say for example what task it was, or at least the PID.
Any ideas?
many thanks
Rob
I was after something a little more granular that could say for example what task it was, or at least the PID.
Any ideas?
many thanks
Rob
-
bolson
Re: CPU spikes - which process is causing it?
This plugin might be the ticket for you. It aggregates parent and child (worker) processes and returns the top CPU consumer.
https://exchange.nagios.org/directory/P ... op/details
https://exchange.nagios.org/directory/P ... op/details
Re: CPU spikes - which process is causing it?
Thanks bolson I will run the rule over that plugin and report back.
Re: CPU spikes - which process is causing it?
Hi bolson - I presume this won't run on a remote windows client as it's a bash script?
Excuse the fairly dumb question but I am new to nagios. I know I can run VBS and Python on the windows server....
Excuse the fairly dumb question but I am new to nagios. I know I can run VBS and Python on the windows server....
-
npolovenko
- Support Tech
- Posts: 3457
- Joined: Mon May 15, 2017 5:00 pm
Re: CPU spikes - which process is causing it?
@robatwork, Yes, It can be problematic to run a shell script on windows. There is a PowerShell script that you can use though:
https://exchange.nagios.org/directory/P ... 29/details
https://www.itefix.net/check_winprocess
You can choose either one.
https://exchange.nagios.org/directory/P ... 29/details
Here's another plugin that does the same:This script will check the current CPU load (percentage and queue) and physical memory utilization on a windows host and alert based on your arguements. If alerting, it will tell you what process is using the most resources and give you the PID so you can kill it with psexec if the host is inaccessible via traditional methods.
https://www.itefix.net/check_winprocess
You can choose either one.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.