I know
check_wmi_plus (included with XI) has some options for monitoring cpu usage on a per-process basis. Here's the technical bits from the module itself:
Code: Select all
#----------------------------------------------------------
[checkproc cpuabove]
requires=1.48
inihelp=<<EOT
Check for processes using more than a specified CPU utilisation. To make this work as intended you need to specify some
warning/critical criteria eg -w 50 for warning when a process uses more than 50% CPU. You probably also want to remove
all processes with low CPU from the results. Do this using something like -exc _AvgCPU=@0:5 (which will exclude processes that have CPU utilisation between 0 and 5%)
ARG1 The processname to look for. Use % for wildcards.
The process name typically only includes the actual file name minus its suffix eg firefox, svchost
If there are multiple instances eg svchost, then some versions of Windows have them named all the same while others
such as Windows 2008 Server, have them numbered eg svchost#1, svchost#2, svchost#3. To get all svchost processes you
need to set ARG1 to svchost%
To view all processes set ARG1 to "%" and the full process list will be included in the plugin output.
Note: Use --nodatamode and/or NODATAEXIT settings to control what happens if no matching process is found.
EOT
aligndata=Name,IDProcess
query=select Name,IDProcess,PercentProcessorTime,Timestamp_Sys100NS from Win32_PerfRawData_PerfProc_Process WHERE Name like "{_arg1}" and Name != "Idle" and Name != "_Total"
# run 2 WMI queries, 5 seconds apart. The delay only applies if using --nokeepstate
samples=2
delay=5
customfield=_AvgCPU,PERF_100NSEC_TIMER,PercentProcessorTime,%.1f,100
test=_AvgCPU
test=_ItemCount
# fields to display before we list out all the CPU data
predisplay=_DisplayMsg||~|~| - ||
predisplay=_ItemCount||Total Process Count|||| (Process details on next line)\n
display=_DisplayMsg||~|~| - ||
display=_AvgCPU|%|CPU for {Name} (PID={IDProcess})||||\n
# need to include the {Name} so that performance data is unique to each instance
perf=_ItemCount||Process Count
# perf=_AvgCPU|%|Avg Utilisation CPU_{Name} - don't really need perfdata for each process for this check - use checkproc cpu if you want that
That's probably the easiest option. You'll need to enable your Windows environment for WMI monitoring though. We have docs for that:
https://assets.nagios.com/downloads/nag ... ios-XI.pdf
You'll need to include the .ini in your check_wmi_plus.pl as well. That's a one-line change from the default one we distribute (I think):
Code: Select all
our $wmi_ini_file='$conf_file_dir/check_wmi_plus.ini';
Though there's nothing to allow you to say "only show the top 5" without modifying the plugin itself, you could say "only show me processes using 10% or greater" using -exc like so:
Code: Select all
[root@xi-stable rw]# /usr/local/nagios/libexec/check_wmi_plus.pl -H 192.168.67.99 -u admin -p welcome123 -m checkproc -s cpuabove -a % -exc _AvgCPU=@0:9
OK (Sample Period 2 sec) - Total Process Count=1 (Process details on next line)|'Process Count'=1;
OK - CPU for wscript (PID=30260)=54.1%
The
-exc _AvgCPU=@0:9 bit basically says "exclude processes with % usage between 0% and 9%". That would help narrow things down a bit, but you would also trigger a CRITICAL if there were no processes matching that criteria and, in turn, no results were returned. I believe you could then say
-w 20 -c 25 to warn at 20% and critical at 25%.