Page 1 of 3

Windows process Monitoring

Posted: Tue Jul 04, 2017 7:42 am
by mindspring
Hi There,

I am hoping you can help. Our customer is trying to monitor a specific process (Let's call it process.exe) It's an application and it could open up multiple copies of the same process.exe.

I tried to figure out how to monitor this process using Windows Server configuration wizard and performance monitors but I can't seem to find the Syntax to use. I followed this post below.
https://support.nagios.com/kb/article.php?id=127

I need to be able to see the memory (in MB), cpu (Percent) and disk usage (MB p/s) for each of the multiple instances of the process.exe.
Could you please help with the syntax to use in this section of the monitoring wizard?
snip1.PNG
Thanks.

Re: Windows process Monitoring

Posted: Wed Jul 05, 2017 2:47 pm
by tgriep
The link to the KB article is more for monitoring Counters on a Windows System and not for checking if a process it running and what it is using.

If you have the NSCLient++ agent installed and configured on the Windows server, you can run the Windows Server Wizard under the Configure > Configuration Wizards menu.
That will setup the server for basic monitoring like Disk, Memory, Services and Processes that are running, etc.

To get the details on each process, the amount of CPU and Memory usage that each process is taking, you would have to find a plugin that works with NSClient++ at the Exchange site.
The default Windows Server plugin will not do this.
https://exchange.nagios.org/


Another Option it to use the check_wmi_plus plugin, it may have the capabilities to do what you are looking for.
This example will show you the amount of CPU utilization the processes are taking on the system.

Code: Select all

./check_wmi_plus.pl -H x.x.x.x -u <username> -p <password> -m checkproc -s cpu -a firefox%
OK (Sample Period 39 sec) - Found 2 Instance(s) of "firefox%" running. CPU_firefox(PID=13064)=2.4% CPU_firefox#1(PID=4584)=7.1% |'Process Count'=2; 'Avg Utilisation CPU_firefox'=2.4%; 'Avg Utilisation CPU_firefox#1'=7.1%;
This example will show you the amount of Memory the processes are taking on the system.

Code: Select all

./check_wmi_plus.pl -H x.x.x.x -u <username> -p <password> -m checkproc -s memory -a firefox%
Found 2 Instance(s) of "firefox%" running. OK - firefox: Private Memory=346.133MB, Working Set=448.602MB, Virtual Memory=1.111GBOK - firefox#1: Private Memory=398.922MB, Working Set=563.508MB, Virtual Memory=1.009GB|'Process Count'=2; 'PrivateMemory_firefox'=362946560Bytes; 'TotalWorkingSet_firefox'=470392832Bytes; 'VirtualMemory_firefox'=1192796160Bytes; 'PrivateMemory_firefox#1'=418299904Bytes; 'TotalWorkingSet_firefox#1'=590880768Bytes; 'VirtualMemory_firefox#1'=1083797504Bytes;

To setup your Windows server to be polled by WMI, take a look at the link below.
https://assets.nagios.com/downloads/nag ... ios-XI.pdf

If you have any questions, let us know.

Re: Windows process Monitoring

Posted: Thu Jul 06, 2017 2:31 am
by mindspring
Thank you, this has been very useful. I think this will give me what I need. I followed the WMI guide on the server and tested the script but I seem to be getting random errors.

WMI monitoring for other services on the same server ( I used the config wizard) Is working but this particular check_wmi script doesn't work from the command line.

Firefox is running on this server.
cap1.PNG
I can telnet to the WMI port on the server and the config wizard probes are working.

Code: Select all

[root@nagiosxi libexec]# telnet 192.168.130.252 135
Trying 192.168.130.252...
Connected to 192.168.130.252.
Escape character is '^]'.
^]

These are the two messages that seem to come up.

Code: Select all


[root@nagiosxi libexec]# ./check_wmi_plus.pl -H 192.168.130.252 -u wmiagent -p ######-m checkproc -s cpu -a firefox%
WMI Query returned no data. The item you were looking for may NOT exist or the software that creates the WMI Class may not be running, or all data has been excluded.

I got this message once or twice but the above one several times

Code: Select all


[root@nagiosxi libexec]# ./check_wmi_plus.pl -H 192.168.130.252 -u wmiagent -p ###### -m checkproc -s cpu -a firefox%
Collecting first WMI sample because the previous state data file (/tmp/cwpss_checkproccpu_cpu_192168130252_firefox__.state) contained no data. Results will be shown the next time the plugin runs.

Wmi is running for other probes on this server
cap2.PNG
Any ideas?

Thanks.

Re: Windows process Monitoring

Posted: Thu Jul 06, 2017 10:01 am
by tgriep
The "WMI Query returned no data." could be a permission problem with the user account but there is not enough details to say what the issue is.
If you run the command again, adding the -d option to it, it will show you the debug output and that may show you what the check is not working for you.

Code: Select all

./check_wmi_plus.pl -H 192.168.130.252 -u wmiagent -p ###### -m checkproc -s cpu -a firefox% -d
The "Collecting first WMI sample because the previous state data file" is a normal message for when the check is run for the first time.
Some plugins, need to run the check twice so it can gather the information for it's calculations between the time the checks run and the previous time it ran.

Re: Windows process Monitoring

Posted: Mon Jul 10, 2017 3:20 am
by mindspring
Thanks for that. This is the more detailed error message I get now. I tried with the wmiagent credentials and even with the administrator login and it comes up with the same error with something more general like explorer.exe.
This seems to be the important bit

Code: Select all


WMI Query returned no data. The item you were looking for may NOT exist or the software that creates the WMI Class may not be running, or all data has been excluded

Below is the entire output in debug mode.
Any ideas?
Thanks.

Code: Select all

[root@nagiosxi libexec]# ./check_wmi_plus.pl -H 192.168.130.252 -u wmiagent -p ###### -m checkproc -s cpu -a explorer%  -d
Command Line (v1.6): ./check_wmi_plus.pl -H 192.168.130.252 -u USER -p PASS -m checkproc -s cpu -a explorer% -d
Base Dir: /usr/local/nagios/libexec
Conf File Dir: /usr/local/nagios/libexec
Loaded Conf File /usr/local/nagios/libexec/check_wmi_plus.conf
Opening Ini Files ...
   opening first ini file: /usr/local/nagios/libexec/check_wmi_plus.ini
   checking ini dir /usr/local/nagios/libexec, found 1 file(s)
   opening ini file: check_wmi_plus.ini
Global Static Ini Variables: $VAR1 = {};
Found Group checkproc
GROUP MEMBERS $VAR1 = [
          'checkproc cmdline',
          'checkproc memory',
          'checkproc memoryabove',
          'checkproc memorytotals',
          'checkproc cpu',
          'checkproc cpuabove',
          'checkproc count',
          'checkproc info'
        ];
Found Member cpu
Processing INI Section: checkproc cpu
Settings for this section are:
-------------------------------------------------------------------
      aligndata => Name,IDProcess
    customfield => _AvgCPU,PERF_100NSEC_TIMER,PercentProcessorTime,%.1f,100
          delay => 5
        display => _AvgCPU|%|CPU_{Name}(PID={IDProcess})||||   
        inihelp => Check cpu details for individual processes
ARG1  The processname to look for. Use % for wildcards.
   The process name typically only includes the actual file name minus its suffix eg firefox, svchost
   If there are multiple instances eg svchost, then some versions of Windows have them named all the same while others
   such as Windows 2008 Server, have them numbered eg svchost#1, svchost#2, svchost#3. To get all svchost processes you
   need to set ARG1 to svchost%
   To view all processes set ARG1 to "%" and the full process list will be included in the plugin output.
Note:  Use --nodatamode and/or NODATAEXIT settings to control what happens if no matching process is found.
           perf => _ItemCount||Process Count
_AvgCPU|%|Avg Utilisation CPU_{Name}
     predisplay => _DisplayMsg||~|~| - ||
_ItemCount| Instance(s)|Found |~|. || of "{_arg1}" running
          query => select Name,IDProcess,PercentProcessorTime,Timestamp_Sys100NS from Win32_PerfRawData_PerfProc_Process WHERE Name like "{_arg1}"
       requires => 1.48
        samples => 2
           test => _AvgCPU
_ItemCount
-------------------------------------------------------------------
All Static Ini Variables: $VAR1 = {};
Query Extenstions: $VAR1 = [];
   Original Query:select Name,IDProcess,PercentProcessorTime,Timestamp_Sys100NS from Win32_PerfRawData_PerfProc_Process WHERE Name like "{_arg1}"
        New Query:select Name,IDProcess,PercentProcessorTime,Timestamp_Sys100NS from Win32_PerfRawData_PerfProc_Process WHERE Name like "{_arg1}"
Starting Keep State Mode
STATE FILE: /tmp/cwpss_checkproccpu_cpu_192168130252_explorer__.state
Checking previous data's expiry - Timestamp 1499674505 vs Expiry After 1499670908 (Keep State Expiry setting is 3600sec)
Using Existing WMI DATA of:$VAR1 = [
          [
            {
              '_ItemCount' => '0',
              '_KeepStateCreateTimestamp' => 1499674505
            }
          ]
        ];
Round #2 of 2
QUERY: /usr/bin/wmic '-U' 'USER%PASS' '--namespace' 'root/cimv2' '//192.168.130.252' 'select Name,IDProcess,PercentProcessorTime,Timestamp_Sys100NS from Win32_PerfRawData_PerfProc_Process WHERE Name like "explorer%"'
OUTPUT: 
WMI DATA:$VAR1 = [
          [
            {
              '_ChecksOK' => 1,
              '_KeepStateSamplePeriod' => 3,
              '_ItemCount' => '0',
              '_KeepStateCreateTimestamp' => 1499674505
            }
          ],
          [
            {
              '_ItemCount' => 0
            }
          ]
        ];
Storing new WMI results in the state file $VAR1 = [
          [
            {
              '_KeepStateCreateTimestamp' => 1499674508,
              '_ItemCount' => 0
            }
          ]
        ];
Copying predefined fields to the last WMI result set [0] to [1]
NEW WMI DATA:$VAR1 = [
          [
            {
              '_ItemCount' => '0'
            }
          ],
          [
            {
              '_KeepStateSamplePeriod' => 3,
              '_ChecksOK' => 1,
              '_KeepStateCreateTimestamp' => 1499674505,
              '_ItemCount' => 0
            }
          ]
        ];
JOIN PARAMETERS  $VAR1 = [];
$VAR2 = [];
$VAR3 = [
          [
            {
              '_ItemCount' => '0'
            }
          ],
          [
            {
              '_KeepStateSamplePeriod' => 3,
              '_ChecksOK' => 1,
              '_KeepStateCreateTimestamp' => 1499674505,
              '_ItemCount' => 0
            }
          ]
        ];
$VAR4 = 1;
WMI Query returned no data. The item you were looking for may NOT exist or the software that creates the WMI Class may not be running, or all data has been excluded

This what I get when I use the administrator login, which is quite odd. But I suspect it could be because I didn't apply the WMI NagiosXI configuration to the administrator login, only to the wmiagent login. So in effect administrator might have a lower access level than wmiagent, which is why it comes up with this.

Code: Select all


[root@nagiosxi libexec]# ./check_wmi_plus.pl -H 192.168.130.252 -u administrator -p #######  -m checkproc -s cpu -a explorer% -d
Command Line (v1.6): ./check_wmi_plus.pl -H 192.168.130.252 -u USER -p PASS -m checkproc -s cpu -a explorer% -d
Base Dir: /usr/local/nagios/libexec
Conf File Dir: /usr/local/nagios/libexec
Loaded Conf File /usr/local/nagios/libexec/check_wmi_plus.conf
Opening Ini Files ...
   opening first ini file: /usr/local/nagios/libexec/check_wmi_plus.ini
   checking ini dir /usr/local/nagios/libexec, found 1 file(s)
   opening ini file: check_wmi_plus.ini
Global Static Ini Variables: $VAR1 = {};
Found Group checkproc
GROUP MEMBERS $VAR1 = [
          'checkproc cmdline',
          'checkproc memory',
          'checkproc memoryabove',
          'checkproc memorytotals',
          'checkproc cpu',
          'checkproc cpuabove',
          'checkproc count',
          'checkproc info'
        ];
Found Member cpu
Processing INI Section: checkproc cpu
Settings for this section are:
-------------------------------------------------------------------
      aligndata => Name,IDProcess
    customfield => _AvgCPU,PERF_100NSEC_TIMER,PercentProcessorTime,%.1f,100
          delay => 5
        display => _AvgCPU|%|CPU_{Name}(PID={IDProcess})||||   
        inihelp => Check cpu details for individual processes
ARG1  The processname to look for. Use % for wildcards.
   The process name typically only includes the actual file name minus its suffix eg firefox, svchost
   If there are multiple instances eg svchost, then some versions of Windows have them named all the same while others
   such as Windows 2008 Server, have them numbered eg svchost#1, svchost#2, svchost#3. To get all svchost processes you
   need to set ARG1 to svchost%
   To view all processes set ARG1 to "%" and the full process list will be included in the plugin output.
Note:  Use --nodatamode and/or NODATAEXIT settings to control what happens if no matching process is found.
           perf => _ItemCount||Process Count
_AvgCPU|%|Avg Utilisation CPU_{Name}
     predisplay => _DisplayMsg||~|~| - ||
_ItemCount| Instance(s)|Found |~|. || of "{_arg1}" running
          query => select Name,IDProcess,PercentProcessorTime,Timestamp_Sys100NS from Win32_PerfRawData_PerfProc_Process WHERE Name like "{_arg1}"
       requires => 1.48
        samples => 2
           test => _AvgCPU
_ItemCount
-------------------------------------------------------------------
All Static Ini Variables: $VAR1 = {};
Query Extenstions: $VAR1 = [];
   Original Query:select Name,IDProcess,PercentProcessorTime,Timestamp_Sys100NS from Win32_PerfRawData_PerfProc_Process WHERE Name like "{_arg1}"
        New Query:select Name,IDProcess,PercentProcessorTime,Timestamp_Sys100NS from Win32_PerfRawData_PerfProc_Process WHERE Name like "{_arg1}"
Starting Keep State Mode
STATE FILE: /tmp/cwpss_checkproccpu_cpu_192168130252_explorer__.state
Checking previous data's expiry - Timestamp 1499674664 vs Expiry After 1499671087 (Keep State Expiry setting is 3600sec)
Using Existing WMI DATA of:$VAR1 = [
          [
            {
              '_ItemCount' => '0',
              '_KeepStateCreateTimestamp' => 1499674664
            }
          ]
        ];
Round #2 of 2
QUERY: /usr/bin/wmic '-U' 'USER%PASS' '--namespace' 'root/cimv2' '//192.168.130.252' 'select Name,IDProcess,PercentProcessorTime,Timestamp_Sys100NS from Win32_PerfRawData_PerfProc_Process WHERE Name like "explorer%"'
OUTPUT: [librpc/rpc/dcerpc_util.c:1290:dcerpc_pipe_auth_recv()] Failed to bind to uuid 4d9f4ab8-7d1c-11cf-861e-0020af6e7c57 - NT_STATUS_NET_WRITE_FAULT
[librpc/rpc/dcerpc_connect.c:790:dcerpc_pipe_connect_b_recv()] failed NT status (c0000022) in dcerpc_pipe_connect_b_recv
[wmi/wmic.c:196:main()] ERROR: Login to remote object.
NTSTATUS: NT_STATUS_ACCESS_DENIED - Access denied

Could not find the CLASS: line - an error occurred
WMI DATA:$VAR1 = [
          [
            {
              '_KeepStateSamplePeriod' => 23,
              '_ItemCount' => '0',
              '_KeepStateCreateTimestamp' => 1499674664
            }
          ]
        ];
UNKNOWN - The WMI query had problems. You might have your username/password wrong or the user's access level is too low. Wmic error text on the next line.
[librpc/rpc/dcerpc_util.c:1290:dcerpc_pipe_auth_recv()] Failed to bind to uuid 4d9f4ab8-7d1c-11cf-861e-0020af6e7c57 - NT_STATUS_NET_WRITE_FAULT
[librpc/rpc/dcerpc_connect.c:790:dcerpc_pipe_connect_b_recv()] failed NT status (c0000022) in dcerpc_pipe_connect_b_recv
[wmi/wmic.c:196:main()] ERROR: Login to remote object.
NTSTATUS: NT_STATUS_ACCESS_DENIED - Access denied


Re: Windows process Monitoring

Posted: Mon Jul 10, 2017 9:31 am
by tgriep
The wmiagent account looks like it can be used to login to the system but the root/cimv2 namespace may not of been granted remote access or enabled so take a look at the instructions and verify it is setup.

The Administratoe account is not logging in at all, it it is a domain account you would have to use this format for the login name.

Code: Select all

domain/user
Or if it is a Workgroup only system, you could use this format as the username.

Code: Select all

workgroup/user
But I think one of the settings was missed for the root/cimv2 access and after fixing that, it should work for you.

Re: Windows process Monitoring

Posted: Mon Jul 31, 2017 9:26 am
by mindspring
Hi There,

Thanks, I eventually got it working by using the admin login. For some reason even the wmiagent user in the admin group couldn't get access. I followed the guide but it still fails - will try to fix that later.

My latest issue is the accuracy of the probe. It shows the following:. Have a look at PID 15888, which is almost always at 100%.
nag1.PNG

The 15888 exe is at 100% and is often at this level but when I check task manage on the server, it never goes higher than 20% or so. Please see below.
nag2.PNG
Any idea why Nagios wouldn't report it correctly? I know it is based on what WMI reports but the history shows that the CPU has never been at 100% .

Re: Windows process Monitoring

Posted: Mon Jul 31, 2017 1:02 pm
by tgriep
Can you post how you ran the command so I can try and recreate it?
What version of the plugin are you running. Run the following and post the output here.

Code: Select all

/usr/local/nagios/libexec/check_wmi_plus.pl --version

Re: Windows process Monitoring

Posted: Mon Dec 18, 2017 6:22 am
by mindspring
Thanks, I am trying to pick up on this again. On the same server I can't even seem to get it to report anymore.

Code: Select all


[root]@nagiosxi /usr/local/nagios/libexec] $ ./check_wmi_plus.pl -H 192.168.130.252 -u xxxx\administrator -p xxxx -m checkproc -s cpu -a explorer% -d
Command Line (v1.6): ./check_wmi_plus.pl -H 192.168.130.252 -u USER -p PASS -m checkproc -s cpu -a explorer% -d
Base Dir: /usr/local/nagios/libexec
Conf File Dir: /usr/local/nagios/libexec
Loaded Conf File /usr/local/nagios/libexec/check_wmi_plus.conf
Opening Ini Files ...
   opening first ini file: /usr/local/nagios/libexec/check_wmi_plus.ini
   checking ini dir /usr/local/nagios/libexec, found 1 file(s)
   opening ini file: check_wmi_plus.ini
Global Static Ini Variables: $VAR1 = {};
Found Group checkproc
GROUP MEMBERS $VAR1 = [
          'checkproc cmdline',
          'checkproc memory',
          'checkproc memoryabove',
          'checkproc memorytotals',
          'checkproc cpu',
          'checkproc cpuabove',
          'checkproc count',
          'checkproc info'
        ];
Found Member cpu
Processing INI Section: checkproc cpu
Settings for this section are:
-------------------------------------------------------------------
      aligndata => Name,IDProcess
    customfield => _AvgCPU,PERF_100NSEC_TIMER,PercentProcessorTime,%.1f,100
          delay => 5
        display => _AvgCPU|%|CPU_{Name}(PID={IDProcess})||||   
        inihelp => Check cpu details for individual processes
ARG1  The processname to look for. Use % for wildcards.
   The process name typically only includes the actual file name minus its suffix eg firefox, svchost
   If there are multiple instances eg svchost, then some versions of Windows have them named all the same while others
   such as Windows 2008 Server, have them numbered eg svchost#1, svchost#2, svchost#3. To get all svchost processes you
   need to set ARG1 to svchost%
   To view all processes set ARG1 to "%" and the full process list will be included in the plugin output.
Note:  Use --nodatamode and/or NODATAEXIT settings to control what happens if no matching process is found.
           perf => _ItemCount||Process Count
_AvgCPU|%|Avg Utilisation CPU_{Name}
     predisplay => _DisplayMsg||~|~| - ||
_ItemCount| Instance(s)|Found |~|. || of "{_arg1}" running
          query => select Name,IDProcess,PercentProcessorTime,Timestamp_Sys100NS from Win32_PerfRawData_PerfProc_Process WHERE Name like "{_arg1}"
       requires => 1.48
        samples => 2
           test => _AvgCPU
_ItemCount
-------------------------------------------------------------------
All Static Ini Variables: $VAR1 = {};
Query Extenstions: $VAR1 = [];
   Original Query:select Name,IDProcess,PercentProcessorTime,Timestamp_Sys100NS from Win32_PerfRawData_PerfProc_Process WHERE Name like "{_arg1}"
        New Query:select Name,IDProcess,PercentProcessorTime,Timestamp_Sys100NS from Win32_PerfRawData_PerfProc_Process WHERE Name like "{_arg1}"
Starting Keep State Mode
STATE FILE: /tmp/cwpss_checkproccpu_cpu_192168130252_explorer__.state
Checking previous data's expiry - Timestamp 1499768784 vs Expiry After 1513592346 (Keep State Expiry setting is 3600sec)
Data has expired - getting data again
Round #1 of 1
QUERY: /usr/bin/wmic '-U' 'USER%PASS' '--namespace' 'root/cimv2' '//192.168.130.252' 'select Name,IDProcess,PercentProcessorTime,Timestamp_Sys100NS from Win32_PerfRawData_PerfProc_Process WHERE Name like "explorer%"'
OUTPUT: [librpc/rpc/dcerpc_util.c:1290:dcerpc_pipe_auth_recv()] Failed to bind to uuid 4d9f4ab8-7d1c-11cf-861e-0020af6e7c57 - NT_STATUS_NET_WRITE_FAULT
[librpc/rpc/dcerpc_connect.c:790:dcerpc_pipe_connect_b_recv()] failed NT status (c0000022) in dcerpc_pipe_connect_b_recv
[wmi/wmic.c:196:main()] ERROR: Login to remote object.
NTSTATUS: NT_STATUS_ACCESS_DENIED - Access denied

Could not find the CLASS: line - an error occurred
WMI DATA:$VAR1 = undef;
UNKNOWN - The WMI query had problems. You might have your username/password wrong or the user's access level is too low. Wmic error text on the next line.
[librpc/rpc/dcerpc_util.c:1290:dcerpc_pipe_auth_recv()] Failed to bind to uuid 4d9f4ab8-7d1c-11cf-861e-0020af6e7c57 - NT_STATUS_NET_WRITE_FAULT
[librpc/rpc/dcerpc_connect.c:790:dcerpc_pipe_connect_b_recv()] failed NT status (c0000022) in dcerpc_pipe_connect_b_recv
[wmi/wmic.c:196:main()] ERROR: Login to remote object.
NTSTATUS: NT_STATUS_ACCESS_DENIED - Access denied


it would be appreciated if you could help.
You also asked for the version:

Code: Select all

[root]@nagiosxi/usr/local/nagios/libexec] $ ./check_wmi_plus.pl --version
Version: 1.6


Re: Windows process Monitoring

Posted: Mon Dec 18, 2017 12:27 pm
by tgriep
When supplying the domain / username, you need to use a forward slash and not a back slash for the user account so try this to see if the plugin can login to your windows server.

Code: Select all

./check_wmi_plus.pl -H 192.168.130.252 -u xxxx/administrator -p xxxx -m checkproc -s cpu -a explorer% -d