Page 1 of 1

Monitoring windows process age

Posted: Fri Apr 01, 2016 11:32 am
by GreatWolfResorts
Is there a way I can monitor a windows server for a particular process that is running too long? We have session processes in a specific application which can sometimes become orphaned but remain intensive against the CPU and databases. This causes us to have to routinely check for said session processes and kill them. I was hoping I could leverage Nagios to do this check, but the best I've come across is the check_proc_age.sh which appears to focus on the local linux box. Any direction is greatly appreciated!

Dan

Re: Monitoring windows process age

Posted: Fri Apr 01, 2016 11:36 am
by hsmith
Is that a piece of information you could grab with a batch file or powershell script? If so, the answer is yes. If Windows is not providing the information this would be much trickier.

Re: Monitoring windows process age

Posted: Sat Apr 02, 2016 10:40 pm
by GreatWolfResorts
For those who may be interested in a similar check, I've put together a powershell script and used NRPE to call it from the server hosting the processes in question. Here are the pieces:

Powershell Script:

Code: Select all

$wlevel = (get-date).AddDays(-2).ToString("G")
$clevel = (get-date).AddDays(-5).ToString("G") 

$a =  gwmi win32_process | ? { $_.name -eq "SunSystemsSession.exe" }
$a | Select-Object name, processId, @{Name="StartTime"; Expression={ $_.ConvertToDateTime( $_.CreationDate )}} | Format-List | Out-File 'C:\Program Files\NSClient++\Scripts\Check_SunSystemsSession_output.txt'

if ($a.Name) {
    if ($a.StartTime -lt $wlevel) {
        Write-Host "OK - All sessions are current"
        exit 0
    } elseif ($a.StartTime -gt $wlevel) {
        Write-Host "WARNING - Process ID $($a.processId) older than 2 days"
        exit 1
    } elseif ($a.starttime -gt $clevel) {
        Write-Host "CRITICAL - Process ID $($a.processId) older than 5 days"
        exit 2
    } else {
        Write-Host "OK - No sessions found"
        exit 3
    }
} else {
    Write-Host "CRITICAL - Problems retrieving processes!"
    exit 2
}
Here I am defining the two thresholds (warning, critical) based on today minus 2 days and 5 days. Then I pull the process name, id and start time using gwmi and convert the format. Note the wlevel and clevel are both formatted to match this output as this is the variable used for comparison. In addition, I send the results to a file that matches the process name. This keeps the data displayed in Nagios clean and only shows what is important.

I then us an "IF" statement to check if the starttime for each process returned (mind you there can be multiple) and identify if it is less than the warning level (OK) or greater then the warning or critical levels. The results are an output for Nagios. If a failure occurs and processes are aging, the output string will include the process ID numbers to make identifying them on the server quick and easy.

NSClient++ NSC.ini:

Code: Select all

Under [External Scripts] append:
check_sunsystemssession_age=cmd /c echo scripts/check_sunsystemssession_age.ps1; exit $LastExitCode | powershell.exe -command -
Also make sure that the following are enabled by removing the comment or adding:

Code: Select all

[modules]
NRPEListener.dll
NSClientListener.dll
CheckExternalScripts.dll
Use the check_NRPE command in Nagios:

Code: Select all

$USER1$/check_nrpe -H $HOSTADDRESS$ -t 30 -c check_sunsystemssession_age
Change the names where you see fit. I made it unique to the process in question to allow for multiple process checks on a single server. I'm sure coding can be refined a bit, but this should do the trick for you.

Re: Monitoring windows process age

Posted: Mon Apr 04, 2016 9:50 am
by lmiltchev
Thanks for sharing, GreatWolfResorts!