Page 1 of 1

Monitoring jenkins Slaves

Posted: Thu Mar 17, 2016 7:26 am
by neworderfac33
Good afternoon,

Does anyone here have any experience of monitoring Jenkins slaves using Nagios Core?

Some of my build servers can have a Jenkins slave on more than one Jenkins master, and i want to be able to monitor them individually, rather than just monitor java.exe.

Here's what I have so far (with certain key information changed), copying the command information from the Processes tab within Task Manager on my Jenkins slave PC:

Code: Select all


define service{
       use                      generic-service
       host_name                MySlaveComputer
       service_description      Java Slave process on jenkins-build
       check_command            check_nt!PROCSTATE! -d SHOWALL -l 'c:\\program files\\java\\jre1.8.0_60\\bin\\java.exe -Xrs -jar "E:\\Jenkins-Slave\\slave.jar" -jnlpURL https://MyBuildServerURL/computer/MySlaveComputer/slave-agent.jnlp -noCertificateCheck -jnlpcredentials MyJenkinsAccount:MyJenkinsAccountPassword'
       }
With this, Nagios returns:

Code: Select all

e: - total: 279.36 Gb - used: 18.24 Gb (7%) - free 261.12 Gb (93%) 
Java Slave process on jenkins-build
CRITICAL	03-17-2016 12:18:57	0d 0h 43m 17s	3/3	c:\program files\java\jre1.8.0_60\bin\java.exe -Xrs -jar "E:\Jenkins-Slave\slave.jar" -jnlpURL https://MyBuildServerURL/computer/MySlaveComputer/slave-agent.jnlp -noCertificateCheck -jnlpcredentials MyJenkinsAccount:MyJenkinsAccountPassword: not running 
I'm not sure if it's simply a matter of re-positioning some quotes and/or double quotes, or whether check_nt simply can't accept a command parameter of this complexity.

Thanks in advance for your assistance

Pete

Re: Monitoring jenkins Slaves

Posted: Thu Mar 17, 2016 4:38 pm
by rkennedy
Can you run it over the CLI, and post the input/output for us? This will make it a bit easier to test with.

Also - I did find a check_jenkins_slave plugin on our exchange (https://exchange.nagios.org/directory/P ... es/details), would this work for you?

Re: Monitoring jenkins Slaves

Posted: Fri Mar 18, 2016 11:15 am
by neworderfac33
Good afternoon and thanks for your reply.

I don't appear to be able to get check_nt to take notice of ANY parameters after the name of the executable, even from the command prompt:

Code: Select all

/usr/local/nagios/libexec/check_nt -H MYSERVERID -p 12489 -v PROCSTATE -l java.exe
returns "OK: all processes are running OK" - I can add anything I want after it (obviously been trying to add the value from the "Command Line" column of the "Processes" tab to identify a unique instance of Java.exe) and Nagios still returns OK.

A useful exercise, but I think I'm going to have to knock this one on the head.

Have a good weekend, all!

Pete

Re: Monitoring jenkins Slaves

Posted: Fri Mar 18, 2016 2:41 pm
by jolson
I don't appear to be able to get check_nt to take notice of ANY parameters after the name of the executable, even from the command prompt:
Not a good sign. Which version of NSClient are you currently using? It's possible that a newer version might handle this kind of complexity better - I have only good things to say about the 0.4.x releases. https://www.nsclient.org/download/

Re: Monitoring jenkins Slaves

Posted: Tue Mar 22, 2016 5:18 am
by neworderfac33
Good morning,

It's 0.4.3.143.

The problem I have is that within Task Manager, there are two occurrences of Jenkins-Slave.exe, each slaved to two different Jenkins masters.
Task Manager differentiates between the two via what is shown in the "Command Line" column (which isn't displayed by default, it had to be added manually).

I don't think that the client is the issue, its just the complexity of the syntax of these two strings that Nagios has trouble dealing with.

Cheers

Re: Monitoring jenkins Slaves

Posted: Tue Mar 22, 2016 7:09 am
by neworderfac33
So, here's what I'd LIKE to be able to run from the command prompt:

Code: Select all

check_nt -H ThisComputer -p 12489 -v PROCSTATE -l java.exe "c:\program Files\java\jre1.8.0-73\java.exe -Xrs -jar E:\jenkins-Slave\slave.jar -jnlpURL https://MyJenkinsBuildServer//computer/ThisComputer/slave-agent.jnlp -noCertificateCheck -jnlpCredentials MyAccount:MyAccountPassword"
It just ignores everything after " -l java.exe "

Cheers

Pete

Re: Monitoring jenkins Slaves

Posted: Tue Mar 22, 2016 4:48 pm
by ssax
I've not been able to get it to work without a custom plugin and NRPE, here's how I've done it before:

You can create a powershell script that would search the processess for the passed in batch or java application file name (not path).

In your nsclient.ini add something like this under [/settings/external scripts/scripts]

Code: Select all

check_pstate = cmd /c echo scripts\check_pstate.ps1 "$ARG1$"; exit($lastexitcode) | powershell.exe -command -
Create the powershell script in your NSClient++\scripts directory.

Your powershell script would need to check for the command line details like you've been trying, you could use something like this:

Code: Select all

Get-WmiObject Win32_Process -Filter "CommandLine like '%your unique commandline details%'"
if it finds it exit 0, if it doesn't, exit 2

You can read more on plugin development here:

http://nagios.sourceforge.net/docs/3_0/pluginapi.html


Let me know if you have any questions.