Monitoring jenkins Slaves

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
neworderfac33
Posts: 329
Joined: Fri Jul 24, 2015 11:04 am

Monitoring jenkins Slaves

Post by neworderfac33 »

Good afternoon,

Does anyone here have any experience of monitoring Jenkins slaves using Nagios Core?

Some of my build servers can have a Jenkins slave on more than one Jenkins master, and i want to be able to monitor them individually, rather than just monitor java.exe.

Here's what I have so far (with certain key information changed), copying the command information from the Processes tab within Task Manager on my Jenkins slave PC:

Code: Select all


define service{
       use                      generic-service
       host_name                MySlaveComputer
       service_description      Java Slave process on jenkins-build
       check_command            check_nt!PROCSTATE! -d SHOWALL -l 'c:\\program files\\java\\jre1.8.0_60\\bin\\java.exe -Xrs -jar "E:\\Jenkins-Slave\\slave.jar" -jnlpURL https://MyBuildServerURL/computer/MySlaveComputer/slave-agent.jnlp -noCertificateCheck -jnlpcredentials MyJenkinsAccount:MyJenkinsAccountPassword'
       }
With this, Nagios returns:

Code: Select all

e: - total: 279.36 Gb - used: 18.24 Gb (7%) - free 261.12 Gb (93%) 
Java Slave process on jenkins-build
CRITICAL	03-17-2016 12:18:57	0d 0h 43m 17s	3/3	c:\program files\java\jre1.8.0_60\bin\java.exe -Xrs -jar "E:\Jenkins-Slave\slave.jar" -jnlpURL https://MyBuildServerURL/computer/MySlaveComputer/slave-agent.jnlp -noCertificateCheck -jnlpcredentials MyJenkinsAccount:MyJenkinsAccountPassword: not running 
I'm not sure if it's simply a matter of re-positioning some quotes and/or double quotes, or whether check_nt simply can't accept a command parameter of this complexity.

Thanks in advance for your assistance

Pete
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Monitoring jenkins Slaves

Post by rkennedy »

Can you run it over the CLI, and post the input/output for us? This will make it a bit easier to test with.

Also - I did find a check_jenkins_slave plugin on our exchange (https://exchange.nagios.org/directory/P ... es/details), would this work for you?
Former Nagios Employee
neworderfac33
Posts: 329
Joined: Fri Jul 24, 2015 11:04 am

Re: Monitoring jenkins Slaves

Post by neworderfac33 »

Good afternoon and thanks for your reply.

I don't appear to be able to get check_nt to take notice of ANY parameters after the name of the executable, even from the command prompt:

Code: Select all

/usr/local/nagios/libexec/check_nt -H MYSERVERID -p 12489 -v PROCSTATE -l java.exe
returns "OK: all processes are running OK" - I can add anything I want after it (obviously been trying to add the value from the "Command Line" column of the "Processes" tab to identify a unique instance of Java.exe) and Nagios still returns OK.

A useful exercise, but I think I'm going to have to knock this one on the head.

Have a good weekend, all!

Pete
jolson
Attack Rabbit
Posts: 2560
Joined: Thu Feb 12, 2015 12:40 pm

Re: Monitoring jenkins Slaves

Post by jolson »

I don't appear to be able to get check_nt to take notice of ANY parameters after the name of the executable, even from the command prompt:
Not a good sign. Which version of NSClient are you currently using? It's possible that a newer version might handle this kind of complexity better - I have only good things to say about the 0.4.x releases. https://www.nsclient.org/download/
Twits Blog
Show me a man who lives alone and has a perpetually clean kitchen, and 8 times out of 9 I'll show you a man with detestable spiritual qualities.
neworderfac33
Posts: 329
Joined: Fri Jul 24, 2015 11:04 am

Re: Monitoring jenkins Slaves

Post by neworderfac33 »

Good morning,

It's 0.4.3.143.

The problem I have is that within Task Manager, there are two occurrences of Jenkins-Slave.exe, each slaved to two different Jenkins masters.
Task Manager differentiates between the two via what is shown in the "Command Line" column (which isn't displayed by default, it had to be added manually).

I don't think that the client is the issue, its just the complexity of the syntax of these two strings that Nagios has trouble dealing with.

Cheers
neworderfac33
Posts: 329
Joined: Fri Jul 24, 2015 11:04 am

Re: Monitoring jenkins Slaves

Post by neworderfac33 »

So, here's what I'd LIKE to be able to run from the command prompt:

Code: Select all

check_nt -H ThisComputer -p 12489 -v PROCSTATE -l java.exe "c:\program Files\java\jre1.8.0-73\java.exe -Xrs -jar E:\jenkins-Slave\slave.jar -jnlpURL https://MyJenkinsBuildServer//computer/ThisComputer/slave-agent.jnlp -noCertificateCheck -jnlpCredentials MyAccount:MyAccountPassword"
It just ignores everything after " -l java.exe "

Cheers

Pete
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Monitoring jenkins Slaves

Post by ssax »

I've not been able to get it to work without a custom plugin and NRPE, here's how I've done it before:

You can create a powershell script that would search the processess for the passed in batch or java application file name (not path).

In your nsclient.ini add something like this under [/settings/external scripts/scripts]

Code: Select all

check_pstate = cmd /c echo scripts\check_pstate.ps1 "$ARG1$"; exit($lastexitcode) | powershell.exe -command -
Create the powershell script in your NSClient++\scripts directory.

Your powershell script would need to check for the command line details like you've been trying, you could use something like this:

Code: Select all

Get-WmiObject Win32_Process -Filter "CommandLine like '%your unique commandline details%'"
if it finds it exit 0, if it doesn't, exit 2

You can read more on plugin development here:

http://nagios.sourceforge.net/docs/3_0/pluginapi.html


Let me know if you have any questions.
Locked