Monitoring Docker containers

An open discussion forum for obtaining help with Nagios Core. Nagios Core users of all experience levels are welcome here. Subforum have been created for the discussion of Nagios Core and Nagios Plugin development.

NOTE: The SourceForge.net mailing lists have been deprecated in favor of this forum in order to expedite support and provide additional features not available on the old mailing list.

Monitoring Docker containers

Postby jankogaga » Wed May 09, 2018 8:21 am

Hi,

I am running Nagios Core 4.2.0 on KVM with Centos7 OS.
By following the next guide:
https://assets.nagios.com/downloads/nag ... ios-XI.pdf

check_jmx plugin has been installed on the remote server running JMX.
nrpe.cfg on the JMX server has been configured to support JMX checking, by adding
command[check_jmx]=/usr/lib64/nagios/check_jmx $ARG1$

On Nagios Core server, I have created .cfg file for the remote server running JMX.
Please find the attached jmx.cfg file.

After an implementation of all of above and restarting appropriate services at
Nagios Core GUI, I can see that
Heap memory usage is monitoring but it shows the error:
(Return code of 255 is out of bounds).

Thanks,
Dragan
Attachments
jmx.cfg
(1.21 KiB) Downloaded 14 times
jankogaga
 
Posts: 28
Joined: Thu Apr 19, 2018 8:16 am

Re: Monitoring JMX

Postby mcapra » Wed May 09, 2018 8:30 am

Your service definition's check_command directive has a syntax error in it:
Code: Select all
define service {
        use                             generic-service
        host_name                       test-difin.abz-testing.de
        service_description             Heap memory usage
        check_command                   check_nrpe!check_jmx!-a '-U service:jmx:rmi:///jndi/rmi://127.0.0.1:1099/jmxrmi -O java.lang:type=Memory -A HeapMemoryUsage -K used I HeapMemoryUsage -J used -vvvv -w 4248302272 -c 5498760192'
        notifications_enabled           1
}


There should be a dash in -I HeapMemoryUsage. Try changing that and see if it makes a difference.

If that doesn't help, can you share the output of the following commands executed from the CLI of the remote machine (10.30.30.33):

Code: Select all
/usr/lib64/nagios/check_jmx -U service:jmx:rmi:///jndi/rmi://127.0.0.1:1099/jmxrmi -O java.lang:type=Memory -A HeapMemoryUsage -K used -I HeapMemoryUsage -J used -vvvv -w 4248302272 -c 5498760192
ls -al /usr/lib64/nagios/
echo $JAVA_HOME


Also, if we could see your command definition for the check_nrpe command, that may be useful.
Former Nagios employee
http://www.mcapra.com/
User avatar
mcapra
 
Posts: 3239
Joined: Thu May 05, 2016 3:54 pm

Re: Monitoring JMX

Postby jankogaga » Wed May 09, 2018 9:54 am

Inserting of the dash doesn't make difference.
The requested output is:

Code: Select all
[root@test-difin /]# /usr/lib64/nagios/check_jmx -U service:jmx:rmi:///jndi/rmi://127.0.0.1:1099/jmxrmi -O java.lang:type=Memory -A HeapMemoryUsage -K used -I HeapMemoryUsage -J used -vvvv -w 4248302272 -c 5498760192
JMX OK - HeapMemoryUsage.used=226035976 | HeapMemoryUsage.used=226035976,committed=2263351296;init=1054867456;max=14974713856;used=226035976
[root@test-difin /]# ls -al /usr/lib64/nagios/
total 24
drwxr-xr-x  2 root   root   4096 Feb 20 16:15 .
dr-xr-xr-x 66 root   root   4096 Feb 21 09:20 ..
-rwxr-xr-x  1 nagios nagios  140 Jan 16 15:10 check_jmx
-rwxr-xr-x  1 nagios nagios 9625 Jan 16 15:10 jmxquery.jar
[root@test-difin /]# echo $JAVA_HOME

[root@test-difin /]#

I can see that /usr/local/nagios/libexec/check_nrpe on Nagios Core server is binary file.
Here is attached file, please just remove .txt extension.

Thanks,
Dragan
Attachments
check_nrpe.txt
(79.61 KiB) Downloaded 13 times
jankogaga
 
Posts: 28
Joined: Thu Apr 19, 2018 8:16 am

Re: Monitoring JMX

Postby tgriep » Thu May 10, 2018 4:11 pm

Here are a couple of things to check.

In the nrpe.cfg config file on the remote system this option has to be set to a 1 to allow NRPE to accept arguments. Make sure it is set.
Code: Select all
dont_blame_nrpe=1


Change it if needed and restart NRPE to see if that fixes the issue.

Also, make sure the check_nrpe command is set to the following
Code: Select all
$USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ $ARG2$ $ARG3$


To test to see if the NRPE agent can be access by the nagios server, run the following on the Nagios server and it should display the NRPE agent's version.
Code: Select all
/usr/local/nagios/libexec/check_nrpe -H 10.30.30.33


Let us know what you find.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
tgriep
Madmin
 
Posts: 6683
Joined: Thu Oct 30, 2014 9:02 am

Re: Monitoring JMX

Postby jankogaga » Fri May 11, 2018 4:33 am

I have changed dont_blame_nrpe to 1, restarted nrpe, but nothing changes.

I have found in /usr/local/nagios/etc/objects/commands.cfg

Code: Select all
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$


but I wouldn't change it since other Nagios checks regularly work with that.

The output of the command:

Code: Select all
[root@monitor ~]# /usr/local/nagios/libexec/check_nrpe -H 10.30.30.33
connect to address 10.30.30.33 port 5666: Connection refused


5666 port is enabled on the host where JMX server is located (and no additional firewall is "on" on it).
I have found within nrpe.cfg

Code: Select all
server_port=5666
allowed_hosts=127.0.0.1,192.168.0.0/16,172.17.0.0/16,10.0.0.0/8


Since IP address of Nagios server (VPN one) is 10.9.0.66, as per my understanding,
it is allowed for the Nagios server to access the JMX server.

Thanks,
Dragan
jankogaga
 
Posts: 28
Joined: Thu Apr 19, 2018 8:16 am

Re: Monitoring JMX

Postby mcapra » Fri May 11, 2018 8:14 am

jankogaga wrote:I have found in /usr/local/nagios/etc/objects/commands.cfg

Code: Select all
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$


Your check_nrpe command definition only accepts 1 argument $ARG1$ and in your service's check_command directive you are attempting to pass it 2 arguments:

Code: Select all
check_command                   check_nrpe!check_jmx!-a '-U service:jmx:rmi:///jndi/rmi://127.0.0.1:1099/jmxrmi -O java.lang:type=Memory -A HeapMemoryUsage -K used -I HeapMemoryUsage -J used -vvvv -w 4248302272 -c 5498760192'


Essentially, this means everything following check_jmx in your check_command directive isn't making it to check_nrpe. You'll need to create a separate command definition that accepts multiple arguments. Or, for a slightly less clean solution, change your service's check_command directive to only pass in a single argument like so:
Code: Select all
check_command                   check_nrpe!check_jmx -a '-U service:jmx:rmi:///jndi/rmi://127.0.0.1:1099/jmxrmi -O java.lang:type=Memory -A HeapMemoryUsage -K used -I HeapMemoryUsage -J used -vvvv -w 4248302272 -c 5498760192'


jankogaga wrote:5666 port is enabled on the host where JMX server is located (and no additional firewall is "on" on it).

Can you share the output of the following command executed from the CLI of your Nagios Core machine:
Code: Select all
nmap -sS -O -p5666 10.30.30.33


You may need to install the nmap or net-tools package on your Nagios Core machine if the command cannot be found.
Former Nagios employee
http://www.mcapra.com/
User avatar
mcapra
 
Posts: 3239
Joined: Thu May 05, 2016 3:54 pm

Re: Monitoring JMX

Postby tgriep » Fri May 11, 2018 11:52 am

Thanks mcapra for the help.

If you are going to use arguments in your service check, then you will have to change the check_nrep command like the follow.
Code: Select all
$USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ $ARG2$ $ARG3$

If not, the Heap memory usage check will not work.
Doing the change should not affect your other service checks as they are only using $ARG1$.

Can you run this command on the remote system as root and post it here?
Code: Select all
netstat -anp |grep 5666


If it shows that that it is ran by xinetd, then you will need to add the IP address of the Nagios server to this file.
Code: Select all
/etc/xinetd.d/nrpe

When the agent is started by xinetd, it does not use the allowed_hosts from the nrpe.cfg file.
This is the option you have to edit to add the addresses and they need to have a space between then, no comma.
Code: Select all
only_from       = 127.0.0.1 192.168.112.130
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
tgriep
Madmin
 
Posts: 6683
Joined: Thu Oct 30, 2014 9:02 am

Re: Monitoring JMX

Postby jankogaga » Mon May 14, 2018 3:07 am

Thank you both for great support.
I have changed check_nrpe command by adding
Code: Select all
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ $ARG2$ $ARG3$

and have restarted Nagios service.

The heap memory usage error is still showing.

Here are the outputs:
On Nagios Core
Code: Select all
[root@monitor ~]# nmap -sS -O -p5666 10.30.30.33

Starting Nmap 6.40 ( http://nmap.org ) at 2018-05-14 09:34 CEST
Nmap scan report for 10.30.30.33
Host is up (0.00035s latency).
PORT     STATE  SERVICE
5666/tcp closed nrpe
Too many fingerprints match this host to give specific OS details

OS detection performed. Please report any incorrect results at http://nmap.org/submit/
Nmap done: 1 IP address (1 host up) scanned in 3.49 seconds

On Remote JMX server:
Code: Select all
[root@test-difin /]# netstat -anp |grep 5666

doesn't return anything.
jankogaga
 
Posts: 28
Joined: Thu Apr 19, 2018 8:16 am

Re: Monitoring JMX

Postby mcapra » Mon May 14, 2018 7:55 am

Simply put, based on this nmap output:

Code: Select all
[root@monitor ~]# nmap -sS -O -p5666 10.30.30.33

Starting Nmap 6.40 ( http://nmap.org ) at 2018-05-14 09:34 CEST
Nmap scan report for 10.30.30.33
Host is up (0.00035s latency).
PORT     STATE  SERVICE
5666/tcp closed nrpe


The route from the `monitor` machine to `10.30.30.33` on port 5666 is closed. I would suggest double checking your network configuration for this setup, both on the machines themselves and on any equipment between them.

It would also appear based on this output:
Code: Select all
[root@test-difin /]# netstat -anp |grep 5666


That there is no process on the `test-difin` machine that is listening on port 5666. Are you certain either the NRPE or xinetd daemon is running?
Former Nagios employee
http://www.mcapra.com/
User avatar
mcapra
 
Posts: 3239
Joined: Thu May 05, 2016 3:54 pm

Re: Monitoring JMX

Postby tgriep » Mon May 14, 2018 11:57 am

Thanks @mcapra for the help.

With the output of the netstat command not showing that the NRPE agent is listening, that would cause the inability to run the check from the Nagios server.
Try reinstalling the NRPE agent on that remote server and see if that helps out.
The link below are the instruction for compiling NRPE from source.
https://support.nagios.com/kb/article.php?id=515
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
tgriep
Madmin
 
Posts: 6683
Joined: Thu Oct 30, 2014 9:02 am

Next

Return to Nagios Core

Who is online

Users browsing this forum: Google [Bot] and 20 guests