Page 1 of 2

NMHeapDump.hprof

Posted: Mon Jun 11, 2018 1:48 am
by nms_system_support
Hello,

We used to check the Head Dump process with check_nrpe!check_services!-a '/NMHeapDump.hprof'!!!!!!

But now has been updated.

the new process that they send us is the one below:

root 9491 1 13 12:28 ? 00:09:40 /opt/adva/share/jre/bin/java -d64 -server -Xmx3000M -cp lib/mediation.jar -XX:HeapDumpPath=NMHeapDumps -XX:+HeapDumpOnOutOfMemoryError -Dcom.sun.xml.bind.v2.runtime.JAXBContextImpl.fastBoot=true -Djava.awt.headless=true -Dfile.encoding=UTF-8 -javaagent:lib/aspectjweaver.jar -Dorg.apache.activemq.SERIALIZABLE_PACKAGES=* -Djava.util.logging.config.file=./logging.properties com.adva.nlms.mediation.Launcher


How can we check the new process from Nagios?

Thank yu

Re: NMHeapDump.hprof

Posted: Mon Jun 11, 2018 6:57 am
by mcapra
I'd suggest wrapping that in a script to interpret the output following the Nagios plugin development guidelines. Install an agent such as NRPE or NCPA on the remote machine and execute the script via check_nrpe or check_ncpa.

If you're not sure where to start, please provide the full output of the following command executed from the remote machine:

Code: Select all

/opt/adva/share/jre/bin/java -d64 -server -Xmx3000M -cp lib/mediation.jar -XX:HeapDumpPath=NMHeapDumps -XX:+HeapDumpOnOutOfMemoryError -Dcom.sun.xml.bind.v2.runtime.JAXBContextImpl.fastBoot=true -Djava.awt.headless=true -Dfile.encoding=UTF-8 -javaagent:lib/aspectjweaver.jar -Dorg.apache.activemq.SERIALIZABLE_PACKAGES=* -Djava.util.logging.config.file=./logging.properties com.adva.nlms.mediation.Launcher && echo $?
Assuming all of the dump/hprof files land in the same general area on the system, you could also rig up a file system check with the folder watch wizard instead of monitoring the process/JVM itself.

Re: NMHeapDump.hprof

Posted: Mon Jun 11, 2018 11:13 am
by cdienger
Thanks for the assist, @mcapra!

Re: NMHeapDump.hprof

Posted: Tue Jun 12, 2018 4:48 am
by nms_system_support
Hello,

Since the administrator of the system is not confident to run the

/opt/adva/share/jre/bin/java -d64 -server -Xmx3000M -cp lib/mediation.jar -XX:HeapDumpPath=NMHeapDumps -XX:+HeapDumpOnOutOfMemoryError -Dcom.sun.xml.bind.v2.runtime.JAXBContextImpl.fastBoot=true -Djava.awt.headless=true -Dfile.encoding=UTF-8 -javaagent:lib/aspectjweaver.jar -Dorg.apache.activemq.SERIALIZABLE_PACKAGES=* -Djava.util.logging.config.file=./logging.properties com.adva.nlms.mediation.Launcher && echo $?

do you know what exactly this command is doing?

thank you

Re: NMHeapDump.hprof

Posted: Tue Jun 12, 2018 6:47 am
by mcapra
Oh hey, I didn't realize that was an actual process you were trying to monitor. I totally misunderstood the original post. My mistake. Disregard that command.

If you could share the remote machine's nrpe.cfg file, that might be helpful.

Without knowing more about your current NRPE configuration, check_procs is what I would use. You could do a simple process argument match on -cp lib/mediation.jar:

Code: Select all

[root@nagiosxi ~]# /usr/local/nagios/libexec/check_procs -c 1: -a "-cp lib/mediation.jar"
PROCS OK: 1 process with args '-cp lib/mediation.jar' | procs=1;;1:;0;
The -c 1: makes sure to cause a CRITICAL state if there are fewer than 1 processes that contain the provided arguments -a.

Code: Select all

[root@nagiosxi ~]# /usr/local/nagios/libexec/check_procs -c 1: -a "-cp lib/mediation.jar"
PROCS OK: 1 process with args '-cp lib/mediation.jar' | procs=1;;1:;0;

## then we kill the process

[root@nagiosxi ~]# /usr/local/nagios/libexec/check_procs -c 1: -a "-cp lib/mediation.jar"
PROCS CRITICAL: 0 processes with args '-cp lib/mediation.jar' | procs=0;;1:;0;
If you installed NRPE via the official documentation, you may need to add a separate command definition in the NRPE configuration file to allow check_nrpe to provided the all important -a argument. Here's what I've used:

Code: Select all

command[check_procs_single]=/usr/local/nagios/libexec/check_procs -c 1: -a "$ARG1$"
Which allows me to pass a single argument to the check_procs plugin by leveraging the newly created check_procs_single command. In action:

Code: Select all

[root@nagiosxi ~]# /usr/local/nagios/libexec/check_nrpe -H 192.168.206.136 -c check_procs_single -a "-cp lib/mediation.jar"
PROCS OK: 1 process with args '-cp lib/mediation.jar' | procs=1;;1:;0;

## then we kill the process

[root@nagiosxi ~]# /usr/local/nagios/libexec/check_nrpe -H 192.168.206.136 -c check_procs_single -a "-cp lib/mediation.jar"
PROCS CRITICAL: 0 processes with args '-cp lib/mediation.jar' | procs=0;;1:;0;

Re: NMHeapDump.hprof

Posted: Tue Jun 12, 2018 6:52 am
by mcapra
Oh hey, I didn't realize that was an actual process you were trying to monitor. I totally misunderstood the original post. My mistake. Disregard that command.

You could try making this simple change in the Core Config Manager:

Code: Select all

check_nrpe!check_services!-a 'lib/mediation.jar'!!!!!!
And if that fails, see below:

If you could share the remote machine's nrpe.cfg file, that might be helpful.

Without knowing more about your current NRPE configuration, check_procs is what I would use. You could do a simple process argument match on -cp lib/mediation.jar:

Code: Select all

[root@nagiosxi ~]# /usr/local/nagios/libexec/check_procs -c 1: -a "-cp lib/mediation.jar"
PROCS OK: 1 process with args '-cp lib/mediation.jar' | procs=1;;1:;0;
The -c 1: makes sure to cause a CRITICAL state if there are fewer than 1 processes that contain the provided arguments -a.

Code: Select all

[root@nagiosxi ~]# /usr/local/nagios/libexec/check_procs -c 1: -a "-cp lib/mediation.jar"
PROCS OK: 1 process with args '-cp lib/mediation.jar' | procs=1;;1:;0;

## then we kill the process

[root@nagiosxi ~]# /usr/local/nagios/libexec/check_procs -c 1: -a "-cp lib/mediation.jar"
PROCS CRITICAL: 0 processes with args '-cp lib/mediation.jar' | procs=0;;1:;0;
If you installed NRPE via the official documentation, you may need to add a separate command definition in the NRPE configuration file to allow check_nrpe to provided the all important -a argument. Here's what I've used:

Code: Select all

command[check_procs_single]=/usr/local/nagios/libexec/check_procs -c 1: -a "$ARG1$"
Which allows me to pass a single argument to the check_procs plugin by leveraging the newly created check_procs_single command. In action:

Code: Select all

[root@nagiosxi ~]# /usr/local/nagios/libexec/check_nrpe -H 192.168.206.136 -c check_procs_single -a "-cp lib/mediation.jar"
PROCS OK: 1 process with args '-cp lib/mediation.jar' | procs=1;;1:;0;

## then we kill the process

[root@nagiosxi ~]# /usr/local/nagios/libexec/check_nrpe -H 192.168.206.136 -c check_procs_single -a "-cp lib/mediation.jar"
PROCS CRITICAL: 0 processes with args '-cp lib/mediation.jar' | procs=0;;1:;0;

Re: NMHeapDump.hprof

Posted: Tue Jun 12, 2018 7:18 am
by nms_system_support
mcarpa thank you for the reply

So the process that I Have to check is the lib/mediation.jar ?

the output from GUI is:

[nagios@nagios ~]# /usr/local/nagios/libexec/check_nrpe -H x.x.x.x -t 30 -c check_procs -a 'lib/mediation.jar'
PROCS WARNING: 181 processes | procs=181;lib/mediation.jar;;0;

Re: NMHeapDump.hprof

Posted: Tue Jun 12, 2018 8:27 am
by mcapra
nms_system_support wrote: So the process that I Have to check is the lib/mediation.jar ?
I don't know what you have to check for because I don't know anything about this process. All I have provided is a check_procs setup that will match lib/meditation.jar because that seemed like a convenient term to match on in the process you provided.

If it is not a convenient term to match (because there's ~200 matching processes and you really only care about one of them), you might try expanding on the -a portion of this command. Perhaps including literally all of the arguments included in the java command you referenced?

Re: NMHeapDump.hprof

Posted: Tue Jun 12, 2018 2:29 pm
by cdienger
Thanks again for the assist, @mcapra!

Re: NMHeapDump.hprof

Posted: Mon Jun 18, 2018 7:21 am
by nms_system_support
Hello again,

I need to understand something

what we had until now is the below:

[nagios@nagios ~]# /usr/local/nagios/libexec/check_nrpe -H x.x.x.x -t 30 -c check_services -a '/NMHeapDump.hprof'
*** /NMHeapDump.hprof: Nok ***

Could you please tell me how check_services is working?

Is it something like : check_services -a name_of_service ? does this check_service is a plugin? (because I cannot find it in /usr/local/nagios/libexec)


Thank you all for your help