Page 3 of 4

Re: NRPE sudo on RHEL 8

Posted: Thu Jul 16, 2020 9:40 am
by drakedts
I have made a copy of JvmInspector.jar into /usr/local/bin. I'm not sure how that can matter since the check_jvm script has the full path hardcoded, but that's no problem to have an extra copy while testing.

Here is my /usr/lib64/nagios/plugins/check_jvm. As discussed earlier in the thread, it is the same as the one downloaded from the source site, except i set the path to JvmInspector.jar to be /usr/local/libexec rather than /usr/local/bin. I have tried copying JvmInspector.jar into /usr/local/bin and setting this script to use that one, but it makes no difference: it works when logged in as the nrpe user and running the check command directly, but not when nrpe is running as a daemon and tries to run the exact same command. Also as indicated earlier, this version has known bugs (namely, it doesn't report ok/critical values at appropriate times). But for ease of consistency and tracking down the issues with why nrpe daemon cannot sudo, this is the one i'm using for now. Later on once the sudo issue is solved i'll try my fixed version again.

Code: Select all

#!/bin/bash

#  This script is Nagios plugin, part of JvmInspector tool
#  Version 2014101401 (YYYYMMDDxx)
#
#  Author: Dimitar Fidanov <[email protected]>
#
#  The latest version can be found at:
#  https://fidanov.net/c0d3/nagios-plugins/jvminspector/
#
#  See README for more details
#
#  This program is free software: you can redistribute it and/or modify
#  it under the terms of the GNU General Public License as published by
#  the Free Software Foundation, either version 3 of the License, or
#  (at your option) any later version.
#
#  This program is distributed in the hope that it will be useful,
#  but WITHOUT ANY WARRANTY; without even the implied warranty of
#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#  GNU General Public License for more details.
#
#  You should have received a copy of the GNU General Public License
#  along with this program.  If not, see <http://www.gnu.org/licenses/>.

##############################
### PATH TO JvmInspector.jar

JVMINSPECTOR="/usr/local/libexec/JvmInspector.jar"  

##############################

export PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
export ENV=""
export CDPATH=""

STATE_OK=0
STATE_WARNING=1
STATE_CRITICAL=2
STATE_UNKNOWN=3
MSG_OK="OK"
MSG_WARNING="WARNING"
MSG_CRITICAL="CRITICAL"
MSG_UNKNOWN="UNKNOWN"
SCRIPT_NAME=$(basename $0)

p_ok () {
  echo "$MSG_OK $1"
  exit "$STATE_OK"
}
p_warning () {
  echo "$MSG_WARNING $1"
  exit "$STATE_WARNING"
}
p_critical () {
  echo "$MSG_CRITICAL $1"
  exit "$STATE_CRITICAL"
}
p_unknown () {
  echo "$MSG_UNKNOWN $1"
  exit "$STATE_UNKNOWN"
}

usage () {
  cat << EOF
This is Nagios plugin that checks local JVMs properies like heap & non-heap memory, threads and etc

Usage: $SCRIPT_NAME -n|--name <java_name> -p|--property <property> -w|--warning <warn> -c|--critical <crit>

Where: <propery> is one of: "heap|non-heap|threads|classes"
Hint: You can use "jps -l" or "java -jar JvmInspector.jar all" to get the java name
Example: $SCRIPT_NAME -n org.apache.catalina.startup.Bootstrap -p heap -w 1073741824 -c 2147483648

EOF
exit 0
}

[ "$#" -eq 0 ] && usage

while [ ! -z "$1" ]; do
  case $1 in
    -n|--name)     shift; NAME="$1";;
    -p|--property) shift; PROPERTY="$1";;
    -w|--warning)  shift; WARNING="$1";;
    -c|--critical) shift; CRITICAL="$1";;
    -h|--help)     usage;;
  esac
  shift
done

[ -z "$NAME" ] && p_unknown "Missing JVM app class name, use -n <value>"
[ -z "$PROPERTY" ] && p_unknown "Missing property, use -p <value>"
[ -z "$WARNING" ] && p_unknown "Missing warning thresholds, use -w <value>"
[ -z "$CRITICAL" ] && p_unknown "Missing critical thresholds, use -c <value>"

expr ${WARNING}  : '[0-9]\+$' >/dev/null || p_unknown "Invalid warning threshold"
expr ${CRITICAL}  : '[0-9]\+$' >/dev/null || p_unknown "Invalid critical threshold"
[ -f "$JVMINSPECTOR" ] || p_unknown "Can't find JvmInspector.jar, please install it and set JVMINSPECTOR var in this script"

PSLINE="$(ps axo pid,uid,command | grep [j]ava | grep $NAME | head -1)"
PID="$(echo $PSLINE | awk '{print $1}')"
PUID="$(echo $PSLINE | awk '{print $2}')"

[ -z "${PID}" ] && p_unknown "Can't find JVM with class name: $NAME"
expr ${PID}  : '[0-9]\+$' >/dev/null || p_unknown "Bug"

[ "${PUID}" = "${EUID}" ] || p_unknown "JVM is running with different username, run this script with UID $PUID"

TIMEOUT="" ; timeout --version >/dev/null 2>&1 && TIMEOUT="timeout 7"
JVMDATA="$(${TIMEOUT} java -jar ${JVMINSPECTOR} ${PID} 2>&1)"
[ $? -ne 0 ] && p_unknown "Can't connect to JVM: ${JVMDATA}" 

echo "$JVMDATA" | grep "class count" >/dev/null 2>/dev/null || p_unknown "Can't connect to the JVM: $JVMDATA"

#echo "$JVMDATA"  # debug

if [ "${PROPERTY}" = "threads" ]; then
        RESULT="$(printf "%s" "$JVMDATA" | awk '/^  thread count/{print $3}')"
        FRESULT="${RESULT}"
        PERFDATA="${PROPERTY}=${RESULT};;;"
elif [ "${PROPERTY}" = "classes" ]; then
        RESULT=$(printf "%s" "$JVMDATA" | awk '/^  class count/{print $3}')
        FRESULT="${RESULT}"
        PERFDATA="${PROPERTY}=${RESULT};;;"
elif [ "${PROPERTY}" = "heap" ]; then
        TEMPDATA=$(printf "%s" "$JVMDATA" | awk 'BEGIN { FS = ": " } ;/^  heap memory/{print $2}')
        MAX=$(printf "%s" "$TEMPDATA" | awk 'BEGIN { FS="|" } {print $1}' | awk 'BEGIN { FS="=" } {print $2}')
        COMMITED=$(printf "%s" "$TEMPDATA" | awk 'BEGIN { FS="|" } {print $2}' | awk 'BEGIN { FS="=" } {print $2}')
        USED=$(printf "%s" "$TEMPDATA" | awk 'BEGIN { FS="|" } {print $3}' | awk 'BEGIN { FS="=" } {print $2}')
        RESULT="${USED}"
        FRESULT=$(echo "${RESULT}" | numfmt --to=iec 2>/dev/null) || FRESULT="${RESULT}"
        PERFDATA="max=${MAX};;; commited=${COMMITED};;; used=${USED};;;"
elif [ "${PROPERTY}" = "non-heap" ]; then
        TEMPDATA=$(printf "%s" "$JVMDATA" | awk 'BEGIN { FS = ": " } ;/^  non-heap memory/{print $2}')
        MAX=$(printf "%s" "$TEMPDATA" | awk 'BEGIN { FS="|" } {print $1}' | awk 'BEGIN { FS="=" } {print $2}')
        COMMITED=$(printf "%s" "$TEMPDATA" | awk 'BEGIN { FS="|" } {print $2}' | awk 'BEGIN { FS="=" } {print $2}')
        USED=$(printf "%s" "$TEMPDATA" | awk 'BEGIN { FS="|" } {print $3}' | awk 'BEGIN { FS="=" } {print $2}')
        RESULT="${USED}"
        FRESULT=$(echo "${RESULT}" | numfmt --to=iec 2>/dev/null) || FRESULT="${RESULT}"
        PERFDATA="max=${MAX};;; commited=${COMMITED};;; used=${USED};;;"
elif [ "${PROPERTY}" = "sessions" ]; then
	RESULT=$(printf "%s" "$JVMDATA" | awk '/^  active sessions/' | sed 's/^.*total\=\([0-9]*\)|.*$/\1/g')	
        FRESULT="${RESULT}"
        PERFDATA="sessions=${RESULT};;;"
else 
	p_unknown "Invalid property"
fi

[ -z ${RESULT} ] && p_unknown "Invalid data"
expr ${RESULT}  : '-\?[0-9]\+$' >/dev/null || p_unknown "Invalid data"

if [ "${RESULT}" -ge "$CRITICAL" ]; then
	p_critical "${FRESULT} |${PERFDATA}"
elif [ "${RESULT}" -ge "$WARNING" ]; then
	p_warning "${FRESULT} |${PERFDATA}"
else 
	p_ok "${FRESULT} |${PERFDATA}"
fi

exit 0

Re: NRPE sudo on RHEL 8

Posted: Thu Jul 16, 2020 3:21 pm
by tgriep
I am trying to setup Tomcat on a Centos 8 system to see if I can replicate this but having limited success getting it to start.
But when I was trying to set it up, I had to set some environment variables and that gave me an idea.
When you login to the server in a shell, I bet the environment variables are getting set and the plugin works.
When running it out of NRPE, I guess they are not set so try putting them in the check_jvm script and see if it runs.

Re: NRPE sudo on RHEL 8

Posted: Thu Jul 16, 2020 3:37 pm
by drakedts
I think i offered earlier in this thread to provide the Tomcat RPM that i built. If you want it, just let me know what is the best way to upload it (it's about 9 MB).

Which environment variables should i set? The environment has quite a bit of stuff, about 87 kB!

Code: Select all

[nrpe@lnx-ethosapi2-test ~]$ set | wc
   2421    7504   89530

Re: NRPE sudo on RHEL 8

Posted: Thu Jul 16, 2020 4:16 pm
by tgriep
Any of the CATALINA environment variables but especially CATALINA_HOME should be set.

Re: NRPE sudo on RHEL 8

Posted: Fri Jul 17, 2020 8:12 am
by drakedts
No CATALINA variables are set:

Code: Select all

[nrpe@lnx-ethosapi2-test ~]$ set | grep CATALINA
[nrpe@lnx-ethosapi2-test ~]$ 
It's a bit long but i'll just attach the full environment so you can see it; nothing jumps out to me as being relevant to Tomcat but maybe you'll see something i missed?
set.txt

Re: NRPE sudo on RHEL 8

Posted: Fri Jul 17, 2020 12:00 pm
by tgriep
The testing I have done, it is not the Environment variables that is causing the issue.
I think it is a pam policy that is blocking the java application.

I created a simple script and put the following in it which should print the version.

Code: Select all

sudo -u tomcat /usr/bin/java -version
It did not work.

I changed it to this so the nrpe user can run it.

Code: Select all

/usr/bin/java -version
And that failed as well so something is blocking the java application from running and I do not see why that is happening.

Re: NRPE sudo on RHEL 8

Posted: Mon Jul 20, 2020 10:14 am
by tgriep
What I did to get the check_jvm plugin to work on a Centos 8 system was I installed the java-latest-openjdk package from the epel repository.
This is the version that was installed.
java-latest-openjdk.x86_64 1:14.0.1.7-2.rolling.el8 epel

After that, I edited the check_jvm plugin and changed the path of the java application to use the epel java application.

Change this line from

Code: Select all

JVMDATA="$(${TIMEOUT} java -jar ${JVMINSPECTOR} ${PID} 2>&1)"
to

Code: Select all

JVMDATA="$(${TIMEOUT} /usr/lib/jvm/java-14-openjdk-14.0.1.7-2.rolling.el8.x86_64/bin/java -jar ${JVMINSPECTOR} ${PID} 2>&1)"
If you installed a different version than the above, adjust the path to the java application.

Save the change and the check_jvm plugin would function when ran by the NRPE agent.

Code: Select all

/usr/local/nagios/libexec/check_nrpe -H localhost -c tomcat_heap
OK 35M |max=876609536;;; commited=76021760;;; used=36456400;;;

Re: NRPE sudo on RHEL 8

Posted: Mon Jul 20, 2020 4:14 pm
by drakedts
That is rather crazy. The Java packages that Tomcat is configured to use are these (such an ancient version is mandated by the application that runs within Tomcat):

java-1.8.0-openjdk-devel-1.8.0.252.b09-3.el8_2.x86_64
java-1.8.0-openjdk-headless-1.8.0.252.b09-3.el8_2.x86_64
java-1.8.0-openjdk-1.8.0.252.b09-3.el8_2.x86_64

I tried installing both Java 11 and Java 14 in parallel with Java 8. Running the check_jvm with either 11 or 14 works! I would not have expected that given that it is connecting to a daemon running Java 8. But it does!

The only difference from your code is that in check_jvm i used the alternatives path to Java rather than hard coding the exact version; going through alternatives should be more resilient to upgrades. For example, using "/etc/alternatives/jre_11/bin/java" for Java 11 and "/etc/alternatives/jre_14/bin/java for Java 14".

I think this solution will work. I wish we knew why Java 8 on RHEL 8 won't work. Maybe Red Hat configured some extra security settings on that older version? I don't know much about how Java really works, i'm just recalling on other OSes that there's a Java control panel where one can adjust a bunch of settings.

Re: NRPE sudo on RHEL 8

Posted: Tue Jul 21, 2020 9:37 am
by tgriep
It is kind of strange that the older version of JAVA does not run by NRPE. I don't know it it is a security settings or that it is just how it works or that it is an bug with that version of Java.
Using the /etc/alternatives/ path is a good idea for future upgrades.

I don't think that the version of Java running the plugin is different that the one running the Tomcat process.
As long as it opens the jar file and processes the commands to connect to Tomcat, it should work.

Re: NRPE sudo on RHEL 8

Posted: Fri Jul 24, 2020 8:50 am
by drakedts
So on RHEL 8 we have Tomcat running with OpenJDK 1.8. But the check_jvm is hard-coded to run OpenJDK 11 instead. And the check works just fine even though the Java versions differ. But i've tested it in production and it all works.

So this thread can be locked or archived. Thank you for the help tracking down this weird problem!