Page 1 of 4

Script returns different result when ran locally versus nrpe

Posted: Tue Oct 28, 2014 1:04 pm
by rickwilson7425
I have a script that I can run locally and get the proper return. When I try to run it using check_nrpe, either on the system itself or from the Nagios server, I get an improper response.

This is the response running locally:

./check_redis.sh
REDIS Sentinel 16379 on dencita1 is active and PONG was the response
REDIS Nuxeo 6379 on dencita1 is active and PONG was the response


This is the response when running with nrpe:

./check_nrpe -H localhost -c check_redis
REDIS Sentinel 16379 on dencita1 not active and did not PONG


The scripts is here:

Code: Select all

#!/usr/bin/env bash

################# Check SENTINEL Port #################

HOSTNAME_REDIS=`hostname -s`
# Gather ports (default sentinel port is 26379)
# SEN_PORT=`grep -P '^port \d+' /etc/redis/rsentinel.conf || echo "port 26379" | cut -f2 -d' '`
# SEN_PORT=`grep -P '^port \d+' /etc/redis/rsentinel.conf | cut -f2 -d' '`
# SEN_PORT=`grep -P '^port ' /etc/redis/rsentinel.conf | cut -f2 -d' '`
SEN_PORT=`grep -w port /etc/redis/rsentinel.conf | cut -f2 -d' '`
#  if [[ `redis-cli -p $SEN_PORT PING | grep 'PONG' 2>&1 >/dev/null` ]];then
  if [[ `redis-cli -p $SEN_PORT PING | grep 'PONG'` ]];then
    # Success
    echo "REDIS Sentinel $SEN_PORT on $HOSTNAME_REDIS is active and PONG was the response"
  else
    # Failure
    echo "REDIS Sentinel $SEN_PORT on $HOSTNAME_REDIS $port not active and did not PONG"
    exit 1
  fi
  
################# Check MOBILE Port #################

if [[ `find /apps/opt -maxdepth 1 -type d | grep /apps/opt/Tomcat_JDM` \
   || `find /apps/opt -maxdepth 1 -type d | grep /apps/opt/Tomcat_PWM` \
   || `find /apps/opt -maxdepth 1 -type d | grep /apps/opt/Tomcat_FDS` \
   || `find /apps/opt -maxdepth 1 -type d | grep /apps/opt/Tomcat_SYNC` \
   || `find /apps/opt -maxdepth 1 -type d | grep /apps/opt/Tomcat_REGISTRAR` ]];then

  MOBIL_PORT=`grep -w MobileServices_Redis_Master /etc/redis/rsentinel.conf | cut -f5 -d' '`
  if [ -z $MOBIL_PORT ];then
    echo "MOBILE REDIS NOT DEFINED IN SENTINEL. WARNING MR. ROBINSON"
    exit 178
  fi
#  if [[ `redis-cli -p $MOBIL_PORT PING | grep 'PONG' 2>&1 >/dev/null` ]];then
  if [[ `redis-cli -p $MOBIL_PORT PING | grep 'PONG'` ]];then
    # Success
    echo "REDIS Mobile $MOBIL_PORT on $HOSTNAME_REDIS is active and PONG was the response"
  else
    # Failure
    echo "REDIS Mobile $MOBIL_PORT on $HOSTNAME_REDIS $port not active and did not PONG"
    exit 1
  fi
fi

################### Check NUXEO Port #################

if [[ `find /apps/opt -maxdepth 1 -type d | grep /apps/opt/nuxeo-jdm-server` ]];then
#  NUXEO_PORT=`ls -1 /etc/redis/*.conf | cut -f1 -d' ' | grep -v $MOBIL_PORT`
#  NUXEO_PORT=`grep -P '^#sentinel monitor Nuxeo_Redis_Master ' /etc/redis/rsentinel.conf | cut -f5 -d' '`
  NUXEO_PORT=`grep -w Nuxeo_Redis_Master /etc/redis/rsentinel.conf | cut -f5 -d' '`
  if [ -z $NUXEO_PORT ];then
    echo "NUXEO REDIS NOT DEFINED IN SENTINEL. WARNING MR. ROBINSON"
    exit 178
  fi
#  if [[ `redis-cli -p $NUXEO_PORT PING | grep 'PONG' 2>&1 >/dev/null` ]];then
  if [[ `redis-cli -p $NUXEO_PORT PING | grep 'PONG'` ]];then
    # Success
    echo "REDIS Nuxeo $NUXEO_PORT on $HOSTNAME_REDIS is active and PONG was the response"
  else
    # Failure
    echo "REDIS Nuxeo $NUXEO_PORT on $HOSTNAME_REDIS $port not active and did not PONG"
    exit 1
  fi
fi
Any ideas?

Thanks

Re: Script returns different result when ran locally versus

Posted: Tue Oct 28, 2014 1:18 pm
by slansing
Not too familiar with REDIS or checks against, it, looks like it is submitting something to port 26379 and getting a invalid response, which the plugin is correctly handling and exiting with a 3. Also, you are checking against the local nagios server, is that also where you have your REDIS server? I would think your nrpe check from nagios would look similar to:

Code: Select all

./check_nrpe -H redis.server.addr -c check_redis
And then you would have check_redis defined as a command in the nrpe.cfg on that remote host. Is that not the case?

Re: Script returns different result when ran locally versus

Posted: Tue Oct 28, 2014 1:29 pm
by rickwilson7425
The redis server and the Nagios server are different systems.

It doesn't matter if I run the check_nrpe command from the redis server or the Nagios server, I get the same response, which is invalid.

When running through nrpe it finds the proper port but when it reports back it is saying that the port is not active which is not true. You can see from the two different responses that one says that port 16379 is active and it received the proper PONG response, the other says that the same port is not active and did not PONG.

The line which was checking port 26379 is commented out.

Re: Script returns different result when ran locally versus

Posted: Tue Oct 28, 2014 1:30 pm
by rickwilson7425
Here si the file the script is checking against:

Code: Select all

cat /etc/redis/rsentinel.conf
daemonize yes

### Configuration for Nuxeo Redis
#sentinel monitor Nuxeo_Redis_Master dencita1.jeppesen.com 6379 2
#sentinel down-after-milliseconds Nuxeo_Redis_Master 4000
#sentinel failover-timeout Nuxeo_Redis_Master 8000
#sentinel parallel-syncs Nuxeo_Redis_Master 1

### Configuration for MobileServices Redis
sentinel monitor MobileServices_Redis_Master 10.1.88.39 6380 2
sentinel down-after-milliseconds MobileServices_Redis_Master 4000
sentinel failover-timeout MobileServices_Redis_Master 8000
sentinel config-epoch MobileServices_Redis_Master 0

# Generated by CONFIG REWRITE
port 16379
dir "/export/home/jdmpadm"
sentinel leader-epoch MobileServices_Redis_Master 0
sentinel known-slave MobileServices_Redis_Master 10.1.88.37 6380
sentinel known-sentinel MobileServices_Redis_Master 10.1.88.39 16380 09a9634ccf4cbe52fcccacd775603a4f461e2f9c
sentinel current-epoch 0

Re: Script returns different result when ran locally versus

Posted: Tue Oct 28, 2014 2:06 pm
by lmiltchev
./check_redis.sh
REDIS Sentinel 16379 on dencita1 is active and PONG was the response
REDIS Nuxeo 6379 on dencita1 is active and PONG was the response
Did you run this check as root? Do you get the same output when you run it as nagios user?

Re: Script returns different result when ran locally versus

Posted: Tue Oct 28, 2014 2:11 pm
by rickwilson7425
got the same response as either root or nagios. good running locally - bad running through nrpe.

Re: Script returns different result when ran locally versus

Posted: Tue Oct 28, 2014 2:46 pm
by rickwilson7425
I figured it out:

I had to change this line:

if [[ `redis-cli -p $SEN_PORT PING | grep 'PONG'` ]];then

To this, by adding the full path to the command.

if [[ `/usr/local/bin/redis-cli -p $SEN_PORT PING | grep 'PONG'` ]];then

Now that that works, another question -

The proper response from the script is this:

REDIS Sentinel port 16379 on dencita1 is active and PONG was the response
REDIS Nuxeo port 6379 on dencita1 is active and PONG was the response


Only the first line shows up in the Nagios monitor screen. How can I get both lines to display?

Re: Script returns different result when ran locally versus

Posted: Tue Oct 28, 2014 4:41 pm
by abrist
Do you see both lines on the "details" page for the object in question?

Re: Script returns different result when ran locally versus

Posted: Wed Oct 29, 2014 7:31 am
by rickwilson7425
Yes, I do

Re: Script returns different result when ran locally versus

Posted: Wed Oct 29, 2014 4:36 pm
by sreinhardt
You need to use a \n in your output to nagios, not an actual newline or second echo. Using \n will be replaced internally with a newline for pretty output but will not cause Core to think the second line is intended as long output. You will probably want to store your results in a variable and do a single echo at the end instead of multiple echos.