Page 1 of 2

NRPE is not able to parse the argument properly

Posted: Mon Jul 17, 2017 8:28 pm
by ddaluka
Hi,

I have tomcat 8 installed in my application server with 3-4 applications deployed under that. I am using check_nrpe to use check_tomcatApplication plugin to monitor the service status for each application.

Here is my entry from command.cfg :

command[check_tomcatApplication]=/usr/local/nagiosxi/libexec/check_tomcatApplication $ARG1$

Here is the service check defined:

Code: Select all

define service {        
        service_description             checkHOMEApp
        use                             xiwizard_nrpe_service
        hostgroup_name                  testHostGroup
        check_command                   check_nrpe!check_tomcatApplication!-a '-H localhost -P 28080 -u admin -p admin -V 8 -a home'!!!!!!!
	max_check_attempts		3
	check_interval			5
	retry_interval			1
	check_period			xi_timeperiod_24x7
	notification_interval   		1440
	notification_period     		xi_timeperiod_24x7
	contact_groups          		admins
        register                        1
        }
I was using generic-service template earlier which was having check interval set to 10 mins ,, retry interval to 1 min and max_check_attempts to 3. The issue is:

When I was doing a test run check from ui, everything works fine with status as application running. when nagios is having its scheduled checks, it will say status running and then even before the next check (which is after 10 min) , the status will still be green but with following output:

Code: Select all

Unknown argument: home

Version 1.02

check_tomcatApplication is a Nagios plugin to check a specific Tomcat Application.

check_tomcatApplication -u user -p password -h host -P port -a application

Options:
 -u/--user)
 User name for authentication on Tomcat Manager Application
 -p/--password)
 Password for authentication on Tomcat Manager Application
 -H/--host)
 Host Name of the server
 -P/--port)
 Port Number Tomcat service is listening on
 -a/--appname)
 Application name to be checked
 -V/--tomcat_version)
 Version of the Tomcat. Default is Tomcat 6
I thought of changing the service template and check intervals but its the same issue. Here is the plugin content:

Code: Select all

#!/bin/sh

#VARIAVEIS NAGIOS
NAGIOS_OK=0
NAGIOS_WARNING=1
NAGIOS_CRITICAL=2
NAGIOS_UNKNOWN=3

PROGNAME=`basename $0 .sh`
VERSION="Version 1.02"
TOMCAT_VERSION="6"

WGET=/usr/bin/wget
GREP=/bin/grep

print_version() {
    echo "$VERSION"
}

print_help() {
    print_version $PROGNAME $VERSION
    echo ""
    echo "$PROGNAME is a Nagios plugin to check a specific Tomcat Application."
    echo ""
    echo "$PROGNAME -u user -p password -h host -P port -a application"
    echo ""
    echo "Options:"
    echo "  -u/--user)"
    echo "     User name for authentication on Tomcat Manager Application"
    echo "  -p/--password)"
    echo "     Password for authentication on Tomcat Manager Application"
    echo "  -H/--host)"
    echo "     Host Name of the server"
    echo "  -P/--port)"
    echo "     Port Number Tomcat service is listening on"
    echo "  -a/--appname)"
    echo "     Application name to be checked"
    echo "  -V/--tomcat_version)"
    echo "     Version of the Tomcat. Default is Tomcat 6"
    exit $ST_UK
}

if [ ! -x "$WGET" ]
then
        echo "wget not found!"
        exit $NAGIOS_CRITICAL
fi

if [ ! -x "$GREP" ]
then
        echo "grep not found!"
        exit $NAGIOS_CRITICAL
fi

if test -z "$1"
then
        print_help
        exit $NAGIOS_CRITICAL
fi

while test -n "$1"; do
    case "$1" in
        --help|-h)
            print_help
            exit $ST_UK
            ;;
        --version|-v)
            print_version $PROGNAME $VERSION
            exit $ST_UK
            ;;
        --user|-u)
            USER=$2
            shift
            ;;
        --password|-p)
            PASSWORD=$2
            shift
            ;;
            shift
            ;;
            shift
            ;;
        --appname|-a)
            APP=$2
            shift
            ;;
        --tomcat_version|-V)
            TOMCAT_VERSION=$2
            shift
            ;;
        *)
            echo "Unknown argument: $1"
            print_help
            exit $ST_UK
            ;;
        esac
    shift
done

# Default URL - Tomcat 6
URL="http://$USER:$PASSWORD@$HOST:$PORT/manager/list"

if [ $TOMCAT_VERSION = 7 -o $TOMCAT_VERSION = 8 ]
then
        URL="http://$USER:$PASSWORD@$HOST:$PORT/manager/text/list"
fi

if wget -o /dev/null -O - $URL | grep -q "^/$APP:running"
then
        echo "OK: Application $APP is running!"
#        wget -o /dev/null -O - http://$USER:$PASSWORD@$HOST:$PORT/manager/status?XML=true |sed -e "s/\/>/\/>\n/g"|egrep "(connector|requestInfo|<memory)"|sed -e "s/\"//g"|sed -e "s/'//g"|awk -v app=$APP '{
#        if ($0 ~ "connector name=")  {  value=$2; all=substr(value,6,13);  ncount=index(all,"<")-2; connector=substr(all,0,ncount);}
#        if ($0 ~ "<memory ") {  jm=$9; ccount=index(jm,"/")-1; jmax=substr($9,0,ccount); print app"_JVM_OK:|" app"_JVM_"$7"MB;;;0 "app"_JVM_"$8"MB;;;0 "app"_JVM_"jmax"MB;;;0"};
#        if ($0 ~ "<requestInfo") {  print app"_"connector" OK:|"connector"_"$2"ms;;;0 "connector"_"$3"ms;;;0 "connector"_"$4"ms;;;0 "connector"_"$5"ms;;;0 "connector"_"$6"ms;;;0 "connector"_"$7"ms;;;0"; } ;}'
        exit $NAGIOS_OK
else
        echo "CRITICAL: Application $APP is not running!"
        exit $NAGIOS_CRITICAL
fi

Re: NRPE is not able to parse the argument properly

Posted: Tue Jul 18, 2017 1:30 pm
by tgriep
Can you edit the check command on the Nagios XI system and change it from

Code: Select all

check_command                   check_nrpe!check_tomcatApplication!-a '-H localhost -P 28080 -u admin -p admin -V 8 -a home'!!!!!!!
to

Code: Select all

check_command                   check_nrpe!check_tomcatApplication!-a '-H localhost -P 28080 -u admin -p admin -V 8 --appname home'!!!!!!!
See if that fixes the issue.

I think either the check_nrpe plugin or the agent sees both of the -a options and is only passing one of them.

Re: NRPE is not able to parse the argument properly

Posted: Thu Jul 20, 2017 6:14 am
by ddaluka
Hi,

Thanks a lot for your response. I will check this today and will let you know the result.

Re: NRPE is not able to parse the argument properly

Posted: Thu Jul 20, 2017 10:36 am
by ddaluka
Hi,

I did check after changing the command arguments. My service description looks like below now:

Code: Select all

define service {        
        service_description             checkHOMEApp
        use                             xiwizard_nrpe_service
        hostgroup_name                  testHostGroup
        check_command                   check_nrpe!check_tomcatApplication!-a '-H localhost -P 28080 -u admin -p admin -V 8 --appname home'!!!!!!!
   max_check_attempts      3
   check_interval         5
   retry_interval         1
   check_period         xi_timeperiod_24x7
   notification_interval         1440
   notification_period           xi_timeperiod_24x7
   contact_groups                admins
        register                        1
        }
Even after this change my issue is same. The problem here I am trying to understand is:

1. why is it inconsistent? My service check is continuously flapping between OK and Critical.

2. Earlier my check interval was set to 10 mins and I have changed it to 5 mins already. then why my service next check is flapping between 5 mins and 10 mins?

3. I suspect that my service is holding up the previous event for some reason. I have checked the service at around 16:22 and below is the status:
home check 1.JPG
At 16:23, service status changed to below.
home Check 2.JPG
Can you please suggest what is going wrong here? Do I need to clean up any database ? I have restarted all the services already.

Re: NRPE is not able to parse the argument properly

Posted: Thu Jul 20, 2017 2:21 pm
by mcapra
In terms of the flapping problem, it's probably an issue with the plugin check_TomcatApplication itself. Looking at line 108:

Code: Select all

if wget -o /dev/null -O - $URL | grep -q "^/$APP:running"
That's a pretty weak test of "up-ness". If the wget fails for any reason, it'll look like the application isn't running.

Re: NRPE is not able to parse the argument properly

Posted: Thu Jul 20, 2017 4:53 pm
by dwhitfield
Considering @mcapra's find, I might suggest one of the following:

https://exchange.nagios.org//directory/ ... py/details
https://exchange.nagios.org//directory/ ... pl/details

Let us know if neither one of those will work for you.

Re: NRPE is not able to parse the argument properly

Posted: Thu Jul 20, 2017 4:54 pm
by tgriep
Thanks @mcapra for the help.
You may want to search the Exchange Site for a different plugin that works better in your environment.
https://exchange.nagios.org/index.php?o ... ord=tomcat

Re: NRPE is not able to parse the argument properly

Posted: Fri Jul 21, 2017 5:06 am
by ddaluka
Hi @tgriep , @dwhitfield , @mcapra

Thank you all for your responses. I am now pretty much sure that it is not the plugin which is causing the problem. This is something to do with Nagios Database.

I have now changed the service to use check_tomcat.py and now check on my Nagios are flapping between check_tomcat.py and previously configured check_TomcatApplication plugin . Same as I described for check_interval.

Is it possible that I can recreate my database or repair it? Please note: I have already run repair database script which nagiosxi provides but didn't help

Re: NRPE is not able to parse the argument properly

Posted: Fri Jul 21, 2017 8:55 am
by tgriep
One thing you can try is to make a copy of the existing check and see if the copy fails the same.
That would rule out the plugin vs the database.

Re: NRPE is not able to parse the argument properly

Posted: Mon Jul 24, 2017 7:10 am
by ddaluka
Hi @tgriep,

The copy is working absolutely fine without any flapping. Do you know if there is any safe way of recreating database? else I will have to go for uninstall and then install again.