Nagios - Status UNKNOWN for bash script

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
BettyRNorahDeniels
Posts: 11
Joined: Tue Feb 15, 2022 7:13 am

Nagios - Status UNKNOWN for bash script

Post by BettyRNorahDeniels »

Hi
I am trying to monitor my EC2 loadblancer through Nagios using a bash script. Below is the script which I am trying to implement with Nagios.

#!/bin/sh

ST_OK=0
ST_WR=1
ST_CR=2
ST_UK=3


LB_NAME="xxx"
AWS_REGION="us-west-2"
PROFILE="default"


CMD=$(/usr/bin/aws elb describe-instance-health --region ${AWS_REGION} --load-balancer-name ${LB_NAME} --profile ${PROFILE})

if [ $? -eq 0 ]; then

IN_SERVICE_COUNT=$(echo ${CMD} | jq -c '.InstanceStates[].State' | grep InService |wc -l)
TOTAL_COUNT=$(echo ${CMD} | jq -c '.InstanceStates[].State' | wc -l)

if [ ${IN_SERVICE_COUNT} -eq 0 ]; then
NAGIOS_STATE=CRITICAL
EXIT_CODE=$ST_CR
elif [ ${TOTAL_COUNT} -eq ${IN_SERVICE_COUNT} ]; then
NAGIOS_STATE=OK
EXIT_CODE=$ST_OK
elif [ ${IN_SERVICE_COUNT} -lt ${TOTAL_COUNT} ]; then
NAGIOS_STATE=WARNING
EXIT_CODE=$ST_WR
fi
echo "${NAGIOS_STATE}: ELB:${LB_NAME} is running fine. Total #instances:${TOTAL_COUNT} Healthy instances:${IN_SERVICE_COUNT}"
else
echo "Failed to retrieve ELB Instances health from AWS"
EXIT_CODE=$ST_UK
fi
exit ${EXIT_CODE}
The above script is working fine for me while running manually. Also I have ran it with nagios user and I am able to get the result like below:

OK: ELB:xxx is running fine Total:18 Healthy:18
So, I don't think any permission issue. I have configured AWS credentials for nagios user. But in the nagios interface I am always getting status "UNKNOWN".

Below is the code for command.cfg

define command {
command_name check_elb_status
command_line /usr/local/nagios/libexec/check_elb_status.sh
}
Below is the code for host file:

define service{
use generic-service
host_name Prod-ELB
service_description Prod ELB Status
check_command check_elb_status
}
The same script I have used with NRPE from a different host and I am able to get the result:

Code for nrpe.cfg

command[check_elb_sts]=/usr/local/nagios/libexec/check_elb_status.sh
Code for host file

define service{
use generic-service
host_name xxx
service_description Prod ELB Status
check_command check_nrpe!check_elb_sts
}
Don't know why the script is not able to give result while using on Nagios host.
Please help to resolve omeglz echat the issue.
Thanks in advance!!!!
Locked