I have a bunch of service checks set up for hadoop, one in particular being for tasktracker. So I wrote a script for the check:
_
service hadoop-0.20-mapreduce-tasktracker status | grep 'is running'
x=$?
if [[ "$x" == "0" ]]; then
echo 'Tasktracker is up and running'
exit $STATE_OK
else
echo 'Tasktracker appears to be down'
exit $STATE_CRITICAL
fi
_
When running the script on the local box, it works exactly how it should. However, Nagios seems to flip the "x == 0" statement and shows STATE_CRITICAL when x=0 and STATE_OK when x=1. Has anyone ever experienced this? Please let me know if I am not being clear enough.