Hello All,
I need to monitor a service memcached-monitor.service and memcached.service which is under the path /opt/airwatch/memcached . So when i do systemctl status xxx the output shows active and running .
Is there a way to monitor this in nagios . Alert should be triggered when the service is not running .
Monitoring service status on linux server
Monitoring service status on linux server
You do not have the required permissions to view the files attached to this post.
Re: Monitoring service status on linux server
You could use NCPA to monitor the service is running. NCPA has a built-in check for a whether a named service is running.
1. Install NCPA on the target machine. NCPA is a FOSS program we maintain: https://github.com/NagiosEnterprises/ncpa
2. Pick a password, open ports, etc to make the check work.
3. Set up a check like so inside nagios XI
More documentation here: https://support.nagios.com/kb/article/s ... ce_started
1. Install NCPA on the target machine. NCPA is a FOSS program we maintain: https://github.com/NagiosEnterprises/ncpa
2. Pick a password, open ports, etc to make the check work.
3. Set up a check like so inside nagios XI
Code: Select all
./check_ncpa.py -H <host ip or FQDN> -t Str0ngT0k3n -M services -q service=memcached,status=runningMore documentation here: https://support.nagios.com/kb/article/s ... ce_started
If you didn't get an 8% raise over the course of the pandemic, you took a pay cut.
Discussion of wages is protected speech under the National Labor Relations Act, and no employer can tell you you can't disclose your pay with your fellow employees.
Discussion of wages is protected speech under the National Labor Relations Act, and no employer can tell you you can't disclose your pay with your fellow employees.
Re: Monitoring service status on linux server
Thats great .
But is there any other way .
But is there any other way .
Re: Monitoring service status on linux server
Sure, you could use NSCA to send a passive check to the Nagios server, indicating whether the memcached service is running.
The difference between an active check and a passive check is an active check is a check that the Nagios monitoring engine initiates, and a passive one is one that's initiated on the remote server and sent to the listening monitoring engine.
Here's also a step-by-step guide to setting up NSCA: http://nagios.sourceforge.net/download/ ... _Setup.pdf
Note that if you're only sending a passive check, you can skip the parts about using xinetd and probably don't have to configure your firewall to allow any incoming ports.
The difference between an active check and a passive check is an active check is a check that the Nagios monitoring engine initiates, and a passive one is one that's initiated on the remote server and sent to the listening monitoring engine.
Here's also a step-by-step guide to setting up NSCA: http://nagios.sourceforge.net/download/ ... _Setup.pdf
Note that if you're only sending a passive check, you can skip the parts about using xinetd and probably don't have to configure your firewall to allow any incoming ports.
If you didn't get an 8% raise over the course of the pandemic, you took a pay cut.
Discussion of wages is protected speech under the National Labor Relations Act, and no employer can tell you you can't disclose your pay with your fellow employees.
Discussion of wages is protected speech under the National Labor Relations Act, and no employer can tell you you can't disclose your pay with your fellow employees.
Re: Monitoring service status on linux server
Thanks for that .
Can we use the below script ? Is this script proper
#!/usr/bin/env bash
# Author: Jon Schipp
# 2015-03-09 [Pascal Hegy] - Add sudo for linux
# 2015-03-09 [Pascal Hegy] - Change USER variable to USERNAME to avoid the use and confusion with the USER env variable
# 2017-08-30 [Roberto Leibman] - Reordered checks to make sure dead and inactive get checked first
# 2018-04-25 [Robin Gierse] - Update check via systemctl for Linux with grep to produce better output for systemctl
# 2019-03-15 [nem / liberodark] - Add support for check all failed services in linux
########
# Examples:
# 1.) List services for osx
# $ ./check_service.sh -l -o osx
#
# 2.) Check status of SSH service on a linux machine
# $ ./check_service.sh -o linux -s sshd
# 3.) Manually select service management tool and service
# $ ./check_service.sh -o linux -t "service rsyslog status"
# Exemple for check all failed services
# $ ./check_service.sh -o linux -t "systemctl list-units --state=failed"
# Nagios Exit Codes
OK=0
WARNING=1
CRITICAL=2
UNKNOWN=3
# Weather or not we can trust the exit code from the service management tool.
# Defaults to 0, put to 1 for systemd. Otherwise we must rely on parsing the
# output from the service management tool.
TRUST_EXIT_CODE=0
usage()
{
cat <<EOF
Check status of system services for Linux, FreeBSD, OSX, and AIX.
Options:
-s <service> Specify service name
-l List services
-o <os> OS type, "linux/osx/freebsd/aix"
-u <user> User if you need to ``sudo -u'' for launchctl (def: nagios, linux and osx only)
-t <tool> Manually specify service management tool (def: autodetect) with status and service
e.g. ``-t "service nagios status"''
EOF
}
argcheck() {
# if less than n argument
if [ $ARGC -lt $1 ]; then
echo "Missing arguments! Use \`\`-h'' for help."
exit 1
fi
}
os_check() {
if [ "$OS" == null ]; then
unamestr=$(uname)
if [[ $unamestr == 'Linux' ]]; then
OS='linux'
elif [[ $unamestr == 'FreeBSD' ]]; then
OS='freebsd'
elif [[ $unamestr == 'Darwin' ]]; then
OS='osx'
else
echo "OS not recognized, Use \`-o\` and specify the OS as an argument"
exit 3
fi
fi
}
determine_service_tool() {
if [[ $OS == linux ]]; then
if command -v systemctl >/dev/null 2>&1; then
SERVICETOOL="systemctl status $SERVICE | grep -i Active"
LISTTOOL="systemctl"
if [ $USERNAME ]; then
SERVICETOOL="sudo -u $USERNAME systemctl status $SERVICE"
LISTTOOL="sudo -u $USERNAME systemctl"
fi
TRUST_EXIT_CODE=1
elif command -v service >/dev/null 2>&1; then
SERVICETOOL="service $SERVICE status"
LISTTOOL="service --status-all"
if [ $USERNAME ]; then
SERVICETOOL="sudo -u $USERNAME service $SERVICE status"
LISTTOOL="sudo -u $USERNAME service --status-all"
fi
elif command -v initctl >/dev/null 2>&1; then
SERVICETOOL="status $SERVICE"
LISTTOOL="initctl list"
if [ $USERNAME ]; then
SERVICETOOL="sudo -u $USERNAME status $SERVICE"
LISTTOOL="sudo -u $USERNAME initctl list"
fi
elif command -v chkconfig >/dev/null 2>&1; then
SERVICETOOL=chkconfig
LISTTOOL="chkconfig --list"
if [ $USERNAME ]; then
SERVICETOOL="sudo -u $USERNAME chkconfig"
LISTTOOL="sudo -u $USERNAME chkconfig --list"
fi
elif [ -f /etc/init.d/$SERVICE ] || [ -d /etc/init.d ]; then
SERVICETOOL="/etc/init.d/$SERVICE status | tail -1"
LISTTOOL="ls -1 /etc/init.d/"
if [ $USERNAME ]; then
SERVICETOOL="sudo -u $USERNAME /etc/init.d/$SERVICE status | tail -1"
LISTTOOL="sudo -u $USERNAME ls -1 /etc/init.d/"
fi
else
echo "Unable to determine the system's service tool!"
exit 1
fi
fi
if [[ $OS == freebsd ]]; then
if command -v service >/dev/null 2>&1; then
SERVICETOOL="service $SERVICE status"
LISTTOOL="service -l"
elif [ -f /etc/rc.d/$SERVICE ] || [ -d /etc/rc.d ]; then
SERVICETOOL="/etc/rc.d/$SERVICE status"
LISTTOOL="ls -1 /etc/rc.d/"
else
echo "Unable to determine the system's service tool!"
exit 1
fi
fi
if [[ $OS == osx ]]; then
if [ -f /usr/sbin/serveradmin >/dev/null 2>&1 ] && serveradmin list | grep "$SERVICE" 2>&1 >/dev/null; then
SERVICETOOL="serveradmin status $SERVICE"
LISTTOOL="serveradmin list"
elif [ -f /Applications/Server.app/Contents/ServerRoot/usr/sbin/serveradmin >/dev/null 2>&1 ] && \
/Applications/Server.app/Contents/ServerRoot/usr/sbin/serveradmin list | \
grep "$SERVICE" 2>&1 >/dev/null; then
SERVICETOOL="/Applications/Server.app/Contents/ServerRoot/usr/sbin/serveradmin status $SERVICE"
LISTTOOL="/Applications/Server.app/Contents/ServerRoot/usr/sbin/serveradmin list"
elif command -v launchctl >/dev/null 2>&1; then
SERVICETOOL="launchctl list | grep -v ^- | grep $SERVICE || echo $SERVICE not running! "
LISTTOOL="launchctl list"
if [ $USERNAME ]; then
SERVICETOOL="sudo -u $USERNAME launchctl list | grep -v ^- | grep $SERVICE || echo $SERVICE not running! "
LISTTOOL="sudo -u $USERNAME launchctl list"
fi
elif command -v service >/dev/null 2>&1; then
SERVICETOOL="service --test-if-configured-on $SERVICE"
LISTTOOL="service list"
else
echo "Unable to determine the system's service tool!"
exit 1
fi
fi
if [[ $OS == aix ]]; then
if command -v lssrc >/dev/null 2>&1; then
SERVICETOOL="lssrc -s $SERVICE | grep -v Subsystem"
LISTTOOL="lssrc -a"
else
echo "Unable to determine the system's service tool!"
exit 1
fi
fi
}
ARGC=$#
LIST=0
MANUAL=0
OS=null
SERVICETOOL=null
LISTTOOL=null
SERVICE=".*"
#USERNAME=nagios
argcheck 1
while getopts "hls
t:u:" OPTION
do
case $OPTION in
h)
usage
exit 0
;;
l)
LIST=1
;;
s)
SERVICE="$OPTARG"
;;
o)
if [[ "$OPTARG" == linux ]]; then
OS="$OPTARG"
elif [[ "$OPTARG" == osx ]]; then
OS="$OPTARG"
elif [[ "$OPTARG" == freebsd ]]; then
OS="$OPTARG"
elif [[ "$OPTARG" == aix ]]; then
OS="$OPTARG"
else
echo "Unknown type!"
exit 1
fi
;;
t)
MANUAL=1
MANUALSERVICETOOL="$OPTARG"
;;
u)
USERNAME="$OPTARG"
;;
\?)
exit 1
;;
esac
done
os_check
if [ $MANUAL -eq 1 ]; then
SERVICETOOL=$MANUALSERVICETOOL
else
determine_service_tool
fi
# -l conflicts with -t
if [ $MANUAL -eq 1 ] && [ $LIST -eq 1 ]; then
echo "Options conflict: \`\`-t'' and \`\`-l''"
exit 2
fi
if [ $LIST -eq 1 ]; then
if [[ $LISTTOOL != null ]]; then
$LISTTOOL
exit 0
else
echo "OS not specified! Use \`\`-o''"
exit 2
fi
fi
# Check the status of a service
STATUS_MSG=$(eval "$SERVICETOOL" 2>&1)
EXIT_CODE=$?
## Exit code from the service tool - if it's non-zero, we should
## probably return CRITICAL. (though, in some cases UNKNOWN would
## probably be more appropriate)
[ $EXIT_CODE -ne 0 ] && echo "$STATUS_MSG" && exit $CRITICAL
## For systemd and most systems, $EXIT_CODE can be trusted - if it's 0, the service is running.
## Ref https://github.com/jonschipp/nagios-plugins/issues/15
[ $TRUST_EXIT_CODE -eq 1 ] && [ $EXIT_CODE -eq 0 ] && echo "$STATUS_MSG" && exit $OK
case $STATUS_MSG in
*stop*)
echo "$STATUS_MSG"
exit $CRITICAL
;;
*STOPPED*)
echo "$STATUS_MSG"
exit $CRITICAL
;;
*not*running*)
echo "$STATUS_MSG"
exit $CRITICAL
;;
*NOT*running*)
echo "$STATUS_MSG"
exit $CRITICAL
;;
*NOT*RUNNING*)
echo "$STATUS_MSG"
exit $CRITICAL
;;
#*inactive*)
# echo "$STATUS_MSG"
# exit $CRITICAL
# ;;
*dead*)
echo "$STATUS_MSG"
exit $CRITICAL
;;
*running*)
echo "$STATUS_MSG"
exit $OK
;;
*RUNNING*)
echo "$STATUS_MSG"
exit $OK
;;
*SUCCESS*)
echo "$STATUS_MSG"
exit $OK
;;
*[eE]rr*)
echo "Error in command: $STATUS_MSG"
exit $CRITICAL
;;
*[fF]ailed*)
echo "$STATUS_MSG"
exit $CRITICAL
;;
*[eE]nable*)
echo "$STATUS_MSG"
exit $OK
;;
*[dD]isable*)
echo "$STATUS_MSG"
exit $CRITICAL
;;
*[cC]annot*)
echo "$STATUS_MSG"
exit $CRITICAL
;;
*[aA]ctive*)
echo "$STATUS_MSG"
exit $OK
;;
*Subsystem*not*on*file)
echo "$STATUS_MSG"
exit $CRITICAL
;;
[1-9][1-9]*)
echo "$SERVICE running: $STATUS_MSG"
exit $OK
;;
"")
echo "$SERVICE is not running: no output from service command"
exit $CRITICAL
;;
*)
echo "Unknown status: $STATUS_MSG"
echo "Is there a typo in the command or service configuration?: $STATUS_MSG"
exit $UNKNOWN
;;
*0\ loaded*)
echo "$STATUS_MSG"
exit $OK
;;
esac
Can we use the below script ? Is this script proper
#!/usr/bin/env bash
# Author: Jon Schipp
# 2015-03-09 [Pascal Hegy] - Add sudo for linux
# 2015-03-09 [Pascal Hegy] - Change USER variable to USERNAME to avoid the use and confusion with the USER env variable
# 2017-08-30 [Roberto Leibman] - Reordered checks to make sure dead and inactive get checked first
# 2018-04-25 [Robin Gierse] - Update check via systemctl for Linux with grep to produce better output for systemctl
# 2019-03-15 [nem / liberodark] - Add support for check all failed services in linux
########
# Examples:
# 1.) List services for osx
# $ ./check_service.sh -l -o osx
#
# 2.) Check status of SSH service on a linux machine
# $ ./check_service.sh -o linux -s sshd
# 3.) Manually select service management tool and service
# $ ./check_service.sh -o linux -t "service rsyslog status"
# Exemple for check all failed services
# $ ./check_service.sh -o linux -t "systemctl list-units --state=failed"
# Nagios Exit Codes
OK=0
WARNING=1
CRITICAL=2
UNKNOWN=3
# Weather or not we can trust the exit code from the service management tool.
# Defaults to 0, put to 1 for systemd. Otherwise we must rely on parsing the
# output from the service management tool.
TRUST_EXIT_CODE=0
usage()
{
cat <<EOF
Check status of system services for Linux, FreeBSD, OSX, and AIX.
Options:
-s <service> Specify service name
-l List services
-o <os> OS type, "linux/osx/freebsd/aix"
-u <user> User if you need to ``sudo -u'' for launchctl (def: nagios, linux and osx only)
-t <tool> Manually specify service management tool (def: autodetect) with status and service
e.g. ``-t "service nagios status"''
EOF
}
argcheck() {
# if less than n argument
if [ $ARGC -lt $1 ]; then
echo "Missing arguments! Use \`\`-h'' for help."
exit 1
fi
}
os_check() {
if [ "$OS" == null ]; then
unamestr=$(uname)
if [[ $unamestr == 'Linux' ]]; then
OS='linux'
elif [[ $unamestr == 'FreeBSD' ]]; then
OS='freebsd'
elif [[ $unamestr == 'Darwin' ]]; then
OS='osx'
else
echo "OS not recognized, Use \`-o\` and specify the OS as an argument"
exit 3
fi
fi
}
determine_service_tool() {
if [[ $OS == linux ]]; then
if command -v systemctl >/dev/null 2>&1; then
SERVICETOOL="systemctl status $SERVICE | grep -i Active"
LISTTOOL="systemctl"
if [ $USERNAME ]; then
SERVICETOOL="sudo -u $USERNAME systemctl status $SERVICE"
LISTTOOL="sudo -u $USERNAME systemctl"
fi
TRUST_EXIT_CODE=1
elif command -v service >/dev/null 2>&1; then
SERVICETOOL="service $SERVICE status"
LISTTOOL="service --status-all"
if [ $USERNAME ]; then
SERVICETOOL="sudo -u $USERNAME service $SERVICE status"
LISTTOOL="sudo -u $USERNAME service --status-all"
fi
elif command -v initctl >/dev/null 2>&1; then
SERVICETOOL="status $SERVICE"
LISTTOOL="initctl list"
if [ $USERNAME ]; then
SERVICETOOL="sudo -u $USERNAME status $SERVICE"
LISTTOOL="sudo -u $USERNAME initctl list"
fi
elif command -v chkconfig >/dev/null 2>&1; then
SERVICETOOL=chkconfig
LISTTOOL="chkconfig --list"
if [ $USERNAME ]; then
SERVICETOOL="sudo -u $USERNAME chkconfig"
LISTTOOL="sudo -u $USERNAME chkconfig --list"
fi
elif [ -f /etc/init.d/$SERVICE ] || [ -d /etc/init.d ]; then
SERVICETOOL="/etc/init.d/$SERVICE status | tail -1"
LISTTOOL="ls -1 /etc/init.d/"
if [ $USERNAME ]; then
SERVICETOOL="sudo -u $USERNAME /etc/init.d/$SERVICE status | tail -1"
LISTTOOL="sudo -u $USERNAME ls -1 /etc/init.d/"
fi
else
echo "Unable to determine the system's service tool!"
exit 1
fi
fi
if [[ $OS == freebsd ]]; then
if command -v service >/dev/null 2>&1; then
SERVICETOOL="service $SERVICE status"
LISTTOOL="service -l"
elif [ -f /etc/rc.d/$SERVICE ] || [ -d /etc/rc.d ]; then
SERVICETOOL="/etc/rc.d/$SERVICE status"
LISTTOOL="ls -1 /etc/rc.d/"
else
echo "Unable to determine the system's service tool!"
exit 1
fi
fi
if [[ $OS == osx ]]; then
if [ -f /usr/sbin/serveradmin >/dev/null 2>&1 ] && serveradmin list | grep "$SERVICE" 2>&1 >/dev/null; then
SERVICETOOL="serveradmin status $SERVICE"
LISTTOOL="serveradmin list"
elif [ -f /Applications/Server.app/Contents/ServerRoot/usr/sbin/serveradmin >/dev/null 2>&1 ] && \
/Applications/Server.app/Contents/ServerRoot/usr/sbin/serveradmin list | \
grep "$SERVICE" 2>&1 >/dev/null; then
SERVICETOOL="/Applications/Server.app/Contents/ServerRoot/usr/sbin/serveradmin status $SERVICE"
LISTTOOL="/Applications/Server.app/Contents/ServerRoot/usr/sbin/serveradmin list"
elif command -v launchctl >/dev/null 2>&1; then
SERVICETOOL="launchctl list | grep -v ^- | grep $SERVICE || echo $SERVICE not running! "
LISTTOOL="launchctl list"
if [ $USERNAME ]; then
SERVICETOOL="sudo -u $USERNAME launchctl list | grep -v ^- | grep $SERVICE || echo $SERVICE not running! "
LISTTOOL="sudo -u $USERNAME launchctl list"
fi
elif command -v service >/dev/null 2>&1; then
SERVICETOOL="service --test-if-configured-on $SERVICE"
LISTTOOL="service list"
else
echo "Unable to determine the system's service tool!"
exit 1
fi
fi
if [[ $OS == aix ]]; then
if command -v lssrc >/dev/null 2>&1; then
SERVICETOOL="lssrc -s $SERVICE | grep -v Subsystem"
LISTTOOL="lssrc -a"
else
echo "Unable to determine the system's service tool!"
exit 1
fi
fi
}
ARGC=$#
LIST=0
MANUAL=0
OS=null
SERVICETOOL=null
LISTTOOL=null
SERVICE=".*"
#USERNAME=nagios
argcheck 1
while getopts "hls
do
case $OPTION in
h)
usage
exit 0
;;
l)
LIST=1
;;
s)
SERVICE="$OPTARG"
;;
o)
if [[ "$OPTARG" == linux ]]; then
OS="$OPTARG"
elif [[ "$OPTARG" == osx ]]; then
OS="$OPTARG"
elif [[ "$OPTARG" == freebsd ]]; then
OS="$OPTARG"
elif [[ "$OPTARG" == aix ]]; then
OS="$OPTARG"
else
echo "Unknown type!"
exit 1
fi
;;
t)
MANUAL=1
MANUALSERVICETOOL="$OPTARG"
;;
u)
USERNAME="$OPTARG"
;;
\?)
exit 1
;;
esac
done
os_check
if [ $MANUAL -eq 1 ]; then
SERVICETOOL=$MANUALSERVICETOOL
else
determine_service_tool
fi
# -l conflicts with -t
if [ $MANUAL -eq 1 ] && [ $LIST -eq 1 ]; then
echo "Options conflict: \`\`-t'' and \`\`-l''"
exit 2
fi
if [ $LIST -eq 1 ]; then
if [[ $LISTTOOL != null ]]; then
$LISTTOOL
exit 0
else
echo "OS not specified! Use \`\`-o''"
exit 2
fi
fi
# Check the status of a service
STATUS_MSG=$(eval "$SERVICETOOL" 2>&1)
EXIT_CODE=$?
## Exit code from the service tool - if it's non-zero, we should
## probably return CRITICAL. (though, in some cases UNKNOWN would
## probably be more appropriate)
[ $EXIT_CODE -ne 0 ] && echo "$STATUS_MSG" && exit $CRITICAL
## For systemd and most systems, $EXIT_CODE can be trusted - if it's 0, the service is running.
## Ref https://github.com/jonschipp/nagios-plugins/issues/15
[ $TRUST_EXIT_CODE -eq 1 ] && [ $EXIT_CODE -eq 0 ] && echo "$STATUS_MSG" && exit $OK
case $STATUS_MSG in
*stop*)
echo "$STATUS_MSG"
exit $CRITICAL
;;
*STOPPED*)
echo "$STATUS_MSG"
exit $CRITICAL
;;
*not*running*)
echo "$STATUS_MSG"
exit $CRITICAL
;;
*NOT*running*)
echo "$STATUS_MSG"
exit $CRITICAL
;;
*NOT*RUNNING*)
echo "$STATUS_MSG"
exit $CRITICAL
;;
#*inactive*)
# echo "$STATUS_MSG"
# exit $CRITICAL
# ;;
*dead*)
echo "$STATUS_MSG"
exit $CRITICAL
;;
*running*)
echo "$STATUS_MSG"
exit $OK
;;
*RUNNING*)
echo "$STATUS_MSG"
exit $OK
;;
*SUCCESS*)
echo "$STATUS_MSG"
exit $OK
;;
*[eE]rr*)
echo "Error in command: $STATUS_MSG"
exit $CRITICAL
;;
*[fF]ailed*)
echo "$STATUS_MSG"
exit $CRITICAL
;;
*[eE]nable*)
echo "$STATUS_MSG"
exit $OK
;;
*[dD]isable*)
echo "$STATUS_MSG"
exit $CRITICAL
;;
*[cC]annot*)
echo "$STATUS_MSG"
exit $CRITICAL
;;
*[aA]ctive*)
echo "$STATUS_MSG"
exit $OK
;;
*Subsystem*not*on*file)
echo "$STATUS_MSG"
exit $CRITICAL
;;
[1-9][1-9]*)
echo "$SERVICE running: $STATUS_MSG"
exit $OK
;;
"")
echo "$SERVICE is not running: no output from service command"
exit $CRITICAL
;;
*)
echo "Unknown status: $STATUS_MSG"
echo "Is there a typo in the command or service configuration?: $STATUS_MSG"
exit $UNKNOWN
;;
*0\ loaded*)
echo "$STATUS_MSG"
exit $OK
;;
esac
Re: Monitoring service status on linux server
Sure, it looks fine. Let me know if you run into any issues.
If you didn't get an 8% raise over the course of the pandemic, you took a pay cut.
Discussion of wages is protected speech under the National Labor Relations Act, and no employer can tell you you can't disclose your pay with your fellow employees.
Discussion of wages is protected speech under the National Labor Relations Act, and no employer can tell you you can't disclose your pay with your fellow employees.
Re: Monitoring service status on linux server
You can close the ticket 
Thank you so much
Thank you so much
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Monitoring service status on linux server
Locking threaddeek wrote:You can close the ticket
Thank you so much