Page 1 of 2
Services don't warn on stop/waiting
Posted: Fri Jun 12, 2015 8:49 am
by courhar
Pretty new to Nagios and been playing around with Nagios XI using it to monitor services on Ubuntu servers. Whilst trying to monitor a service called heat-api on a Ubuntu server, try stopping the service on the server itself and it comes up on nagios as stop/waiting yet reports green? How can i configure this to go red and alert? Tried having a look in the core configuration manager on the service monitoring but got a bit lost, guessing I need to change something in there maybe an arguement or something? Its currently trying to do it with a check_nrpe command? Any pointers would be great.
Thanks
Courhar
Re: Services don't warn on stop/waiting
Posted: Fri Jun 12, 2015 9:01 am
by lmiltchev
How did you install the Linux agent (NRPE + Nagios plugins) on the Ubuntu box? Did you follow
this document? Can you show us the actual command that you are running from the command line and the output of it?
Re: Services don't warn on stop/waiting
Posted: Fri Jun 12, 2015 9:16 am
by courhar
I used the same guide you linked when installing the agent first time around, then ran the linux wizard selecting ubuntu as the OS. On the services section entered the service name and it seems to have picked it up fine, says when its running but when its stopped it says that the service is stop/waiting but shows as green. I've attached a snippet of the service management common settings for the service i'm trying to monitor.
Thanks
courhar
Re: Services don't warn on stop/waiting
Posted: Fri Jun 12, 2015 9:48 am
by courhar
Attached is the output in the dashboard, as you can see it sees the service is stop/waiting but doesn't alarm
Re: Services don't warn on stop/waiting
Posted: Fri Jun 12, 2015 10:25 am
by lmiltchev
Do you get the expected output if you run the check as the "nagios" user? Run the following commands and show us the output:
Code: Select all
/usr/local/nagios/libexec/check_nrpe -H <client ip> -t 30 -c check_init_service -a 'heat-api'
echo $?
su nagios
/usr/local/nagios/libexec/check_nrpe -H <client ip> -t 30 -c check_init_service -a 'heat-api'
echo $?
Re: Services don't warn on stop/waiting
Posted: Mon Jun 15, 2015 2:53 am
by courhar
So i ran those commands and got the expected outputs (I stopped the service after the first two runs). The problem however is that on the Nagios XI dashboard it doesn't seem to pick up that stop/waiting is bad and therefore does not alarm me when that service is off? As shown in the second command the processes are green and 'Ok' regardless of start/running or stop/waiting.
Thanks
Harry
Re: Services don't warn on stop/waiting
Posted: Mon Jun 15, 2015 9:31 am
by tmcdonald
On the remote machine, can you please show us how you have the check_init_service command defined? It will be in your nrpe.cfg
Re: Services don't warn on stop/waiting
Posted: Tue Jun 16, 2015 2:55 am
by courhar
So this is the definition of the service on the nagios server, is this what you meant? Is there something to enter in here or somewhere else on the nagios side where you define stop/waiting as an error?
Thanks
Courhar
Re: Services don't warn on stop/waiting
Posted: Tue Jun 16, 2015 10:03 am
by lmiltchev
So i ran those commands and got the expected outputs
What you expect might be different that what I expect... Can you, please run the following commands from the CLI and show us the output (when the service on the remote box is running, and when it is stopped):
Code: Select all
/usr/local/nagios/libexec/check_nrpe -H <client ip> -t 30 -c check_init_service -a 'heat-api'
echo $?
su nagios
/usr/local/nagios/libexec/check_nrpe -H <client ip> -t 30 -c check_init_service -a 'heat-api'
echo $?
So this is the definition of the service on the nagios server, is this what you meant?
Actually, tmcdonald wanted to see the check_init_service command definition on the remote machine (devops-os-hos01.devops.local), not on the Nagios XI server. If you used our Linux agent installer, the "check_init_service" command definition should be located in the "/usr/local/nagios/etc/nrpe/common.cfg" file on the client.
Re: Services don't warn on stop/waiting
Posted: Fri Jun 19, 2015 3:44 am
by courhar
Hi guys been away so sorry for the lack of response. From looking in the common.cfg i found the location to where check_init_service was - command[check_init_service]=sudo /usr/local/nagios/libexec/check_init_service $ARG1$
So i followed that and the below is the result.
#!/bin/sh
PROGNAME=`basename $0`
print_usage() {
echo "Usage: $PROGNAME"
}
print_help() {
echo ""
print_usage
echo ""
echo "This plugin checks the status of services normally started by the init process."
echo ""
support
exit 0
}
case "$1" in
--help)
print_help
exit 0
;;
-h)
print_help
exit 0
;;
*)
if [ $# -eq 1 ]; then
/sbin/service $1 status
ret=$?
case "$ret" in
0)
exit $ret
;;
*)
exit 2
;;
esac
else
echo "ERROR: No service name specified on command line"
exit 3
fi
;;
esac
I also ran the commands you asked, these are shown in the pictures.