Page 1 of 2

Nagios does not update status in the UI

Posted: Wed Oct 28, 2015 2:02 pm
by linuser
On my remote system I am using the "check_init_service" script plugin to allow Nagios to capture the status of a service. I had to modify the script a bit to work within RHEL7 since it uses systemctl for starting services. Problem is now the nagios server is getting the data back and posting into the UI (see image) but it does not ever change the status, it stays green and 'OK' even if the service is down. I'd like this to change status if the service is down or it cant be checked for some reason. Here is the original script:

Code: Select all

#!/bin/sh

PROGNAME=`basename $0`

print_usage() {
        echo "Usage: $PROGNAME"
}

print_help() {
        echo ""
        print_usage
        echo ""
        echo "This plugin checks the status of services normally started by the init process."
        echo ""
        support
        exit 0
}


case "$1" in
        --help)
                print_help
                exit 0
                ;;
        -h)
                print_help
                exit 0
                ;;
        *)

                if [ $# -eq 1 ]; then
                        /sbin/service $1 status
                        ret=$?
                        case "$ret" in
                             0)
                                exit $ret
                                ;;
                             *)
                                exit 2
                                ;;
                        esac
                else
                        echo "ERROR: No service name specified on command line"
                        exit 3
                fi
                ;;
esac
This is the part I changed:

Code: Select all

 if [ $# -eq 1 ]; then
                        /bin/systemctl status $1 | awk 'NR==3'
                        ret=$?
                        case "$ret" in
So basically changed the /sbin/service line to use /bin/systemctl and only print out the 3rd line. The output shows up in the Nagios UI as seen in the image. And this is where the functionality pretty much stops. I need a way for the server to distinguish between states and update status appropriately. If the service is down I need a "Critical/Red" status and if the service cant be checked I need an "unknown/amber" status, etc...How can I do this?

Re: Nagios does not update status in the UI

Posted: Wed Oct 28, 2015 2:21 pm
by linuser
UPDATE - It appears that Nagios WILL respond appropriately if the check fails altogether. I just re-enabled SELinux and now the Nagios server goes "Crtitical/Red" with a socket timeout. So I would suspect anything that causes Nagios not to get any data would trip this alarm. So the more specific problem/question is if Nagios does get data, but its the wrong data, as in my case, a service that reports down, how do I get Nagios to go "Critical/Red" in that case too?

Re: Nagios does not update status in the UI

Posted: Wed Oct 28, 2015 2:35 pm
by hsmith
You're going to want to add an if statement to your script. What is happening is the fact that your awk was successful is making exit with a code of 0, which Nagios things is fine and dandy.

You'll want to do something along the lines of if NR=3 exit 3 to make it exit in a way that Nagios will see it as critical. Make sense?

Re: Nagios does not update status in the UI

Posted: Wed Oct 28, 2015 2:47 pm
by linuser
hsmith wrote:You're going to want to add an if statement to your script. What is happening is the fact that your awk was successful is making exit with a code of 0, which Nagios things is fine and dandy.

You'll want to do something along the lines of if NR=3 exit 3 to make it exit in a way that Nagios will see it as critical. Make sense?

Not the best at scripting :( But maybe you can help show me what the code would look like. What my "awk 'NR==3' does is only print out the 3rd line of stdout. The typical output is something like this:

Code: Select all

[root@rec3 ~]# systemctl status bgpd
bgpd.service - BGP routing daemon
   Loaded: loaded (/usr/lib/systemd/system/bgpd.service; enabled)
   Active: active (running) since Wed 2015-10-28 12:44:37 CDT; 1h 57min ago
  Process: 15792 ExecStart=/usr/sbin/bgpd -u quagga -g quagga -f /etc/quagga/bgpd.conf -d -P 17500 (code=exited, status=0/SUCCESS)
 Main PID: 15793 (bgpd)


   CGroup: /system.slice/bgpd.service
           └─15793 /usr/sbin/bgpd -u quagga -g quagga -f /etc/quagga/bgpd.conf -d -P 17500

I don't need all that I just want that 3rd line. And the 3rd line will always be there it will just say something different if the service is stopped. Like this:

Code: Select all

[root@rec3 ~]# systemctl status ngpd
ngpd.service
   Loaded: not-found (Reason: No such file or directory)
   Active: inactive (dead)
I need sometihng that will recognize anything other than " active (running)" to bring that to my attention.

Re: Nagios does not update status in the UI

Posted: Wed Oct 28, 2015 2:59 pm
by linuser
Yea, changing the script to "exit 3" does not change the behavior. Still showing "OK/Green" with an inactive/dead service.

Re: Nagios does not update status in the UI

Posted: Wed Oct 28, 2015 3:03 pm
by linuser
Disregard that I changed it in the wrong place I'm pretty sure. You are saying I need something that can tell when systemctl exits with any other code than 0 corerct?

Re: Nagios does not update status in the UI

Posted: Wed Oct 28, 2015 3:12 pm
by hsmith
Correct. I'm not a bash scripting expert myself, but I can try to whip a quick example up for you if you'd like. I wrote a really horrible Nagios Plugin in bash that uses a bunch of if statements.

Re: Nagios does not update status in the UI

Posted: Wed Oct 28, 2015 3:17 pm
by linuser
Ok, I would appreciate that. Just for the record, if you systemctl a service that you know is dead, you will still get an exit code of 0, and a success.

Code: Select all

[root@rec3 plugins]# systemctl status bgpd
bgpd.service - BGP routing daemon
   Loaded: loaded (/usr/lib/systemd/system/bgpd.service; enabled)
   Active: inactive (dead) since Wed 2015-10-28 14:56:10 CDT; 16min ago
  Process: 15792 ExecStart=/usr/sbin/bgpd -u quagga -g quagga -f /etc/quagga/bgpd.conf -d -P 17500 (code=exited, status=0/SUCCESS)
 Main PID: 15793 (code=exited, status=0/SUCCESS)
So not sure if this script mode will do the trick or not.

Re: Nagios does not update status in the UI

Posted: Wed Oct 28, 2015 3:27 pm
by hsmith
Issue the 'systemctl status <whatever>' command and then do an 'echo $?' right after, to check that for sure. I found that is not the case :/

Re: Nagios does not update status in the UI

Posted: Wed Oct 28, 2015 3:32 pm
by hsmith
Quick example script. It's pretty hack-y, but hey, it does the thing.

Code: Select all

#!/bin/bash



S1="active"


Info=$(systemctl status $1 | awk 'NR==3' | awk '{print $2}')



if [ $Info == $S1 ]

then

        echo "yay" $1 "is running!"

        exit 0

else
        echo "boooo" $1 "is not running!"

        exit 3
fi