Nagios does not update status in the UI

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
linuser
Posts: 102
Joined: Fri Sep 18, 2015 9:53 am

Nagios does not update status in the UI

Post by linuser »

On my remote system I am using the "check_init_service" script plugin to allow Nagios to capture the status of a service. I had to modify the script a bit to work within RHEL7 since it uses systemctl for starting services. Problem is now the nagios server is getting the data back and posting into the UI (see image) but it does not ever change the status, it stays green and 'OK' even if the service is down. I'd like this to change status if the service is down or it cant be checked for some reason. Here is the original script:

Code: Select all

#!/bin/sh

PROGNAME=`basename $0`

print_usage() {
        echo "Usage: $PROGNAME"
}

print_help() {
        echo ""
        print_usage
        echo ""
        echo "This plugin checks the status of services normally started by the init process."
        echo ""
        support
        exit 0
}


case "$1" in
        --help)
                print_help
                exit 0
                ;;
        -h)
                print_help
                exit 0
                ;;
        *)

                if [ $# -eq 1 ]; then
                        /sbin/service $1 status
                        ret=$?
                        case "$ret" in
                             0)
                                exit $ret
                                ;;
                             *)
                                exit 2
                                ;;
                        esac
                else
                        echo "ERROR: No service name specified on command line"
                        exit 3
                fi
                ;;
esac
This is the part I changed:

Code: Select all

 if [ $# -eq 1 ]; then
                        /bin/systemctl status $1 | awk 'NR==3'
                        ret=$?
                        case "$ret" in
So basically changed the /sbin/service line to use /bin/systemctl and only print out the 3rd line. The output shows up in the Nagios UI as seen in the image. And this is where the functionality pretty much stops. I need a way for the server to distinguish between states and update status appropriately. If the service is down I need a "Critical/Red" status and if the service cant be checked I need an "unknown/amber" status, etc...How can I do this?
Attachments
Capture (1).PNG
Capture (1).PNG (10.7 KiB) Viewed 4695 times
linuser
Posts: 102
Joined: Fri Sep 18, 2015 9:53 am

Re: Nagios does not update status in the UI

Post by linuser »

UPDATE - It appears that Nagios WILL respond appropriately if the check fails altogether. I just re-enabled SELinux and now the Nagios server goes "Crtitical/Red" with a socket timeout. So I would suspect anything that causes Nagios not to get any data would trip this alarm. So the more specific problem/question is if Nagios does get data, but its the wrong data, as in my case, a service that reports down, how do I get Nagios to go "Critical/Red" in that case too?
User avatar
hsmith
Agent Smith
Posts: 3539
Joined: Thu Jul 30, 2015 11:09 am
Location: 127.0.0.1
Contact:

Re: Nagios does not update status in the UI

Post by hsmith »

You're going to want to add an if statement to your script. What is happening is the fact that your awk was successful is making exit with a code of 0, which Nagios things is fine and dandy.

You'll want to do something along the lines of if NR=3 exit 3 to make it exit in a way that Nagios will see it as critical. Make sense?
Former Nagios Employee.
me.
linuser
Posts: 102
Joined: Fri Sep 18, 2015 9:53 am

Re: Nagios does not update status in the UI

Post by linuser »

hsmith wrote:You're going to want to add an if statement to your script. What is happening is the fact that your awk was successful is making exit with a code of 0, which Nagios things is fine and dandy.

You'll want to do something along the lines of if NR=3 exit 3 to make it exit in a way that Nagios will see it as critical. Make sense?

Not the best at scripting :( But maybe you can help show me what the code would look like. What my "awk 'NR==3' does is only print out the 3rd line of stdout. The typical output is something like this:

Code: Select all

[root@rec3 ~]# systemctl status bgpd
bgpd.service - BGP routing daemon
   Loaded: loaded (/usr/lib/systemd/system/bgpd.service; enabled)
   Active: active (running) since Wed 2015-10-28 12:44:37 CDT; 1h 57min ago
  Process: 15792 ExecStart=/usr/sbin/bgpd -u quagga -g quagga -f /etc/quagga/bgpd.conf -d -P 17500 (code=exited, status=0/SUCCESS)
 Main PID: 15793 (bgpd)


   CGroup: /system.slice/bgpd.service
           └─15793 /usr/sbin/bgpd -u quagga -g quagga -f /etc/quagga/bgpd.conf -d -P 17500

I don't need all that I just want that 3rd line. And the 3rd line will always be there it will just say something different if the service is stopped. Like this:

Code: Select all

[root@rec3 ~]# systemctl status ngpd
ngpd.service
   Loaded: not-found (Reason: No such file or directory)
   Active: inactive (dead)
I need sometihng that will recognize anything other than " active (running)" to bring that to my attention.
linuser
Posts: 102
Joined: Fri Sep 18, 2015 9:53 am

Re: Nagios does not update status in the UI

Post by linuser »

Yea, changing the script to "exit 3" does not change the behavior. Still showing "OK/Green" with an inactive/dead service.
linuser
Posts: 102
Joined: Fri Sep 18, 2015 9:53 am

Re: Nagios does not update status in the UI

Post by linuser »

Disregard that I changed it in the wrong place I'm pretty sure. You are saying I need something that can tell when systemctl exits with any other code than 0 corerct?
User avatar
hsmith
Agent Smith
Posts: 3539
Joined: Thu Jul 30, 2015 11:09 am
Location: 127.0.0.1
Contact:

Re: Nagios does not update status in the UI

Post by hsmith »

Correct. I'm not a bash scripting expert myself, but I can try to whip a quick example up for you if you'd like. I wrote a really horrible Nagios Plugin in bash that uses a bunch of if statements.
Former Nagios Employee.
me.
linuser
Posts: 102
Joined: Fri Sep 18, 2015 9:53 am

Re: Nagios does not update status in the UI

Post by linuser »

Ok, I would appreciate that. Just for the record, if you systemctl a service that you know is dead, you will still get an exit code of 0, and a success.

Code: Select all

[root@rec3 plugins]# systemctl status bgpd
bgpd.service - BGP routing daemon
   Loaded: loaded (/usr/lib/systemd/system/bgpd.service; enabled)
   Active: inactive (dead) since Wed 2015-10-28 14:56:10 CDT; 16min ago
  Process: 15792 ExecStart=/usr/sbin/bgpd -u quagga -g quagga -f /etc/quagga/bgpd.conf -d -P 17500 (code=exited, status=0/SUCCESS)
 Main PID: 15793 (code=exited, status=0/SUCCESS)
So not sure if this script mode will do the trick or not.
User avatar
hsmith
Agent Smith
Posts: 3539
Joined: Thu Jul 30, 2015 11:09 am
Location: 127.0.0.1
Contact:

Re: Nagios does not update status in the UI

Post by hsmith »

Issue the 'systemctl status <whatever>' command and then do an 'echo $?' right after, to check that for sure. I found that is not the case :/
Former Nagios Employee.
me.
User avatar
hsmith
Agent Smith
Posts: 3539
Joined: Thu Jul 30, 2015 11:09 am
Location: 127.0.0.1
Contact:

Re: Nagios does not update status in the UI

Post by hsmith »

Quick example script. It's pretty hack-y, but hey, it does the thing.

Code: Select all

#!/bin/bash



S1="active"


Info=$(systemctl status $1 | awk 'NR==3' | awk '{print $2}')



if [ $Info == $S1 ]

then

        echo "yay" $1 "is running!"

        exit 0

else
        echo "boooo" $1 "is not running!"

        exit 3
fi
Former Nagios Employee.
me.
Locked