Nagios event handler

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
anusha
Posts: 21
Joined: Mon Sep 11, 2017 4:38 am

Nagios event handler

Post by anusha »

Hi,

I am trying to use event handler concept and restart jvm on the remote server.

command.cfg :

Code: Select all

define command{
        command_name  restart-jvm
        command_line  /usr/local/nagios/etc/event_handlers/script.sh  $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$
        }

server.cfg :

Code: Select all

define service{
        use                             local-service         ; Name of service template to use
        host_name                      XXXXX
        service_description             JVM status
        check_command                   check_nrpe!check_jvm
        check_interval                  2              ; Actively check the host every 5 minutes
        retry_interval                  1               ; Schedule host check retries at 1 minute intervals
        max_check_attempts              2           ; Check each Linux host 10 times (max time)
        event_handler                   restart-jvm
        event_handler_enabled           1


}
event-handler script:

Code: Select all

#script for restarting the web server on the local machine
#
# Note: This script will only restart the web server if the service is
#       retried 3 times (in a "soft" state) or if the web service somehow
#       manages to fall into a "hard" error state.
#


# What state is the HTTP service in?
case "$1" in
OK)
        ;;
WARNING)
      
        ;;
UNKNOWN)
      
        ;;
CRITICAL)
        
        case "$2" in
        SOFT)

               
                case "$3" in

                3)
                        echo -n "Restarting JVM (3rd soft critical state)..."
                      /etc/init.d/tomcatd recycle
                        ;;
                        esac
                ;;

       
        HARD)
                echo -n "Restarting JVM..."
 
                 /etc/init.d/tomcatd recycle
                ;;
        esac
        ;;
esac
exit 0
But the above script is not working.

In some links I see that we need to add nagios user to the sudoers file. Can you please let me know the alternative way for this.
Last edited by dwhitfield on Wed Jan 03, 2018 1:59 pm, edited 1 time in total.
Reason: code blocks FTW
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Nagios event handler

Post by cdienger »

Can you clarify what you mean by it's not working? Is it even running? Have you tried having the script do something else(like writing to a file as seen in this XI doc https://assets.nagios.com/downloads/nag ... ios-XI.pdf)? You don't have to edit sudoers, but you must make sure /etc/init.d/tomcatd can be executed by nagios when it's called. This can be achieved by setting different permissions on the file or making sure the user belongs to the correct group.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
anusha
Posts: 21
Joined: Mon Sep 11, 2017 4:38 am

Re: Nagios event handler

Post by anusha »

Hi cdeinger,

I am trying to restart the tomcat when it is down. Please find the below files and correct me if I am wrong.

command.cfg

Code: Select all

define command{
        command_name  restart-jvm
        command_line  /usr/local/nagios/etc/event_handlers/restart-jvm.sh  $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$ $HOSTADDRESS$
        }

host.cfg

Code: Select all

define service{
        use                             local-service         ; Name of service template to use
        host_name                       XXXX
        service_description             JVM status
        check_command                   check_nrpe!check_jvm
        check_interval                  3              ; Actively check the host every 5 minutes
        retry_interval                  1               ; Schedule host check retries at 1 minute intervals
        max_check_attempts              5           ; Check each Linux host 10 times (max)i
        event_handler                  restart-jvm

}
eventhandler script:

Code: Select all

OK)
        # The service just came back up, so don't do anything...
        ;;
WARNING)
        # We don't really care about warning states, since the service is probably still running...
        ;;
UNKNOWN)
        # We don't know what might be causing an unknown error, so don't do anything...
        ;;
CRITICAL)
        #  perhaps we should restart the server...

        # Is this a "soft" or a "hard" state?
        case "$2" in

        # We're in a "soft" state, meaning that Nagios is in the middle of retrying the
# check before it turns into a "hard" state and contacts get notified...
        SOFT)

          
                case "$3" in
  
                3)
                mail -s "Errors in the logs" sample@mail.com
                echo -n "Restarting JVM (3rd soft critical state)..."
                     
                        sudo -su wasadmin
                         /etc/init.d/tomcatd start
                        mail -s "Sampls" sample@mail.com

                        ;;
                        esac
                ;;

   
        HARD)
                echo -n "Restarting JVM..."
            
  sudo -su wasadmin
                 /etc/init.d/tomcatd start


                ;;
        esac
        ;;
esac
exit 0
~

Whenever the server is down I am getting mails as per my script but it is not restarting the tomcat. can you please help me with this.
Last edited by dwhitfield on Wed Jan 17, 2018 5:57 pm, edited 1 time in total.
Reason: code blocks FTW
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Nagios event handler

Post by cdienger »

What are the permissions set to to on the script? Run ll /usr/local/nagios/etc/event_handlers/script.sh to get them and make sure there are execute permissions.

I modified the script slightly to write an entry to a log if the SOFT, HARD or OK state are seen. See below:

Code: Select all

#script for restarting the web server on the local machine
#
# Note: This script will only restart the web server if the service is
#       retried 3 times (in a "soft" state) or if the web service somehow
#       manages to fall into a "hard" error state.
#


# What state is the HTTP service in?
case "$1" in
OK)
		echo OK >> /tmp/test.txt
        date >> /tmp/test.txt
        ;;
WARNING)
        ;;
UNKNOWN)
        ;;
CRITICAL)
        case "$2" in
        SOFT)

                case "$3" in

                3)
                        echo -n "Restarting JVM (3rd soft critical state)..."
                        echo SOFT >> /tmp/test.txt
                        date >> /tmp/test.txt

                        /etc/init.d/tomcatd recycle
                        ;;
                     esac
                ;;


        HARD)
                echo -n "Restarting JVM..."
                 echo HARD >> /tmp/test.txt
                 date >> /tmp/test.txt
                 /etc/init.d/tomcatd recycle
                ;;
        esac
        ;;
esac
exit 0
This seems to work. Note this requires a /tmp/test.txt:

touch /tmp/test.txt
chmod a+rw /tmp/test.txt
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
anusha
Posts: 21
Joined: Mon Sep 11, 2017 4:38 am

Re: Nagios event handler

Post by anusha »

Thank you @cdienger. I have tried the same but no luck.

I tried executing the script on the remote server and it is working. But when I am trying to restart through event handler it is not working.

Worked script:

with nagios user: sudo -su wasadmin /etc/init.d/tomcatd start - is working
User avatar
tacolover101
Posts: 432
Joined: Mon Apr 10, 2017 11:55 am

Re: Nagios event handler

Post by tacolover101 »

anusha wrote:Thank you @cdienger. I have tried the same but no luck.

I tried executing the script on the remote server and it is working. But when I am trying to restart through event handler it is not working.

Worked script:

with nagios user: sudo -su wasadmin /etc/init.d/tomcatd start - is working
scripts will execute as the nagios user. your command above uses the user wasadmin, not nagios.

you need to add a sudoers entry in, for the nagios user to have permission to restart the tomcatd service.
anusha
Posts: 21
Joined: Mon Sep 11, 2017 4:38 am

Re: Nagios event handler

Post by anusha »

Thank you @tacolover101 for your quick response. That worked for me actually by executing only a single line (pfb line) in script .

restart-jvm.sh

Code: Select all

sudo -su wasadmin /etc/init.d/tomcatd start
My idea is to use the below script but getting NRPE:unable to read output error. Can you please help me with this.

script:

Code: Select all

case "$1" in
OK)
	# The service just came back up, so don't do anything...
	;;
WARNING)
	# We don't really care about warning states, since the service is probably still running...
	;;
UNKNOWN)
	# We don't know what might be causing an unknown error, so don't do anything...
	;;
CRITICAL)
	
	case "$2" in

	SOFT)
			

		case "$3" in
				

		3)
		
		echo -n "Restarting JVM (3rd soft critical state)..."
	
			sudo -su wasadmin /etc/init.d/tomcatd start
			

			;;
			esac
		;;

	HARD)
		echo -n "Restarting JVM..."
			sudo -su wasadmin /etc/init.d/tomcatd start

		
		;;
	esac
	;;
esac
exit 0
Last edited by dwhitfield on Mon Jan 29, 2018 2:08 pm, edited 1 time in total.
Reason: code blocks FTW
kyang

Re: Nagios event handler

Post by kyang »

If you are receiving that error with NRPE.

Take a look at our kb article for troubleshooting this error.
https://support.nagios.com/kb/article/n ... t-620.html
anusha
Posts: 21
Joined: Mon Sep 11, 2017 4:38 am

Re: Nagios event handler

Post by anusha »

Hi Team,

I am trying to restart the jvm in the last soft state and I am using below script and getting below error. Can someone please help me with this.

script:

Code: Select all

case "$1" in
OK)
 
	# The service just came back up, so don't do anything...
	;;
WARNING)
	# We don't really care about warning states, since the service is probably still running...
	;;
UNKNOWN)
	# We don't know what might be causing an unknown error, so don't do anything...
	;;
CRITICAL)
	
	case "$2" in
		
	
	SOFT)
			
		
		case "$3" in
				
		
		3)
		

			 sudo -su wasadmin etc/init.d/tomcatd start
		

			;;
			esac
		;;
				
	
	HARD)
		
		sudo -su wasadmin  /etc/init.d/tomcatd start

		
		;;
	esac
	;;
esac
exit 0

ERROR:

SERVICE EVENT HANDLER:XXX;JVM status;CRITICAL;SOFT;1;restart-jvm!restart-jvm
[1518505203] wproc: SERVICE EVENTHANDLER job 17 from worker Core Worker 19975 is a non-check helper but exited with return code 3
[1518505203] wproc: early_timeout=0; exited_ok=1; wait_status=768; error_code=0;
[1518505203] wproc: stdout line 01: NRPE: Unable to read output
Last edited by Anonymous on Tue Feb 13, 2018 3:21 pm, edited 1 time in total.
Reason: code blocks
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Nagios event handler

Post by npolovenko »

@anusha, I don't quite understand. You're checking the JVM using NRPE on the remote server, right? Then why is your event handler restarting tomcat locally? Here's a little example of how it should be:
#!/bin/sh
case "$1" in
OK)
;;
WARNING)
;;
UNKNOWN)
;;
CRITICAL)
/usr/local/nagios/libexec/check_nrpe -H "$2" -p 5666 -c start_tomcat
;;
esac
exit 0
Where $1 is a state type, $2 is a host name. So on the remote server in nrpe.cfg you need to define the command:

Code: Select all

command[restart_tomcat]=sudo -su wasadmin  /etc/init.d/tomcatd start
So then technically I can just run this command from a nagios server command line:

Code: Select all

/usr/local/nagios/libexec/check_nrpe -H remote_server_ip -p 5666 -c start_tomcat
Please make sure you can run the above command in a similar manner and that it actually starts the tomcat before moving on to the event handler.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Locked