Nagios event handler

An open discussion forum for obtaining help with Nagios Core. Nagios Core users of all experience levels are welcome here. Subforum have been created for the discussion of Nagios Core and Nagios Plugin development.

NOTE: The SourceForge.net mailing lists have been deprecated in favor of this forum in order to expedite support and provide additional features not available on the old mailing list.

Nagios event handler

Postby anusha » Tue Jan 02, 2018 3:42 am

Hi,

I am trying to use event handler concept and restart jvm on the remote server.

command.cfg :

Code: Select all
define command{
        command_name  restart-jvm
        command_line  /usr/local/nagios/etc/event_handlers/script.sh  $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$
        }



server.cfg :


Code: Select all
define service{
        use                             local-service         ; Name of service template to use
        host_name                      XXXXX
        service_description             JVM status
        check_command                   check_nrpe!check_jvm
        check_interval                  2              ; Actively check the host every 5 minutes
        retry_interval                  1               ; Schedule host check retries at 1 minute intervals
        max_check_attempts              2           ; Check each Linux host 10 times (max time)
        event_handler                   restart-jvm
        event_handler_enabled           1


}


event-handler script:

Code: Select all
#script for restarting the web server on the local machine
#
# Note: This script will only restart the web server if the service is
#       retried 3 times (in a "soft" state) or if the web service somehow
#       manages to fall into a "hard" error state.
#


# What state is the HTTP service in?
case "$1" in
OK)
        ;;
WARNING)
     
        ;;
UNKNOWN)
     
        ;;
CRITICAL)
       
        case "$2" in
        SOFT)

               
                case "$3" in

                3)
                        echo -n "Restarting JVM (3rd soft critical state)..."
                      /etc/init.d/tomcatd recycle
                        ;;
                        esac
                ;;

       
        HARD)
                echo -n "Restarting JVM..."

                 /etc/init.d/tomcatd recycle
                ;;
        esac
        ;;
esac
exit 0


But the above script is not working.

In some links I see that we need to add nagios user to the sudoers file. Can you please let me know the alternative way for this.
Last edited by dwhitfield on Wed Jan 03, 2018 1:59 pm, edited 1 time in total.
Reason: code blocks FTW
anusha
 
Posts: 21
Joined: Mon Sep 11, 2017 4:38 am

Re: Nagios event handler

Postby cdienger » Wed Jan 03, 2018 2:39 pm

Can you clarify what you mean by it's not working? Is it even running? Have you tried having the script do something else(like writing to a file as seen in this XI doc https://assets.nagios.com/downloads/nag ... ios-XI.pdf)? You don't have to edit sudoers, but you must make sure /etc/init.d/tomcatd can be executed by nagios when it's called. This can be achieved by setting different permissions on the file or making sure the user belongs to the correct group.
User avatar
cdienger
Support Tech
 
Posts: 919
Joined: Tue Feb 07, 2017 11:26 am

Re: Nagios event handler

Postby anusha » Wed Jan 17, 2018 2:16 am

Hi cdeinger,

I am trying to restart the tomcat when it is down. Please find the below files and correct me if I am wrong.

command.cfg
Code: Select all
define command{
        command_name  restart-jvm
        command_line  /usr/local/nagios/etc/event_handlers/restart-jvm.sh  $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$ $HOSTADDRESS$
        }



host.cfg

Code: Select all
define service{
        use                             local-service         ; Name of service template to use
        host_name                       XXXX
        service_description             JVM status
        check_command                   check_nrpe!check_jvm
        check_interval                  3              ; Actively check the host every 5 minutes
        retry_interval                  1               ; Schedule host check retries at 1 minute intervals
        max_check_attempts              5           ; Check each Linux host 10 times (max)i
        event_handler                  restart-jvm

}


eventhandler script:

Code: Select all
OK)
        # The service just came back up, so don't do anything...
        ;;
WARNING)
        # We don't really care about warning states, since the service is probably still running...
        ;;
UNKNOWN)
        # We don't know what might be causing an unknown error, so don't do anything...
        ;;
CRITICAL)
        #  perhaps we should restart the server...

        # Is this a "soft" or a "hard" state?
        case "$2" in

        # We're in a "soft" state, meaning that Nagios is in the middle of retrying the
# check before it turns into a "hard" state and contacts get notified...
        SOFT)

         
                case "$3" in
 
                3)
                mail -s "Errors in the logs" sample@mail.com
                echo -n "Restarting JVM (3rd soft critical state)..."
                     
                        sudo -su wasadmin
                         /etc/init.d/tomcatd start
                        mail -s "Sampls" sample@mail.com

                        ;;
                        esac
                ;;

   
        HARD)
                echo -n "Restarting JVM..."
           
  sudo -su wasadmin
                 /etc/init.d/tomcatd start


                ;;
        esac
        ;;
esac
exit 0


~

Whenever the server is down I am getting mails as per my script but it is not restarting the tomcat. can you please help me with this.
Last edited by dwhitfield on Wed Jan 17, 2018 5:57 pm, edited 1 time in total.
Reason: code blocks FTW
anusha
 
Posts: 21
Joined: Mon Sep 11, 2017 4:38 am

Re: Nagios event handler

Postby cdienger » Thu Jan 18, 2018 5:50 pm

What are the permissions set to to on the script? Run ll /usr/local/nagios/etc/event_handlers/script.sh to get them and make sure there are execute permissions.

I modified the script slightly to write an entry to a log if the SOFT, HARD or OK state are seen. See below:

Code: Select all
#script for restarting the web server on the local machine
#
# Note: This script will only restart the web server if the service is
#       retried 3 times (in a "soft" state) or if the web service somehow
#       manages to fall into a "hard" error state.
#


# What state is the HTTP service in?
case "$1" in
OK)
      echo OK >> /tmp/test.txt
        date >> /tmp/test.txt
        ;;
WARNING)
        ;;
UNKNOWN)
        ;;
CRITICAL)
        case "$2" in
        SOFT)

                case "$3" in

                3)
                        echo -n "Restarting JVM (3rd soft critical state)..."
                        echo SOFT >> /tmp/test.txt
                        date >> /tmp/test.txt

                        /etc/init.d/tomcatd recycle
                        ;;
                     esac
                ;;


        HARD)
                echo -n "Restarting JVM..."
                 echo HARD >> /tmp/test.txt
                 date >> /tmp/test.txt
                 /etc/init.d/tomcatd recycle
                ;;
        esac
        ;;
esac
exit 0


This seems to work. Note this requires a /tmp/test.txt:

touch /tmp/test.txt
chmod a+rw /tmp/test.txt
User avatar
cdienger
Support Tech
 
Posts: 919
Joined: Tue Feb 07, 2017 11:26 am

Re: Nagios event handler

Postby anusha » Sat Jan 27, 2018 10:44 am

Thank you @cdienger. I have tried the same but no luck.

I tried executing the script on the remote server and it is working. But when I am trying to restart through event handler it is not working.

Worked script:

with nagios user: sudo -su wasadmin /etc/init.d/tomcatd start - is working
anusha
 
Posts: 21
Joined: Mon Sep 11, 2017 4:38 am

Re: Nagios event handler

Postby tacolover101 » Mon Jan 29, 2018 2:02 am

anusha wrote:Thank you @cdienger. I have tried the same but no luck.

I tried executing the script on the remote server and it is working. But when I am trying to restart through event handler it is not working.

Worked script:

with nagios user: sudo -su wasadmin /etc/init.d/tomcatd start - is working


scripts will execute as the nagios user. your command above uses the user wasadmin, not nagios.

you need to add a sudoers entry in, for the nagios user to have permission to restart the tomcatd service.
User avatar
tacolover101
 
Posts: 368
Joined: Mon Apr 10, 2017 11:55 am

Re: Nagios event handler

Postby anusha » Mon Jan 29, 2018 4:18 am

Thank you @tacolover101 for your quick response. That worked for me actually by executing only a single line (pfb line) in script .

restart-jvm.sh
Code: Select all
sudo -su wasadmin /etc/init.d/tomcatd start


My idea is to use the below script but getting NRPE:unable to read output error. Can you please help me with this.

script:
Code: Select all
case "$1" in
OK)
   # The service just came back up, so don't do anything...
   ;;
WARNING)
   # We don't really care about warning states, since the service is probably still running...
   ;;
UNKNOWN)
   # We don't know what might be causing an unknown error, so don't do anything...
   ;;
CRITICAL)
   
   case "$2" in

   SOFT)
         

      case "$3" in
            

      3)
      
      echo -n "Restarting JVM (3rd soft critical state)..."
   
         sudo -su wasadmin /etc/init.d/tomcatd start
         

         ;;
         esac
      ;;

   HARD)
      echo -n "Restarting JVM..."
         sudo -su wasadmin /etc/init.d/tomcatd start

      
      ;;
   esac
   ;;
esac
exit 0
Last edited by dwhitfield on Mon Jan 29, 2018 2:08 pm, edited 1 time in total.
Reason: code blocks FTW
anusha
 
Posts: 21
Joined: Mon Sep 11, 2017 4:38 am

Re: Nagios event handler

Postby kyang » Mon Jan 29, 2018 2:43 pm

If you are receiving that error with NRPE.

Take a look at our kb article for troubleshooting this error.
https://support.nagios.com/kb/article/n ... t-620.html
Be sure to check out our Knowledgebase for helpful articles and solutions!
User avatar
kyang
Support Tech
 
Posts: 1478
Joined: Tue Jul 25, 2017 3:35 pm

Re: Nagios event handler

Postby anusha » Tue Feb 13, 2018 3:40 am

Hi Team,

I am trying to restart the jvm in the last soft state and I am using below script and getting below error. Can someone please help me with this.

script:

Code: Select all
case "$1" in
OK)

   # The service just came back up, so don't do anything...
   ;;
WARNING)
   # We don't really care about warning states, since the service is probably still running...
   ;;
UNKNOWN)
   # We don't know what might be causing an unknown error, so don't do anything...
   ;;
CRITICAL)
   
   case "$2" in
      
   
   SOFT)
         
      
      case "$3" in
            
      
      3)
      

          sudo -su wasadmin etc/init.d/tomcatd start
      

         ;;
         esac
      ;;
            
   
   HARD)
      
      sudo -su wasadmin  /etc/init.d/tomcatd start

      
      ;;
   esac
   ;;
esac
exit 0



ERROR:

SERVICE EVENT HANDLER:XXX;JVM status;CRITICAL;SOFT;1;restart-jvm!restart-jvm
[1518505203] wproc: SERVICE EVENTHANDLER job 17 from worker Core Worker 19975 is a non-check helper but exited with return code 3
[1518505203] wproc: early_timeout=0; exited_ok=1; wait_status=768; error_code=0;
[1518505203] wproc: stdout line 01: NRPE: Unable to read output
Last edited by kyang on Tue Feb 13, 2018 3:21 pm, edited 1 time in total.
Reason: code blocks
anusha
 
Posts: 21
Joined: Mon Sep 11, 2017 4:38 am

Re: Nagios event handler

Postby npolovenko » Tue Feb 13, 2018 4:08 pm

@anusha, I don't quite understand. You're checking the JVM using NRPE on the remote server, right? Then why is your event handler restarting tomcat locally? Here's a little example of how it should be:
#!/bin/sh
case "$1" in
OK)
;;
WARNING)
;;
UNKNOWN)
;;
CRITICAL)
/usr/local/nagios/libexec/check_nrpe -H "$2" -p 5666 -c start_tomcat
;;
esac
exit 0

Where $1 is a state type, $2 is a host name. So on the remote server in nrpe.cfg you need to define the command:
Code: Select all
command[restart_tomcat]=sudo -su wasadmin  /etc/init.d/tomcatd start

So then technically I can just run this command from a nagios server command line:
Code: Select all
/usr/local/nagios/libexec/check_nrpe -H remote_server_ip -p 5666 -c start_tomcat

Please make sure you can run the above command in a similar manner and that it actually starts the tomcat before moving on to the event handler.
User avatar
npolovenko
Support Tech
 
Posts: 1290
Joined: Mon May 15, 2017 5:00 pm


Return to Nagios Core

Who is online

Users browsing this forum: No registered users and 41 guests