Error: External command failed -> SCHEDULE_FORCED_SVC_CHECK
Posted: Wed May 13, 2015 12:49 pm
Hi,
I need to stop monitoring a particular service once it runs successfully for the day and then reschedule the check for the next day.
My script looks like below:
***************************************************************************************************************************************************************************************************
#!/bin/sh
if [ $# -lt 3 ]; then
echo "./check_morning_jobs.sh JOBNAME HOSTNAME IP"
exit 0
fi
LOGFILE="/usr/local/nagios/libexec/morningjobs/MORNING_JOB_$1"
> $LOGFILE
JOBNAME=$1
HOSTNAME=$2
SERVICENAME=`cat ../etc/objects/UAT_as400.services.cfg | grep $JOBNAME -B 1 | head -1 | awk -F"service_description" '{print $2}' | sed 's/^ *//g'`
IP=$3
TOMORROW=`date --date="+1 day 00:40:00" +%s`
#echo "Tomorrow= `date -d@$TOMORROW`"
NEXTHOUR=`date --date="+1 hour" +%s`
NOW=`date +%s`
/usr/local/nagios/libexec/check_by_ssh -H $IP -l nagios -t 120 -C "/home/nagios/jobLog.sh $JOBNAME"
STATUS=$?
echo "Status of $JOBNAME: $STATUS" >> $LOGFILE
if [ $STATUS -eq 0 ]; then
#echo "Job $JOBNAME ran successfully for today"
echo "Re-scheduling the job for tomorrow."
/usr/bin/printf "[%lu] SCHEDULE_FORCED_SVC_CHECK;$HOSTNAME;$SERVICENAME;$TOMORROW\n" $NOW > /usr/local/nagios/var/rw/nagios.cmd
#/usr/bin/printf "[%lu] SCHEDULE_FORCED_SVC_CHECK;$HOSTNAME;$SERVICENAME;'$NEXTHOUR\n" $NOW > /usr/local/nagios/var/rw/nagios.cmd
/usr/bin/printf "[%lu] ADD_SVC_COMMENT;$HOSTNAME;$SERVICENAME;1;nagiosadmin;Morning check OK\n" $NOW > /usr/local/nagios/var/rw/nagios.cmd
cat /usr/local/nagios/var/nagios.log | grep "$SERVICENAME" | grep SCHEDULE_FORCED_SVC_CHECK | tail -1 >> $LOGFILE
cat $LOGFILE
exit 0
else
echo "Job $JOBNAME did not run for today. Will check after 10 mins."
cat $LOGFILE
exit 2
fi
***************************************************************************************************************************************************************************************************
When i run the script from the shell it runs fine but when re-scheduled from the webpage by clicking "Re-schedule the next check of this service" it fails with the following log:
[1431538590] Error: External command failed -> SCHEDULE_FORCED_SVC_CHECK;UAT_CMP_AS400_CWUDB2T2;;1431560400
My configs are:
[nagios@ukcpwmon01 var]$ sudo cat /etc/group | grep nag
nagios:x:10002:nagios,apache
nagcmd:x:10003:
nagiocmd:x:10004:nagios,nobody,apache
[nagios@ukcpwmon01 var]$ sestatus
SELinux status: disabled
[nagios@ukcpwmon01 var]$ uname -a
Linux ukcpwmon01 2.6.32-431.29.2.el6.x86_64 #1 SMP Sun Jul 27 15:55:46 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux
[nagios@ukcpwmon01 var]$ ls -la
total 17184
drwxrwxr-x. 5 nagios nagios 4096 May 13 18:46 .
drwxr-xr-x. 9 root root 4096 Jan 21 11:49 ..
drwxrwxr-x. 2 nagios nagios 4096 May 12 23:59 archives
-rw-r--r--. 1 nagios nagios 114416 May 13 18:16 livestatus.log
-rw-r--r--. 1 nagios nagios 8466 May 13 18:16 nagios.configtest
-rw-r--r-- 1 nagios nagios 6 May 13 18:16 nagios.lock
-rw-r--r--. 1 nagios nagios 8733424 May 13 18:45 nagios.log
-rw-r--r--. 1 nagios nagios 1524525 May 13 18:16 objects.cache
-rw-r--r--. 1 nagios nagios 1524525 May 13 18:16 objects.precache
-rw------- 1 nagios nagios 2825311 May 13 18:16 retention.dat
drwxrwsr-x. 2 nagios nagiocmd 4096 May 13 18:16 rw
drwxr-xr-x. 3 root root 4096 Jan 21 11:43 spool
-rw-rw-r-- 1 nagios nagios 2816943 May 13 18:46 status.dat
[nagios@ukcpwmon01 rw]$ ls -la
total 8
drwxrwsr-x. 2 nagios nagiocmd 4096 May 13 18:16 .
drwxrwxr-x. 5 nagios nagios 4096 May 13 18:46 ..
srw-rw---- 1 nagios nagiocmd 0 May 13 18:16 live
prw-rw---- 1 nagios nagiocmd 0 May 13 18:46 nagios.cmd
srw-rw---- 1 nagios nagiocmd 0 May 13 18:16 nagios.qh
Really need to get this fixed. Please help!
I need to stop monitoring a particular service once it runs successfully for the day and then reschedule the check for the next day.
My script looks like below:
***************************************************************************************************************************************************************************************************
#!/bin/sh
if [ $# -lt 3 ]; then
echo "./check_morning_jobs.sh JOBNAME HOSTNAME IP"
exit 0
fi
LOGFILE="/usr/local/nagios/libexec/morningjobs/MORNING_JOB_$1"
> $LOGFILE
JOBNAME=$1
HOSTNAME=$2
SERVICENAME=`cat ../etc/objects/UAT_as400.services.cfg | grep $JOBNAME -B 1 | head -1 | awk -F"service_description" '{print $2}' | sed 's/^ *//g'`
IP=$3
TOMORROW=`date --date="+1 day 00:40:00" +%s`
#echo "Tomorrow= `date -d@$TOMORROW`"
NEXTHOUR=`date --date="+1 hour" +%s`
NOW=`date +%s`
/usr/local/nagios/libexec/check_by_ssh -H $IP -l nagios -t 120 -C "/home/nagios/jobLog.sh $JOBNAME"
STATUS=$?
echo "Status of $JOBNAME: $STATUS" >> $LOGFILE
if [ $STATUS -eq 0 ]; then
#echo "Job $JOBNAME ran successfully for today"
echo "Re-scheduling the job for tomorrow."
/usr/bin/printf "[%lu] SCHEDULE_FORCED_SVC_CHECK;$HOSTNAME;$SERVICENAME;$TOMORROW\n" $NOW > /usr/local/nagios/var/rw/nagios.cmd
#/usr/bin/printf "[%lu] SCHEDULE_FORCED_SVC_CHECK;$HOSTNAME;$SERVICENAME;'$NEXTHOUR\n" $NOW > /usr/local/nagios/var/rw/nagios.cmd
/usr/bin/printf "[%lu] ADD_SVC_COMMENT;$HOSTNAME;$SERVICENAME;1;nagiosadmin;Morning check OK\n" $NOW > /usr/local/nagios/var/rw/nagios.cmd
cat /usr/local/nagios/var/nagios.log | grep "$SERVICENAME" | grep SCHEDULE_FORCED_SVC_CHECK | tail -1 >> $LOGFILE
cat $LOGFILE
exit 0
else
echo "Job $JOBNAME did not run for today. Will check after 10 mins."
cat $LOGFILE
exit 2
fi
***************************************************************************************************************************************************************************************************
When i run the script from the shell it runs fine but when re-scheduled from the webpage by clicking "Re-schedule the next check of this service" it fails with the following log:
[1431538590] Error: External command failed -> SCHEDULE_FORCED_SVC_CHECK;UAT_CMP_AS400_CWUDB2T2;;1431560400
My configs are:
[nagios@ukcpwmon01 var]$ sudo cat /etc/group | grep nag
nagios:x:10002:nagios,apache
nagcmd:x:10003:
nagiocmd:x:10004:nagios,nobody,apache
[nagios@ukcpwmon01 var]$ sestatus
SELinux status: disabled
[nagios@ukcpwmon01 var]$ uname -a
Linux ukcpwmon01 2.6.32-431.29.2.el6.x86_64 #1 SMP Sun Jul 27 15:55:46 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux
[nagios@ukcpwmon01 var]$ ls -la
total 17184
drwxrwxr-x. 5 nagios nagios 4096 May 13 18:46 .
drwxr-xr-x. 9 root root 4096 Jan 21 11:49 ..
drwxrwxr-x. 2 nagios nagios 4096 May 12 23:59 archives
-rw-r--r--. 1 nagios nagios 114416 May 13 18:16 livestatus.log
-rw-r--r--. 1 nagios nagios 8466 May 13 18:16 nagios.configtest
-rw-r--r-- 1 nagios nagios 6 May 13 18:16 nagios.lock
-rw-r--r--. 1 nagios nagios 8733424 May 13 18:45 nagios.log
-rw-r--r--. 1 nagios nagios 1524525 May 13 18:16 objects.cache
-rw-r--r--. 1 nagios nagios 1524525 May 13 18:16 objects.precache
-rw------- 1 nagios nagios 2825311 May 13 18:16 retention.dat
drwxrwsr-x. 2 nagios nagiocmd 4096 May 13 18:16 rw
drwxr-xr-x. 3 root root 4096 Jan 21 11:43 spool
-rw-rw-r-- 1 nagios nagios 2816943 May 13 18:46 status.dat
[nagios@ukcpwmon01 rw]$ ls -la
total 8
drwxrwsr-x. 2 nagios nagiocmd 4096 May 13 18:16 .
drwxrwxr-x. 5 nagios nagios 4096 May 13 18:46 ..
srw-rw---- 1 nagios nagiocmd 0 May 13 18:16 live
prw-rw---- 1 nagios nagiocmd 0 May 13 18:46 nagios.cmd
srw-rw---- 1 nagios nagiocmd 0 May 13 18:16 nagios.qh
Really need to get this fixed. Please help!