Nagios Support Forum

Posted: **Wed May 13, 2015 12:49 pm**

Hi,
I need to stop monitoring a particular service once it runs successfully for the day and then reschedule the check for the next day.

My script looks like below:

***************************************************************************************************************************************************************************************************
#!/bin/sh

if [ $# -lt 3 ]; then
echo "./check_morning_jobs.sh JOBNAME HOSTNAME IP"
exit 0
fi

LOGFILE="/usr/local/nagios/libexec/morningjobs/MORNING_JOB_$1"
> $LOGFILE

JOBNAME=$1
HOSTNAME=$2
SERVICENAME=`cat ../etc/objects/UAT_as400.services.cfg | grep $JOBNAME -B 1 | head -1 | awk -F"service_description" '{print $2}' | sed 's/^ *//g'`
IP=$3
TOMORROW=`date --date="+1 day 00:40:00" +%s`
#echo "Tomorrow= `date -d@$TOMORROW`"
NEXTHOUR=`date --date="+1 hour" +%s`
NOW=`date +%s`

/usr/local/nagios/libexec/check_by_ssh -H $IP -l nagios -t 120 -C "/home/nagios/jobLog.sh $JOBNAME"
STATUS=$?

echo "Status of $JOBNAME: $STATUS" >> $LOGFILE
if [ $STATUS -eq 0 ]; then
#echo "Job $JOBNAME ran successfully for today"
echo "Re-scheduling the job for tomorrow."
/usr/bin/printf "[%lu] SCHEDULE_FORCED_SVC_CHECK;$HOSTNAME;$SERVICENAME;$TOMORROW\n" $NOW > /usr/local/nagios/var/rw/nagios.cmd
#/usr/bin/printf "[%lu] SCHEDULE_FORCED_SVC_CHECK;$HOSTNAME;$SERVICENAME;'$NEXTHOUR\n" $NOW > /usr/local/nagios/var/rw/nagios.cmd
/usr/bin/printf "[%lu] ADD_SVC_COMMENT;$HOSTNAME;$SERVICENAME;1;nagiosadmin;Morning check OK\n" $NOW > /usr/local/nagios/var/rw/nagios.cmd
cat /usr/local/nagios/var/nagios.log | grep "$SERVICENAME" | grep SCHEDULE_FORCED_SVC_CHECK | tail -1 >> $LOGFILE
cat $LOGFILE
exit 0
else
echo "Job $JOBNAME did not run for today. Will check after 10 mins."
cat $LOGFILE
exit 2
fi
***************************************************************************************************************************************************************************************************

When i run the script from the shell it runs fine but when re-scheduled from the webpage by clicking "Re-schedule the next check of this service" it fails with the following log:

[1431538590] Error: External command failed -> SCHEDULE_FORCED_SVC_CHECK;UAT_CMP_AS400_CWUDB2T2;;1431560400

My configs are:
[nagios@ukcpwmon01 var]$ sudo cat /etc/group | grep nag
nagios:x:10002:nagios,apache
nagcmd:x:10003:
nagiocmd:x:10004:nagios,nobody,apache

[nagios@ukcpwmon01 var]$ sestatus
SELinux status: disabled

[nagios@ukcpwmon01 var]$ uname -a
Linux ukcpwmon01 2.6.32-431.29.2.el6.x86_64 #1 SMP Sun Jul 27 15:55:46 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux

[nagios@ukcpwmon01 var]$ ls -la
total 17184
drwxrwxr-x. 5 nagios nagios 4096 May 13 18:46 .
drwxr-xr-x. 9 root root 4096 Jan 21 11:49 ..
drwxrwxr-x. 2 nagios nagios 4096 May 12 23:59 archives
-rw-r--r--. 1 nagios nagios 114416 May 13 18:16 livestatus.log
-rw-r--r--. 1 nagios nagios 8466 May 13 18:16 nagios.configtest
-rw-r--r-- 1 nagios nagios 6 May 13 18:16 nagios.lock
-rw-r--r--. 1 nagios nagios 8733424 May 13 18:45 nagios.log
-rw-r--r--. 1 nagios nagios 1524525 May 13 18:16 objects.cache
-rw-r--r--. 1 nagios nagios 1524525 May 13 18:16 objects.precache
-rw------- 1 nagios nagios 2825311 May 13 18:16 retention.dat
drwxrwsr-x. 2 nagios nagiocmd 4096 May 13 18:16 rw
drwxr-xr-x. 3 root root 4096 Jan 21 11:43 spool
-rw-rw-r-- 1 nagios nagios 2816943 May 13 18:46 status.dat

[nagios@ukcpwmon01 rw]$ ls -la
total 8
drwxrwsr-x. 2 nagios nagiocmd 4096 May 13 18:16 .
drwxrwxr-x. 5 nagios nagios 4096 May 13 18:46 ..
srw-rw---- 1 nagios nagiocmd 0 May 13 18:16 live
prw-rw---- 1 nagios nagiocmd 0 May 13 18:46 nagios.cmd
srw-rw---- 1 nagios nagiocmd 0 May 13 18:16 nagios.qh

Really need to get this fixed. Please help!

Posted: **Wed May 13, 2015 4:34 pm**

When i run the script from the shell it runs fine but when re-scheduled from the webpage by clicking "Re-schedule the next check of this service" it fails with the following log:

When you run the script from the shell, try running it as the 'nagios' user. Does your result change? Do you need to prepend the command with sudo?

It would also be useful if you could post the appropriate section of your commands.cfg file, as well as your service.cfg.

Posted: **Thu May 14, 2015 3:43 am**

I am running the script via nagios user on the shell.

COMMANDS.CFG
define command{
command_name check_morning_job
command_line /usr/local/nagios/libexec/check_morning_job.sh "$ARG1$" "$ARG2$" "$ARG3$"
}

SERVICES.CFG
define service{
use cmp-morning-service
host_name UAT_CMP_AS400_CWUDB2T2
service_description AS400 jobs status - Datawarehouse Extract CIF-3088 1
check_command check_morning_job!DTAWEXT001!UAT_CMP_AS400_CWUDB2T2!192.168.249.132
check_period 24x7_morning
}

Prepending the command line with a sudo /usr/local/nagios/libexec/check_morning_job.sh "$ARG1$" "$ARG2$" "$ARG3$" give the below error:
Remote command execution failed: Permission denied, please try again.

Any thoughts?

Posted: **Thu May 14, 2015 1:47 pm**

It's not passing in the service name and you don't need sudo.

You could change your command to be:

Code: Select all

define command{
command_name check_morning_job
command_line /usr/local/nagios/libexec/check_morning_job.sh "$ARG1$" "$ARG2$" "$ARG3$" "$SERVICEDESC$"
}

And your script:

Code: Select all

#!/bin/sh

if [ $# -lt 4 ]; then
echo "./check_morning_jobs.sh JOBNAME HOSTNAME IP SERVICENAME"
exit 0
fi

LOGFILE="/usr/local/nagios/libexec/morningjobs/MORNING_JOB_$1"
> $LOGFILE

JOBNAME=$1
HOSTNAME=$2
IP=$3
SERVICENAME=$4

Posted: **Fri May 15, 2015 11:05 am**

That actually worked!
Appreciate your assistance Sir. Thanks much.

Posted: **Fri May 15, 2015 11:28 am**

Great! Do you have any follow up questions or are we clear to lock the thread?

Posted: **Wed May 20, 2015 4:10 am**

Yes I have one more question:

I have a service which monitors a particular job. Now the job runs from 00:40 to 23:59 everyday.
If the job runs successfully once, then I am supposed to stop monitoring it for the day and re-schedule the next check the next morning at 00:40.

Similarly there are other jobs which are weekly or monthly. So for eg, a perticular job runs on MON every week at 3AM. If it runs successfully, I need to reschedule the next check to 3AM next Monday.

Is there a simpler way of doing this? A way where I could halt the monitoring till the next valid start time of a timeperiod. How do I get the value of the next valid check time?

Please advise.

Posted: **Wed May 20, 2015 11:59 am**

That's a bit tricky. You might need to use a combination of DISABLE_HOST_CHECK and ENABLE_HOST_CHECK:

http://old.nagios.org/developerinfo/ext ... mand_id=54

Tie that into an event handler that checks the status of the last check, then kicks off an "at" command to schedule the re-enable, but that gets nasty quickly.

Nagios Support Forum

Error: External command failed -> SCHEDULE_FORCED_SVC_CHECK

Error: External command failed -> SCHEDULE_FORCED_SVC_CHECK

Re: Error: External command failed -> SCHEDULE_FORCED_SVC_CH

Re: Error: External command failed -> SCHEDULE_FORCED_SVC_CH

Re: Error: External command failed -> SCHEDULE_FORCED_SVC_CH

Re: Error: External command failed -> SCHEDULE_FORCED_SVC_CH

Re: Error: External command failed -> SCHEDULE_FORCED_SVC_CH

Re: Error: External command failed -> SCHEDULE_FORCED_SVC_CH

Re: Error: External command failed -> SCHEDULE_FORCED_SVC_CH