This is a multi-part message in MIME format.
--------------060408010808050603010400
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Michael Streb wrote:
> Hi,
>
> I already posted this issue a while ago, see
> "[Nagios-devel] blocking character of event_handlers" on the list.
>
> Regards,
>
> Michael
>
Ah, i had a look, but didn't found your post before.
Solution #1 doesn't work, if i start my eventhandler
in background, nagios still waits for the eventhandler to
finish.
So, in my opinion, there are 2 problems with eventhandler/notifications.
1. They are executed sequentially and blocking nagios while executed
2. nagios runs amok when the eventhandler gets into an early_timeout
because the main process
wants to read from the killed pipe in a never ending loop.
I wrote a small patch for the second issue, maybe someone with more c
skills wants to have a look...
Regards,
Sven
--------------060408010808050603010400
Content-Type: text/plain;
name="nagios_utils.c_mysystem.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="nagios_utils.c_mysystem.diff"
--- nagios-3.0.4/base/utils.c 2008-10-15 19:43:55.000000000 +0200
+++ nagios-3.0.4-patched/base/utils.c 2008-10-30 16:40:50.000000000 +0100
@@ -581,6 +581,20 @@
/* initialize dynamic buffer */
dbuf_init(&output_dbuf,dbuf_chunk);
+ /* if there was a critical return code and no output AND the command time exceeded the timeout thresholds, assume a timeout */
+ if(result==STATE_CRITICAL && bytes_read==-1 && (end_time.tv_sec-start_time.tv_sec)>=timeout){
+
+ /* set the early timeout flag */
+ *early_timeout=TRUE;
+
+ /* try to kill the command that timed out by sending termination signal to child process group */
+ kill((pid_t)(-pid),SIGTERM);
+ sleep(1);
+ kill((pid_t)(-pid),SIGKILL);
+ }
+
+ log_debug_info(DEBUGL_COMMANDS,1,"Execution time=%.3f sec\n, early timeout=%d, result=%d\n",*exectime,*early_timeout,result);
+
/* try and read the results from the command output (retry if we encountered a signal) */
do{
bytes_read=read(fd[0],buffer,sizeof(buffer)-1);
@@ -604,7 +618,7 @@
if(bytes_read==0)
break;
- }while(1);
+ }while(!early_timeout);
/* cap output length - this isn't necessary, but it keeps runaway plugin output from causing problems */
if(max_output_length>0 && output_dbuf.used_size>max_output_length)
@@ -616,20 +630,6 @@
/* free memory */
dbuf_free(&output_dbuf);
- /* if there was a critical return code and no output AND the command time exceeded the timeout thresholds, assume a timeout */
- if(result==STATE_CRITICAL && bytes_read==-1 && (end_time.tv_sec-start_time.tv_sec)>=timeout){
-
- /* set the early timeout flag */
- *early_timeout=TRUE;
-
- /* try to kill the command that timed out by sending termination signal to child process group */
- kill((pid_t)(-pid),SIGTERM);
- sleep(1);
- kill((pid_t)(-pid),SIGKILL);
- }
-
- log_debug_info(DEBUGL_COMMANDS,1,"Execution time=%.3f sec\n, early timeout=%d, result=%d",*exectime,*early_timeout,result);
-
#ifdef USE_EVENT_BROKER
/* send data to event broker */
broker_system_command(NEBTYPE_SYSTEM_COMMAND_END,NEBFLAG_NONE,NEBATTR_NONE,start_time,end_time,*exectime,timeout,*early_timeout,result,cmd,(output==NULL)?NULL:*output,NULL);
@@ -4565,4 +4565,3 @@
return OK;
}
-
--------------060408010808050603010400--
This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]