Re: [Nagios-devel] eventhandler timeout 3.0.4

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

Re: [Nagios-devel] eventhandler timeout 3.0.4

Post by Guest »

This is a multi-part message in MIME format.
--------------060408010808050603010400
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Michael Streb wrote:
> Hi,
>
> I already posted this issue a while ago, see
> "[Nagios-devel] blocking character of event_handlers" on the list.
>
> Regards,
>
> Michael
>

Ah, i had a look, but didn't found your post before.
Solution #1 doesn't work, if i start my eventhandler
in background, nagios still waits for the eventhandler to
finish.


So, in my opinion, there are 2 problems with eventhandler/notifications.

1. They are executed sequentially and blocking nagios while executed
2. nagios runs amok when the eventhandler gets into an early_timeout
because the main process
wants to read from the killed pipe in a never ending loop.

I wrote a small patch for the second issue, maybe someone with more c
skills wants to have a look...

Regards,
Sven

--------------060408010808050603010400
Content-Type: text/plain;
name="nagios_utils.c_mysystem.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="nagios_utils.c_mysystem.diff"

--- nagios-3.0.4/base/utils.c 2008-10-15 19:43:55.000000000 +0200
+++ nagios-3.0.4-patched/base/utils.c 2008-10-30 16:40:50.000000000 +0100
@@ -581,6 +581,20 @@
/* initialize dynamic buffer */
dbuf_init(&output_dbuf,dbuf_chunk);

+ /* if there was a critical return code and no output AND the command time exceeded the timeout thresholds, assume a timeout */
+ if(result==STATE_CRITICAL && bytes_read==-1 && (end_time.tv_sec-start_time.tv_sec)>=timeout){
+
+ /* set the early timeout flag */
+ *early_timeout=TRUE;
+
+ /* try to kill the command that timed out by sending termination signal to child process group */
+ kill((pid_t)(-pid),SIGTERM);
+ sleep(1);
+ kill((pid_t)(-pid),SIGKILL);
+ }
+
+ log_debug_info(DEBUGL_COMMANDS,1,"Execution time=%.3f sec\n, early timeout=%d, result=%d\n",*exectime,*early_timeout,result);
+
/* try and read the results from the command output (retry if we encountered a signal) */
do{
bytes_read=read(fd[0],buffer,sizeof(buffer)-1);
@@ -604,7 +618,7 @@
if(bytes_read==0)
break;

- }while(1);
+ }while(!early_timeout);

/* cap output length - this isn't necessary, but it keeps runaway plugin output from causing problems */
if(max_output_length>0 && output_dbuf.used_size>max_output_length)
@@ -616,20 +630,6 @@
/* free memory */
dbuf_free(&output_dbuf);

- /* if there was a critical return code and no output AND the command time exceeded the timeout thresholds, assume a timeout */
- if(result==STATE_CRITICAL && bytes_read==-1 && (end_time.tv_sec-start_time.tv_sec)>=timeout){
-
- /* set the early timeout flag */
- *early_timeout=TRUE;
-
- /* try to kill the command that timed out by sending termination signal to child process group */
- kill((pid_t)(-pid),SIGTERM);
- sleep(1);
- kill((pid_t)(-pid),SIGKILL);
- }
-
- log_debug_info(DEBUGL_COMMANDS,1,"Execution time=%.3f sec\n, early timeout=%d, result=%d",*exectime,*early_timeout,result);
-
#ifdef USE_EVENT_BROKER
/* send data to event broker */
broker_system_command(NEBTYPE_SYSTEM_COMMAND_END,NEBFLAG_NONE,NEBATTR_NONE,start_time,end_time,*exectime,timeout,*early_timeout,result,cmd,(output==NULL)?NULL:*output,NULL);
@@ -4565,4 +4565,3 @@

return OK;
}
-

--------------060408010808050603010400--





This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Locked