Page 1 of 1

Re: [Nagios-devel] Antwort: Re: Antwort: Host check retries -

Posted: Tue Jun 17, 2008 8:19 am
by Guest






Indeed it is a nagios 3.x bug :(

I did some debug on base/check.c file and found the bogus function:

/* check for hosts that never returned from a
check... */
void check_for_orphaned_hosts(void){
        host *temp_host=NULL;
        time_t current_time=0L;
        time_t expected_time=0L;


       
log_debug_info(DEBUGL_FUNCTIONS,0,"check_for_orphaned_hosts()\n");

        /* get the current time */
        time(&current_time);

        /* check all hosts... */
       
for(temp_host=host_list;temp_host!=NULL;temp_host=temp_host->next){

                /* skip hosts that are not currently executing */
                if(temp_host->is_executing==FALSE)
                        continue;

                /* determine the time at which the check results should
have come in (allow 10 minutes slack time) */
               
expected_time=(time_t)(temp_host->next_check+temp_host->latency+host_check_timeout+check_reaper_interval+600);

                /* this host was supposed to have executed a while ago,
but for some reason the results haven't come back in... */
                if(expected_time<current_time){

                        /* log a warning */
                        logit(NSLOG_RUNTIME_WARNING,TRUE,"Warning: The
check of host '%s' looks like it was orphaned (results never came
back).  I'm scheduling an immediate check of the
host...\n",temp_host->name);
                       
logit(NSLOG_RUNTIME_WARNING,TRUE,"----------------  %d <
%d\n",expected_time,current_time);
                       
logit(NSLOG_RUNTIME_WARNING,TRUE,"----------------
next_check            %d\n",temp_host->next_check);
                       
logit(NSLOG_RUNTIME_WARNING,TRUE,"----------------
latency               %g\n",temp_host->latency);
                       
logit(NSLOG_RUNTIME_WARNING,TRUE,"----------------
host_check_timeout    %d\n",host_check_timeout);
                       
logit(NSLOG_RUNTIME_WARNING,TRUE,"----------------
check_reaper_interval %d\n",check_reaper_interval);

                        log_debug_info(DEBUGL_CHECKS,1,"Host '%s' was
orphaned, so we're scheduling an immediate
check...\n",temp_host->name);

                        /* decrement the number of running host checks
*/
                        if(currently_running_host_checks>0)
                                currently_running_host_checks--;

                        /* disable the executing flag */
                        temp_host->is_executing=FALSE;

                        /* schedule an immediate check of the host */
                       
schedule_host_check(temp_host,current_time,CHECK_OPTION_ORPHAN_CHECK);
                        }

                }

        return;
        }

Some of the logit() calls are from my debug. When nagios checks for
orphaned host checks it compares the actual time with an expected time
for the result to appear. This expected time is a sum of the next
scheduled check, the latency, the host_check_timeout and
check_reaper_interval directives plus 10

...[email truncated]...


This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]