Page 2 of 10

Re: host check orphaned

Posted: Wed Mar 11, 2015 9:58 am
by bosecorp
it's actually for both.

but I see overtime the number going up and down

Re: host check orphaned

Posted: Wed Mar 11, 2015 2:01 pm
by jdalrymple
bosecorp

From: http://nagios.sourceforge.net/docs/3_0/configmain.html
Host Check Timeout

Format: host_check_timeout=<seconds>
Example: host_check_timeout=60

This is the maximum number of seconds that Nagios will allow host checks to run. If checks exceed this limit, they are killed and a CRITICAL state is returned and the host will be assumed to be DOWN. A timeout error will also be logged.

There is often widespread confusion as to what this option really does. It is meant to be used as a last ditch mechanism to kill off plugins which are misbehaving and not exiting in a timely manner. It should be set to something high (like 60 seconds or more), so that each host check normally finishes executing within this time limit. If a host check runs longer than this limit, Nagios will kill it off thinking it is a runaway processes.
There is no initial poor impact to Nagios to increase this limit. It is often necessary if you have some very rigorous custom plugins that require a lot of time to run. This can mask some issues though where plugins aren't running in what would otherwise be considered in an appropriate amount of time. I would just keep an eye on my system for long-running plugins. Use the "Monitoring Engine Performance" chart from the Performance link on the Home page.

Re: host check orphaned

Posted: Wed Mar 11, 2015 2:11 pm
by bosecorp
Thanks for the Response. very clear now on what that functionality does.

Now, I still have "host check orphaned". nothing that I have tried has worked

Re: host check orphaned

Posted: Wed Mar 11, 2015 3:33 pm
by jdalrymple
Do we know if the host checks that are timing out are for hosts that really exist?

As abrist asked - what do you get when you run the host check from the gearman server command line?

Re: host check orphaned

Posted: Wed Mar 11, 2015 3:42 pm
by bosecorp
how do I run the host check from the gearman server command line

Also, how do I check which plugging is taking more than expected?

Re: host check orphaned

Posted: Wed Mar 11, 2015 3:52 pm
by jdalrymple
If using XI defaults, login to the gearman server, execute:

Code: Select all

/usr/local/nagios/libexec/check_icmp -H hostip -p 5
where hostip is the IP address of the host to be monitored. Your path may be different than the one I included, but the check_icmp should be the plugin being used.

I would expect that your XI interface would indicate what hosts the checks are timing out on, does it not?

Re: host check orphaned

Posted: Wed Mar 11, 2015 3:54 pm
by bosecorp
Active Host Checks
1-min 0
5-min 0
15-min 46
Passive Host Checks
1-min 0
5-min 0
15-min 136
Active Service Checks
1-min 0
5-min 0
15-min 6,280
Passive Service Checks
1-min 0
5-min 0
15-min 648

&

Metric
Value
Host Check Latency
Min 0.00 sec
Max 561.71 sec
Avg 12.94 sec
Host Check Execution Time
Min 0.00 sec
Max 10.06 sec
Avg 1.21 sec
Service Check Latency
Min 0.00 sec
Max 22.31 sec
Avg 2.53 sec
Service Check Execution Time
Min 0.00 sec
Max 10.00 sec
Avg 0.07 sec

Re: host check orphaned

Posted: Wed Mar 11, 2015 4:00 pm
by jdalrymple
I meant from your Home --> Host Detail page, not from the Home --> Performance page.

Re: host check orphaned

Posted: Wed Mar 11, 2015 4:08 pm
by bosecorp
what I trying to do is scheduling a forced immediate check, but still I see the same thing

Re: host check orphaned

Posted: Wed Mar 11, 2015 4:15 pm
by jdalrymple
Are ALL of your hosts broken? Are ALL of your hosts monitored through this broken gearman server? Are ALL of the host on that gearman server showing a broken status?