Re: [Nagios-devel] Nagios retries checks too soon.

Guest · Post by **Guest** » Fri Jun 10, 2011 6:54 am

On 06/09/2011 08:14 PM, Paul M. Dubuc wrote:
> Andreas Ericsson wrote:
>> I'm not sure. I'm also not sure which behaviour is intended. Arguably,=
either
>> is correct and Nagios is doing one of two right things.
> I'm not sure. If a test times out and Nagios tries again 10 seconds la=
ter=20
> instead of the 60 seconds specified, that could cause problems; load re=
lated=20
> problems when you have many of these tests running and timing out and p=
roblems=20
> for the system under test not having sufficient time to recover before =
the=20
> next check is done.

True, but *if* someone has the latter kind of problem, I'd expect him to
keep it in mind while writing the configuration, too. IIRC, the actual
code adds check_interval/retry_interval to the variable that holds the
(previous) scheduled check time - i.e., the time when the previous check
supposedly was *started* (assuming negligible check latency).
Configuring a retry_interval of one minute for a service whose sustained
request rate may be *less* than one per minute sounds dubitable to me.

(And I'm a firm nonbeliever in Unix-ish "load" figures, as opposed to
actual CPU usage etc., but that's a different rant.)

Kind regards,
J. Bern
--=20
Jochen Bern, Systemingenieur --- LINworks GmbH
Postfach 100121, 64201 Darmstadt | Robert-Koch-Str. 9, 64331 Weiterstadt
PGP (1024D/4096g) FP =3D D18B 41B1 16C0 11BA 7F8C DCF7 E1D5 FAF4 444E 1C2=
7
Tel. +49 6151 9067-231, Zentr. -0, Fax -299 - Amtsg. Darmstadt HRB 85202
Unternehmenssitz Weiterstadt, Gesch=E4ftsf=FChrer Metin Dogan, Oliver Mic=
hel

This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]