Re: [Nagios-devel] freshness_threshold bug - big problem

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
Guest

Re: [Nagios-devel] freshness_threshold bug - big problem

Post by Guest »

This is a multi-part message in MIME format.
--------------000900000003060108030909
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

On 12/16/2010 12:03 PM, Rodney Ramos wrote:
> As I=B4ve said before I think that it is a Nagios Core bug. I=B4ve test=
ed it
> with Nagios 3.2.1 and I found the same problem.
> I think it=B4s a serious problem.


Oh, wow. 8-O I can confirm the effect on my 3.2.3, but there seems to be
*more* of a problem with host freshness checks. Test run with
check_interval 15, retry_interval 2, max_check_attempts 4; log excerpt:


18:23:55 Warning: Host 'Unfresh' has no services associated with it!
18:24:28 EXTERNAL COMMAND: PROCESS_HOST_CHECK_RESULT;Unfresh;0;Manual
Init to UP|
18:24:35 PASSIVE HOST CHECK: Unfresh;0;Manual Init to UP

18:39:55 Warning: The results of host 'Unfresh' are stale by 0d 0h 0m 12s
(threshold=3D0d 0h 15m 16s). I'm forcing an immediate check of the hos=
t.
18:40:05 HOST ALERT: Unfresh;DOWN;SOFT;1;(null)

18:51:12 Warning: Host 'Unfresh' has no services associated with it!

18:56:13 Warning: The results of host 'Unfresh' are stale by 0d 0h 0m 59s
(threshold=3D0d 0h 15m 17s). I'm forcing an immediate check of the hos=
t.
18:56:23 HOST ALERT: Unfresh;DOWN;SOFT;2;(null)
19:00:12 Warning: Host 'Unfresh' has no services associated with it!
19:12:13 Warning: The results of host 'Unfresh' are stale by 0d 0h 0m 45s
(threshold=3D0d 0h 15m 15s). I'm forcing an immediate check of the hos=
t.
19:12:23 HOST ALERT: Unfresh;DOWN;SOFT;2;CRITICAL: All life functions
terminated
19:28:13 Warning: The results of host 'Unfresh' are stale by 0d 0h 0m 39s
(threshold=3D0d 0h 15m 18s). I'm forcing an immediate check of the hos=
t.
19:28:23 HOST ALERT: Unfresh;DOWN;SOFT;3;CRITICAL: All life functions
terminated
19:44:13 Warning: The results of host 'Unfresh' are stale by 0d 0h 0m 39s
(threshold=3D0d 0h 15m 18s). I'm forcing an immediate check of the hos=
t.
19:44:23 HOST ALERT: Unfresh;DOWN;HARD;4;CRITICAL: All life functions
terminated
20:00:13 Warning: The results of host 'Unfresh' are stale by 0d 0h 0m 39s
(threshold=3D0d 0h 15m 18s). I'm forcing an immediate check of the hos=
t.
20:16:13 Warning: The results of host 'Unfresh' are stale by 0d 0h 0m 41s
(threshold=3D0d 0h 15m 17s). I'm forcing an immediate check of the hos=
t.
20:32:13 Warning: The results of host 'Unfresh' are stale by 0d 0h 0m 39s
(threshold=3D0d 0h 15m 18s). I'm forcing an immediate check of the hos=
t.
20:48:13 Warning: The results of host 'Unfresh' are stale by 0d 0h 0m 45s
(threshold=3D0d 0h 15m 15s). I'm forcing an immediate check of the hos=
t.
21:04:13 Warning: The results of host 'Unfresh' are stale by 0d 0h 0m 45s
(threshold=3D0d 0h 15m 15s). I'm forcing an immediate check of the hos=
t.


(The additional "no services" crud stems from my not getting the check
command right the first time 'round, and having to re-reload the config.)


I took excerpts of status.dat and retention.dat initially and after the
first nine active checks, look at these current_attempt numbers:


# for FIL in *.dat* ; do echo -n "${FIL}: " | \
> sed -e 's/_[a-z]*-/-/' -e 's/\.[a-z]*: */:/' ; \
> egrep '(current_attempt|state_type|(current|last_hard)_state=3D)' \
> $FIL | sed -e 's/\([a-z][a-z][a-z]\)[a-z]*\([_=3D]\)/\1\2/g' | \
> tr '\n\t' ' ' ; echo "" ; done
retention.dat-OK: cur_sta=3D0 las_har_sta=3D0 cur_att=3D1 sta_typ=3D=
1
retention.dat-1: cur_sta=3D0 las_har_sta=3D0 cur_att=3D1 sta_typ=3D=
1
retention.dat-2: cur_sta=3D1 las_har_sta=3D0 cur_att=3D1 sta_typ=3D=
0
retention.dat-3: cur_sta=3D1 las_har_sta=3D0 cur_att=3D2 sta_typ=3D=
0
retention.dat-4: cur_sta=3D1 las_har_sta=3D0 cur_att=3D2 sta_typ=3D=
0
retention.dat-5: cur_sta=3D1 las_har_sta=3D0 cur_att=3D2 sta_typ=3D=
0
retention.dat-6: cur_sta=3D1 las_har_sta=3D0 cur_att=3D4 sta_typ=3D=
1
retention.dat-7: cur_sta=3D1 las_har_sta=3D0 cur_att=3D4 sta_typ=3D=
1
retention.dat-8: cur_sta=3D1 las_har_sta=3D0 cur_att=3D4 sta_typ=3D=
1
retention.dat-9: cur_sta=3D1 las_har_sta=3D0 cur_att=3

...[email truncated]...


This post was automatically imported from historical nagios-devel mailing list archives
Original poster: [email protected]
Locked