Nagios Core 4.3.4 - retry_interval is getting ignored

charangandra · Post by **charangandra** » Tue Jul 31, 2018 8:02 am

Hi,

I've configured retry_interval to check the host status every minute until host state becomes HARD. However nagios is only performing the checks at the check_interval period only.

Here is my Host Definition.

Code: Select all

define host {
        name                       passive-host          ; The name of this host template
        active_checks_enabled      0                     ; Active service checks are disabled
        passive_checks_enabled     1                     ; Passive service checks are enabled
        flap_detection_enabled     1                     ; Flap detection is enabled
        check_freshness            1                     ; Freshess checks are enabled
        freshness_threshold        360                 ; Set the fresshness threshold to 6 minutes
        check_period               24x7                  ; Send host notifications at any time
        max_check_attempts         3                     ; Re-check the service up to 3 times in order to determine its final (hard) state
        check_interval             5                     ; Check the service every 5 minutes under normal conditions
        retry_interval             1                     ; Re-check the service every minute until a hard state can be determined
        contact_groups             admins,view_only      ; Notifications get sent out to everyone in the 'admins' group
        check_command              check-host-alive      ; Default command to check remote hosts
        notifications_enabled      0                     ; Host notifications are disabled
        notification_options       d,u,r                 ; Send notifications about warning, unknown, critical, and recovery events
        notification_interval      60                    ; Re-notify about service problems every hour
        notification_period        24x7                  ; Notifications can be sent out at any time
        register                   0                     ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
}

I am using NCPA passive checks and nagios core 4.3.4.

I've not changed interval_length which is set to 60, so not sure why checks are not getting carried out every mintue until state becomes HARD.

Thanks

Post by **mcapra** » Tue Jul 31, 2018 2:17 pm

Unless my understanding of passive checks is out-dated, the retry_interval and check_interval mean absolutely nothing if active_checks_enabled 0. If the checks are only received passively, Nagios Core has nothing to schedule. Your client, on the other hand, might have something to schedule; The interval on which it should send data back to Nagios Core.

More info on passive checks in Nagios Core:
https://assets.nagios.com/downloads/nag ... hecks.html

More info on passive checks in NCPA:
https://www.nagios.org/ncpa/help/2.1/passive.html

In NCPA, your check interval is baked into the NCPA-side configurations. It has no concept of anything going on in Nagios Core.

charangandra · Post by **charangandra** » Tue Jul 31, 2018 4:42 pm

Thanks @mcapra

I've updated my host definition now,

Code: Select all

define host {
        name                       passive-host          ; The name of this host template
        active_checks_enabled      0                     ; Active service checks are disabled
        passive_checks_enabled     1                     ; Passive service checks are enabled
        flap_detection_enabled     1                     ; Flap detection is enabled
        check_freshness            1                     ; Freshess checks are enabled
        freshness_threshold        300                   ; Set the fresshness threshold to 5 minutes
        check_period               24x7                  ; Send host notifications at any time
        max_check_attempts         3                     ; Re-check the service up to 3 times in order to determine its final (hard) state
        contact_groups             admins,view_only      ; Notifications get sent out to everyone in the 'admins' group
        check_command              check-host-alive      ; Default command to check remote hosts
        notifications_enabled      0                     ; Host notifications are disabled
        notification_options       d,u,r                 ; Send notifications about warning, unknown, critical, and recovery events
        notification_interval      60                    ; Re-notify about service problems every hour
        notification_period        24x7                  ; Notifications can be sent out at any time
        register                   0                     ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!

This is my use case, passive checks are done every 5 min. If host is down and nagios doesn't receive passive checks results in 5 min it will issue an active check because of check_freshness is enabled.

Host state type becomes HARD after 3 attempts and is working well. But my host state type is becoming HARD nearly after 15 min since host went down, I want to reduce this time to 7 min is it possible? i.e. first active check(in case of passive failure) after 5 min and want to issue another two checks at a min interval so my host state type becomes HARD by 7th min.

Is that really possible with passive checks and check freshness?

Thanks,

Post by **cdienger** » Wed Aug 01, 2018 3:59 pm

It would require separate "run active check X number of times at Y interval if not fresh" logic that isn't built in. I was under the impression that max_check_attempts shouldn't apply to passive checks, but haven't been able to test. That said, it does appear to be factor in your environment and lowering the freshness value could help switch the state to HARD quicker.

Nagios Support Forum

Nagios Core 4.3.4 - retry_interval is getting ignored

Nagios Core 4.3.4 - retry_interval is getting ignored

Re: Nagios Core 4.3.4 - retry_interval is getting ignored

Re: Nagios Core 4.3.4 - retry_interval is getting ignored

Re: Nagios Core 4.3.4 - retry_interval is getting ignored