Notification count never resets -> escalations don't trigger

a7ger · Post by **a7ger** » Fri Jul 13, 2018 11:20 am

I have set up a host escalation that triggers at the 3rd notification of a certain host. When the host was first defined the notification count started at 0. So the escalation was triggered 3 notifications later and worked perfectly. But after that problem was resolved and the host stabilized the notification stayed at 5 and never reset. Even 3 weeks later when we have another issue the escalation happens immediately rather than waiting till the "third" notification (because the notification count never reset after the last problem resolved). How do I get the notification count to zero out every time the host stabilizes in an up state? I want the notification to go to the default contact the first 2 notifications and escalate on the 3rd FOR EACH PROBLEM. These problems should be separated by the host reaching a hard-up state.

Code: Select all

define host {
        host_name                       devopsTestVm
        alias                           devopsTestVm
        display_name                    devopsTestVm
        address                         10.87.63.91
        parents
        check_command                   check-host-alive
        initial_state                   o
        max_check_attempts              1
        check_interval                  1
        retry_interval                  1
        active_checks_enabled           1
        passive_checks_enabled          0
        check_period                    24x7
        obsess_over_host                0
        check_freshness                 0
        freshness_threshold             0
        event_handler_enabled           0
        low_flap_threshold              1
        high_flap_threshold             5
        flap_detection_enabled          0
        flap_detection_options          o,d,u
        process_perf_data               0
        retain_status_information       0
        retain_nonstatus_information    0
        contacts                        slack-test-1
        notification_interval           1
        first_notification_delay        5
        notification_period             24x7
        notification_options            d,u,r,s
        notifications_enabled           1
        stalking_options
        notes                           no notes
        notes_url
        action_url
        icon_image
        icon_image_alt
        vrml_image
        statusmap_image
}

define hostescalation {
        host_name                       devopsTestVm
        contacts                        slack-test-2
        first_notification              3
        last_notification               0
        notification_interval           0
        escalation_period               24x7
        escalation_options              d,u,r
}

Post by **cdienger** » Fri Jul 13, 2018 4:05 pm

What version of core is this? Do you see the current_state, last_state_change, last_hard_state_change, and last_time_* lines accurately update in /usr/local/nagios/var/status.dat when it goes back into an OK state?

a7ger · Post by **a7ger** » Mon Jul 16, 2018 11:22 am

I am running Nagios Core 4.4.0

With my limited knowledge of what the values should be, all of the values you mentioned seem correct in `/usr/local/nagios/var/status.dat` before, during and after the very first problem my host has. However, after the first problem, the `current_notification_count` does not reset to zero. You can see this in the after portion here:

beforeDuringAfterStatus.txt: contains host state info from `/usr/local/nagios/var/status.dat` from before, during and after host has problem.; (4.9 KiB) Downloaded 305 times

Note: I had to restart Nagios to get the notification count to zero out so I could provide "before" status info.

Thanks so much for the help

Post by **cdienger** » Wed Jul 18, 2018 4:01 pm

I've been able to reproduce this and currently pinging dev regarding it. Thanks for bringing it to our attention. I'll try to have an update for you as soon as possible.

a7ger · Post by **a7ger** » Sat Jul 21, 2018 11:21 pm

Great! Thank you!

Post by **cdienger** » Mon Jul 23, 2018 9:58 am

Here's a link to the issue on github: https://github.com/NagiosEnterprises/na ... issues/557. It looks like the problem has been identified and we'll be working on a fix.

a7ger · Post by **a7ger** » Tue Jul 24, 2018 10:02 pm

Great! When the problem is fixed will you notify me here? Or should I follow the other post?

Post by **cdienger** » Wed Jul 25, 2018 2:27 pm

Feel free to follow both, but we'll update this thread once a fix is available.

gornm565 · Post by **gornm565** » Mon Jul 30, 2018 3:02 pm

What's the ETA on this fix?

scottwilkerson · Post by **scottwilkerson** » Tue Jul 31, 2018 2:33 pm

gornm565 wrote:What's the ETA on this fix?

At present the solution is still being worked on, we do not have an ETA

Nagios Support Forum

Notification count never resets -> escalations don't trigger

Notification count never resets -> escalations don't trigger

Re: Notification count never resets -> escalations don't tri

Re: Notification count never resets -> escalations don't tri

Re: Notification count never resets -> escalations don't tri

Re: Notification count never resets -> escalations don't tri

Re: Notification count never resets -> escalations don't tri

Re: Notification count never resets -> escalations don't tri

Re: Notification count never resets -> escalations don't tri

Re: Notification count never resets -> escalations don't tri

Re: Notification count never resets -> escalations don't tri