Page 1 of 1

Recovery alert after no warning alert

Posted: Thu Sep 13, 2018 7:35 am
by amprantino
I have configured a service as follows:

it will send an alert only to critical & recovery states.
However, I am receiving numerous recovery alerts but no critical.
The service indeed is changing states between OK and Warning:
Snap1.png

Code: Select all

define service{
        use                             generic-service
        host_name                       HOST
        service_description             Service_Name
        is_volatile                     0
        check_period                    24x7
        max_check_attempts              2
        normal_check_interval           2
        retry_check_interval            2
        contact_groups                  net-admins
        notification_interval           240
        notification_period             24x7
        notification_options            c,r
        check_command                COMMAND
}
Should I receive a recover mail although I haven't received a critical alert for the service?

The documentation says:

notification_options: This directive is used to determine when notifications for the host should be sent out. Valid options are a combination of one or more of the following: d = send notifications on a DOWN state, u = send notifications on an UNREACHABLE state, r = send notifications on recoveries (OK state), f = send notifications when the host starts and stops flapping, and s = send notifications when scheduled downtime starts and ends. If you specify n (none) as an option, no host notifications will be sent out. If you do not specify any notification options, Nagios will assume that you want notifications to be sent out for all possible states. Example: If you specify d,r in this field, notifications will only be sent out when the host goes DOWN and when it recovers from a DOWN state.

Re: Recovery alert after no warning alert

Posted: Thu Sep 13, 2018 11:28 am
by scottwilkerson
What version of Nagios Core are you running?

there were a few recent issues regarding this that should be fixed in the maint branch here
https://github.com/NagiosEnterprises/na ... tree/maint

Re: Recovery alert after no warning alert

Posted: Thu Sep 13, 2018 5:12 pm
by amprantino
I am using 4.4.1 with some fixes from git; downloaded on 20180813 and is probably version 4.4.2 .
So I hit a bug again?

Would you like to install the current main branch and check if the problem is fixed?

Re: Recovery alert after no warning alert

Posted: Fri Sep 14, 2018 12:20 pm
by tgriep
Yes please, download and install the full Core 4.4.2 release and install it.
One more thing to do, you will have to stop nagios and delete the retention.dat file so the system will rebuild the counters for the checks / notifications when it starts up.
Then see if it happens again.

Re: Recovery alert after no warning alert

Posted: Tue Sep 18, 2018 6:59 am
by amprantino
Sorry, but I cannot delete retention.dat !! It will create chaos in my installation!

Updating the state for specific service, those with problems, will not fix the problem?

Re: Recovery alert after no warning alert

Posted: Tue Sep 18, 2018 8:46 am
by tgriep
Yes, you could edit just the entries in the retention.dat file instead of deleting them all.
That should work as well as they will still have to regenerate the counters.

Re: Recovery alert after no warning alert

Posted: Mon Oct 15, 2018 4:14 am
by amprantino
Unfortunately, v.4.4.2 didn't solve the problem :(

Re: Recovery alert after no warning alert

Posted: Mon Oct 15, 2018 1:00 pm
by tgriep
Was the host check in a down state when the service was in a down state as well?
Does this issue describe how your system is behaving?
https://github.com/NagiosEnterprises/na ... issues/572