Notification interval malfunctioning

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
frankwijers
Posts: 20
Joined: Fri Apr 09, 2021 6:22 am

Notification interval malfunctioning

Post by frankwijers »

Hello,

I'm currently finishing our Nagios XI setup, which has a lot of old cfg files imported from a (really old) version of nagios.
Only thing I cannot seem to figure out is the notification interval. For instance, I set the interval for the checks of the Veeam backup to 5 minutes, but I receive mails every 10 minutes. I cannot figure out why this is happening. Can anybody explain?

Service definition:

Code: Select all

define service {
    service_description      veeam_BACKUP_JOBS
    use                      TMPL_generic
    hostgroup_name           VEEAM_BACKUP_SERVERS
    check_command            CHK_VEEAM_JOBS!!!!!!!!
    max_check_attempts       3
    check_interval           5
    retry_interval           5
    notification_interval    5
    notification_period      24x7
    notification_options     w,c,
    notifications_enabled    1
    _service_id              1479
    register                 1
}
It uses a Template, but the definitions in there are not shocking:

Code: Select all

define service {
    name                            TMPL_generic
    service_description             TMPL_generic
    is_volatile                     0
    max_check_attempts              3
    check_interval                  15
    retry_interval                  5
    active_checks_enabled           1
    passive_checks_enabled          0
    check_period                    CHK_24x7
    register                        0
}
Any help would be appreciated.

EDIT:
Since posting this, I did some more testing, it seems that notifications get delayed by 5 minutes more than the interval. I set the service above with notification_interval 15 and now I get the mails every 20 minutes.
I'm absolutely stunned that I cannot find where those 5 minutes are coming from.
User avatar
vtrac
Posts: 903
Joined: Tue Oct 27, 2020 1:35 pm

Re: Notification interval malfunctioning

Post by vtrac »

Hi,
How are you doing?

Below is the definition of "notification interval":
Service - notification interval

This directive is used to define the number of "time units" to wait before re-notifying a contact that this service is still in a non-OK state. Unless you've changed the interval_length directive from the default value of 60, this number will mean minutes. If you set this value to 0, Nagios will not re-notify contacts about problems for this service - only one problem notification will be sent out.

Parameter name: notification_interval
Required: yes
I usually set this value to "60" minutes, just to re-notify that this service is still in a non-OK state.

Since you set this value so small (5 minutes), you will be re-notified just 5 minutes after the issue has been identified.


Best Regards,
Vinh
frankwijers
Posts: 20
Joined: Fri Apr 09, 2021 6:22 am

Re: Notification interval malfunctioning

Post by frankwijers »

Hi vtrac, thanks for taking the time to comment on this. It's appreciated.

You are correct. Usually, we also set these notifications to 60 minutes, but for this moment (testing and layout of the mails) we decreased the notification interval (not the interval_length! That's still at 60 seconds.).
If I set the notification interval to 60 minutes, I receive mails every 65 minutes, which is not what we expect. I can set it to 55 to get them every 60 minutes, but I want to figure out where those 5 minutes are coming from.

To make it clear, let's put in into time.
Service A is already in a HARD state, so during the notification hours (24x7), it should sent out notifications to the contacts (me).
If notification1 is sent at a certain time (let's say 09:00) and notification interval is 60 minutes, I would expect notification2 at 10:00, but in stead, I receive it at 10:05. Notification3 will be received at 11:10.
For every notification, there is an extra 5 minutes more than the interval.
User avatar
vtrac
Posts: 903
Joined: Tue Oct 27, 2020 1:35 pm

Re: Notification interval malfunctioning

Post by vtrac »

Hi,
Hope you are having a good day!!

I believe that "5 minutes" came from your "retry_interval = 5 minutes".

Can you please try setting "retry_interval = 1 minutes" and see what you will get?
Service - retry interval

This directive is used to define the number of "time units" to wait before scheduling a re-check of the service. Services are rescheduled at the retry interval when they have changed to a non-OK state. Once the service has been retried max_check_attempts times without a change in its status, it will revert to being scheduled at its "normal" rate as defined by the check_interval value. Unless you've changed the interval_length directive from the default value of 60, this number will mean minutes. More information on this value can be found in the check scheduling documentation.

Parameter name: retry_interval
Required: yes
Best Regards,
Vinh
frankwijers
Posts: 20
Joined: Fri Apr 09, 2021 6:22 am

Re: Notification interval malfunctioning

Post by frankwijers »

Hello Vinh,

as I understand it, check settings have not much to do with notification settings. As per documentation:
notification_interval: This directive is used to define the number of "time units" to wait before re-notifying a contact that this service is still down or unreachable. Unless you've changed the interval_length directive from the default value of 60, this number will mean minutes. If you set this value to 0, Nagios will not re-notify contacts about problems for this host - only one problem notification will be sent out.

I tried the setting you suggested:
check interval 5 minutes
retry interval 1 minutes
Notification period 24x7
Notification interval 5 minutes

But I still receive mails every 10 minutes. When the configurations are done, I want them every hour. As a workaround I can set it to 55 minutes to receive them every 60 minutes, but I want to know and understand where the extra 5 minutes come from.

Frank
User avatar
vtrac
Posts: 903
Joined: Tue Oct 27, 2020 1:35 pm

Re: Notification interval malfunctioning

Post by vtrac »

Hi,
I understand your frustration and I'm sorry for all the trouble this has caused.

Could you please attach the "profile.zip"? ... just so I can check ... :-)

Please also let me know the name of the host & service you are having issue with.


Best Regards,
Vinh
frankwijers
Posts: 20
Joined: Fri Apr 09, 2021 6:22 am

Re: Notification interval malfunctioning

Post by frankwijers »

Hi Vinh,

sorry for my late reply. We went live with the current setup last week, had to add a lot of hosts/services to make it in time.
I attached the file you requested, hope it can give some information.
Again, it's not a big issue, just want to know where the extra time comes from. Worst case we can schedule it to notify in 59 minutes in stead of 60 or something similar.

Frank
You do not have the required permissions to view the files attached to this post.
User avatar
vtrac
Posts: 903
Joined: Tue Oct 27, 2020 1:35 pm

Re: Notification interval malfunctioning

Post by vtrac »

Hi Frank,
How are you doing?
I downloaded and compared your services with my and did not noticed much differences.

I noticed some of your "notification_option" service settings is just "c" (CRITICAL) only.

Code: Select all

notification_options	c
This mean you will not get notify until it is CRITICAL, which means it will skip "w" (WARNINGS) ... would this be why your notification is a few minutes off.


Regards,
Vinh
frankwijers
Posts: 20
Joined: Fri Apr 09, 2021 6:22 am

Re: Notification interval malfunctioning

Post by frankwijers »

Not entirely sure if that's the case, but I can live with the current interval, or at least, make use of the timing.
Topic can be locked.
User avatar
vtrac
Posts: 903
Joined: Tue Oct 27, 2020 1:35 pm

Re: Notification interval malfunctioning

Post by vtrac »

Great!! ... locking thread ... :-)


Vinh
Locked