Notifications not sent when timeperiod begins

Support forum for Nagios Core, Nagios Plugins, NCPA, NRPE, NSCA, NDOUtils and more. Engage with the community of users including those using the open source solutions.
Locked
sasha777
Posts: 4
Joined: Thu Oct 24, 2013 7:50 am

Notifications not sent when timeperiod begins

Post by sasha777 »

Hello. Can Nagios send notifications of critical events that happened outside a timeperiod, when the timeperiod begins again?

I created a timeperiod so Nagios doesn't send me SMSs during night. But when a critical hard state occurs during night, it doesn't notify me in the morning...

Thanks.
slansing
Posts: 7698
Joined: Mon Apr 23, 2012 4:28 pm
Location: Travelling through time and space...

Re: Notifications not sent when timeperiod begins

Post by slansing »

Yes nagios should notify you of state changes that did occur during a non-specified time period. Did you make sure to assign the time-period to the object/contact notification timeperiods? Can you share an example service definition and contact definition?
12csd
Posts: 10
Joined: Thu Oct 17, 2013 7:58 am

Re: Notifications not sent when timeperiod begins

Post by 12csd »

tmcdonald
Posts: 9117
Joined: Mon Sep 23, 2013 8:40 am

Re: Notifications not sent when timeperiod begins

Post by tmcdonald »

I was actually just about to make that connection, 12csd.

Awaiting a response from sahsa777 but I suspect the issues may be related.
Former Nagios employee
sasha777
Posts: 4
Joined: Thu Oct 24, 2013 7:50 am

Re: Notifications not sent when timeperiod begins

Post by sasha777 »

OK, here's a part of our config.

From contacts.cfg:

Code: Select all

define contact{
        contact_name			astojkovic-sms
        use						generic-contact-sms-only
        alias					 Aleksandar Stojkovic (SMS)
        host_notification_period		let-me-sleep
        service_notification_period	let-me-sleep
        address1				 <my cellphone number>
        }
From timeperiods.cfg:

Code: Select all

define timeperiod{
        timeperiod_name let-me-sleep
        alias           Let me sleep
        sunday          07:00-22:00
        monday          07:00-22:00
        tuesday         07:00-22:00
        wednesday       07:00-22:00
        thursday        07:00-22:00
        friday          07:00-22:00
        saturday        07:00-22:00
        }
From templates.cfg (with entire inheritance line):

Code: Select all

define service{
        name                            generic-service
        active_checks_enabled           1
        passive_checks_enabled          1
        parallelize_check               1
        obsess_over_service             1
        check_freshness                 0
        notifications_enabled           1
        event_handler_enabled           1
        flap_detection_enabled          1
        failure_prediction_enabled      1
        process_perf_data               1
        retain_status_information       1
        retain_nonstatus_information    1
        is_volatile                     0
        check_period                    24x7
        max_check_attempts              3
        normal_check_interval           10
        retry_check_interval            2
        contact_groups                  admins
        notification_options            w,u,c,r
        notification_interval           60
        notification_period             24x7
        register                        0
        }

define service{
        name                       generic-status-service
        use                        generic-service
        retry_check_interval       1
        notification_interval      1440
        contact_groups             networkadmins
        register                   0
        }

define service{
        name                       generic-status-critical
        use				            generic-status-service
        normal_check_interval		5
        contacts			          astojkovic-sms
        register                   0
        }
From host definition file:

Code: Select all

define service{
        use                     generic-status-critical
        host_name               C6500
        service_description     Gi 5/1
        check_command           check_cisco_port_status!49
        }
SMS notification during the timeperiod works just fine. Contact astojkovic-sms is not a member of networkadmins group. Any ideas? Thanks.
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: Notifications not sent when timeperiod begins

Post by sreinhardt »

I think we should clarify real quick. It will never send past notifications from an excluded time. It should however send new notifications when a check is still failing post this excluded time. To confirm, this is what you are seeing, no notifications until full state change, like to recovery, and not just expecting several messages from when things were down?
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
sasha777
Posts: 4
Joined: Thu Oct 24, 2013 7:50 am

Re: Notifications not sent when timeperiod begins

Post by sasha777 »

Here's what happened. As you could see, my timeperiod allows notifications from 7:00 to 22:00.

1) Service went hard critical around 6:30. Nagios didn't send me problem notification - my timeperiod excludes that time. It did notify other contacts via e-mail.

2) My timeperiod started at 7:00. The service still was critical, but Nagios didn't notify me.

3) Service went hard OK around 7:30. Nagios DID send me recovery notification. (Which is the second odd thing in this story, since docs say "It doesn't make sense to get a recovery notification for something you never knew was a problem..." Btw, I think it DOES make sense - admins should know there was some problem.)

Is there, somewhere in Nagios, a more detailed log which contains info why Nagios did or didn't send some notification?

Thanks.
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: Notifications not sent when timeperiod begins

Post by sreinhardt »

I am going to quote a post I made on friday for a very similar issue in another thread. I think it should clarify some things, and hopefully if it is still an issue you can confirm the issue as well.
After some discussion and code review. Provided the state has not changed from a hard warning or critical, the notification will not happen until the next notification interval. The interval counter is properly decremented and counted during excluded time periods. This means that if your host\service enters a hard state 1 minute prior to being out of excluded notification time, it will not notify until the proper notification interval has passed, regardless of check results provided they stay in the same state. However if your host\service change state after the excluded time, provided you are set to receive them, you will be notified.

SO, what we need to validate, is that if your notification interval is 30 minutes, at a maximum of 30 minutes after re-entering notification time, that you are sent a notification for the check failing.
Previous topic
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
sasha777
Posts: 4
Joined: Thu Oct 24, 2013 7:50 am

Re: Notifications not sent when timeperiod begins

Post by sasha777 »

Yes, that clarifies things. But that also means the following sentences from http://nagios.sourceforge.net/docs/3_0/ ... tions.html should be corrected or at least clarified:

- "Note: If the time period filter is not passed, Nagios will reschedule the next notification for the host or service (if its in a non-OK state) for the next valid time present in the time period. This helps ensure that contacts are notified of problems as soon as possible when the next valid time in time period arrives." - This sentence talks about TIME PERIOD as a criteron for overriding normal notification interval, in order to notify contacts ASAP! A good wish that didn't come to live in actual code, or a bug in docs? Or, does that sentence apply to service's time period only, but not to contact's time period as well?

- "It doesn't make sense to get a recovery notification for something you never knew was a problem..." - As I already said, Nagios sent me recovery notification although it hadn't send me problem notification. But please leave it work that way - I believe admins SHOULD get recovery notice in order to know there was some problem.

Thanks.
sreinhardt
-fno-stack-protector
Posts: 4366
Joined: Mon Nov 19, 2012 12:10 pm

Re: Notifications not sent when timeperiod begins

Post by sreinhardt »

That brings up an interesting point on how one person thought it should work versus how it is actually coded. This might be worthwhile to post a bug report to tracker.nagios.org for this particular circumstance. Based on that description, I would agree with your original logic, not what we saw in the code.
Nagios-Plugins maintainer exclusively, unless you have other C language bugs with open-source nagios projects, then I am happy to help! Please pm or use other communication to alert me to issues as I no longer track the forum.
Locked