Page 1 of 2

Recurring Downtime not working

Posted: Tue Jul 24, 2018 4:51 am
by proddan
Hi Everyone,

I'm running Nagios XI 5.5.1 (VMWare appliance) and have recently encountered a problem with recurring downtime.

I have a recurring schedule set up to suppress alerts during a nightly backup window - I can see that the recurring schedule has translated into scheduled downtime, for the correct services, and if I look at the service I can see that downtime has been scheduled for it.
However, I still get warning emails overnight, telling me the service is a problem.

Where should I be looking to get this resolved?

Thanks,


Peter.

Re: Recurring Downtime not working

Posted: Tue Jul 24, 2018 9:56 am
by lmiltchev
Can you show us the actual email notification that you received?

Also, PM me (or any other member of the nagios tech support team) your profile:

Admin > System Profile > Download Profile

Re: Recurring Downtime not working

Posted: Wed Jul 25, 2018 2:26 am
by proddan
Thanks for the reply - screenshot of email attached and profile on the way to you.

I've noticed that I don't seem to get the warnings, but I am getting the recoveries - I've checked and the downtime period was still in force when the recovery was generated.

Thanks,

Peter.

Re: Recurring Downtime not working

Posted: Wed Jul 25, 2018 3:40 am
by JGCG
Sorry to jump in, but it saves making another post for the same issue.

We're having the same problem.
Recurring downtimes have been created to put all hosts/services within a certain host group into scheduled downtime, but we still get notifications overnight.

I thought I fixed this last week by deleting all the scheduled downtime entries and manually running the perl script.
This worked for a few days and notification logs show the hosts were put into scheduled downtime, but as of last night (no changes were made on Nagios), notifications were emailed out and the notification logs show no mention of any of these hosts going into downtime.

Re: Recurring Downtime not working

Posted: Wed Jul 25, 2018 11:50 am
by lmiltchev
@proddan, for some reason, the nagios.log was missing from the profile? Can you PM me the log - we need to see if the service was actually in downtime.

@JGCG, you may or may not have the same issue. Please start a separate thread, and reference this one, e.g. "I am having the same (or similar) issue as the one, described in this post: <URL>". Thank you!

Re: Recurring Downtime not working

Posted: Mon Jul 30, 2018 2:43 am
by proddan
Hi lmiltchev,

I've now upgraded to 5.5.2, but am still seeing the same problem.
I also don't think it's related to recurring downtime, but to notifications in general - over the weekend, I took some services down for some planned maintenance - rather then schedule downtime, I used Mass Acknowledge to acknowledge the problems, before notifications were generated. Once the maintenance was complete and services were restored, Nagios generated recovery emails for all of the services concerned.

Thanks,

Peter.

Re: Recurring Downtime not working

Posted: Mon Jul 30, 2018 4:23 pm
by scottwilkerson
proddan wrote:I used Mass Acknowledge to acknowledge the problems, before notifications were generated. Once the maintenance was complete and services were restored, Nagios generated recovery emails for all of the services concerned.
If you ACK a problem it is normal to receive a recovery email.

Re: Recurring Downtime not working

Posted: Tue Jul 31, 2018 6:44 am
by proddan
Hi Scott,

Thanks for your input.

This isn't the behaviour I've seen up until recently.
If I use Mass Acknowledge to ACK a problem while it's still in a soft state (i.e. it's number of retries before alerting have not been completed), then I didn't see a recovery email when the service recovered.

This is behaviour that has definitely changed recently.

Peter

Re: Recurring Downtime not working

Posted: Tue Jul 31, 2018 7:32 am
by scottwilkerson
proddan wrote:Hi Scott,

Thanks for your input.

This isn't the behaviour I've seen up until recently.
If I use Mass Acknowledge to ACK a problem while it's still in a soft state (i.e. it's number of retries before alerting have not been completed), then I didn't see a recovery email when the service recovered.

This is behaviour that has definitely changed recently.

Peter
Ah you didn't mention they were in a soft state when you ack'd them.

I believe it is related to this bug in Core and I am going to pass the info on the the developers

https://github.com/NagiosEnterprises/na ... issues/557

Re: Recurring Downtime not working

Posted: Fri Aug 03, 2018 2:45 am
by proddan
Hi Scott,

Thanks for this - look forward to seeing it fixed in an upcoming release!


Peter