Hi Everyone,
I'm running Nagios XI 5.5.1 (VMWare appliance) and have recently encountered a problem with recurring downtime.
I have a recurring schedule set up to suppress alerts during a nightly backup window - I can see that the recurring schedule has translated into scheduled downtime, for the correct services, and if I look at the service I can see that downtime has been scheduled for it.
However, I still get warning emails overnight, telling me the service is a problem.
Where should I be looking to get this resolved?
Thanks,
Peter.
Recurring Downtime not working
Re: Recurring Downtime not working
Can you show us the actual email notification that you received?
Also, PM me (or any other member of the nagios tech support team) your profile:
Admin > System Profile > Download Profile
Also, PM me (or any other member of the nagios tech support team) your profile:
Admin > System Profile > Download Profile
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Recurring Downtime not working
Thanks for the reply - screenshot of email attached and profile on the way to you.
I've noticed that I don't seem to get the warnings, but I am getting the recoveries - I've checked and the downtime period was still in force when the recovery was generated.
Thanks,
Peter.
I've noticed that I don't seem to get the warnings, but I am getting the recoveries - I've checked and the downtime period was still in force when the recovery was generated.
Thanks,
Peter.
You do not have the required permissions to view the files attached to this post.
Re: Recurring Downtime not working
Sorry to jump in, but it saves making another post for the same issue.
We're having the same problem.
Recurring downtimes have been created to put all hosts/services within a certain host group into scheduled downtime, but we still get notifications overnight.
I thought I fixed this last week by deleting all the scheduled downtime entries and manually running the perl script.
This worked for a few days and notification logs show the hosts were put into scheduled downtime, but as of last night (no changes were made on Nagios), notifications were emailed out and the notification logs show no mention of any of these hosts going into downtime.
We're having the same problem.
Recurring downtimes have been created to put all hosts/services within a certain host group into scheduled downtime, but we still get notifications overnight.
I thought I fixed this last week by deleting all the scheduled downtime entries and manually running the perl script.
This worked for a few days and notification logs show the hosts were put into scheduled downtime, but as of last night (no changes were made on Nagios), notifications were emailed out and the notification logs show no mention of any of these hosts going into downtime.
Re: Recurring Downtime not working
@proddan, for some reason, the nagios.log was missing from the profile? Can you PM me the log - we need to see if the service was actually in downtime.
@JGCG, you may or may not have the same issue. Please start a separate thread, and reference this one, e.g. "I am having the same (or similar) issue as the one, described in this post: <URL>". Thank you!
@JGCG, you may or may not have the same issue. Please start a separate thread, and reference this one, e.g. "I am having the same (or similar) issue as the one, described in this post: <URL>". Thank you!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Recurring Downtime not working
Hi lmiltchev,
I've now upgraded to 5.5.2, but am still seeing the same problem.
I also don't think it's related to recurring downtime, but to notifications in general - over the weekend, I took some services down for some planned maintenance - rather then schedule downtime, I used Mass Acknowledge to acknowledge the problems, before notifications were generated. Once the maintenance was complete and services were restored, Nagios generated recovery emails for all of the services concerned.
Thanks,
Peter.
I've now upgraded to 5.5.2, but am still seeing the same problem.
I also don't think it's related to recurring downtime, but to notifications in general - over the weekend, I took some services down for some planned maintenance - rather then schedule downtime, I used Mass Acknowledge to acknowledge the problems, before notifications were generated. Once the maintenance was complete and services were restored, Nagios generated recovery emails for all of the services concerned.
Thanks,
Peter.
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Recurring Downtime not working
If you ACK a problem it is normal to receive a recovery email.proddan wrote:I used Mass Acknowledge to acknowledge the problems, before notifications were generated. Once the maintenance was complete and services were restored, Nagios generated recovery emails for all of the services concerned.
Re: Recurring Downtime not working
Hi Scott,
Thanks for your input.
This isn't the behaviour I've seen up until recently.
If I use Mass Acknowledge to ACK a problem while it's still in a soft state (i.e. it's number of retries before alerting have not been completed), then I didn't see a recovery email when the service recovered.
This is behaviour that has definitely changed recently.
Peter
Thanks for your input.
This isn't the behaviour I've seen up until recently.
If I use Mass Acknowledge to ACK a problem while it's still in a soft state (i.e. it's number of retries before alerting have not been completed), then I didn't see a recovery email when the service recovered.
This is behaviour that has definitely changed recently.
Peter
-
scottwilkerson
- DevOps Engineer
- Posts: 19396
- Joined: Tue Nov 15, 2011 3:11 pm
- Location: Nagios Enterprises
- Contact:
Re: Recurring Downtime not working
Ah you didn't mention they were in a soft state when you ack'd them.proddan wrote:Hi Scott,
Thanks for your input.
This isn't the behaviour I've seen up until recently.
If I use Mass Acknowledge to ACK a problem while it's still in a soft state (i.e. it's number of retries before alerting have not been completed), then I didn't see a recovery email when the service recovered.
This is behaviour that has definitely changed recently.
Peter
I believe it is related to this bug in Core and I am going to pass the info on the the developers
https://github.com/NagiosEnterprises/na ... issues/557
Re: Recurring Downtime not working
Hi Scott,
Thanks for this - look forward to seeing it fixed in an upcoming release!
Peter
Thanks for this - look forward to seeing it fixed in an upcoming release!
Peter