Page 1 of 1

No recovery sent

Posted: Tue Nov 17, 2020 6:18 am
by comfone
Dear Nagios Team

Sometimes Nagios XI fails to send a recovery notification. An example from yesterday.

Status
pic1.png
Notifications
pic2.png
Configuration
pic4.png

Re: No recovery sent

Posted: Tue Nov 17, 2020 2:47 pm
by benjaminsmith
Hi @comfone

Just to clarify sometime, is this intermittent? Have you received recovery notification for this service in the past or is it consistent behavior? There are a number of layers to controlling notifications in Nagios XI, host level, contact level, and also within the XI user account. Please double-check the service object in the CCM to make sure you have recovery notifications enabled.

Additionally, if the contact is also an XI user (has access to the web interface), check the notification preferences.
notification-preferences.png
See: Nagios XI - Notification Problems

A handy way to check the object settings in XI is to open the objects.cache file and search for the object.

Code: Select all

/usr/local/nagios/var/objects.cache

Re: No recovery sent

Posted: Tue Dec 08, 2020 8:52 am
by comfone
Hi Benjamin

With "sometimes" I mean, that for the same service on the same host and the same recipient the recovery message is not always sent. I also didn't make any config changes.

Following Screenshot shows this.

- The panel at the top shows the Notifications for the Service. There was only one recovery sent
- The panel at the bottom shows the State changes for the Service. There were two "Hard" recoveries.
dump.png
Kind Regards

Urs

Re: No recovery sent

Posted: Tue Dec 08, 2020 5:04 pm
by benjaminsmith
Hi Urs,

Got it, that makes sense. One difference is that it's possible that on 12-7, flap detection may have been initiated ( suppressing the notifications) but I'd have to sort through the nagios log from that day to further troubleshoot.

Can you upload the nagios log from 12-7-20? It would likely be the last entry titled nagios-12-08..*.log in the archives folder by now (rotated every 24 hours).

Code: Select all

/usr/local/nagios/var/archives
Also, please send me the system profile so I can review the configs as well. Thanks, Benjamin

To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button

Re: No recovery sent

Posted: Wed Dec 09, 2020 8:51 am
by comfone
Hi Benjamin

Will do. Is there a way to share this with you privately. I don't feel comfortable to share this with everyone. Especially the log file.

Kind Regards

Urs

Re: No recovery sent

Posted: Wed Dec 09, 2020 4:42 pm
by benjaminsmith
Hi,

Yeah, no problem. If you click the PM icon under my name, you can send me a Private Message and attach the logs.

Otherwise, you can open a support ticket for this issue and your information will be private between you and the support team.

Benjamin

Re: No recovery sent

Posted: Mon Dec 21, 2020 5:01 am
by comfone
Hi Benjamin

I've sent you a private message. Any news on this?

Kind Regards

Urs

Re: No recovery sent

Posted: Mon Dec 21, 2020 3:11 pm
by benjaminsmith
Hi,

Thanks for the system profile. The host was in HARD down state when the service recovered, and under this condition, service notifications are suppressed in Nagios Core (monitoring engine)
[1607376345] HOST ALERT: csvmo071;DOWN;HARD;1;CRITICAL: No Data received for Host. Host might be down
[1607376422] SERVICE ALERT: csvmo071;SSG-CHECF-PROBE-RUN;OK;HARD;1;alive
If the host is in a hard non-OK state, notifications for services related to the host won't be sent out, this is actually true for soft down as well. For a closer look at state types and notifications, the Nagios Core docs are very helpful.

State Types
Notifications

Let me know if you have any questions or need clarification on anything.

Best Regards,
Benjamin

Re: No recovery sent

Posted: Wed Dec 23, 2020 4:05 am
by comfone
Hi Benjamin

Okay I see. So this is a feature not a bug :D. This behaviour probably makes sense for a lot of customer. Unfortunately not for us. Host Recover and Service Recovery are sent to different people. So our NOC gets a service down but never receives a service recovery.

Any way to change this behaviour?

Kind Regards

Urs

Re: No recovery sent

Posted: Wed Dec 23, 2020 10:57 am
by benjaminsmith
Hi Urs,

Typically if the service is not going to recover if the host is down, and suppressing the service notifications when the host is down helps reduce the number of unwanted notifications.

One option is to re-structure the host and service check commands to try and avoid this outcome. These are passive checks, so I am not sure exactly what the check commands are.

Regards,
Benjamin