Nagios Support Forum

Posted: **Wed Sep 18, 2019 1:02 pm**

Hello Nagios Support,

This morning a critical server that we monitor triggered some alerts when it went down. We got the initial Host Down and one Service Problem alert, but when everything recovered we never got an important Service Recovery alert. (We have automation hooked into the alerts, and to make a long story short its important that we get ALL alerts.)

Looking through the event log, I see some strange behavior. To summarize succinctly:

1. We had some Service checks associated with the critical server unexpectedly get marked as "CRITICAL;HARD;1" - seemingly by passing the max-check-attempt counter.

2. Upon recovery, all of these Services got marked as "OK;SOFT;1" -- bypassing any notification process.

3. Looking at the XI UI now, I see that these Services are set to HARD states. I don't see any event log entry where/when that took place.

The one Service that did alert had one expected "CRITICAL;SOFT;1" entry before it logged the abnormal "CRITICAL;HARD;1". It was set to alert after 2 max-attempts, so this makes some degree of sense that it sent a notification - but there is obviously still something wrong here.

Have you seen this problem before, and do you know of a fix?

I am running XI 5.5.11 on a Centos 7.6 box.

Posted: **Wed Sep 18, 2019 3:01 pm**

Hello @yo_marc,

Appreciate the detailed description of the issue. It looks like you are hitting this bug in Nagios Core.

https://github.com/NagiosEnterprises/na ... issues/651

Please upgrade to the latest version as this as been patched in Core 4.4.4.

Posted: **Thu Sep 19, 2019 10:11 am**

Thank you! Glad to hear its' been addressed.

Am I missing something, or is Core 4.4.4 not yet included in the latest rev of XI? Looks like the latest bump was 4.4.3 in XI version 5.5.9?

https://assets.nagios.com/downloads/nag ... NGES-5.TXT

Posted: **Thu Sep 19, 2019 10:39 am**

Hello @yo_mar,

No you are not missing something, my mistake. Sorry about that, we typically wait sometime before pulling the latest core version into Nagios XI for stability. We should have this updated soon (likely 5.6.7).

Posted: **Thu Sep 19, 2019 10:40 am**

Hello @yo_mar,

No you are not missing something, my mistake. Sorry about that. We typically wait sometime before pulling the latest core version into Nagios XI for stability. We should have this updated soon (likely 5.6.7).

Posted: **Thu Sep 19, 2019 1:45 pm**

Thanks! I'll keep an eye out for that next release. Feel free to close this if/as needed.

Posted: **Thu Sep 19, 2019 2:05 pm**

Hi,

Sounds good. We'll close this up. If you have any new questions feel free to open another.

Thank you for using the Nagios Support Forum.

Nagios Support Forum

No recovery alert; "OK;SOFT;1" state.

No recovery alert; "OK;SOFT;1" state.

Re: No recovery alert; "OK;SOFT;1" state.

Re: No recovery alert; "OK;SOFT;1" state.

Re: No recovery alert; "OK;SOFT;1" state.

No recovery alert; "OK;SOFT;1" state.

Re: No recovery alert; "OK;SOFT;1" state.

Re: No recovery alert; "OK;SOFT;1" state.