Page 1 of 1

Some notifications not firing after upgraded to 5.5.1

Posted: Thu Jul 26, 2018 2:13 pm
by gmackey
Last night we had a host and a service on another host go down and did not receive notifications for them from Nagios XI. I only found out about them because of another host that was affected in which case Nagios XI sent the notifications like it should have. I checked at the time and they were definitely in Unhandled status and had notifications enabled and contact groups assigned. Looking at the Notifications log in the XI web interface, it is clear that Nagios XI wasn't even trying to send notifications to anyone for this host or the service in question. We have far fewer notifications in that log than normal since the upgrade to 5.5.1 from 5.4. Our Linux admin also noticed the day before that he wasn't receiving notifications for something that exhibited the same behavior as this.

Could you tell me what the next troubleshooting step should be? I'm sure there's a log somewhere I need to check to find out what's going on. Thanks!

Re: Some notifications not firing after upgraded to 5.5.1

Posted: Fri Jul 27, 2018 2:05 pm
by jomann
Well you should check out the event log for sure, and search for the host/service that was not sending out a notification. Check for the alert and see if it tried to notify. If it did though it should have shown up in the notifications section. The most obvious things to check after that are that the host/service is going to have the actual settings to be able to alert. If you want to view the actual flattened definition you can check the objects.cache (/usr/local/nagios/var/objects.cache) for the host/service definition and look for the notification_options. There was no downtime or anything occurring during this right?

Re: Some notifications not firing after upgraded to 5.5.1

Posted: Fri Jul 27, 2018 5:03 pm
by gmackey
There is clearly an issue after checking the event log.

Here is a service that sends notifications properly and is configured identically from what I can see:

SERVICE ALERT: Baubo;Svc - Apache Tomcat 8;OK;HARD;1;Tomcat8=running (auto)
SERVICE ALERT: Baubo;Svc - Apache Tomcat 8;CRITICAL;HARD;5;critical(Tomcat8=stopped (auto))
SERVICE ALERT: Baubo;Svc - Apache Tomcat 8;UNKNOWN;HARD;5;Failed to open service manager: 1115: A system shutdown is in progress.
SERVICE ALERT: Baubo;Svc - Apache Tomcat 8;CRITICAL;SOFT;1;critical(Tomcat8=stopping (auto))

And here is a service that does not trigger notifications to be sent:

SERVICE ALERT: Kane;Svc - SCCM;CRITICAL;SOFT;1;critical(SMS_EXECUTIVE=stopped (auto))

It never gets past that first service alert. Same goes for another service I tried. All of these services and hosts were sending notifications right before we upgrade from 5.4.13 to 5.5.1 and I have evidence of this because of a temporary network misconfiguration on a core router that brought every single host and service down, resulting in about 14,500 notifications. So yeah, notifications working great before upgrade and not triggering after upgrade. Some of them still work, though, but I can't see a difference in the config on those.

Re: Some notifications not firing after upgraded to 5.5.1

Posted: Mon Jul 30, 2018 4:24 pm
by scottwilkerson
Are the hosts down for the services that are not sending?

Re: Some notifications not firing after upgraded to 5.5.1

Posted: Mon Jul 30, 2018 7:04 pm
by gmackey
No, the hosts were left operational. I literally just picked a random service from a random host that was previously sending email notifications days before to test. The other service I tested was the one that I mentioned that was down recently when another related host (a shared database server) was completely down.

Re: Some notifications not firing after upgraded to 5.5.1

Posted: Tue Jul 31, 2018 11:36 am
by scottwilkerson
I have a feeling this may be due to a reported bug about some settings not getting refactored properly in Core
https://github.com/NagiosEnterprises/na ... issues/557

I added this thread to that ticket so when a fix is available they can notify this thread

Re: Some notifications not firing after upgraded to 5.5.1

Posted: Thu Aug 02, 2018 7:19 am
by scottwilkerson
I believe I found the cause in Core and is fixed in the maint branch on Github
https://github.com/NagiosEnterprises/na ... ee/maint​​

Code: Select all

wget https://github.com/NagiosEnterprises/nagioscore/archive/maint.tar.gz​
tar xzf maint.tar.gz​
cd nagioscore-maint​
configureflags="--with-command-group=​nagcmd"
if [ ! `command -v systemctl` ] || [ -f /etc/init.d/nagios ]; then
    configureflags="--with-init-type=sysv $configureflags"
fi
./configure "$configureflags"​
make -j 2 all​
make install​

service nagios restart

After this once the services stuck in soft state go to OK state either naturally, or by stopping nagios and removing retention.dat they should no longer get stuck