Page 1 of 1

Nagios XI 5.7.2 - Event Queue Empty & Services Behaving Abno

Posted: Sun Aug 16, 2020 4:14 am
by bsanjay
Hello Team,
We have recently upgraded to Nagios XI 5.7.2 and after the upgrade we can see empty event queues (ipcs -q) and services behaving abnormally.
For example services (nagios/npcd) in backend CLI is running but are showing down in GUI (monitoring Engine/Performance Grapher). Please find the screenshot & system profile attached privately for your reference,

Best Regards,
BSanjay

Re: Nagios XI 5.7.2 - Event Queue Empty & Services Behaving

Posted: Mon Aug 17, 2020 11:19 am
by bsanjay
Even though nagios service is running in background, we are seeing this error message in availability reports. Please find screenshots for your reference,

Re: Nagios XI 5.7.2 - Event Queue Empty & Services Behaving

Posted: Mon Aug 17, 2020 3:06 pm
by benjaminsmith
Hi @bsanjay,

That's odd. Do you see check results updating as expected in the user interface? What happens you force and immediate check, does it update the status of the host or service. Try re-starting Nagios and all of its services using the commands below (Cent / RHEL 7):

Code: Select all

systemctl stop npcd
systemctl stop nagios
systemctl stop crond
pkill -9 -u nagios
systemctl restart mariadb
rm -f /usr/local/nagios/var/rw/nagios.cmd
rm -f /usr/local/nagios/var/nagios.lock
rm -f /var/run/nagios.lock
rm -f /usr/local/nagios/var/ndo.sock
rm -f /usr/local/nagiosxi/var/*.lock
for i in `ipcs -q | grep nagios |awk '{print $2}'`; do ipcrm -q $i; done
systemctl restart httpd
systemctl start nagios
systemctl start npcd
systemctl start crond
If the issue persists, please send me a fresh system profile to further troubleshoot. Thanks, Benjamin

To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and share in a private message or upload it to the post/ticket, and then reply to this post to bring it up in the queue.

Re: Nagios XI 5.7.2 - Event Queue Empty & Services Behaving

Posted: Wed Aug 19, 2020 3:27 am
by WillemDH
Hello,

We also tried to update Nagios XI from 5.6.14 to 5.7.2 which failed. The message queue was stuck at zero. SELinux is disabled.. My colleague Peter will create a support ticket with a system profile. In the meantime we restored a vm snapshot from before the update.

Grtz

Willem

Re: Nagios XI 5.7.2 - Event Queue Empty & Services Behaving

Posted: Wed Aug 19, 2020 3:30 am
by PeterDK
Hi,

we are having the same issue when upgrading from 5.6.14 -> 5.7.2
At first our message queue was stuck at 88000, but after a reboot is was empty.
The Nagios environment seems to be fine, but nothing is happening.

I also saw pending check from 1970 (see attachment)

Kind regards,
Peter

Re: Nagios XI 5.7.2 - Event Queue Empty & Services Behaving

Posted: Wed Aug 19, 2020 11:15 am
by benjaminsmith
Hi Peter,


Sorry to hear you're having issues with the upgrade. Can you open a separate support ticket or forum thread for your issue and share your system profile so we can review the logs and troubleshoot this for you?

Thanks,
Benjamin

Re: Nagios XI 5.7.2 - Event Queue Empty & Services Behaving

Posted: Thu Aug 20, 2020 11:48 am
by nickap
We've had similar issues and have had a support ticket open for over 2 months with no resolution.

The 5.7 upgrade has been frustrating and disappointing. We're still running 5.6.14 because we cannot successfully upgrade without the Nagios monitoring engine crashing or failing to check properly.

Re: Nagios XI 5.7.2 - Event Queue Empty & Services Behaving

Posted: Fri Aug 21, 2020 10:28 am
by benjaminsmith
HI @nickap,

Sorry to hear you've had a poor experience with the 5.7 upgrade. As you may know, we completely re-wrote the backend database component for improved performance and scalability. We are working to resolve these issues that have impacted some customers, and we should have another maintenance release out soon.

One benefit of the Nagios XI license is that it allows for 3 installs, production, test, and backup. We do our best to QA new releases but every environment is unique, so testing upgrades out on the test server is highly recommended for a smooth transition between major changes.

For now, I would recommend either staying in 5.6.14 until 5.7.3 is released or downgrading to the previous version of ndo (backend database application).

Lastly, this thread has gotten off-topic as far as addressing @bsanjay's support issue. If you're still having issues, please open a support ticket at:

https://support.nagios.com/tickets/

and we'll help you get this taken care of.