Page 2 of 3

Re: Issue after upgrade to 5.7.2 version

Posted: Thu Sep 10, 2020 5:17 pm
by benjaminsmith
HI,

Please try to run the following commands to truncate a few the nagios tables.

Code: Select all

echo "truncate table nagios_objects; truncate table nagios_hosts; truncate table nagios_hoststatus; truncate table nagios_services; truncate table nagios_servicestatus;" | mysql -u root -pnagiosxi nagios
And then restart the software stack.

Code: Select all

service crond stop
service npcd stop
service nagios stop
pkill -9 -u nagios
for i in $(ipcs -q | grep nagios |awk '{print $2}'); do ipcrm -q $i; done
rm -rf /usr/local/nagiosxi/var/dbmaint.lock
rm -rf /usr/local/nagiosxi/var/event_handler.lock
rm -rf /usr/local/nagiosxi/scripts/reconfigure_nagios.lock
service nagios start
service npcd start
service crond start
If the issue persists, let's get a support ticket opened for you as we may need to setup a remote session to troubleshoot this further.
https://support.nagios.com/tickets/

Re: Issue after upgrade to 5.7.2 version

Posted: Mon Sep 14, 2020 7:24 am
by bseuser
Thanks for your response. After executing the below commands still the same.

Is the Nagios 5.7.3 is a stable version, are there any issues with 5.7.3?
If we install the 5.7.3 version the NDO3 utils will also automatically updated? As per your suggestion, we downgraded the ndoutils to the 5.6.14 version.

Thanks,

Re: Issue after upgrade to 5.7.2 version

Posted: Mon Sep 14, 2020 4:51 pm
by benjaminsmith
Hi,
If we install the 5.7.3 version the NDO3 utils will also automatically updated? As per your suggestion, we downgraded the ndoutils to the 5.6.14 version.
That's correct. If you upgrade to 5.7.3 it will automatically update the server to ndo3 ( unless you comment out this part of the upgrade script).

Benjamin

Re: Issue after upgrade to 5.7.2 version

Posted: Tue Sep 15, 2020 8:34 am
by nickap
FYI, we installed 5.7.3 on a test clone and the issue persists, disappointing. Do we lose anything when downgrading the NDO?

Re: Issue after upgrade to 5.7.2 version

Posted: Tue Sep 15, 2020 5:29 pm
by benjaminsmith
Hi,

Appreciate your update on this. It's always best to make a backup or snapshot before making any changes, but you will not lose anything by downgrading ndo on the system.

Re: Issue after upgrade to 5.7.2 version

Posted: Thu Sep 17, 2020 7:06 am
by bseuser
Hi,

We upgraded 5.7.3 today and it broke again.
Nagios XI has some issues and links are broken b/w GUI and backend connectivity.
Steps:
-------
1. Created hosts, services, in Nagiosxi GUI. [I used Configuration Wizards - Select a Wizard --> Nagios XI Server(monitor a remote NagiosXI server)]
2. When I Search the newly created host using the search option top right corner the host is not showing. Referred below screen-shot-1 for reference
3. But it is showing in CCM. Referred below screen-shot-2 for reference.
4. We have to wait for ~20-30min to update to GUI.

The 5.7.3 broked our Nagios production and not happy with 5.7.x version.

Thanks

Re: Issue after upgrade to 5.7.2 version

Posted: Thu Sep 17, 2020 5:24 pm
by benjaminsmith
Hi,

That's frustrating to hear. If you're not able to revert back using a snapshot on this server, I would recommend downgrading back to ndo2bd to minimize any impact on the production server.

Alternatively, I have a patch for ndo3 that I could give you if you have test servers set up in your environment. Let me know how you would like to proceed.

Downgrade Instructions (Local Database)

Code: Select all

systemctl stop nagios
cd /tmp
rm -rf /tmp/nagiosxi
wget https://assets.nagios.com/downloads/nagiosxi/5/xi-5.6.14.tar.gz
tar zxf xi-5.6.14.tar.gz
cd /tmp/nagiosxi/subcomponents/ndoutils
./install
systemctl enable ndo2db
Then edit your /usr/local/nagios/etc/nagios.cfg and make sure this line is uncommented:

Code: Select all

broker_module=/usr/local/nagios/bin/ndomod.o config_file=/usr/local/nagios/etc/ndomod.cfg
Make sure this line is commented:

Code: Select all

#broker_module=/usr/local/nagios/bin/ndo.so /usr/local/nagios/etc/ndo.cfg
Then start the nagios service:

Code: Select all

systemctl start nagios

Re: Issue after upgrade to 5.7.2 version

Posted: Fri Sep 18, 2020 3:01 am
by bseuser
Hi,

We have 4 Nagios Instances(APAC, EMEA, PROD, TEST), and we see this behavior only in one prod server. The difference between prod with other instances is the OS versions (CentOS but different versions) and the hosts, services ~18+k services in prod.

We have the Dev instance to test the ndo3, but I tested the 5.7.3 workflow in DEV before upgrading the prod instance and I don't see any issue on the dev server and issues in prod. Is there any OS compatible for ndo3 DB?

After upgrading to the 5.7.x version, the load and I/O Wait is very high on the Nagios instances, and accessing GUI is dead slow :oops: . Sending the profile in PM.

We downgraded the ndo2db to 5.6.14 in prod instances and will observe the behavior in PROD.

Please let me know if you need any other information to resolve the issues.

Thanks,

Re: Issue after upgrade to 5.7.2 version

Posted: Fri Sep 18, 2020 11:00 am
by bseuser
I am very disappointed. After downgraded the ndo2db to 5.6.14 total Nagios workflow is broken in our APAC Nagios instance :oops: .
The GUI is not at all opening always reading and timing out and this upgrade is totally broken. I already sent you the profile of our APAC Nagios instances via PM.
Can you please provide us the steps to downgrade the total Nagios XI to 5.6.14 or 5.6.12 version?

Re: Issue after upgrade to 5.7.2 version

Posted: Fri Sep 18, 2020 3:12 pm
by benjaminsmith
Hi

Given the impact here, it would be best to open a support ticket so we can set up a remote session if necessary to troubleshoot the APAC instance.

Open a support ticket from the following page, and reference this post in the ticket.
https://support.nagios.com/tickets/

The ndo process writes the nagios results to a database, so as long as the nagios service is running it will continue to execute checks.

If you did downgrade to ndo2db on this instance, to open up the /usr/local/nagios/etc/nagios.cfg file and comment out this line. The configuration file is still loading the new ndo broker module.

Code: Select all

# Added by NDO 'make install-broker-line' on Thu Sep 17 02:09:38 EDT 2020
#broker_module=/usr/local/nagios/bin/ndo.so /usr/local/nagios/etc/ndo.cfg
And uncomment the following line.

Code: Select all

# Commented out by NDO 'make install-broker-line' on Thu Sep 17 02:09:38 EDT 2020
broker_module=/usr/local/nagios/bin/ndomod.o config_file=/usr/local/nagios/etc/ndomod.cfg
Then go ahead a do a full restart of the Nagios service stack.

Code: Select all

service crond stop
service npcd stop
service nagios stop
service ndo2db stop
pkill -9 -u nagios
for i in $(ipcs -q | grep nagios |awk '{print $2}'); do ipcrm -q $i; done
rm -rf /usr/local/nagiosxi/var/dbmaint.lock
rm -rf /usr/local/nagiosxi/var/event_handler.lock
rm -rf /usr/local/nagiosxi/scripts/reconfigure_nagios.lock
service mysqld restart
service httpd restart
service ndo2db start
service nagios start
service npcd start
service crond start
Please post the results and if your're able to get into the GUI after performing those steps to the support ticket.

Regards,
Benjamin