CentOS -> Ubuntu migration/upgrade issues

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
jacek
Posts: 248
Joined: Wed Sep 09, 2015 5:49 am

CentOS -> Ubuntu migration/upgrade issues

Post by jacek »

Hi,

I'm recently testing a migration to from CentOS 6 to Ubuntu (20.04 x86_64) on NagiosXI and I'm running into few issues during the upgrade process.
Here is what I did:
  • I created an backup on the Centos with Nagios XI (5.7.5).
  • I installed a fresh Ubuntu 20.04, and installed Nagios 5.7.5 on it
  • I ran the recovery script and also ran the repair script
  • The system started working as expected
  • Then I started an upgrade of XI from the Web UI
These are the warnings/errors I'm getting:

Code: Select all

tar: Removing leading `/' from member names
Backing up MySQL databases...
mysqldump: [Warning] Using a password on the command line interface can be insecure.
mysqldump: Error: 'Access denied; you need (at least one of) the PROCESS privilege(s) for this operation' when trying to dump tablespaces
mysqldump: [Warning] Using a password on the command line interface can be insecure.
mysqldump: Error: 'Access denied; you need (at least one of) the PROCESS privilege(s) for this operation' when trying to dump tablespaces
mysqldump: [Warning] Using a password on the command line interface can be insecure.
mysqldump: Error: 'Access denied; you need (at least one of) the PROCESS privilege(s) for this operation' when trying to dump tablespaces
Backing up cronjobs for Apache...
Backing up logrotate config files...
Backing up Apache config files...
Compressing backup...
 
===============
BACKUP COMPLETE
===============

Code: Select all

Things look okay - No serious problems were detected during the pre-flight check
> Return Code: 0
--------------------------------------
PHP Deprecated:  __autoload() is deprecated, use spl_autoload_register() instead in /usr/local/nagiosxi/html/includes/phpmailer/PHPMailerAutoload.php on line 45
PHP Deprecated:  Function get_magic_quotes_gpc() is deprecated in /usr/local/nagiosxi/html/includes/utils.inc.php on line 256
PHP Deprecated:  Function get_magic_quotes_gpc() is deprecated in /usr/local/nagiosxi/html/includes/utils.inc.php on line 256
PHP Deprecated:  Array and string offset access syntax with curly braces is deprecated in /usr/local/nagiosxi/html/includes/components/ldap_ad_integration/adLDAP/src/classes/adLDAPUsers.php on line 520
chown: invalid user: 'apache:nagios'
chown: invalid user: 'apache:nagios'
chown: invalid user: 'apache:nagios'
chown: invalid user: 'apache:nagios'
chown: invalid user: 'apache:nagios'
chown: invalid user: 'apache:nagios'
chown: invalid user: 'apache:nagios'
chown: invalid user: 'apache:nagios'
chown: invalid user: 'apache:nagios'
chown: invalid user: 'apache:nagios'
sh: 1: rpm: not found

Nagios XI Upgrade Complete!
The missing user and mysql permissions look a little bit concerning, but the NagiosXI system looks fine?
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: CentOS -> Ubuntu migration/upgrade issues

Post by benjaminsmith »

HI,

If you are changing OS families, there is an additional step to repair the restore on the new server. Did you run the restore_repair.sh script on the new server.

Code: Select all

cd /tmp/
wget https://assets.nagios.com/downloads/nagiosxi/scripts/restore_repair.sh
chmod +x restore_repair.sh
./restore_repair.sh
This is detailed on page 13 of the following guide.
https://assets.nagios.com/downloads/nag ... ios-XI.pdf

If not try running this and then upgrade once more. I recommend using the manual upgrade process.

Code: Select all

cd /tmp
rm -rf nagiosxi xi*.tar.gz
wget http://assets.nagios.com/downloads/nagiosxi/xi-latest.tar.gz
tar xzf xi-latest.tar.gz
cd nagiosxi
./upgrade
If that's not the issue here, please send over the system profile from the new server. Thanks, Benjamin

To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
jacek
Posts: 248
Joined: Wed Sep 09, 2015 5:49 am

Re: CentOS -> Ubuntu migration/upgrade issues

Post by jacek »

I did run the restore_repair script:
"I ran the recovery script and also ran the repair script"

Now I ran the upgrade from CLI as root as you asked and it ran without the errors about the access denied in mysql, but the rest is there.
I PM you the profile and the upgrade log file.

Thanks!
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: CentOS -> Ubuntu migration/upgrade issues

Post by benjaminsmith »

Hi,

The good news is that the upgrade did complete and the profile shows version 5.8.1 and the main services are running. Are you able to login into the GUI? Are you able to see check results being updated? Let me know what specific problems remain and we can troubleshoot those.
Nagios XI Upgrade Complete!
---------------------------
You can access the Nagios XI web interface by visiting:
http://xx.xx.xx.xx/nagiosxi/
I noticed quite few errors like this in the logs.
sda: failed to get sysfs uid: Invalid argument
I believe they are related to this issue, and the following solution may resolve those.

Ubuntu 20.04 on ESXi Generating multipathd Errors
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
jacek
Posts: 248
Joined: Wed Sep 09, 2015 5:49 am

Re: CentOS -> Ubuntu migration/upgrade issues

Post by jacek »

Hi,

I noticed that most likely the apache:nagios user came from the Centos backup.
This should be www-data:nagios, and the init script in the backup file did get that correctly.

So after recovering from the Centos backup and fixing it from the script I cleaned the /usr/local/nagiosxi/tmp/ directory, as the backup got extracted there and there was the only file that had the "apache" user reference in the variables.

Then I did an upgrade to 5.8.0 from CLI, and then to the newest also from CLI and didn't get the errors anymore.

So I did the same for our production environment and am now working on Ubunut 20 LTS with the 5.8.1 version.
I have only one issue with npcd not starting after reboot, but still investigating it, I have an ramdisk, but the setup looks ok...
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: CentOS -> Ubuntu migration/upgrade issues

Post by benjaminsmith »

Hi,

Sounds like you are getting petty close, are you able to start the npcd service on the system? If not please post any error output.

Code: Select all

sudo systemctl start npcd.service
Also, try enabling the service and let me know if it comes up on a re-boot.

Code: Select all

sudo systemctl enable npcd.service
If you want to send over the system profile from server, I can check it the logs for any other errors. You can download it from Admin > System Profile > Download Profile.

Benjamin
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
jacek
Posts: 248
Joined: Wed Sep 09, 2015 5:49 am

Re: CentOS -> Ubuntu migration/upgrade issues

Post by jacek »

This must have been an intermittent issue, after a few reboots and troubleshooting it went away, and can't recreate it anymore.

But I ran into an very weird situation which I will try to describe (might be material for a separate topic unless this is an easy fix).
I have a few service definitions in an config file called _Servers-Windows-General, which I assign to a hostgroup (so it is not tied to any host directly).
if I change anything in one of these services I get an error:
Error: Service has no hosts and/or service_description (config file '/usr/local/nagios/etc/services/_Servers-Windows-General.cfg', starting on line 47)
which makes sense, because if I take a look at the file it has the "register" parameter set to "1" for all services.
But this doesn't make sense as this worked always in the past.

Also if I try to modify any of the localhost services parameters and update them I run into the same issue.
I have an working "current users" service, go in CCM, hit apply without doing any changes, and I end up with the UI saying that the host definition is missing for that service, but if I look up the file, the definition is there, perfectly fine.

This is an disaster in case we need to change something...
I think this might be the case for all services, so I PM'd you the profile once again.

PS - might be an useful tip
mysql did create a lot of binlog files, filling a space of over 10 GB in a week!
Looks like the default settings don't have the expiration set up, so here we modified the size and added the expiration option:
/etc/mysql.conf.d/mysqld.cnf

Code: Select all

max_binlog_size = 50M
expire_logs_days = 2
Now it is fine.
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: CentOS -> Ubuntu migration/upgrade issues

Post by benjaminsmith »

Hi @jacek,

Thank you for the system profile. I've imported your configurations on my test server here and made several random changes to those 4 services under _Servers-Windows-General and I'm able to apply the configuration changes. Can write down the exact changes you are making and I will try to replicate this on my test system.

Regards,
Benjamin
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
jacek
Posts: 248
Joined: Wed Sep 09, 2015 5:49 am

Re: CentOS -> Ubuntu migration/upgrade issues

Post by jacek »

I took an second look and the host disappears from all configs as soon as I hit the save button on the CCM service management.
Here is what I did:
  • I opened an OK looking and working service
    1-service-config-OK.png
  • I changed the ARG1 to 15% and hit Save
  • Without applying the config I opened it once again and the host in "Manage Hosts" is already gone
    2-service-config-NOK.png
So no wonder the config doesn't get applied.
I checked this on other service and the behavior is the same.
Changing the 20% to 15% via the XI UI also caused the same issue.
You do not have the required permissions to view the files attached to this post.
jacek
Posts: 248
Joined: Wed Sep 09, 2015 5:49 am

Re: CentOS -> Ubuntu migration/upgrade issues

Post by jacek »

Looks like we fixed the issue, at least the system looks OK now.

In the nagiosql.tbl_service table in localhost services, one record had the "host_name" value set to 0, and the rest had a 1, so we changed that value to a "1" and after that everything looks normal again.

Any clues why this happened?
Locked