Unable to backup server, DB issues

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
dshearon
Posts: 64
Joined: Tue Nov 17, 2015 9:38 am

Unable to backup server, DB issues

Post by dshearon »

We were able to resolve the log rotate issue but our attempts to upgrade to 5.7 failed and it appears backups/DB corruption is the cause. Here is some of the info we've gathered so far. Any help you can provide with troubleshooting would be appreciated.

A manual backup starts and we see activity in top but once it finishes no files are present. The output below is from the backup logs. There are more of the same lines but this should show all the data available

05-14-2017 00:00:03 DEBUG: Running scheduled local backup ...
05-14-2017 00:00:03 INFO: Creating a local backup: nagiosxi.1494734403
05-14-2017 00:00:03 DEBUG: Sending create local backup command to CmdSubsystem
05-14-2017 00:08:41 INFO: Too many backups! Limit is 4. Removing: nagiosxi.14923 15202.tar.gz before proceeding with backup.
Error backing up MySQL database 'nagios' - check the password in this script!
Error backing up MySQL database 'nagios' - check the password in this script!
Error backing up MySQL database 'nagios' - check the password in this script!
Error backing up MySQL database 'nagios' - check the password in this script!


Running the database repair script fails with the output below. This is just a sample, there are many other mibs listed but it gives you the idea.

Cannot adopt OID in JUNIPER-SRX5000-SPU-MONITORING-MIB: jnxJsSPUMonitoringMaxCPSession ::= { jnxJsSPUMonitoringObjectsEntry 9 }
Cannot adopt OID in JUNIPER-SRX5000-SPU-MONITORING-MIB: jnxJsSPUMonitoringNodeIndex ::= { jnxJsSPUMonitoringObjectsEntry 10 }
Cannot adopt OID in JUNIPER-SRX5000-SPU-MONITORING-MIB: jnxJsSPUMonitoringNodeDescr ::= { jnxJsSPUMonitoringObjectsEntry 11 }
Cannot adopt OID in JUNIPER-SRX5000-SPU-MONITORING-MIB: jnxJsSPUMonitoringFlowSessIPv4 ::= { jnxJsSPUMonitoringObjectsEntry 12 }
Cannot adopt OID in JUNIPER-SRX5000-SPU-MONITORING-MIB: jnxJsSPUMonitoringFlowSessIPv6 ::= { jnxJsSPUMonitoringObjectsEntry 13 }
Cannot adopt OID in JUNIPER-SRX5000-SPU-MONITORING-MIB: jnxJsSPUMonitoringCPSessIPv4 ::= { jnxJsSPUMonitoringObjectsEntry 14 }
Cannot adopt OID in JUNIPER-SRX5000-SPU-MONITORING-MIB: jnxJsSPUMonitoringCPSessIPv6 ::= { jnxJsSPUMonitoringObjectsEntry 15 }
/usr/local/nagiosxi/scripts/repair_databases.lock already exists. Perhaps a repair is already in process ..aborting



Here is what we find in the UI
Nagios_Capture3.JPG
Thank you in advance for your help.
You do not have the required permissions to view the files attached to this post.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Unable to backup server, DB issues

Post by scottwilkerson »

It thinks a backup is already in the process of running

Can you show the output of the following?

Code: Select all

ls -l /usr/local/nagiosxi/scripts/repair_databases.lock
df -h
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
dshearon
Posts: 64
Joined: Tue Nov 17, 2015 9:38 am

Re: Unable to backup server, DB issues

Post by dshearon »

Here is the output you requested.

-rwxr-xr-x 1 nagios nagios 0 Oct 20 2015 /usr/local/nagiosxi/scripts/repair_databases.lock


Filesystem Size Used Avail Use% Mounted on
devtmpfs 3.8G 0 3.8G 0% /dev
tmpfs 3.9G 0 3.9G 0% /dev/shm
tmpfs 3.9G 218M 3.6G 6% /run
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
/dev/sda4 295G 30G 265G 11% /
/dev/sda2 1014M 201M 814M 20% /boot
/dev/sda1 200M 12M 189M 6% /boot/efi
tmpfs 779M 0 779M 0% /run/user/1000
tmpfs 779M 0 779M 0% /run/user/0
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Unable to backup server, DB issues

Post by scottwilkerson »

ahh, this is a really old lock file, lets just remove it

Code: Select all

rm -f /usr/local/nagiosxi/scripts/repair_databases.lock
then try to proceed with your backup
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
dshearon
Posts: 64
Joined: Tue Nov 17, 2015 9:38 am

Re: Unable to backup server, DB issues

Post by dshearon »

The backup failed again after the removal of the lock file but I ran the DB repair script again and it was able to complete. After doing that the backup completed successfully. I went back to try and run the update again from the UI and it is just showing "Update in progress. Please wait. Update may take a few minutes.". It continues to show the same thing even after a reboot.
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Unable to backup server, DB issues

Post by scottwilkerson »

dshearon wrote:The backup failed again after the removal of the lock file but I ran the DB repair script again and it was able to complete. After doing that the backup completed successfully. I went back to try and run the update again from the UI and it is just showing "Update in progress. Please wait. Update may take a few minutes.". It continues to show the same thing even after a reboot.
In some environments a manual upgrade is required
https://assets.nagios.com/downloads/nag ... ctions.pdf

I would suggest doing this, once complete you can reset the upgrade status page following these instructions
https://support.nagios.com/kb/article/n ... e-851.html
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
dshearon
Posts: 64
Joined: Tue Nov 17, 2015 9:38 am

Re: Unable to backup server, DB issues

Post by dshearon »

I ran the upgrade script manually after posting that and everything appeared to work fine. I think we are back in business so you can lock the thread. Thank you for your help!
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Unable to backup server, DB issues

Post by scottwilkerson »

dshearon wrote:I ran the upgrade script manually after posting that and everything appeared to work fine. I think we are back in business so you can lock the thread. Thank you for your help!
Awesome!

Locking thread
Former Nagios employee
Creator:
Human Design Website
Get Your Human Design Chart
Locked