Page 1 of 1

Unable to backup server, DB issues

Posted: Fri Aug 07, 2020 7:55 am
by dshearon
We were able to resolve the log rotate issue but our attempts to upgrade to 5.7 failed and it appears backups/DB corruption is the cause. Here is some of the info we've gathered so far. Any help you can provide with troubleshooting would be appreciated.

A manual backup starts and we see activity in top but once it finishes no files are present. The output below is from the backup logs. There are more of the same lines but this should show all the data available

05-14-2017 00:00:03 DEBUG: Running scheduled local backup ...
05-14-2017 00:00:03 INFO: Creating a local backup: nagiosxi.1494734403
05-14-2017 00:00:03 DEBUG: Sending create local backup command to CmdSubsystem
05-14-2017 00:08:41 INFO: Too many backups! Limit is 4. Removing: nagiosxi.14923 15202.tar.gz before proceeding with backup.
Error backing up MySQL database 'nagios' - check the password in this script!
Error backing up MySQL database 'nagios' - check the password in this script!
Error backing up MySQL database 'nagios' - check the password in this script!
Error backing up MySQL database 'nagios' - check the password in this script!


Running the database repair script fails with the output below. This is just a sample, there are many other mibs listed but it gives you the idea.

Cannot adopt OID in JUNIPER-SRX5000-SPU-MONITORING-MIB: jnxJsSPUMonitoringMaxCPSession ::= { jnxJsSPUMonitoringObjectsEntry 9 }
Cannot adopt OID in JUNIPER-SRX5000-SPU-MONITORING-MIB: jnxJsSPUMonitoringNodeIndex ::= { jnxJsSPUMonitoringObjectsEntry 10 }
Cannot adopt OID in JUNIPER-SRX5000-SPU-MONITORING-MIB: jnxJsSPUMonitoringNodeDescr ::= { jnxJsSPUMonitoringObjectsEntry 11 }
Cannot adopt OID in JUNIPER-SRX5000-SPU-MONITORING-MIB: jnxJsSPUMonitoringFlowSessIPv4 ::= { jnxJsSPUMonitoringObjectsEntry 12 }
Cannot adopt OID in JUNIPER-SRX5000-SPU-MONITORING-MIB: jnxJsSPUMonitoringFlowSessIPv6 ::= { jnxJsSPUMonitoringObjectsEntry 13 }
Cannot adopt OID in JUNIPER-SRX5000-SPU-MONITORING-MIB: jnxJsSPUMonitoringCPSessIPv4 ::= { jnxJsSPUMonitoringObjectsEntry 14 }
Cannot adopt OID in JUNIPER-SRX5000-SPU-MONITORING-MIB: jnxJsSPUMonitoringCPSessIPv6 ::= { jnxJsSPUMonitoringObjectsEntry 15 }
/usr/local/nagiosxi/scripts/repair_databases.lock already exists. Perhaps a repair is already in process ..aborting



Here is what we find in the UI
Nagios_Capture3.JPG
Thank you in advance for your help.

Re: Unable to backup server, DB issues

Posted: Fri Aug 07, 2020 8:20 am
by scottwilkerson
It thinks a backup is already in the process of running

Can you show the output of the following?

Code: Select all

ls -l /usr/local/nagiosxi/scripts/repair_databases.lock
df -h

Re: Unable to backup server, DB issues

Posted: Fri Aug 07, 2020 8:23 am
by dshearon
Here is the output you requested.

-rwxr-xr-x 1 nagios nagios 0 Oct 20 2015 /usr/local/nagiosxi/scripts/repair_databases.lock


Filesystem Size Used Avail Use% Mounted on
devtmpfs 3.8G 0 3.8G 0% /dev
tmpfs 3.9G 0 3.9G 0% /dev/shm
tmpfs 3.9G 218M 3.6G 6% /run
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
/dev/sda4 295G 30G 265G 11% /
/dev/sda2 1014M 201M 814M 20% /boot
/dev/sda1 200M 12M 189M 6% /boot/efi
tmpfs 779M 0 779M 0% /run/user/1000
tmpfs 779M 0 779M 0% /run/user/0

Re: Unable to backup server, DB issues

Posted: Fri Aug 07, 2020 8:32 am
by scottwilkerson
ahh, this is a really old lock file, lets just remove it

Code: Select all

rm -f /usr/local/nagiosxi/scripts/repair_databases.lock
then try to proceed with your backup

Re: Unable to backup server, DB issues

Posted: Fri Aug 07, 2020 9:14 am
by dshearon
The backup failed again after the removal of the lock file but I ran the DB repair script again and it was able to complete. After doing that the backup completed successfully. I went back to try and run the update again from the UI and it is just showing "Update in progress. Please wait. Update may take a few minutes.". It continues to show the same thing even after a reboot.

Re: Unable to backup server, DB issues

Posted: Fri Aug 07, 2020 9:17 am
by scottwilkerson
dshearon wrote:The backup failed again after the removal of the lock file but I ran the DB repair script again and it was able to complete. After doing that the backup completed successfully. I went back to try and run the update again from the UI and it is just showing "Update in progress. Please wait. Update may take a few minutes.". It continues to show the same thing even after a reboot.
In some environments a manual upgrade is required
https://assets.nagios.com/downloads/nag ... ctions.pdf

I would suggest doing this, once complete you can reset the upgrade status page following these instructions
https://support.nagios.com/kb/article/n ... e-851.html

Re: Unable to backup server, DB issues

Posted: Fri Aug 07, 2020 9:30 am
by dshearon
I ran the upgrade script manually after posting that and everything appeared to work fine. I think we are back in business so you can lock the thread. Thank you for your help!

Re: Unable to backup server, DB issues

Posted: Fri Aug 07, 2020 9:46 am
by scottwilkerson
dshearon wrote:I ran the upgrade script manually after posting that and everything appeared to work fine. I think we are back in business so you can lock the thread. Thank you for your help!
Awesome!

Locking thread