I made the change to the file on my Prod box so it wouldn't keep overwriting the file on the failover box. The restore ran this morning and now I'm unable to access my failover environment via the web UI.
The error I'm seeing on the service checks for that host is: Error: Could not parse XML from http://10.231.86.58/nagiosxi/ ()
I'm not sure what happened or where to even start to look to figure it out.
Unable to perform restore after 5.6.13
Re: Unable to perform restore after 5.6.13
Ok, I have a feeling that after the restore this morning, my failover host thought it was my prod host. There were muted service checks on the prod host sending out notifications and I think in actuality they were coming from the failover host.
To stop this as quickly as possible, I ended up reverting the changes to the restore_xi.sh files on both hosts and then restoring a backup to my failover host from two days ago prior to the change. Everything seems to be back to normal now.
Another thing I noticed is that when I run the failover_restore.sh script manually, it causes a lot of drive space to get used. I went from 40%+ free space to only 6% free. I don't know what the heck changed with this last update, but my environment is not happy.
To stop this as quickly as possible, I ended up reverting the changes to the restore_xi.sh files on both hosts and then restoring a backup to my failover host from two days ago prior to the change. Everything seems to be back to normal now.
Another thing I noticed is that when I run the failover_restore.sh script manually, it causes a lot of drive space to get used. I went from 40%+ free space to only 6% free. I don't know what the heck changed with this last update, but my environment is not happy.
Re: Unable to perform restore after 5.6.13
The restore file is big, the restore itself extracts it (uncompressed, then some other things get extracted even further), then mysql temporary files are created to restore the DB. I would usually recommend you keep at least 4x the size of a restore file free space available for proper restore.
That -e in the restore script causes issues, the devs have reverted that change because it caused issues.
That -e in the restore script causes issues, the devs have reverted that change because it caused issues.
Re: Unable to perform restore after 5.6.13
Do the extracted files get removed once the restore is complete? I think if that does happen, it's not happening for some reason in my environment. It's filling up the drive and then leaving the redundant data.
Re: Unable to perform restore after 5.6.13
Yes, it should be cleaned up automatically (if it doesn't fail).
Check here:
It should say XXXXXXX-restore, they can be safely deleted, if your restores failed it likely left them behind.
Check here:
Code: Select all
/store/backups/nagiosxiRe: Unable to perform restore after 5.6.13
I'm seeing a nagiosxi directory in /tmp for some reason. It seems to have all the files in it that would typically be placed during a restore. I think that's what's taking up all the space.
Re: Unable to perform restore after 5.6.13
Likely, feel free to remove any non-backup files in /store/backups/nagiosxi.
Re: Unable to perform restore after 5.6.13
I'd like to escalate this issue for next week. I need someone from Nagios Support to connect with me and figure out what's going on.
My restores aren't working at all since updating to 5.6.13. After a restore all the daemons crash and I have to manually reboot the host and more importantly, the restore isn't actually occurring.
Thank you! Have a great weekend.
My restores aren't working at all since updating to 5.6.13. After a restore all the daemons crash and I have to manually reboot the host and more importantly, the restore isn't actually occurring.
Thank you! Have a great weekend.
-
benjaminsmith
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: Unable to perform restore after 5.6.13
Hi @rferebee,
Please open a ticket for this issue to get a remote session booked and reference this forum topic.
Thank you.
Benjamin
Please open a ticket for this issue to get a remote session booked and reference this forum topic.
Thank you.
Benjamin
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Unable to perform restore after 5.6.13
This thread can be locked. The issue is resolved. Thank you.