Unable to perform restore after 5.6.13

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
rferebee
Posts: 733
Joined: Wed Jul 11, 2018 11:37 am

Re: Unable to perform restore after 5.6.13

Post by rferebee »

I made the change to the file on my Prod box so it wouldn't keep overwriting the file on the failover box. The restore ran this morning and now I'm unable to access my failover environment via the web UI.

The error I'm seeing on the service checks for that host is: Error: Could not parse XML from http://10.231.86.58/nagiosxi/ ()

I'm not sure what happened or where to even start to look to figure it out.
rferebee
Posts: 733
Joined: Wed Jul 11, 2018 11:37 am

Re: Unable to perform restore after 5.6.13

Post by rferebee »

Ok, I have a feeling that after the restore this morning, my failover host thought it was my prod host. There were muted service checks on the prod host sending out notifications and I think in actuality they were coming from the failover host.

To stop this as quickly as possible, I ended up reverting the changes to the restore_xi.sh files on both hosts and then restoring a backup to my failover host from two days ago prior to the change. Everything seems to be back to normal now.

Another thing I noticed is that when I run the failover_restore.sh script manually, it causes a lot of drive space to get used. I went from 40%+ free space to only 6% free. I don't know what the heck changed with this last update, but my environment is not happy.
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Unable to perform restore after 5.6.13

Post by ssax »

The restore file is big, the restore itself extracts it (uncompressed, then some other things get extracted even further), then mysql temporary files are created to restore the DB. I would usually recommend you keep at least 4x the size of a restore file free space available for proper restore.

That -e in the restore script causes issues, the devs have reverted that change because it caused issues.
rferebee
Posts: 733
Joined: Wed Jul 11, 2018 11:37 am

Re: Unable to perform restore after 5.6.13

Post by rferebee »

Do the extracted files get removed once the restore is complete? I think if that does happen, it's not happening for some reason in my environment. It's filling up the drive and then leaving the redundant data.
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Unable to perform restore after 5.6.13

Post by ssax »

Yes, it should be cleaned up automatically (if it doesn't fail).

Check here:

Code: Select all

/store/backups/nagiosxi
It should say XXXXXXX-restore, they can be safely deleted, if your restores failed it likely left them behind.
rferebee
Posts: 733
Joined: Wed Jul 11, 2018 11:37 am

Re: Unable to perform restore after 5.6.13

Post by rferebee »

I'm seeing a nagiosxi directory in /tmp for some reason. It seems to have all the files in it that would typically be placed during a restore. I think that's what's taking up all the space.
ssax
Dreams In Code
Posts: 7682
Joined: Wed Feb 11, 2015 12:54 pm

Re: Unable to perform restore after 5.6.13

Post by ssax »

Likely, feel free to remove any non-backup files in /store/backups/nagiosxi.
rferebee
Posts: 733
Joined: Wed Jul 11, 2018 11:37 am

Re: Unable to perform restore after 5.6.13

Post by rferebee »

I'd like to escalate this issue for next week. I need someone from Nagios Support to connect with me and figure out what's going on.

My restores aren't working at all since updating to 5.6.13. After a restore all the daemons crash and I have to manually reboot the host and more importantly, the restore isn't actually occurring.

Thank you! Have a great weekend.
benjaminsmith
Posts: 5324
Joined: Wed Aug 22, 2018 4:39 pm
Location: saint paul

Re: Unable to perform restore after 5.6.13

Post by benjaminsmith »

Hi @rferebee,

Please open a ticket for this issue to get a remote session booked and reference this forum topic.

Thank you.

Benjamin
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.

Be sure to check out our Knowledgebase for helpful articles and solutions!
rferebee
Posts: 733
Joined: Wed Jul 11, 2018 11:37 am

Re: Unable to perform restore after 5.6.13

Post by rferebee »

This thread can be locked. The issue is resolved. Thank you.
Locked