Page 3 of 3

Re: create_backup.sh hanging

Posted: Thu Apr 30, 2015 9:54 am
by jolson
I am able to reproduce this issue on my local system - though it's slightly different from what you describe.

In my testing, if the backup script is run on a single node at a time, the state will climb from 0 to a max of 2 and then descend back to 0 - the backup completes properly.

If the backup script is run on both nodes at once, the states will get stuck and the backups never finish. I spoke with a developer, and it looks like the backup job is currently running as an instance job - which means that every instance picks it up and runs it whenever a backup is scheduled and run from the command subsystem. I think the problem is that the 'state' is a cluster-wide variable, so the backups will inevitably get stuck if running on more than 1 instance at the same time.

This is my theory so far. I will do further testing and collaberation with the developers to see if this is the root cause, and we'll fix it accordingly if it is. Thanks for the information, it was very helpful in reproducing this bug. I'll keep you informed.

Best,


Jesse

Re: create_backup.sh hanging

Posted: Thu Apr 30, 2015 10:04 am
by jvestrum
Ah, good to hear that it's not just me. Now that I know what's going on I can work around it, so no rush. If/when there's an official fix I'd be glad to test it out. Thanks,

John

Re: create_backup.sh hanging

Posted: Thu Apr 30, 2015 10:16 am
by jolson
Sounds good! The official bug report has been filed, I'll let you know when progress is made.