[sheepdog] [PATCH] recovery: notify completion only when all objects are fresh

MORITA Kazutaka morita.kazutaka at gmail.com
Sun Jun 2 17:48:10 CEST 2013


At Sun, 02 Jun 2013 19:25:24 +0800,
Liu Yuan wrote:
> 
> On 06/02/2013 03:09 AM, MORITA Kazutaka wrote:
> > For the manual recovery, we have to read sheep.log carefully and
> > determine which object is the correct one in the stale objects.
> > Actually, I did this several months ago on some user environment.  At
> > that time, I could recover objects because their sheepdog crashed and
> > the stale directories were not cleaned up.  The crash reason is fixed
> > in the curret sheepdog, so if they would use the latest version of
> > sheepdog, I couldn't fix their environment.
> > 
> > What this patch tries to do is just only giving the chance for users
> > who have deep knowledge of sheepdog object recovery.  If you think we
> > shouldn't include such feature, it's okay for me to keep this change
> > for my own tree.  Perhaps, what I should add is rather a documentation
> > about the risk of data loss.
> 
> So it is better to keep it as a out of tree patch, for users that are
> able to do advanced manual recovery and can tolerate a cluster downtime
> just because of some unrecoverable broken or stale objects.
> 
> In my opinion, shutdown the cluster is the worst solution for
> production. Most users can tolerate the partial vdi broken and remove
> the broken vdi as the worst case, but the good vdi survive and the
> service isn't stopped at all. With the backup tool in mind, such as vdi
> backup, or cluster-wide backup, the unrecoverable objects can be
> recovered via more user friendly backup tools.
> 
> The manual backup, looks to me a much more reliable solution to solve
> unrecoverable broken & stale objects problem, because service uptime is
> as important as data reliability.

Okay, I'll withdraw this patch.

Thanks,

Kazutaka



More information about the sheepdog mailing list