[Sheepdog] Power supply interruption crashes data stored in sheepdog

Fri Aug 5 05:12:00 CEST 2011

I have already cleaned the damaged cluster. I guess it is possible to
reproduce the error, and then capture the output from collie cluster info.

Anyway, the upcoming  "collie cluster check" command is a very good news.

Rubens de Souza Matos Júnior

2011/8/4 MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp>

> At Thu, 4 Aug 2011 16:28:50 -0300,
> Rubens Matos wrote:
> > Hi everyone,
> >
> > I am testing sheepdog and everything was working, but after an
> interruption
> > in power supply, that affected all nodes, the cluster was damaged so that
> > the nodes didn't join again, and I can't recover the data that was stored
> in
> > a VDI.
> >
> > Have you already noticed a similar behavior? Is sheepdog protected
> against
> > such kind of failure, in which all nodes are abruptly disconnected?
>
> Sheepdog should handle the total node failure, but I think some bugs
> still exist in it.  The error handling has not been tested enough.
>
> If you have not cleaned the damaged cluster yet, can you give me the
> outputs of "collie cluster info" on all the nodes?  Those info would
> be helpful to find the error reason.
>
> I'm implementing a "collie cluster check" command, which works like
> fsck for Sheepdog.  This command would be helpful for recovering the
> damaged cluster.
>
>
> Thanks,
>
> Kazutaka
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wpkg.org/pipermail/sheepdog/attachments/20110805/faa5e61f/attachment.html>