I have already cleaned the damaged cluster. I guess it is possible to reproduce the error, and then capture the output from collie cluster info.<div><br></div><div>Anyway, the upcoming <span class="Apple-style-span" style="font-family: arial, sans-serif; font-size: 13px; border-collapse: collapse; color: rgb(51, 51, 51); "> </span>"collie cluster check" command is a very good news.</div>
<meta http-equiv="content-type" content="text/html; charset=utf-8"><div><br clear="all">Rubens de Souza Matos Jśnior<br>
<br><br><div class="gmail_quote">2011/8/4 MORITA Kazutaka <span dir="ltr"><<a href="mailto:morita.kazutaka@lab.ntt.co.jp">morita.kazutaka@lab.ntt.co.jp</a>></span><br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
At Thu, 4 Aug 2011 16:28:50 -0300,<br>
<div class="im">Rubens Matos wrote:<br>
> Hi everyone,<br>
><br>
> I am testing sheepdog and everything was working, but after an interruption<br>
> in power supply, that affected all nodes, the cluster was damaged so that<br>
> the nodes didn't join again, and I can't recover the data that was stored in<br>
> a VDI.<br>
><br>
> Have you already noticed a similar behavior? Is sheepdog protected against<br>
> such kind of failure, in which all nodes are abruptly disconnected?<br>
<br>
</div>Sheepdog should handle the total node failure, but I think some bugs<br>
still exist in it. The error handling has not been tested enough.<br>
<br>
If you have not cleaned the damaged cluster yet, can you give me the<br>
outputs of "collie cluster info" on all the nodes? Those info would<br>
be helpful to find the error reason.<br>
<br>
I'm implementing a "collie cluster check" command, which works like<br>
fsck for Sheepdog. This command would be helpful for recovering the<br>
damaged cluster.<br>
<br>
<br>
Thanks,<br>
<font color="#888888"><br>
Kazutaka<br>
</font></blockquote></div><br></div>