<p dir="ltr"><br>

在 2013-7-24 PM5:13，"Kai Zhang" <<a href="mailto:kyle@zelin.io">kyle@zelin.io</a>>写道：<br>

><br>

><br>

> On Jul 24, 2013, at 2:53 PM, MORITA Kazutaka <<a href="mailto:morita.kazutaka@lab.ntt.co.jp">morita.kazutaka@lab.ntt.co.jp</a>> wrote:<br>

><br>

> > At Tue, 23 Jul 2013 17:30:03 +0800,<br>

> > Kai Zhang wrote:<br>

> >><br>

> >> On Jul 23, 2013, at 4:44 PM, MORITA Kazutaka <<a href="mailto:morita.kazutaka@lab.ntt.co.jp">morita.kazutaka@lab.ntt.co.jp</a>> wrote:<br>

> >><br>

> >>> Ah, sorry.  The node A doesn't start until the nodes B, C, and D come<br>

> >>> back.  It is because the latest epoch in the node A includes B, C, and<br>

> >>> D.<br>

> >><br>

> >> Well, it seems I didn't fully understand the current implementation of cluster driver.<br>

> >><br>

> >> A very silly question: if B, C come back but D does not, what is the status of<br>

> >> the cluster? It can work or just wait for D?<br>

> ><br>

> > The cluster status will be SD_STATUS_WAIT.  It will wait for the node<br>

> > D to join Sheepdog if you don't run "collie cluster recover force".<br>

> ><br>

><br>

> Does this mean that sheepdog is not self-healing?<br>

> Any persistent failure of sheep will be handled by administrator?</p>

<p dir="ltr">sd is indeed self-healing, only corner case we need manaul recovery.</p>

<p dir="ltr">> If so, I think there is no need for auto-recover.<br>

> Recover should happen when administrator call "collie cluster recover force".<br>

><br>

> Thanks,<br>

> Kyle<br>

><br>

> --<br>

> sheepdog mailing list<br>

> <a href="mailto:sheepdog@lists.wpkg.org">sheepdog@lists.wpkg.org</a><br>

> <a href="http://lists.wpkg.org/mailman/listinfo/sheepdog">http://lists.wpkg.org/mailman/listinfo/sheepdog</a><br>

</p>