[Sheepdog] [PATCH v2] sheep: tame sheep to recover the

Tue Sep 27 05:43:27 CEST 2011

If the latest epoch is unrecoverable , or is a  transient epoch,
should it fall back to the last available epoch?

Yibin Shen

On Tue, Sep 27, 2011 at 11:13 AM, MORITA Kazutaka <
morita.kazutaka at lab.ntt.co.jp> wrote:

> At Tue, 27 Sep 2011 09:45:49 +0800,
> Liu Yuan wrote:
> >
> > On 09/27/2011 06:09 AM, MORITA Kazutaka wrote:
> > > At Mon, 26 Sep 2011 11:43:34 -0700 (PDT),
> > > Ski Mountain wrote:
> > >> What happens if one of the nodes in the cluster is not recoverable at
> all.  IE fried motherboard, can you just start up the vm's that were on the
> dead machine on another machine in the cluster?
> > > If the unrecoverable node doesn't have the latest epoch info, we need
> > > to do nothing special.  If you start the sheep daemon on all other
> > > machines, then the cluster will work again.
> > >
> > > But if the failed node has the latest epoch, this is the case we need
> > > a manual recovery.  It is because there is a risk of data loss in this
> > > case, though I think this rarely happens.
> > >
> > >
> >
> > Hi Kazutaka,
> >      I do have some idea like 'collie cluster recover' hanging over in
> > my head. This kind of brutal force manual recovery would be the last
> > resort to handle physical highest-epoch node failure in crashed cluster
> > or physical nodes failure in shutdown cluster.
>
> Good point.
>
> >
> >      The implementation might be rather easy. I am thinking of adding a
> > new SD_MSG_RECOVERY event and broadcast this event to recovery the
> > cluster with the epoch incremented by 1. how do you think of it?
>
> How about adding a new operation SD_OP_CLUSTER_RECOVERY and
> broadcasting it with SD_MSG_VDI_OP?  I think It should work like a
> "collie cluster format" command.
>
>
> Thanks,
>
> Kazutaka
> --
> sheepdog mailing list
> sheepdog at lists.wpkg.org
> http://lists.wpkg.org/mailman/listinfo/sheepdog
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wpkg.org/pipermail/sheepdog/attachments/20110927/76dcc31a/attachment-0003.html>