[sheepdog] [PATCH v3] sheep: remove master node

MORITA Kazutaka morita.kazutaka at lab.ntt.co.jp
Wed Jul 24 11:40:10 CEST 2013


At Wed, 24 Jul 2013 17:20:51 +0800,
Kai Zhang wrote:
> 
> 
> On Jul 24, 2013, at 5:13 PM, Kai Zhang <kyle at zelin.io> wrote:
> 
> > 
> > On Jul 24, 2013, at 2:53 PM, MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp> wrote:
> > 
> >> At Tue, 23 Jul 2013 17:30:03 +0800,
> >> Kai Zhang wrote:
> >>> 
> >>> On Jul 23, 2013, at 4:44 PM, MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp> wrote:
> >>> 
> >>>> Ah, sorry.  The node A doesn't start until the nodes B, C, and D come
> >>>> back.  It is because the latest epoch in the node A includes B, C, and
> >>>> D. senerio
> >>> 
> >>> Well, it seems I didn't fully understand the current implementation of cluster driver.
> >>> 
> >>> A very silly question: if B, C come back but D does not, what is the status of 
> >>> the cluster? It can work or just wait for D?
> >> 
> >> The cluster status will be SD_STATUS_WAIT.  It will wait for the node
> >> D to join Sheepdog if you don't run "collie cluster recover force".
> >> 
> > 
> > Does this mean that sheepdog is not self-healing?
> > Any persistent failure of sheep will be handled by administrator?
> 
> Sorry, my description is not correct.
> What I mean is that sheepdog cluster cannot recover by themselves at this scenario.
> And I'm a little disappointed with this.
> Is there a possibility to solve this?

If the number of redundacy is 1, it is possible that only the node D
has the latest data.  Then, it's not safe to start sheepdog
automatically without the node D.

Sheepdog starts if all the nodes in the previous epoch are gathered -
this is necessary to keep strong consistency which is required for
block storage system.  We can relax this rule a bit (e.g. it is okay
to start sheepdog in the above example if the number of redundancy is
larger than one).  It's on my TODO items.

Thanks,

Kazutaka



More information about the sheepdog mailing list