[sheepdog] [PATCH v3] sheep: remove master node

Wed Jul 24 11:20:51 CEST 2013

On Jul 24, 2013, at 5:13 PM, Kai Zhang <kyle at zelin.io> wrote:

> 
> On Jul 24, 2013, at 2:53 PM, MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp> wrote:
> 
>> At Tue, 23 Jul 2013 17:30:03 +0800,
>> Kai Zhang wrote:
>>> 
>>> On Jul 23, 2013, at 4:44 PM, MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp> wrote:
>>> 
>>>> Ah, sorry.  The node A doesn't start until the nodes B, C, and D come
>>>> back.  It is because the latest epoch in the node A includes B, C, and
>>>> D. senerio
>>> 
>>> Well, it seems I didn't fully understand the current implementation of cluster driver.
>>> 
>>> A very silly question: if B, C come back but D does not, what is the status of 
>>> the cluster? It can work or just wait for D?
>> 
>> The cluster status will be SD_STATUS_WAIT.  It will wait for the node
>> D to join Sheepdog if you don't run "collie cluster recover force".
>> 
> 
> Does this mean that sheepdog is not self-healing?
> Any persistent failure of sheep will be handled by administrator?

Sorry, my description is not correct.
What I mean is that sheepdog cluster cannot recover by themselves at this scenario.
And I'm a little disappointed with this.
Is there a possibility to solve this?

> If so, I think there is no need for auto-recover.
> Recover should happen when administrator call "collie cluster recover force".
> 
> Thanks,
> Kyle
>