[Sheepdog] Nodes leaving and joining the cluster

MORITA Kazutaka morita.kazutaka at lab.ntt.co.jp
Wed Jun 16 20:31:27 CEST 2010


At Wed, 16 Jun 2010 18:57:38 +0200,
Wido den Hollander wrote:
> 
> Hi,
> 
> A few months ago when i was testing sheepdog there were some issues with
> nodes joining and leaving the cluster.
> 
> For example, if i turned my whole cluster off and turned it back on
> again, the cluster wouldn't come online and i would have to do a fresh
> mkfs.
> 
> Has this been addressed already?
> 

Current version should be much better than what you tested before.  I
think nodes joining and leaving would work well now, though testing is
not enough yet.

Rebooting sheepdog cluster without a shutdown command is not supported
yet.  I think we should consider the following situations:

 1) Administrator wrongly shutdowns all the nodes before running a
    shutdown command

    In this case, all nodes do not down at the same time, so internal
    membership info in sheepdog daemons are wrongly updated.  It is
    not easy to fix them automatically because Sheepdog doesn't have a
    static membership information.  I think of providing a command to
    fix membership information manually.

 2) Power failure occurs

    In this case, membership info of sheepdog daemons couldn't be
    inconsistent because all nodes down at the same time.  However, if
    VMs were on writing data when power failure occurred, the data
    objects may become inconsistent.  We need to fix them, and I think
    we can do it automatically.

Thanks,

Kazutaka



More information about the sheepdog mailing list