[Sheepdog] partition recovery algorithm

MORITA Kazutaka morita.kazutaka at lab.ntt.co.jp
Thu Nov 5 04:25:43 CET 2009


Chris Webb wrote:
 > Chris Webb <chris at arachsys.com> writes:
 >
 >> Dietmar Maurer <dietmar at proxmox.com> writes:
 >>
 >>> No, the whole cluster state resync is up to you - corosync does not help
 >>> you much. It only provides EVS.
 >> Corosync quorum service doesn't do the job completely for you out of the
 >> box, but it does make life significantly simpler.
 >
 > (Note that by this I mean that it can be used to prevent a non-quorate
 > fragment of the cluster from continuing at all rather than somehow resyncing
 > the system after two halves have evolved independently, which is a horrible
 > problem.)

Corosync looks to be helpful to deal with cluster partition problem
if Sheepdog use a quorum approach; when cluster is partitioned,
only one cluster can provide Sheepdog survive.
I will try to test the vote-based quorum service of corosync.

However, sheepdog have some critical items to be done,
such as VDI deletion, reclaiming unused objects, etc,
so moveing to corosync might not be a top priority.

I'll show a tiny subset of dog program using corosync first,
then let's consider whether moving to corosync is good choice.

 > We had begun writing a distributed store ourselves, but I was very
 > interested to see sheepdog announced on the qemu list. This project is
 > further along than we were, and we're quite excited about the idea of
 > getting involved and contributing to sheepdog instead of continuing to
 > develop our own in-house solution.

I'm also excited to hear you are so interested in Sheepdog project:)

Thanks,

MORITA Kazutaka



More information about the sheepdog mailing list