[Sheepdog] [PATCH 0/2] fix collie command errors during node member changes

Thu Dec 15 10:13:51 CET 2011

At Tue, 13 Dec 2011 12:27:03 +0000,
Chris Webb wrote:
> 
> MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp> writes:
> 
> > This patchset makes collie's I/Os like QEMU's ones, and adds support
> > for automatic retry of collie commands.
> > Chris, can you try the devel branch?
> 
> Hi. This seems to work very nicely. I made a three node cluster, created and
> started writing to a VDI, and disconnected a node, whereupon a collie vdi
> list on one of the remaining nodes hung for the corosync-dependent timeout
> until the failed node was kicked out of the cluster, then sprung into life
> with a correct vdi listing. collie node list showed the cluster now had two
> nodes instead of one, and everything continued to work fine.
> 
> Rebooting and reattaching the disconnected node, the cluster grew back to
> the full size again automatically. Very nice!

Thanks for your testing!

> 
> If the failed node is just partitioned away from the rest of the cluster
> rather than failing, what's supposed to happen to the sheep instances and
> the qemus on it? I saw operations hang indefinitely, which is the intended

Sheepdog cannot distinguish the temporary disconnected node from the
failed one, so the sheep instances will abort and qemus will hang
forever.

> behaviour I imagine? The case I wondered about is where the failed node is
> later reattached to the rest of the cluster. I think it continues to hang in
> that case rather than recovering and allowing the local VMs to proceed?

If sheep sleeps long time, all I/Os would be timeout in qemus.  It is
not good.

But it is also difficult for Sheepdog to continue to work but avoid
recovering.  Because Sheepdog assumes that data is fully replicated
when sheep returns I/O responses to qemus.  If we give up dynamic node
membership, I think we can deal with the temporary disconnected node,
but it would be a feature work.

Thanks,

Kazutaka