[Sheepdog] [PATCH] collie: show better error message while node membership is changing

Chris Webb chris at arachsys.com
Sun Dec 11 18:57:54 CET 2011


MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp> writes:

> It looks a bit difficult to handle collie commands gracefully during
> node membership changes.  I think of showing an error message to force
> users to retry the commands, and leaving this problem as a future work.

Hi Kazutaka. If we defined an extra 'temporary failure; please retry' exit
code for collie, automated systems would be able to detect this case and
automatically wait and retry themselves too. It's probably fine to do
something like that and rely on the layer that's calling collie to retry if
that's easier to implement.

Just a thought, but what happens to qemu VMs accessing sheepdog block
devices when this happens? Presumably they do hang, and then restart once
the node membership is sorted? But (also presumably), new qemu VMs who try
to start during the change will fail? It would be nice to know that this has
happened for a temporary reason too, but that might be harder to propagate
out of qemu.

Cheers,

Chris.



More information about the sheepdog mailing list