[sheepdog] [PATCH 0/9] revive VDI locking mecahnism

Mon Jul 14 15:15:15 CEST 2014

At Fri, 27 Jun 2014 15:13:47 +0900,
Hitoshi Mitake wrote:
> 
> This patch revives VDI locking mechanism. When two or more clients
> (QEMU and tgt) try to open one VDI, sheep returns an error to the
> later one.
> 
> Example:
> $ sudo qemu-system-x86_64 -hda sheepdog:debian
> qemu-system-x86_64: -hda sheepdog:debian: could not open disk image sheepdog:debian: cannot get vdi info, VDI isn't locked, debian 0
> 
> This mechainsm requires a change in QEMU, too. Counterpart QEMU can be
> found here:
> https://github.com/sheepdog/qemu/tree/vdi-locking
> 
> If consensus about the design can be achieved, I'll post the patches
> to the QEMU list.
> 
> Thanks,
> Hitoshi
> 
> Hitoshi Mitake (9):
>   sheep: change a prototype of process_main() for obtaining sender
>     information
>   sheep: revive lock operation
>   sheep: add a list for storing information of all clients
>   sheep: associate client info and locked vdi
>   sheep: add an API for releasing VDI
>   sheep: unlock existing lock in a case of double locking
>   dog: use GET_VDI_INFO unconditionally in dog
>   sheep: snapshot and collect vdi state during joining to cluster
>   sheep: log and play locking/unlock information in newly joining node
> 
>  dog/vdi.c                |    5 +-
>  include/internal_proto.h |    2 +
>  include/sheepdog_proto.h |    5 +
>  sheep/group.c            |  162 ++++++++++++++++++++++-
>  sheep/ops.c              |  215 +++++++++++++++++++++++++------
>  sheep/request.c          |   25 ++++-
>  sheep/sheep_priv.h       |   26 ++++-
>  sheep/vdi.c              |  322 ++++++++++++++++++++++++++++++++++++++++++++++
>  8 files changed, 705 insertions(+), 57 deletions(-)
> 

Valerio, Fabian, I'd like to hear your comments.

Under the vdi locking scheme implemented in this patchset, VDI release
will be caused in the below 3 cases:

1. qemu explicitly releases its VDI
2. qemu process dies
3. sheep process dies (in this case, VDIs opened by qemu processes
which connect to the sheep process as a gateway will be released)

On the second thought, the case 2 and 3 are not so good for
integrity of VDI data. Because in such cases, write requests for
objects of the VDIs can be stopped before completion. It can introduce
inconsistent state of the objects. For example, assume that before the
stopped write request, contents of the objects are A, A, A (3
replica). The sheep can die after the second write request is
issued. After that, replicas can be B, B, A. If new qemu process tries
to read the object, sheep can return both of B and A because sheep
issues read request to one randomly choosed node from 3 nodes. This
behavior breaks the semantics of block device!

So I think it is safe to require "force unlocking + dog vdi check"
before launching new qemu for the cases of sudden death of qemu and
node leaving. How do you think?

Thanks,
Hitoshi