[Sheepdog] Drive snapshots and metadata

MORITA Kazutaka morita.kazutaka at lab.ntt.co.jp
Fri Feb 4 18:37:05 CET 2011


Hi Chris,

At Fri, 4 Feb 2011 15:35:31 +0000,
Chris Webb wrote:
> 
> Hi. I'm looking at both Sheepdog and Ceph at the moment, and thinking about
> future directions for our hosting product. We run qemu-kvm virtual machines
> backed by LVM2 logical volumes as virtual drives, accessed either locally or
> over iscsi. I'm thinking of migrating in time to a distributed block store
> like Sheepdog or Ceph's rbd, and have a handful of questions which have come
> up while experimenting.
> 
> 
> The operation I would really like to be able to export to users (in addition
> to what we have already in our lvm2-based system) is an ability to make
> copy-on-write clones of virtual hard drives. I can create a snapshot of the
> source with qemu-img snapshot, and then do
> 
>   qemu-img create -b sheepdog:source:1 sheepdog:dest
> 
> However, I think that I can't then delete the snapshot source:1 and the
> original source drive without also deleting the dest drive? Am I right about
> this, or am I misunderstanding or out-of-date with the current state of
> sheepdog?

You are right.  Although we can delete the snapshot source:1 with
"collie vdi delete src -s 1", the data objects of the snapshot aren't
reclaimed until the dest image is deleted.  This should be fixed, but
it is a bit difficult to implement it.


> 
> 
> Something else I'm contemplating is storage of metadata associated with
> virtual drives, e.g. which user it belongs to, the user-provided drive name,
> and other management layer properties on the drive. Is there a way I can tag
> vdis in Sheepdog with a few short keys and values? (I know I could construct
> a separate simple distributed database for this on top of the same corosync
> backend as Sheepdog uses, but I'd like to avoid this if additional metadata
> would naturally fit within Sheepdog as the total amount of metadata I'm
> looking to store is very tiny!)

It sounds good to me to have the feature in Sheepdog.  Storing
something like accounting information is necessary feature for hosting
use.  How about the following commands?

  $ collie vdi setdata [key] [value]
  $ collie vdi getdata [key]


> 
> 
> Finally, I see the rather intrusive qemu patch I contributed in the early
> days of sheepdog to allow locking and live-migration to coexist has been
> superseded by the total removal of the sheepdog locking requirement in
> fe14318e31d8. This is a much nicer solution to the problem than mine! Out of
> interest, what happens if several clients do access a vdi at the same time?
> Is it identical behaviour to accessing (say) an iscsi block device from 2
> hosts, e.g. cluster filesystems can be made to work, or are there weaker
> ordering guarantees on the sequencing of writes and/or problems with
> read-cache consistency that make it less useful?

Both write ordering problem and read-cache consistency problem would
happen.  Sheepdog is not designed to support such situation; all
objects must be one of the following:
 - no writer and multiple reader
 - one writer and one reader
To use Sheepdog safely, something like a lock system is necessary.

But this assumption makes Sheepdog much simple and achieves low
latency.


Thanks,

Kazutaka



More information about the sheepdog mailing list