[Sheepdog] Using snapshot to implement clone

MORITA Kazutaka morita.kazutaka at lab.ntt.co.jp
Wed Aug 3 18:36:34 CEST 2011


At Wed, 3 Aug 2011 13:46:55 +0100,
Chris Webb wrote:
> 
> I'm working on an release of our cloud infrastructure platform using
> Sheepdog to provide drive storage instead of LVM LVs + iscsi as at present.

Great!

> 
> Part of our system is a drive clone facility, which makes a new drive as a
> clone of the (current state of) an existing drive. We want to (continue to)
> expose a simple concept of a drive to users, rather than distinct drive and
> snapshot objects, with snapshots having a subset of the facilities of
> drives.
> 
> I have done a preliminary implementation of this as
> 
>   - qemu-img snapshot the source vdi
>   - find the snapshot id of this snapshot with collie vdi list
>   - qemu-img create a new vdi using the newly-created snapshot as a base
>   - collie vdi delete the temporary snapshot of the source vdi
> 
> It all seems to work quite nicely, but I wonder whether there are any
> potential issues with creating lots of very ephemeral snapshots like this?
> The vdi IDs will obvious get fairly large, but that is presumably not a
> disaster in itself, unless there are also internal resources being used very
> inefficiently by a pattern like this?


When many vdi IDs are allocated to the same vdi, the time to look up a
snapshot vdi could be bad.  Sheepdog search a snapshot vdi object from
the latest vdi id linearly, so the older snapshot takes worse time for
looking up.  But the current vdi and the latest snapshot can be
accessed in constant time, so I guess Sheepdog would work fine with
your usage.

The second concern is that Sheepdog cannot reclaim unused data objects
well.  In your case, Sheepdog doesn't reclaim the data related to
ephemeral snapshots.  I'd like to solve this problem as soon as
possible, but the implementation is a bit difficult.

Other than above, I think there is no problem.


Thanks,

Kazutaka



More information about the sheepdog mailing list