[sheepdog] [PATCH v5 04/14] sheep: introduce generational reference counting for object reclaim
Liu Yuan
namei.unix at gmail.com
Wed Mar 5 06:36:10 CET 2014
On Wed, Mar 05, 2014 at 02:13:57PM +0900, Hitoshi Mitake wrote:
> At Tue, 4 Mar 2014 21:28:07 +0800,
> Liu Yuan wrote:
> >
> > On Tue, Mar 04, 2014 at 02:42:48PM +0900, Hitoshi Mitake wrote:
> > > From: Hitoshi Mitake <mitake.hitoshi at gmail.com>
> > >
> > > Generational reference counting is an algorithm to reclaim data
> > > efficiently without race conditions on distributed system. This
> > > extends vdi objects structure to store generational reference counts,
> > > and increments the counts when creating snapshots.
> > >
> > > Cc: MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp>
> > > Cc: Valerio Pachera <sirio81 at gmail.com>
> > > Cc: Alessandro Bolgia <alessandro at extensys.it>
> > > Signed-off-by: Hitoshi Mitake <mitake.hitoshi at lab.ntt.co.jp>
> > > ---
> > >
> > > v5:
> > > - update store version and break compatibility explicitly
> > > - rename data_ref -> gref
> > >
> > > v4:
> > > - remove a bug in snapshot_vdi(), storing an invalid number of references
> > >
> > > include/sheepdog_proto.h | 6 +++++
> > > sheep/config.c | 2 +-
> > > sheep/migrate.c | 8 +++++++
> > > sheep/vdi.c | 58 +++++++++++++++++++++++++++++++++++-----------
> > > 4 files changed, 59 insertions(+), 15 deletions(-)
> > >
> > > diff --git a/include/sheepdog_proto.h b/include/sheepdog_proto.h
> > > index 9361bad..9937497 100644
> > > --- a/include/sheepdog_proto.h
> > > +++ b/include/sheepdog_proto.h
> > > @@ -212,6 +212,11 @@ struct sd_rsp {
> > > };
> > > };
> > >
> > > +struct generation_reference {
> > > + int32_t generation;
> > > + int32_t count;
> > > +};
> > > +
> > > struct sd_inode {
> > > char name[SD_MAX_VDI_LEN];
> > > char tag[SD_MAX_VDI_TAG_LEN];
> > > @@ -230,6 +235,7 @@ struct sd_inode {
> > > uint32_t child_vdi_id[MAX_CHILDREN];
> > > uint32_t data_vdi_id[SD_INODE_DATA_INDEX];
> > > uint32_t btree_counter;
> > > + struct generation_reference gref[SD_INODE_DATA_INDEX];
> > > };
> >
> > This patch set passes tests on my box, great!
> >
> > For better compatibility, I'd suggest
> >
> > make gref array in a spectial object like btree intermedia node object, instead
> > of embedding into inode and put 'btree_counter' in the unused field (child_vdi_id)
> >
> > Then we can keep the current inode layout without modification of QEMU and TGT
> > backend code to support hyper volume later.
> >
> > This way
> >
> > - inode won't become cumbersome and too big as more and more field
> > - adds in.
>
> I think adding a new type of objects for generation reference is not
> needed. It increases complexity of the code and consumes the bit for
> indicating object types. There are only 3 bits for this purpose.
>
> In addition, gref array doesn't consume amount of disk space because
> of the sparse object scheme. And we can also reduce network trafic for
> transmitting inode object by sending/recving only offsetof(struct
> sd_inode, gref) instead of sizeof(struct sd_inode). It can be done
> later easily.
>
> > - one for moving btree_counter since it is not currently used by client code but
> > will be in the future when we add hyper volume support.
>
> I agree with this proposal. But I think this change should be appended
> at the tail of the patchset for making natural change (child_vdi_id is
> removed in 8th patch and moving btree_counter should be moved after).
Yes, this is what I meant. Append it in the tail of patch set for easier
patching.
Thanks
Yuan
More information about the sheepdog
mailing list