[sheepdog] [PATCH v5 04/14] sheep: introduce generational reference counting for object reclaim

Liu Yuan namei.unix at gmail.com
Wed Mar 5 06:36:10 CET 2014


On Wed, Mar 05, 2014 at 02:13:57PM +0900, Hitoshi Mitake wrote:
> At Tue, 4 Mar 2014 21:28:07 +0800,
> Liu Yuan wrote:
> > 
> > On Tue, Mar 04, 2014 at 02:42:48PM +0900, Hitoshi Mitake wrote:
> > > From: Hitoshi Mitake <mitake.hitoshi at gmail.com>
> > > 
> > > Generational reference counting is an algorithm to reclaim data
> > > efficiently without race conditions on distributed system.  This
> > > extends vdi objects structure to store generational reference counts,
> > > and increments the counts when creating snapshots.
> > > 
> > > Cc: MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp>
> > > Cc: Valerio Pachera <sirio81 at gmail.com>
> > > Cc: Alessandro Bolgia <alessandro at extensys.it>
> > > Signed-off-by: Hitoshi Mitake <mitake.hitoshi at lab.ntt.co.jp>
> > > ---
> > > 
> > > v5:
> > >  - update store version and break compatibility explicitly
> > >  - rename data_ref -> gref
> > > 
> > > v4:
> > >  - remove a bug in snapshot_vdi(), storing an invalid number of references
> > > 
> > >  include/sheepdog_proto.h |    6 +++++
> > >  sheep/config.c           |    2 +-
> > >  sheep/migrate.c          |    8 +++++++
> > >  sheep/vdi.c              |   58 +++++++++++++++++++++++++++++++++++-----------
> > >  4 files changed, 59 insertions(+), 15 deletions(-)
> > > 
> > > diff --git a/include/sheepdog_proto.h b/include/sheepdog_proto.h
> > > index 9361bad..9937497 100644
> > > --- a/include/sheepdog_proto.h
> > > +++ b/include/sheepdog_proto.h
> > > @@ -212,6 +212,11 @@ struct sd_rsp {
> > >  	};
> > >  };
> > >  
> > > +struct generation_reference {
> > > +	int32_t generation;
> > > +	int32_t count;
> > > +};
> > > +
> > >  struct sd_inode {
> > >  	char name[SD_MAX_VDI_LEN];
> > >  	char tag[SD_MAX_VDI_TAG_LEN];
> > > @@ -230,6 +235,7 @@ struct sd_inode {
> > >  	uint32_t child_vdi_id[MAX_CHILDREN];
> > >  	uint32_t data_vdi_id[SD_INODE_DATA_INDEX];
> > >  	uint32_t btree_counter;
> > > +	struct generation_reference gref[SD_INODE_DATA_INDEX];
> > >  };
> > 
> > This patch set passes tests on my box, great!
> > 
> > For better compatibility, I'd suggest
> > 
> > make gref array in a spectial object like btree intermedia node object, instead
> > of embedding into inode and put 'btree_counter' in the unused field (child_vdi_id)
> > 
> > Then we can keep the current inode layout without modification of QEMU and TGT 
> > backend code to support hyper volume later.
> > 
> > This way 
> > 
> > - inode won't become cumbersome and too big as more and more field
> > - adds in. 
> 
> I think adding a new type of objects for generation reference is not
> needed. It increases complexity of the code and consumes the bit for
> indicating object types. There are only 3 bits for this purpose. 
> 
> In addition, gref array doesn't consume amount of disk space because
> of the sparse object scheme. And we can also reduce network trafic for
> transmitting inode object by sending/recving only offsetof(struct
> sd_inode, gref) instead of sizeof(struct sd_inode). It can be done
> later easily.
> 
> > - one for moving btree_counter since it is not currently used by client code but
> >   will be in the future when we add hyper volume support.
> 
> I agree with this proposal. But I think this change should be appended
> at the tail of the patchset for making natural change (child_vdi_id is
> removed in 8th patch and moving btree_counter should be moved after).

Yes, this is what I meant. Append it in the tail of patch set for easier
patching.

Thanks
Yuan



More information about the sheepdog mailing list