[sheepdog] [PATCH v3 1/9] sheep: introduce generational reference counting for object reclaim

Hitoshi Mitake mitake.hitoshi at gmail.com
Thu Feb 27 13:20:21 CET 2014


At Thu, 27 Feb 2014 18:07:27 +0800,
Liu Yuan wrote:
> 
> On Sun, Feb 23, 2014 at 02:28:20PM +0900, Hitoshi Mitake wrote:
> > Generational reference counting is an algorithm to reclaim data
> > efficiently without race conditions on distributed system.  This
> > extends vdi objects structure to store generational reference counts,
> > and increments the counts when creating snapshots.
> > 
> > Cc: MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp>
> > Cc: Valerio Pachera <sirio81 at gmail.com>
> > Cc: Alessandro Bolgia <alessandro at extensys.it>
> > Signed-off-by: Hitoshi Mitake <mitake.hitoshi at lab.ntt.co.jp>
> > ---
> >  include/sheepdog_proto.h |  6 +++++
> >  sheep/vdi.c              | 57 ++++++++++++++++++++++++++++++++++++------------
> >  2 files changed, 49 insertions(+), 14 deletions(-)
> > 
> > diff --git a/include/sheepdog_proto.h b/include/sheepdog_proto.h
> > index 1d17e8f..39d57aa 100644
> > --- a/include/sheepdog_proto.h
> > +++ b/include/sheepdog_proto.h
> > @@ -211,6 +211,11 @@ struct sd_rsp {
> >  	};
> >  };
> >  
> > +struct generation_reference {
> > +	int32_t generation;
> > +	int32_t count;
> > +};
> > +
> >  struct sd_inode {
> >  	char name[SD_MAX_VDI_LEN];
> >  	char tag[SD_MAX_VDI_TAG_LEN];
> > @@ -229,6 +234,7 @@ struct sd_inode {
> >  	uint32_t child_vdi_id[MAX_CHILDREN];
> >  	uint32_t data_vdi_id[SD_INODE_DATA_INDEX];
> >  	uint32_t btree_counter;
> > +	struct generation_reference data_ref[SD_INODE_DATA_INDEX];
> 
> data_ref -> gref looks better.

okay.

> 	
> It seems that with this change, it would be much harder to have hyper volume
> support snapshot/clone. It means we need a btree-like management to manage 
> data_ref.
> 
> Or can we modulize the deletion path, and make it support two different deletion
> algorithm, one is current stupid one and the other generational ref? In this way
> hyper volume can make use of old algorithm and then have it support snpashot
> looks easier.
> 
> Anyway, let hyper volume support snapshot/clone is a far future stuff. I am not
> sure if we should consider it.

Yes, the feature will require time for implementation. So I think we
don't have to employ the previous naive GC for hypervolume as a
half-baked solution.

I think we should let "dog vdi snapshot" fail when it is required to
create a snapshot of hypervolume. It would be good as a temporal
solution.

Thanks,
Hitoshi



More information about the sheepdog mailing list