[sheepdog] [PATCH] dog: use a minimum number of copies and zones for vdi checking

Liu Yuan namei.unix at gmail.com
Mon Jan 20 05:47:07 CET 2014


On Mon, Jan 20, 2014 at 01:18:13PM +0900, Hitoshi Mitake wrote:
> At Fri, 17 Jan 2014 16:39:59 +0800,
> Liu Yuan wrote:
> > 
> > On Thu, Jan 16, 2014 at 06:10:04PM +0900, Hitoshi Mitake wrote:
> > > Current sheepdog allows the condition of a number of zones < a number
> > > of copies. But "dog vdi check" cannot handle the case well because it
> > > tries to create copies of lost/corrupted objects based on
> > > inode->nr_copies.
> > > 
> > > This patch lets "dog vdi check" create a minimum number of copies and
> > > zones.
> > > 
> > > Reported-by: Marcin Mirosław <marcin at mejor.pl>
> > > Signed-off-by: Hitoshi Mitake <mitake.hitoshi at lab.ntt.co.jp>
> > > ---
> > > 
> > > v3: handle a case of erasure coded VDIs
> > > 
> > > v2: renameing, zones_nr -> sd_zones_nr
> > > 
> > >  dog/vdi.c |   25 ++++++++++++++++++-------
> > >  1 file changed, 18 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/dog/vdi.c b/dog/vdi.c
> > > index 8fd4664..b6d382c 100644
> > > --- a/dog/vdi.c
> > > +++ b/dog/vdi.c
> > > @@ -1800,11 +1800,11 @@ static void vdi_check_object_main(struct work *work)
> > >  }
> > >  
> > >  static void queue_vdi_check_work(const struct sd_inode *inode, uint64_t oid,
> > > -				 uint64_t *done, struct work_queue *wq)
> > > +				 uint64_t *done, struct work_queue *wq,
> > > +				 int nr_copies)
> > >  {
> > >  	struct vdi_check_info *info;
> > >  	const struct sd_vnode *tgt_vnodes[SD_MAX_COPIES];
> > > -	int nr_copies = inode->nr_copies;
> > >  
> > >  	info = xzalloc(sizeof(*info) + sizeof(info->vcw[0]) * nr_copies);
> > >  	info->oid = oid;
> > > @@ -1830,6 +1830,7 @@ struct check_arg {
> > >  	const struct sd_inode *inode;
> > >  	uint64_t *done;
> > >  	struct work_queue *wq;
> > > +	int nr_copies;
> > >  };
> > >  
> > >  static void check_cb(void *data, enum btree_node_type type, void *arg)
> > > @@ -1844,7 +1845,8 @@ static void check_cb(void *data, enum btree_node_type type, void *arg)
> > >  			oid = vid_to_data_oid(ext->vdi_id, ext->idx);
> > >  			*(carg->done) = (uint64_t)ext->idx * SD_DATA_OBJ_SIZE;
> > >  			vdi_show_progress(*(carg->done), carg->inode->vdi_size);
> > > -			queue_vdi_check_work(carg->inode, oid, NULL, carg->wq);
> > > +			queue_vdi_check_work(carg->inode, oid, NULL, carg->wq,
> > > +					     carg->nr_copies);
> > >  		}
> > >  	}
> > >  }
> > > @@ -1855,8 +1857,15 @@ int do_vdi_check(const struct sd_inode *inode)
> > >  	uint64_t done = 0, oid;
> > >  	uint32_t vid;
> > >  	struct work_queue *wq;
> > > +	int nr_copies = min((int)inode->nr_copies, sd_zones_nr);
> > > +
> > > +	if (inode->copy_policy && sd_zones_nr < nr_copies) {
> > > +		sd_err("ABORT: Not enough active zones for consistency-checking"
> > > +		       " erasure coded VDI");
> > 
> > explicitly check inode->copy_policy == 1, since later we might add other policy.
> 
> OK.
> 
> > 
> > > +		return EXIT_FAILURE;
> > > +	}
> > >  
> > > -	if (sd_nodes_nr < inode->nr_copies) {
> > > +	if (sd_nodes_nr < nr_copies) {
> > 
> > we should check sd_zones_nr < nr_copies as my previous mail suggested, no?
> > 
> 
> The condition should be checked only for erasure coded VDIs, no? It is
> an ordinal condition for replicated VDis.
> 

For either EC or replicated, we use zone to represent the node identity. So e.g,
we have 6 nodes in 1 zone, it means we have 1 node for hash ring. In this case,
for 4:2 ec, we should abort, or we'll meet a panic.

I think we use sd_zones_nr check for safe guard since you already handle the
copy_nr beforehand. If you don't want this, we can safely remove this check.

Either way, I think you should leave it as (sd_zones_nr < nr_copies) or remove
this check,

Thanks
Yuan



More information about the sheepdog mailing list