[Sheepdog] [PATCH 2/2] always delete data objects when deleting an cloned vdi

MORITA Kazutaka morita.kazutaka at gmail.com
Mon May 7 17:59:41 CEST 2012


At Mon, 07 May 2012 09:51:00 +0800,
levin li wrote:
> 
> On 05/07/2012 02:58 AM, MORITA Kazutaka wrote:
> > At Fri, 04 May 2012 09:49:08 +0800,
> > levin li wrote:
> >> On 05/04/2012 03:46 AM, MORITA Kazutaka wrote:
> >>> At Mon, 23 Apr 2012 14:18:06 +0800,
> >>> Li Wenpeng wrote:
> >>>> From: levin li<xingke.lwp at taobao.com>
> >>>>
> >>>> When deleting a cloned vdi, sheep find the root vdi and then
> >>>> traverse the vdi chain(such as base -->   snapshot -->   clone) to
> >>>> check wheter there's an undeleted vdi in the chain, if some vdi
> >>>> in the chain isn't deleted, sheep just mark the cloned vdi as
> >>>> deleted by clear its vdi name.
> >>>>
> >>>> But in fact a cloned vdi may created its own objects by copy-on-write,
> >>>> these objects can be deleted when deleting the vdi, so we make
> >>>> the cloned vdi to be deleted as the root vdi, then we can deleting
> >>>> its data objects, in delete_one() we check whether the object belongs
> >>>> to itself to determine whether to delete the object.
> >>>>
> >>>> Signed-off-by: levin li<xingke.lwp at taobao.com>
> >>>> ---
> >>>>    sheep/vdi.c |   14 ++++++++++++++
> >>>>    1 files changed, 14 insertions(+), 0 deletions(-)
> >>>>
> >>>> diff --git a/sheep/vdi.c b/sheep/vdi.c
> >>>> index d2a522d..c8085c8 100644
> >>>> --- a/sheep/vdi.c
> >>>> +++ b/sheep/vdi.c
> >>>> @@ -478,6 +478,12 @@ static void delete_one(struct work *work)
> >>>>    		if (!inode->data_vdi_id[i])
> >>>>    			continue;
> >>>>
> >>>> +		if (inode->data_vdi_id[i] != inode->vdi_id) {
> >>>> +			dprintf("object %" PRIx64 " is base's data, would not be deleted.\n",
> >>>> +					vid_to_data_oid(inode->data_vdi_id[i], i));
> >>>> +			continue;
> >>>> +		}
> >>>> +
> >>>>    		ret = remove_object(dw->entries, dw->nr_vnodes, dw->nr_zones, dw->epoch,
> >>>>    			      vid_to_data_oid(inode->data_vdi_id[i], i),
> >>>>    			      inode->nr_copies);
> >>>> @@ -587,6 +593,14 @@ next:
> >>>>    			  vid_to_vdi_oid(vid), (char *)inode,
> >>>>    			  SD_INODE_HEADER_SIZE, 0, sys->nr_sobjs);
> >>>>
> >>>> +	if (vid == inode->vdi_id&&   inode->snap_id == 1
> >>> What does 'inode->snap_id == 1' mean here?  I think this patch is not
> >>> correct at all.
> >>>
> >>> Thanks,
> >>>
> >>> Kazutaka
> >>>
> >> When inode->snap_ctime is zero, it means this can not be a snapshot.
> >>
> >> Further more, in the current vdi chain, the base vdi that we see by
> >> 'collie vdi list' has parent id which may point to the snapshot vdi
> >> which is the previous base vdi, in the case that an snapshot exist, the
> >> snap_id of the current base vdi must be greater than 1.
> >>
> >> But the cloned vdi always has a snap_id to be 1, so we can determine that
> >> if inode->snap_id == 1&&  inode->parent_vdi_id != 0&&
> >> !inode->snap_ctime, then
> >> the vdi must be a cloned vdi.
> > For example:
> >
> >   $ collie cluster format
> >   $ collie vdi create base 1G
> >   $ collie vdi snapshot base
> >   $ collie vdi clone base cloned
> >   $ collie vdi clone -s 1 base cloned
> >   $ collie vdi snapshot cloned
> >   $ collie vdi list
> >      Name        Id    Size    Used  Shared    Creation time   VDI id  Tag
> >    s base         1  1.0 GB  0.0 MB  0.0 MB 2012-05-07 03:50   54c278
> >      base         2  1.0 GB  0.0 MB  0.0 MB 2012-05-07 03:50   54c279
> >    s cloned       1  1.0 GB  0.0 MB  0.0 MB 2012-05-07 03:50   c876b2
> >      cloned       2  1.0 GB  0.0 MB  0.0 MB 2012-05-07 03:51   c876b3
> >
> > The snap_id of the vdi c876b2 is 1, but its snap_ctime is not zero
> > because the vdi is a snapshot.  The snap_ctime of the vdi c876b3 is
> > zero, but its snap_is 2.  So, with your code, no vdi is detected as a
> > clonend vdi against the above example.
> >
> > Kazutaka
> Yes, I already considered this case, in your example,c876b2
> should be the cloned VDI, but now it becomes a snapshot, and c876b3
> becomes the base VDI of c876b2, it's OK.
> 
> The only reason we need to mark a VDI as cloned it for deletion work,
> let me explain why.
> 
> A cloned VDI I mean isn't always the VDI created by the clone operation,
> but a cloned VDI on the leaves of the VDI tree, if it has some objects
> which created by copy-on-write that not shared with any other VDI, then
> we can delete the objects when we deleting the VDI to avoid a leak so
> as to save disk space.
> 
> In the example, c876b3 becomes a base VDI, we can not delete it's data
> objects directly when deleting it, even if some of its objects are cloned
> by copy-on-write, since c876b2 may shares objects with it.
> 
> The same reason for c876b2, it becomes a snapshot VDI, no longer a simple
> cloned VDI any more.

So this patch deletes the vdi when it is a cloned one but doesn't have
any snapshots?  If yes, you should write so.

I wonder why you tried to handle the specific situation.  I think we
can generalize your code; Sheepdog can delete the vdi safely if its
all descendant vdis are already deleted.

Kazutaka



More information about the sheepdog mailing list