[Sheepdog] [PATCH 2/2] always delete data objects when deleting an cloned vdi

levin li levin108 at gmail.com
Mon May 7 03:51:00 CEST 2012


On 05/07/2012 02:58 AM, MORITA Kazutaka wrote:
> At Fri, 04 May 2012 09:49:08 +0800,
> levin li wrote:
>> On 05/04/2012 03:46 AM, MORITA Kazutaka wrote:
>>> At Mon, 23 Apr 2012 14:18:06 +0800,
>>> Li Wenpeng wrote:
>>>> From: levin li<xingke.lwp at taobao.com>
>>>>
>>>> When deleting a cloned vdi, sheep find the root vdi and then
>>>> traverse the vdi chain(such as base -->   snapshot -->   clone) to
>>>> check wheter there's an undeleted vdi in the chain, if some vdi
>>>> in the chain isn't deleted, sheep just mark the cloned vdi as
>>>> deleted by clear its vdi name.
>>>>
>>>> But in fact a cloned vdi may created its own objects by copy-on-write,
>>>> these objects can be deleted when deleting the vdi, so we make
>>>> the cloned vdi to be deleted as the root vdi, then we can deleting
>>>> its data objects, in delete_one() we check whether the object belongs
>>>> to itself to determine whether to delete the object.
>>>>
>>>> Signed-off-by: levin li<xingke.lwp at taobao.com>
>>>> ---
>>>>    sheep/vdi.c |   14 ++++++++++++++
>>>>    1 files changed, 14 insertions(+), 0 deletions(-)
>>>>
>>>> diff --git a/sheep/vdi.c b/sheep/vdi.c
>>>> index d2a522d..c8085c8 100644
>>>> --- a/sheep/vdi.c
>>>> +++ b/sheep/vdi.c
>>>> @@ -478,6 +478,12 @@ static void delete_one(struct work *work)
>>>>    		if (!inode->data_vdi_id[i])
>>>>    			continue;
>>>>
>>>> +		if (inode->data_vdi_id[i] != inode->vdi_id) {
>>>> +			dprintf("object %" PRIx64 " is base's data, would not be deleted.\n",
>>>> +					vid_to_data_oid(inode->data_vdi_id[i], i));
>>>> +			continue;
>>>> +		}
>>>> +
>>>>    		ret = remove_object(dw->entries, dw->nr_vnodes, dw->nr_zones, dw->epoch,
>>>>    			      vid_to_data_oid(inode->data_vdi_id[i], i),
>>>>    			      inode->nr_copies);
>>>> @@ -587,6 +593,14 @@ next:
>>>>    			  vid_to_vdi_oid(vid), (char *)inode,
>>>>    			  SD_INODE_HEADER_SIZE, 0, sys->nr_sobjs);
>>>>
>>>> +	if (vid == inode->vdi_id&&   inode->snap_id == 1
>>> What does 'inode->snap_id == 1' mean here?  I think this patch is not
>>> correct at all.
>>>
>>> Thanks,
>>>
>>> Kazutaka
>>>
>> When inode->snap_ctime is zero, it means this can not be a snapshot.
>>
>> Further more, in the current vdi chain, the base vdi that we see by
>> 'collie vdi list' has parent id which may point to the snapshot vdi
>> which is the previous base vdi, in the case that an snapshot exist, the
>> snap_id of the current base vdi must be greater than 1.
>>
>> But the cloned vdi always has a snap_id to be 1, so we can determine that
>> if inode->snap_id == 1&&  inode->parent_vdi_id != 0&&
>> !inode->snap_ctime, then
>> the vdi must be a cloned vdi.
> For example:
>
>   $ collie cluster format
>   $ collie vdi create base 1G
>   $ collie vdi snapshot base
>   $ collie vdi clone base cloned
>   $ collie vdi clone -s 1 base cloned
>   $ collie vdi snapshot cloned
>   $ collie vdi list
>      Name        Id    Size    Used  Shared    Creation time   VDI id  Tag
>    s base         1  1.0 GB  0.0 MB  0.0 MB 2012-05-07 03:50   54c278
>      base         2  1.0 GB  0.0 MB  0.0 MB 2012-05-07 03:50   54c279
>    s cloned       1  1.0 GB  0.0 MB  0.0 MB 2012-05-07 03:50   c876b2
>      cloned       2  1.0 GB  0.0 MB  0.0 MB 2012-05-07 03:51   c876b3
>
> The snap_id of the vdi c876b2 is 1, but its snap_ctime is not zero
> because the vdi is a snapshot.  The snap_ctime of the vdi c876b3 is
> zero, but its snap_is 2.  So, with your code, no vdi is detected as a
> clonend vdi against the above example.
>
> Kazutaka
Yes, I already considered this case, in your example,c876b2
should be the cloned VDI, but now it becomes a snapshot, and c876b3
becomes the base VDI of c876b2, it's OK.

The only reason we need to mark a VDI as cloned it for deletion work,
let me explain why.

A cloned VDI I mean isn't always the VDI created by the clone operation,
but a cloned VDI on the leaves of the VDI tree, if it has some objects
which created by copy-on-write that not shared with any other VDI, then
we can delete the objects when we deleting the VDI to avoid a leak so
as to save disk space.

In the example, c876b3 becomes a base VDI, we can not delete it's data
objects directly when deleting it, even if some of its objects are cloned
by copy-on-write, since c876b2 may shares objects with it.

The same reason for c876b2, it becomes a snapshot VDI, no longer a simple
cloned VDI any more.

thanks,

levin.






More information about the sheepdog mailing list