[Sheepdog] [PATCH 2/2] always delete data objects when deleting an cloned vdi
levin li
levin108 at gmail.com
Mon May 7 03:51:00 CEST 2012
On 05/07/2012 02:58 AM, MORITA Kazutaka wrote:
> At Fri, 04 May 2012 09:49:08 +0800,
> levin li wrote:
>> On 05/04/2012 03:46 AM, MORITA Kazutaka wrote:
>>> At Mon, 23 Apr 2012 14:18:06 +0800,
>>> Li Wenpeng wrote:
>>>> From: levin li<xingke.lwp at taobao.com>
>>>>
>>>> When deleting a cloned vdi, sheep find the root vdi and then
>>>> traverse the vdi chain(such as base --> snapshot --> clone) to
>>>> check wheter there's an undeleted vdi in the chain, if some vdi
>>>> in the chain isn't deleted, sheep just mark the cloned vdi as
>>>> deleted by clear its vdi name.
>>>>
>>>> But in fact a cloned vdi may created its own objects by copy-on-write,
>>>> these objects can be deleted when deleting the vdi, so we make
>>>> the cloned vdi to be deleted as the root vdi, then we can deleting
>>>> its data objects, in delete_one() we check whether the object belongs
>>>> to itself to determine whether to delete the object.
>>>>
>>>> Signed-off-by: levin li<xingke.lwp at taobao.com>
>>>> ---
>>>> sheep/vdi.c | 14 ++++++++++++++
>>>> 1 files changed, 14 insertions(+), 0 deletions(-)
>>>>
>>>> diff --git a/sheep/vdi.c b/sheep/vdi.c
>>>> index d2a522d..c8085c8 100644
>>>> --- a/sheep/vdi.c
>>>> +++ b/sheep/vdi.c
>>>> @@ -478,6 +478,12 @@ static void delete_one(struct work *work)
>>>> if (!inode->data_vdi_id[i])
>>>> continue;
>>>>
>>>> + if (inode->data_vdi_id[i] != inode->vdi_id) {
>>>> + dprintf("object %" PRIx64 " is base's data, would not be deleted.\n",
>>>> + vid_to_data_oid(inode->data_vdi_id[i], i));
>>>> + continue;
>>>> + }
>>>> +
>>>> ret = remove_object(dw->entries, dw->nr_vnodes, dw->nr_zones, dw->epoch,
>>>> vid_to_data_oid(inode->data_vdi_id[i], i),
>>>> inode->nr_copies);
>>>> @@ -587,6 +593,14 @@ next:
>>>> vid_to_vdi_oid(vid), (char *)inode,
>>>> SD_INODE_HEADER_SIZE, 0, sys->nr_sobjs);
>>>>
>>>> + if (vid == inode->vdi_id&& inode->snap_id == 1
>>> What does 'inode->snap_id == 1' mean here? I think this patch is not
>>> correct at all.
>>>
>>> Thanks,
>>>
>>> Kazutaka
>>>
>> When inode->snap_ctime is zero, it means this can not be a snapshot.
>>
>> Further more, in the current vdi chain, the base vdi that we see by
>> 'collie vdi list' has parent id which may point to the snapshot vdi
>> which is the previous base vdi, in the case that an snapshot exist, the
>> snap_id of the current base vdi must be greater than 1.
>>
>> But the cloned vdi always has a snap_id to be 1, so we can determine that
>> if inode->snap_id == 1&& inode->parent_vdi_id != 0&&
>> !inode->snap_ctime, then
>> the vdi must be a cloned vdi.
> For example:
>
> $ collie cluster format
> $ collie vdi create base 1G
> $ collie vdi snapshot base
> $ collie vdi clone base cloned
> $ collie vdi clone -s 1 base cloned
> $ collie vdi snapshot cloned
> $ collie vdi list
> Name Id Size Used Shared Creation time VDI id Tag
> s base 1 1.0 GB 0.0 MB 0.0 MB 2012-05-07 03:50 54c278
> base 2 1.0 GB 0.0 MB 0.0 MB 2012-05-07 03:50 54c279
> s cloned 1 1.0 GB 0.0 MB 0.0 MB 2012-05-07 03:50 c876b2
> cloned 2 1.0 GB 0.0 MB 0.0 MB 2012-05-07 03:51 c876b3
>
> The snap_id of the vdi c876b2 is 1, but its snap_ctime is not zero
> because the vdi is a snapshot. The snap_ctime of the vdi c876b3 is
> zero, but its snap_is 2. So, with your code, no vdi is detected as a
> clonend vdi against the above example.
>
> Kazutaka
Yes, I already considered this case, in your example,c876b2
should be the cloned VDI, but now it becomes a snapshot, and c876b3
becomes the base VDI of c876b2, it's OK.
The only reason we need to mark a VDI as cloned it for deletion work,
let me explain why.
A cloned VDI I mean isn't always the VDI created by the clone operation,
but a cloned VDI on the leaves of the VDI tree, if it has some objects
which created by copy-on-write that not shared with any other VDI, then
we can delete the objects when we deleting the VDI to avoid a leak so
as to save disk space.
In the example, c876b3 becomes a base VDI, we can not delete it's data
objects directly when deleting it, even if some of its objects are cloned
by copy-on-write, since c876b2 may shares objects with it.
The same reason for c876b2, it becomes a snapshot VDI, no longer a simple
cloned VDI any more.
thanks,
levin.
More information about the sheepdog
mailing list