At Mon, 07 May 2012 09:51:00 +0800, levin li wrote: > > On 05/07/2012 02:58 AM, MORITA Kazutaka wrote: > > At Fri, 04 May 2012 09:49:08 +0800, > > levin li wrote: > >> On 05/04/2012 03:46 AM, MORITA Kazutaka wrote: > >>> At Mon, 23 Apr 2012 14:18:06 +0800, > >>> Li Wenpeng wrote: > >>>> From: levin li<xingke.lwp at taobao.com> > >>>> > >>>> When deleting a cloned vdi, sheep find the root vdi and then > >>>> traverse the vdi chain(such as base --> snapshot --> clone) to > >>>> check wheter there's an undeleted vdi in the chain, if some vdi > >>>> in the chain isn't deleted, sheep just mark the cloned vdi as > >>>> deleted by clear its vdi name. > >>>> > >>>> But in fact a cloned vdi may created its own objects by copy-on-write, > >>>> these objects can be deleted when deleting the vdi, so we make > >>>> the cloned vdi to be deleted as the root vdi, then we can deleting > >>>> its data objects, in delete_one() we check whether the object belongs > >>>> to itself to determine whether to delete the object. > >>>> > >>>> Signed-off-by: levin li<xingke.lwp at taobao.com> > >>>> --- > >>>> sheep/vdi.c | 14 ++++++++++++++ > >>>> 1 files changed, 14 insertions(+), 0 deletions(-) > >>>> > >>>> diff --git a/sheep/vdi.c b/sheep/vdi.c > >>>> index d2a522d..c8085c8 100644 > >>>> --- a/sheep/vdi.c > >>>> +++ b/sheep/vdi.c > >>>> @@ -478,6 +478,12 @@ static void delete_one(struct work *work) > >>>> if (!inode->data_vdi_id[i]) > >>>> continue; > >>>> > >>>> + if (inode->data_vdi_id[i] != inode->vdi_id) { > >>>> + dprintf("object %" PRIx64 " is base's data, would not be deleted.\n", > >>>> + vid_to_data_oid(inode->data_vdi_id[i], i)); > >>>> + continue; > >>>> + } > >>>> + > >>>> ret = remove_object(dw->entries, dw->nr_vnodes, dw->nr_zones, dw->epoch, > >>>> vid_to_data_oid(inode->data_vdi_id[i], i), > >>>> inode->nr_copies); > >>>> @@ -587,6 +593,14 @@ next: > >>>> vid_to_vdi_oid(vid), (char *)inode, > >>>> SD_INODE_HEADER_SIZE, 0, sys->nr_sobjs); > >>>> > >>>> + if (vid == inode->vdi_id&& inode->snap_id == 1 > >>> What does 'inode->snap_id == 1' mean here? I think this patch is not > >>> correct at all. > >>> > >>> Thanks, > >>> > >>> Kazutaka > >>> > >> When inode->snap_ctime is zero, it means this can not be a snapshot. > >> > >> Further more, in the current vdi chain, the base vdi that we see by > >> 'collie vdi list' has parent id which may point to the snapshot vdi > >> which is the previous base vdi, in the case that an snapshot exist, the > >> snap_id of the current base vdi must be greater than 1. > >> > >> But the cloned vdi always has a snap_id to be 1, so we can determine that > >> if inode->snap_id == 1&& inode->parent_vdi_id != 0&& > >> !inode->snap_ctime, then > >> the vdi must be a cloned vdi. > > For example: > > > > $ collie cluster format > > $ collie vdi create base 1G > > $ collie vdi snapshot base > > $ collie vdi clone base cloned > > $ collie vdi clone -s 1 base cloned > > $ collie vdi snapshot cloned > > $ collie vdi list > > Name Id Size Used Shared Creation time VDI id Tag > > s base 1 1.0 GB 0.0 MB 0.0 MB 2012-05-07 03:50 54c278 > > base 2 1.0 GB 0.0 MB 0.0 MB 2012-05-07 03:50 54c279 > > s cloned 1 1.0 GB 0.0 MB 0.0 MB 2012-05-07 03:50 c876b2 > > cloned 2 1.0 GB 0.0 MB 0.0 MB 2012-05-07 03:51 c876b3 > > > > The snap_id of the vdi c876b2 is 1, but its snap_ctime is not zero > > because the vdi is a snapshot. The snap_ctime of the vdi c876b3 is > > zero, but its snap_is 2. So, with your code, no vdi is detected as a > > clonend vdi against the above example. > > > > Kazutaka > Yes, I already considered this case, in your example,c876b2 > should be the cloned VDI, but now it becomes a snapshot, and c876b3 > becomes the base VDI of c876b2, it's OK. > > The only reason we need to mark a VDI as cloned it for deletion work, > let me explain why. > > A cloned VDI I mean isn't always the VDI created by the clone operation, > but a cloned VDI on the leaves of the VDI tree, if it has some objects > which created by copy-on-write that not shared with any other VDI, then > we can delete the objects when we deleting the VDI to avoid a leak so > as to save disk space. > > In the example, c876b3 becomes a base VDI, we can not delete it's data > objects directly when deleting it, even if some of its objects are cloned > by copy-on-write, since c876b2 may shares objects with it. > > The same reason for c876b2, it becomes a snapshot VDI, no longer a simple > cloned VDI any more. So this patch deletes the vdi when it is a cloned one but doesn't have any snapshots? If yes, you should write so. I wonder why you tried to handle the specific situation. I think we can generalize your code; Sheepdog can delete the vdi safely if its all descendant vdis are already deleted. Kazutaka |