Even if the VDI is a cloned VDI (not the VDI just created by 'clone' operation, but the cloned VDI on the leaf node of the VDI tree), we also should traverse the vdi tree from the root VDI. But if there's a VDI not deleted in the tree path, and the VDI to delete is a cloned VDI, we should delete it's data objects so as to save disk space. If all the VDIs have been deleted in the tree path, then we traverse the path to delete all the objects of the VDIs. I'd like to explain again why we need to deleted the objects created by copy-on-write when we try to delete a cloned VDI, in the old logic, objects of cloned VDI can only be deleted after all the VDI in the tree path has been deleted, but there's a problem, we may clone many VDIs from one snapshot, and these VDIs can be deleted frequently, but we may always not delete the snapshot VDI, so in this case, objects of cloned VDI would always stay in the disk, as we know, they're already useless and should be deleted, and waste too much disk space, but we can only make it deleted after all the VDI in the tree path, including the snapshot VDI have beed deleted, it's really not good. Signed-off-by: levin li <xingke.lwp at taobao.com> --- sheep/vdi.c | 26 ++++++++++++++++++-------- 1 file changed, 18 insertions(+), 8 deletions(-) diff --git a/sheep/vdi.c b/sheep/vdi.c index 5b96dbb..cd6b557 100644 --- a/sheep/vdi.c +++ b/sheep/vdi.c @@ -562,12 +562,14 @@ out: } static uint64_t get_vdi_root(struct vnode_info *vnode_info, uint32_t epoch, - uint32_t vid) + uint32_t vid, int *cloned) { int ret; struct sheepdog_inode *inode = NULL; int nr_copies = get_nr_copies(vnode_info); + *cloned = 0; + inode = malloc(SD_INODE_HEADER_SIZE); if (!inode) { eprintf("failed to allocate memory\n"); @@ -583,7 +585,7 @@ next: && !inode->snap_ctime) { dprintf("vdi %" PRIx32 " is a cloned vdi.\n", vid); /* current vdi is a cloned vdi */ - goto out; + *cloned = 1; } if (ret != SD_RES_SUCCESS) { @@ -607,7 +609,7 @@ out: int start_deletion(uint32_t vid, uint32_t epoch) { struct deletion_work *dw = NULL; - int ret = SD_RES_NO_MEM; + int ret = SD_RES_NO_MEM, cloned; uint32_t root_vid; dw = zalloc(sizeof(struct deletion_work)); @@ -628,7 +630,7 @@ int start_deletion(uint32_t vid, uint32_t epoch) dw->vnodes = get_vnode_info(); - root_vid = get_vdi_root(dw->vnodes, dw->epoch, dw->vid); + root_vid = get_vdi_root(dw->vnodes, dw->epoch, dw->vid, &cloned); if (!root_vid) { ret = SD_RES_EIO; goto err; @@ -636,10 +638,18 @@ int start_deletion(uint32_t vid, uint32_t epoch) ret = fill_vdi_list(dw, root_vid); if (ret) { - dprintf("snapshot chain has valid vdi, " - "just mark vdi %" PRIx32 " as deleted.\n", dw->vid); - delete_inode(dw); - return SD_RES_SUCCESS; + /* if the VDI is a cloned VDI, delete its objects + * no matter whether the VDI tree is clear. */ + if (cloned) { + dw->buf[0] = vid; + dw->count = 1; + } else { + dprintf("snapshot chain has valid vdi, " + "just mark vdi %" PRIx32 " as deleted.\n", + dw->vid); + delete_inode(dw); + return SD_RES_SUCCESS; + } } dprintf("%d\n", dw->count); -- 1.7.10 |