[Sheepdog] [PATCH 1/2] traverse the VDI tree before delete cloned VDI

levin li levin108 at gmail.com
Mon May 7 07:45:31 CEST 2012


Even if the VDI is a cloned VDI (not the VDI just created
by 'clone' operation, but the cloned VDI on the leaf node
of the VDI tree), we also should traverse the vdi tree from
the root VDI.

But if there's a VDI not deleted in the tree path, and the
VDI to delete is a cloned VDI, we should delete it's data
objects so as to save disk space. If all the VDIs have been
deleted in the tree path, then we traverse the path to delete
all the objects of the VDIs.

I'd like to explain again why we need to deleted the objects
created by copy-on-write when we try to delete a cloned VDI,
in the old logic, objects of cloned VDI can only be deleted
after all the VDI in the tree path has been deleted, but there's
a problem, we may clone many VDIs from one snapshot, and these
VDIs can be deleted frequently, but we may always not delete the
snapshot VDI, so in this case, objects of cloned VDI would
always stay in the disk, as we know, they're already useless
and should be deleted, and waste too much disk space, but we can
only make it deleted after all the VDI in the tree path, including
the snapshot VDI have beed deleted, it's really not good.

Signed-off-by: levin li <xingke.lwp at taobao.com>
---
 sheep/vdi.c |   26 ++++++++++++++++++--------
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/sheep/vdi.c b/sheep/vdi.c
index 5b96dbb..cd6b557 100644
--- a/sheep/vdi.c
+++ b/sheep/vdi.c
@@ -562,12 +562,14 @@ out:
 }
 
 static uint64_t get_vdi_root(struct vnode_info *vnode_info, uint32_t epoch,
-		uint32_t vid)
+		uint32_t vid, int *cloned)
 {
 	int ret;
 	struct sheepdog_inode *inode = NULL;
 	int nr_copies = get_nr_copies(vnode_info);
 
+	*cloned = 0;
+
 	inode = malloc(SD_INODE_HEADER_SIZE);
 	if (!inode) {
 		eprintf("failed to allocate memory\n");
@@ -583,7 +585,7 @@ next:
 			&& !inode->snap_ctime) {
 		dprintf("vdi %" PRIx32 " is a cloned vdi.\n", vid);
 		/* current vdi is a cloned vdi */
-		goto out;
+		*cloned = 1;
 	}
 
 	if (ret != SD_RES_SUCCESS) {
@@ -607,7 +609,7 @@ out:
 int start_deletion(uint32_t vid, uint32_t epoch)
 {
 	struct deletion_work *dw = NULL;
-	int ret = SD_RES_NO_MEM;
+	int ret = SD_RES_NO_MEM, cloned;
 	uint32_t root_vid;
 
 	dw = zalloc(sizeof(struct deletion_work));
@@ -628,7 +630,7 @@ int start_deletion(uint32_t vid, uint32_t epoch)
 
 	dw->vnodes = get_vnode_info();
 
-	root_vid = get_vdi_root(dw->vnodes, dw->epoch, dw->vid);
+	root_vid = get_vdi_root(dw->vnodes, dw->epoch, dw->vid, &cloned);
 	if (!root_vid) {
 		ret = SD_RES_EIO;
 		goto err;
@@ -636,10 +638,18 @@ int start_deletion(uint32_t vid, uint32_t epoch)
 
 	ret = fill_vdi_list(dw, root_vid);
 	if (ret) {
-		dprintf("snapshot chain has valid vdi, "
-				"just mark vdi %" PRIx32 " as deleted.\n", dw->vid);
-		delete_inode(dw);
-		return SD_RES_SUCCESS;
+		/* if the VDI is a cloned VDI, delete its objects
+		 * no matter whether the VDI tree is clear. */
+		if (cloned) {
+			dw->buf[0] = vid;
+			dw->count = 1;
+		} else {
+			dprintf("snapshot chain has valid vdi, "
+				"just mark vdi %" PRIx32 " as deleted.\n",
+				dw->vid);
+			delete_inode(dw);
+			return SD_RES_SUCCESS;
+		}
 	}
 
 	dprintf("%d\n", dw->count);
-- 
1.7.10




More information about the sheepdog mailing list