[sheepdog] [PATCH v2] sheep/http: make kv_delete_object more safe

Liu Yuan namei.unix at gmail.com
Thu Dec 26 13:52:29 CET 2013


For object create/delete, we can't easily maintain the bnode consistent by
playing around the operations order.

We should inform the user the deletion failure if bnode_update() fails even
though we might delete the onode successfully. Then subsequent 'delete' for
the same object won't skew up the bnode metadata.

The true fix for the inconsistency (for whatever reaons it happens), is a check
request that does a server side consistency check. This is left for a future
patch.

Another fix is that we drop the redundant data about bytes_used, object_counts
from bnode, and so for "HEAD" operation, we just iterate all the objects. This
can't scale if we have huge objects.

Signed-off-by: Liu Yuan <namei.unix at gmail.com>
---
 sheep/http/kv.c | 36 +++++++++++++++++++++++++-----------
 1 file changed, 25 insertions(+), 11 deletions(-)

diff --git a/sheep/http/kv.c b/sheep/http/kv.c
index 3bb8a1d..679784d 100644
--- a/sheep/http/kv.c
+++ b/sheep/http/kv.c
@@ -373,6 +373,21 @@ out:
 	return ret;
 }
 
+/*
+ * For object create/delete, we can't easily maintain the bnode consistent by
+ * playing around the operations order.
+ *
+ * We should inform the user the deletion failure if bnode_update() fails even
+ * though we might delete the onode successfully. Then subsequent 'delete' for
+ * the same object won't skew up the bnode metadata.
+ * The true fix for the inconsistency (for whatever reaons it happens), is a
+ * check request that does a server side consistency check. This is left for a
+ * future patch.
+ *
+ * Another fix is that we drop the redundant data about bytes_used,
+ * object_counts from bnode, and so for "HEAD" operation, we just iterate all
+ * the objects. This can't scale if we have huge objects.
+ */
 static int bnode_update(const char *account, const char *bucket, uint64_t used,
 			bool create)
 {
@@ -1033,7 +1048,8 @@ int kv_create_object(struct http_request *req, const char *account,
 
 	ret = bnode_update(account, bucket, req->data_length, true);
 	if (ret != SD_RES_SUCCESS) {
-		ret = onode_delete(onode);
+		sd_err("failed to update bucket for %s", name);
+		onode_delete(onode);
 		goto out;
 	}
 out:
@@ -1087,17 +1103,15 @@ int kv_delete_object(const char *account, const char *bucket, const char *name)
 		goto out;
 
 	ret = onode_delete(onode);
-	if (ret != SD_RES_SUCCESS)
+	if (ret != SD_RES_SUCCESS) {
+		sd_err("failed to delete bnode for %s", name);
 		goto out;
-
-	/*
-	 * If bnode is deleted successfully, we consider it successful deletion
-	 * even if bnode_update() fails.
-	 *
-	 * FIXME: make bnode metadata consistent
-	 */
-	if (bnode_update(account, bucket, onode->size, false) != SD_RES_SUCCESS)
-		sd_err("failed to update bnode for %s/%s", account, bucket);
+	}
+	ret = bnode_update(account, bucket, onode->size, false);
+	if (ret != SD_RES_SUCCESS) {
+		sd_err("failed to update bnode for %s", name);
+		goto out;
+	}
 out:
 	free(onode);
 	return ret;
-- 
1.8.1.2




More information about the sheepdog mailing list