[sheepdog] [PATCH v2] sheep/http: make kv_delete_object more safe

Robin Dong robin.k.dong at gmail.com
Fri Dec 27 02:53:58 CET 2013


Reviewed-by: Robin Dong <sanbai at taobao.com>


2013/12/26 Liu Yuan <namei.unix at gmail.com>

> For object create/delete, we can't easily maintain the bnode consistent by
> playing around the operations order.
>
> We should inform the user the deletion failure if bnode_update() fails even
> though we might delete the onode successfully. Then subsequent 'delete' for
> the same object won't skew up the bnode metadata.
>
> The true fix for the inconsistency (for whatever reaons it happens), is a
> check
> request that does a server side consistency check. This is left for a
> future
> patch.
>
> Another fix is that we drop the redundant data about bytes_used,
> object_counts
> from bnode, and so for "HEAD" operation, we just iterate all the objects.
> This
> can't scale if we have huge objects.
>
> Signed-off-by: Liu Yuan <namei.unix at gmail.com>
> ---
>  sheep/http/kv.c | 36 +++++++++++++++++++++++++-----------
>  1 file changed, 25 insertions(+), 11 deletions(-)
>
> diff --git a/sheep/http/kv.c b/sheep/http/kv.c
> index 3bb8a1d..679784d 100644
> --- a/sheep/http/kv.c
> +++ b/sheep/http/kv.c
> @@ -373,6 +373,21 @@ out:
>         return ret;
>  }
>
> +/*
> + * For object create/delete, we can't easily maintain the bnode
> consistent by
> + * playing around the operations order.
> + *
> + * We should inform the user the deletion failure if bnode_update() fails
> even
> + * though we might delete the onode successfully. Then subsequent
> 'delete' for
> + * the same object won't skew up the bnode metadata.
> + * The true fix for the inconsistency (for whatever reaons it happens),
> is a
> + * check request that does a server side consistency check. This is left
> for a
> + * future patch.
> + *
> + * Another fix is that we drop the redundant data about bytes_used,
> + * object_counts from bnode, and so for "HEAD" operation, we just iterate
> all
> + * the objects. This can't scale if we have huge objects.
> + */
>  static int bnode_update(const char *account, const char *bucket, uint64_t
> used,
>                         bool create)
>  {
> @@ -1033,7 +1048,8 @@ int kv_create_object(struct http_request *req, const
> char *account,
>
>         ret = bnode_update(account, bucket, req->data_length, true);
>         if (ret != SD_RES_SUCCESS) {
> -               ret = onode_delete(onode);
> +               sd_err("failed to update bucket for %s", name);
> +               onode_delete(onode);
>                 goto out;
>         }
>  out:
> @@ -1087,17 +1103,15 @@ int kv_delete_object(const char *account, const
> char *bucket, const char *name)
>                 goto out;
>
>         ret = onode_delete(onode);
> -       if (ret != SD_RES_SUCCESS)
> +       if (ret != SD_RES_SUCCESS) {
> +               sd_err("failed to delete bnode for %s", name);
>                 goto out;
> -
> -       /*
> -        * If bnode is deleted successfully, we consider it successful
> deletion
> -        * even if bnode_update() fails.
> -        *
> -        * FIXME: make bnode metadata consistent
> -        */
> -       if (bnode_update(account, bucket, onode->size, false) !=
> SD_RES_SUCCESS)
> -               sd_err("failed to update bnode for %s/%s", account,
> bucket);
> +       }
> +       ret = bnode_update(account, bucket, onode->size, false);
> +       if (ret != SD_RES_SUCCESS) {
> +               sd_err("failed to update bnode for %s", name);
> +               goto out;
> +       }
>  out:
>         free(onode);
>         return ret;
> --
> 1.8.1.2
>
> --
> sheepdog mailing list
> sheepdog at lists.wpkg.org
> http://lists.wpkg.org/mailman/listinfo/sheepdog
>



-- 
--
Best Regard
Robin Dong
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wpkg.org/pipermail/sheepdog/attachments/20131227/ddad99bc/attachment-0004.html>


More information about the sheepdog mailing list