[sheepdog] [PATCH] sheep:Fix data wipe bug in recovery

Hitoshi Mitake mitake.hitoshi at lab.ntt.co.jp
Thu Dec 4 08:05:32 CET 2014


At Wed, 3 Dec 2014 14:56:21 +0800,
徐小龙 wrote:
> 
> [1  <multipart/alternative (7bit)>]
> [1.1  <text/plain; UTF-8 (7bit)>]
> Epoch won't increase if there are only gateway nodes in the cluster.
> In this way, when the cluster restarts, it will always recovery from a
> latest epoch version which has at least one non-gateway node in the cluster.
> This patch fixes bug "data wipe bug in the recovery" described here:
>         https://bugs.launchpad.net/sheepdog-project/+bug/1327037
> 
>         modified:   sheep/store.c
> Reported by:Xiaolong Xu <nxtxiaolong at gmail.com>; Yang Zhang <
> zhangyangdreamer at gmail.com>
> ---
>  sheep/store.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)

Hi guys (sorry, I cannot read your name),
THanks a lot for your problem reporting, analyzing and posting
patch. And really sorry for your inconvenience.

I reviewed your solution, and stopping increment of epoch in the case
seems to work fine. But it makes the logic of membership management a
little bit complex. The essential problem is an invalid cleaning of
stale dir. I created a patch for it as another solution and post it
soon.

Thanks,
Hitoshi

> 
> diff --git a/sheep/store.c b/sheep/store.c
> index 8843fb8..ca16284 100644
> --- a/sheep/store.c
> +++ b/sheep/store.c
> @@ -19,9 +19,10 @@ LIST_HEAD(store_drivers);
> 
>  int update_epoch_log(uint32_t epoch, struct sd_node *nodes, size_t
> nr_nodes)
>  {
> -       int ret, len, nodes_len;
> +       int ret = 0, len, nodes_len;
>         time_t t;
>         char path[PATH_MAX], *buf;
> +       bool all_gateway = true;
> 
>         sd_debug("update epoch: %d, %zu", epoch, nr_nodes);
> 
> @@ -37,14 +38,19 @@ int update_epoch_log(uint32_t epoch, struct sd_node
> *nodes, size_t nr_nodes)
>          * rb field is unused in epoch file, zero-filling it
>          * is good for epoch file recovery because it is unified
>          */
> -       for (int i = 0; i < nr_nodes; i++)
> +       for (int i = 0; i < nr_nodes; i++){
> +               if((nodes + i)->nr_vnodes){
> +                       all_gateway = false;
> +               }
>                 memset(buf + i * sizeof(struct sd_node)
>                                 + offsetof(struct sd_node, rb),
>                                 0, sizeof(struct rb_node));
> +       }
> 
>         snprintf(path, sizeof(path), "%s%08u", epoch_path, epoch);
> -
> -       ret = atomic_create_and_write(path, buf, len, true);
> +       if(!all_gateway){
> +               ret = atomic_create_and_write(path, buf, len, true);
> +       }
> 
>         free(buf);
>         return ret;
> -- 
> 1.9.1
> [1.2  <text/html; UTF-8 (quoted-printable)>]
> 
> [2  <text/plain; us-ascii (7bit)>]
> -- 
> sheepdog mailing list
> sheepdog at lists.wpkg.org
> http://lists.wpkg.org/mailman/listinfo/sheepdog



More information about the sheepdog mailing list