[sheepdog] [PATCH] sheep: add epoch recovery to reduce GET_EPOCH error

Hitoshi Mitake mitake.hitoshi at lab.ntt.co.jp
Mon Jul 28 06:20:54 CEST 2014


At Mon, 28 Jul 2014 10:41:46 +0800,
Ruoyu wrote:
> 
> 
> On 2014年07月27日 16:59, Hitoshi Mitake wrote:
> > On Tue, Jul 8, 2014 at 12:04 PM, Ruoyu <liangry at ucweb.com> wrote:
> >> If cluster members change frequently, especially the first two or
> >> more nodes of the cluster leave and join frequently, the following
> >> error message will be found in sheep.log
> >>
> >> Failed to find requested tag, remote address: 127.0.0.1:7001,
> >> op name: GET_EPOCH
> >>
> >> This patch adds epoch recovery to reduce these errors. To achieve
> >> it, it is better to unify epoch file because rb_node is unused
> >> in it. And then, we can create a new epoch file as long as the file
> >> is not found in local node.
> > The essential problem of the bloated epoch file is the design of
> > struct sd_node. The data structure is sent/receive via networks and
> > stored on disks, but it contains struct rb_node which contains
> > pointers. Of course the pointers are completely meaningless in
> > different processes. I prefer redesigning the struct sd_node than
> > ad-hoc solution (the changes for sheep/store.c). How do you think?
> >
> > The changes for sheep/group.c looks good to me. If you can extract
> > this part and send v2, I'll apply it soon.
> Redesigning struct sd_node will affect many modules, therefore we need 
> to think over it's risk and workload. Moreover, it's size is only 
> hundreds of bytes currently and I think it is acceptable even if the 
> sheepdog cluster has hundreds of nodes.
> 
> On the other hand, the main purpose of this patch is to add epoch file 
> recovery when it is missing which group.c will do. If you accept it, why 
> don't you expect the content of the epoch file on the same epoch is 
> exactly same?

So do you mean that the zero-filling rb_node from epoch file is for
ensuring epoch files on the same epoch should be exactlly same? If it
is the purpose of the change for sheep/store.c, I agree with your
opinion. Could you rebase it on the latest master?

Thanks,
Hitoshi

> >
> > Thanks,
> > Hitoshi
> >
> >> Signed-off-by: Ruoyu <liangry at ucweb.com>
> >> ---
> >>   sheep/group.c |  7 +++++--
> >>   sheep/store.c | 11 ++++++++++-
> >>   2 files changed, 15 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/sheep/group.c b/sheep/group.c
> >> index 27e3574..6e4f8ad 100644
> >> --- a/sheep/group.c
> >> +++ b/sheep/group.c
> >> @@ -359,7 +359,7 @@ int epoch_log_read_remote(uint32_t epoch, struct sd_node *nodes, int len,
> >>          rb_for_each_entry(node, &vinfo->nroot, rb) {
> >>                  struct sd_req hdr;
> >>                  struct sd_rsp *rsp = (struct sd_rsp *)&hdr;
> >> -               int nodes_len;
> >> +               int nodes_len, nr_nodes;
> >>
> >>                  if (node_is_local(node))
> >>                          continue;
> >> @@ -382,7 +382,10 @@ int epoch_log_read_remote(uint32_t epoch, struct sd_node *nodes, int len,
> >>                          memcpy(timestamp, buf + nodes_len, sizeof(*timestamp));
> >>
> >>                  free(buf);
> >> -               return nodes_len / sizeof(struct sd_node);
> >> +               nr_nodes = nodes_len / sizeof(struct sd_node);
> >> +               /* epoch file is missing in local node, try to create one */
> >> +               update_epoch_log(epoch, nodes, nr_nodes);
> >> +               return nr_nodes;
> >>          }
> >>
> >>          /*
> >> diff --git a/sheep/store.c b/sheep/store.c
> >> index 70fddb8..87913d4 100644
> >> --- a/sheep/store.c
> >> +++ b/sheep/store.c
> >> @@ -19,7 +19,7 @@ LIST_HEAD(store_drivers);
> >>
> >>   int update_epoch_log(uint32_t epoch, struct sd_node *nodes, size_t nr_nodes)
> >>   {
> >> -       int ret, len, nodes_len;
> >> +       int ret, len, nodes_len, i;
> >>          time_t t;
> >>          char path[PATH_MAX], *buf;
> >>
> >> @@ -33,6 +33,15 @@ int update_epoch_log(uint32_t epoch, struct sd_node *nodes, size_t nr_nodes)
> >>          memcpy(buf, nodes, nodes_len);
> >>          memcpy(buf + nodes_len, &t, sizeof(time_t));
> >>
> >> +       /*
> >> +        * rb_node is unused in epoch file,
> >> +        * unifying it is good for epoch recovery
> >> +        */
> >> +       for (i = 0; i < nr_nodes; i++)
> >> +               memset(buf + i * sizeof(struct sd_node)
> >> +                               + offsetof(struct sd_node, rb),
> >> +                               '\0', sizeof(struct rb_node));
> >> +
> >>          snprintf(path, sizeof(path), "%s%08u", epoch_path, epoch);
> >>
> >>          ret = atomic_create_and_write(path, buf, len, true);
> >> --
> >> 1.8.3.2
> >>
> >>
> >> --
> >> sheepdog mailing list
> >> sheepdog at lists.wpkg.org
> >> http://lists.wpkg.org/mailman/listinfo/sheepdog
> 
> 



More information about the sheepdog mailing list