[sheepdog] [PATCH] sheep: add epoch recovery to reduce GET_EPOCH error
Ruoyu
liangry at ucweb.com
Mon Jul 28 07:48:41 CEST 2014
On 2014年07月28日 12:20, Hitoshi Mitake wrote:
> At Mon, 28 Jul 2014 10:41:46 +0800,
> Ruoyu wrote:
>>
>> On 2014年07月27日 16:59, Hitoshi Mitake wrote:
>>> On Tue, Jul 8, 2014 at 12:04 PM, Ruoyu <liangry at ucweb.com> wrote:
>>>> If cluster members change frequently, especially the first two or
>>>> more nodes of the cluster leave and join frequently, the following
>>>> error message will be found in sheep.log
>>>>
>>>> Failed to find requested tag, remote address: 127.0.0.1:7001,
>>>> op name: GET_EPOCH
>>>>
>>>> This patch adds epoch recovery to reduce these errors. To achieve
>>>> it, it is better to unify epoch file because rb_node is unused
>>>> in it. And then, we can create a new epoch file as long as the file
>>>> is not found in local node.
>>> The essential problem of the bloated epoch file is the design of
>>> struct sd_node. The data structure is sent/receive via networks and
>>> stored on disks, but it contains struct rb_node which contains
>>> pointers. Of course the pointers are completely meaningless in
>>> different processes. I prefer redesigning the struct sd_node than
>>> ad-hoc solution (the changes for sheep/store.c). How do you think?
>>>
>>> The changes for sheep/group.c looks good to me. If you can extract
>>> this part and send v2, I'll apply it soon.
>> Redesigning struct sd_node will affect many modules, therefore we need
>> to think over it's risk and workload. Moreover, it's size is only
>> hundreds of bytes currently and I think it is acceptable even if the
>> sheepdog cluster has hundreds of nodes.
>>
>> On the other hand, the main purpose of this patch is to add epoch file
>> recovery when it is missing which group.c will do. If you accept it, why
>> don't you expect the content of the epoch file on the same epoch is
>> exactly same?
> So do you mean that the zero-filling rb_node from epoch file is for
> ensuring epoch files on the same epoch should be exactlly same? If it
> is the purpose of the change for sheep/store.c, I agree with your
> opinion. Could you rebase it on the latest master?
Yes, exactly :)
>
> Thanks,
> Hitoshi
>
>>> Thanks,
>>> Hitoshi
>>>
>>>> Signed-off-by: Ruoyu <liangry at ucweb.com>
>>>> ---
>>>> sheep/group.c | 7 +++++--
>>>> sheep/store.c | 11 ++++++++++-
>>>> 2 files changed, 15 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/sheep/group.c b/sheep/group.c
>>>> index 27e3574..6e4f8ad 100644
>>>> --- a/sheep/group.c
>>>> +++ b/sheep/group.c
>>>> @@ -359,7 +359,7 @@ int epoch_log_read_remote(uint32_t epoch, struct sd_node *nodes, int len,
>>>> rb_for_each_entry(node, &vinfo->nroot, rb) {
>>>> struct sd_req hdr;
>>>> struct sd_rsp *rsp = (struct sd_rsp *)&hdr;
>>>> - int nodes_len;
>>>> + int nodes_len, nr_nodes;
>>>>
>>>> if (node_is_local(node))
>>>> continue;
>>>> @@ -382,7 +382,10 @@ int epoch_log_read_remote(uint32_t epoch, struct sd_node *nodes, int len,
>>>> memcpy(timestamp, buf + nodes_len, sizeof(*timestamp));
>>>>
>>>> free(buf);
>>>> - return nodes_len / sizeof(struct sd_node);
>>>> + nr_nodes = nodes_len / sizeof(struct sd_node);
>>>> + /* epoch file is missing in local node, try to create one */
>>>> + update_epoch_log(epoch, nodes, nr_nodes);
>>>> + return nr_nodes;
>>>> }
>>>>
>>>> /*
>>>> diff --git a/sheep/store.c b/sheep/store.c
>>>> index 70fddb8..87913d4 100644
>>>> --- a/sheep/store.c
>>>> +++ b/sheep/store.c
>>>> @@ -19,7 +19,7 @@ LIST_HEAD(store_drivers);
>>>>
>>>> int update_epoch_log(uint32_t epoch, struct sd_node *nodes, size_t nr_nodes)
>>>> {
>>>> - int ret, len, nodes_len;
>>>> + int ret, len, nodes_len, i;
>>>> time_t t;
>>>> char path[PATH_MAX], *buf;
>>>>
>>>> @@ -33,6 +33,15 @@ int update_epoch_log(uint32_t epoch, struct sd_node *nodes, size_t nr_nodes)
>>>> memcpy(buf, nodes, nodes_len);
>>>> memcpy(buf + nodes_len, &t, sizeof(time_t));
>>>>
>>>> + /*
>>>> + * rb_node is unused in epoch file,
>>>> + * unifying it is good for epoch recovery
>>>> + */
>>>> + for (i = 0; i < nr_nodes; i++)
>>>> + memset(buf + i * sizeof(struct sd_node)
>>>> + + offsetof(struct sd_node, rb),
>>>> + '\0', sizeof(struct rb_node));
>>>> +
>>>> snprintf(path, sizeof(path), "%s%08u", epoch_path, epoch);
>>>>
>>>> ret = atomic_create_and_write(path, buf, len, true);
>>>> --
>>>> 1.8.3.2
>>>>
>>>>
>>>> --
>>>> sheepdog mailing list
>>>> sheepdog at lists.wpkg.org
>>>> http://lists.wpkg.org/mailman/listinfo/sheepdog
>>
More information about the sheepdog
mailing list