[sheepdog] [PATCH] sheep: don't clean stale dir if there are no enough nodes
long
nxtxiaolong at gmail.com
Sat Dec 6 06:53:39 CET 2014
hi all,
Because the gateway nodes only to hold is the bitmap information of the cluster.So,I think the epoch format should be changed :
or store only the node's information which has objects ,
or store the node’s information in difference way depends on the node is gateway or not.
So,when gateway nodes join to the cluster ,the cluster always thought that this node is new node.
thanks
xiao long
> 在 2014年12月5日,下午12:05,Yang Zhang <3100100878 at zju.edu.cn> 写道:
>
> Hi Hitoshi,
>
> I've test the patch. It didn't solve the problem, 'dog vdi list' still show object not found.
> Actually. it didn't clean the object saved in .stale dir, but didn't recover it back to obj/ also.
>
> Also, i wonder even if we recover the obj in.stale dir, will it be the newest version?
>
> Thanks,
> Yang
>
>> -----原始邮件-----
>> 发件人: "Hitoshi Mitake" <mitake.hitoshi at lab.ntt.co.jp>
>> 发送时间: 2014年12月4日 星期四
>> 收件人: sheepdog at lists.wpkg.org
>> 抄送: mitake.hitoshi at gmail.com, "Hitoshi Mitake" <mitake.hitoshi at lab.ntt.co.jp>, duron800 at qq.com, "张扬" <3100100878 at zju.edu.cn>, "徐小�霜" <nxtxiaolong at gmail.com>
>> 主题: Re: [PATCH] sheep: don't clean stale dir if there are no enough nodes
>>
>> At Thu, 4 Dec 2014 16:05:39 +0900,
>> Hitoshi Mitake wrote:
>>>
>>> Current recovery process has a bug of data wipe. After an epoch which
>>> consists only gateway nodes, objects stored in dying nodes will be
>>> wiped when the nodes join to the cluster. This patch solves the
>>> problem with removing invalid call of sd_store->cleanup() during
>>> recovery completion.
>>>
>>> Related issue:
>>> https://bugs.launchpad.net/sheepdog-project/+bug/1327037
>>>
>>> Cc: duron800 at qq.com
>>> Cc: �UEQo <3100100878 at zju.edu.cn>
>>> Cc: 徐小�霜 <nxtxiaolong at gmail.com>
>>> Signed-off-by: Hitoshi Mitake <mitake.hitoshi at lab.ntt.co.jp>
>>> ---
>>> sheep/ops.c | 5 +++--
>>> sheep/sheep_priv.h | 1 +
>>> sheep/vdi.c | 12 ++++++++++++
>>> 3 files changed, 16 insertions(+), 2 deletions(-)
>>
>> �UEQo, 徐小�霜, could you test this patch if you have time? It would be
>> the simplest solution for the problem.
>>
>> Thanks,
>> Hitoshi
>>
>>>
>>> diff --git a/sheep/ops.c b/sheep/ops.c
>>> index a617a83..b418bda 100644
>>> --- a/sheep/ops.c
>>> +++ b/sheep/ops.c
>>> @@ -726,8 +726,9 @@ static int cluster_recovery_completion(const struct sd_req *req,
>>> sd_notice("all nodes are recovered, epoch %d", epoch);
>>> last_gathered_epoch = epoch;
>>> /* sd_store can be NULL if this node is a gateway */
>>> - if (vnode_info->nr_zones >= ec_max_data_strip &&
>>> - sd_store && sd_store->cleanup)
>>> + if (vnode_info->nr_zones >=
>>> + max(ec_max_data_strip, max_nr_copies)
>>> + && sd_store && sd_store->cleanup)
>>> sd_store->cleanup();
>>> }
>>> }
>>> diff --git a/sheep/sheep_priv.h b/sheep/sheep_priv.h
>>> index 5fc6b90..699f352 100644
>>> --- a/sheep/sheep_priv.h
>>> +++ b/sheep/sheep_priv.h
>>> @@ -357,6 +357,7 @@ int inode_coherence_update(uint32_t vid, bool validate,
>>> void remove_node_from_participants(const struct node_id *left);
>>>
>>> extern int ec_max_data_strip;
>>> +extern int max_nr_copies;
>>>
>>> int read_vdis(char *data, int len, unsigned int *rsp_len);
>>> int read_del_vdis(char *data, int len, unsigned int *rsp_len);
>>> diff --git a/sheep/vdi.c b/sheep/vdi.c
>>> index 1c8fb36..d815196 100644
>>> --- a/sheep/vdi.c
>>> +++ b/sheep/vdi.c
>>> @@ -40,6 +40,12 @@ static struct sd_rw_lock vdi_state_lock = SD_RW_LOCK_INITIALIZER;
>>> */
>>> int ec_max_data_strip;
>>>
>>> +/*
>>> + * max_nr_copies represent max number of copies of replicated VDIs. It is used
>>> + * for the same purpose of ec_max_data_strip.
>>> + */
>>> +int max_nr_copies;
>>> +
>>> int sheep_bnode_writer(uint64_t oid, void *mem, unsigned int len,
>>> uint64_t offset, uint32_t flags, int copies,
>>> int copy_policy, bool create, bool direct)
>>> @@ -171,6 +177,12 @@ int add_vdi_state(uint32_t vid, int nr_copies, bool snapshot, uint8_t cp)
>>> sd_mutex_lock(&m);
>>> ec_max_data_strip = max(d, ec_max_data_strip);
>>> sd_mutex_unlock(&m);
>>> + } else {
>>> + static struct sd_mutex m = SD_MUTEX_INITIALIZER;
>>> +
>>> + sd_mutex_lock(&m);
>>> + max_nr_copies = max(nr_copies, max_nr_copies);
>>> + sd_mutex_unlock(&m);
>>> }
>>>
>>> sd_debug("%" PRIx32 ", %d, %d", vid, nr_copies, cp);
>>> --
>>> 1.8.3.2
>>>
>
More information about the sheepdog
mailing list