[sheepdog] 答复: question about replica recovery failure caused by oid.tmp file
李贵宁(哀蝉)
guining.lgn at alibaba-inc.com
Thu Sep 18 11:20:18 CEST 2014
Bingpeng is right.
I am sure that the default_int() call the unlink to delete old.tmp , but its
input args is filename not including full path
Its should be a bug.
发件人: sheepdog [mailto:sheepdog-bounces at lists.wpkg.org] 代表 Bingpeng Zhu
发送时间: 2014年9月18日 16:31
收件人: Hitoshi Mitake; Ruoyu
抄送: sheepdog
主题: Re: [sheepdog] question about replica recovery failure caused by
oid.tmp file
Thank you for the advice.
default_init() of sheep/store.c has already had the logic of unlinking oid.
tmp files. I'm not sure the reason why oid.tmp file still exists in the
system.
------------------ Original ------------------
From: "Hitoshi Mitake";<mitake.hitoshi at lab.ntt.co.jp
<mailto:mitake.hitoshi at lab.ntt.co.jp> >;
Date: Sep 18, 2014
To: "Ruoyu"<liangry at ucweb.com <mailto:liangry at ucweb.com> >;
Cc: "Bingpeng Zhu"<nkuzbp at foxmail.com <mailto:nkuzbp at foxmail.com> >;
"sheepdog"<sheepdog at lists.wpkg.org <mailto:sheepdog at lists.wpkg.org> >;
Subject: Re: [sheepdog] question about replica recovery failure caused by
oid.tmp file
At Tue, 16 Sep 2014 10:10:32 +0800,
Ruoyu wrote:
>
> [1 <multipart/alternative (7bit)>]
> [1.1 <text/plain; ISO-8859-1 (7bit)>]
> Thanks Bingpeng.
> I also encountered this problem.
> I suggest sheep should scan oid.tmp files and remove them when it is
> being started.
I agree with Ruoyu's opinion. .tmp files should be deleted at
initialization time. e.g. default_init() of sheep/store.c would be a
good place for it.
Thanks,
Hitoshi
>
> On 2014?09?15? 00:14, Bingpeng Zhu wrote:
> > Hi, all:
> > I have a problem in using sheepdog. I create a erasure coded VDI
> > and write
> > some data to it. Then, I unplug disk and stop/restart one sheep in a
> > short
> > time. After recovery is completed in the latest epoch, I find some
> > replica is
> > lost and only the corresponding oid.tmp file exists in the data
> > directory. I tried
> > to rebuild the replica using "dog vdi check", but it didn't work. I
> > think it is
> > caused by oid.tmp file. I have to delete the oid.tmp file manually
> > and then
> > "dog vdi check" successfully recoverd the lost replica.
> > In function default_create_and_write() of sheep/plain_store.c,
> > it returns
> > success directly if oid.tmp file exists. I have read the comment in
> > this function carefully,
> > it says gateway and recovery thread may try to write the SAME data,
> > so it is okay
> > to simply return success here. To solve this problem, I want to
> > change the code of
> > default_create_and_write() so that replica data will be written even
> > oid.tmp file exists.
> > If oid.tmp exists, the function should overwrite it.
> > I am not sure if this change will work good for all scenario.
> > Especially, I doubt whether
> > this change will lead to old data overwriting new data. But I
> > haven't thought out any scenario
> > that will lead to old data overwriting new data. Can someone give me
> > some advice to solve this problem?
> >
> >
> >
>
> [1.2 <text/html; ISO-8859-1 (7bit)>]
>
> [2 <text/plain; us-ascii (7bit)>]
> --
> sheepdog mailing list
> sheepdog at lists.wpkg.org <mailto:sheepdog at lists.wpkg.org>
> http://lists.wpkg.org/mailman/listinfo/sheepdog
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wpkg.org/pipermail/sheepdog/attachments/20140918/4f4f71bc/attachment-0004.html>
More information about the sheepdog
mailing list