[sheepdog] [PATCH v1] sheep/md: fix dogcluster info command show wrong disk number

Fri Sep 12 07:15:40 CEST 2014

At Fri, 12 Sep 2014 12:31:02 +0800,
=?utf-8?B?QmluZ3BlbmcgWmh1?= wrote:
> 
> [1  <multipart/alternative (8bit)>]
> [1.1  <text/plain; utf-8 (base64)>]
> Hi, Hitoshi:
>     I think you are right.  But maybe this bug is hard to trigger. Here is my analysis:â€
>     If md.nr_disks changes from 1 to 0, kick_recover() is not called immediately. In mostâ€
> case, there are more than one io thread. If other io thread access the corrupted directory,
> it will get SD_RES_EIO since md.nr_disks is 0 (See md_handle_eio() functionâ€). Then, mainâ€â€â€
> thread will call leave_cluster() (See io_op_done() function).â€
>     I think maybe we can change the code like this:
> 
> 
> 	if (disk) {
> 		if (nr > 0) {
> 			update_node_disks();
> 			kick_recover();
> 		} else {
> 			leave_cluster();
> 		}
> 	}â€
> 
> 
> â€> How do you think?

Thanks for your analysis! Your solution seems to be good. Could you
send a patch?

Thanks,
Hitoshi

> 
> 
> Thanks!
> Bingpeng
> 
> 
> ------------------ åŽŸå§‹é‚®ä»¶ ------------------
> å‘ä»¶äºº: "Bingpeng Zhu";<nkuzbp at foxmail.com>;
> å‘é€æ—¶é—´: 2014å¹´9æœˆ11æ—¥(æ˜ŸæœŸå››) ä¸‹åˆ2:37
> æ”¶ä»¶äºº: "Hitoshi Mitake"<mitake.hitoshi at lab.ntt.co.jp>; 
> 
> ä¸»é¢˜: Re:  [sheepdog]    [PATCH v1] sheep/md: fix dogcluster info command show	wrong disk number
> 
> 
> 
> I see. But it works OK in my experimental environment:
> 
> 
> #dog cluster info
> Cluster status: running, auto-recovery enabled
> Cluster created at Thu Sep 11 14:07:48 2014
> 
> 
> Epoch Time           Version
> 2014-09-11 14:26:51      5 [127.0.0.1:7000(3), 127.0.0.1:7001(3), 127.0.0.1:7002(3), 127.0.0.1:7003(3), 127.0.0.1:7004(3), 127.0.0.1:7005(3), 127.0.0.1:7007(3)]
> 2014-09-11 14:26:51      4 [127.0.0.1:7000(3), 127.0.0.1:7001(3), 127.0.0.1:7002(3), 127.0.0.1:7003(3), 127.0.0.1:7004(3), 127.0.0.1:7005(3), 127.0.0.1:7006(0), 127.0.0.1:7007(3)]
> 2014-09-11 14:26:27      3 [127.0.0.1:7000(3), 127.0.0.1:7001(3), 127.0.0.1:7002(3), 127.0.0.1:7003(3), 127.0.0.1:7004(3), 127.0.0.1:7005(3), 127.0.0.1:7006(1), 127.0.0.1:7007(3)]
> 2014-09-11 14:26:20      2 [127.0.0.1:7000(3), 127.0.0.1:7001(3), 127.0.0.1:7002(3), 127.0.0.1:7003(3), 127.0.0.1:7004(3), 127.0.0.1:7005(3), 127.0.0.1:7006(2), 127.0.0.1:7007(3)]
> 2014-09-11 14:07:49      1 [127.0.0.1:7000(3), 127.0.0.1:7001(3), 127.0.0.1:7002(3), 127.0.0.1:7003(3), 127.0.0.1:7004(3), 127.0.0.1:7005(3), 127.0.0.1:7006(3), 127.0.0.1:7007(3)]â€
> 
> 
> 
> Seems it will notify other nodes even md.nr_disks equals zero. I'll read the code carefully to figure it out.
> 
> 
> Thanks.
> 
> 
> ------------------ åŽŸå§‹é‚®ä»¶ ------------------
> å‘ä»¶äºº: "Hitoshi Mitake";<mitake.hitoshi at lab.ntt.co.jp>;
> å‘é€æ—¶é—´: 2014å¹´9æœˆ11æ—¥(æ˜ŸæœŸå››) ä¸åˆ12:55
> æ”¶ä»¶äºº: "Bingpeng Zhu"<nkuzbp at foxmail.com>; 
> æŠ„é€: "Hitoshi Mitake"<mitake.hitoshi at lab.ntt.co.jp>; "sheepdog"<sheepdog at lists.wpkg.org>; "Yu Fang"<bingpeng.zbp at alibaba-inc.com>; 
> ä¸»é¢˜: Re: [sheepdog]    [PATCH v1] sheep/md: fix dogcluster info command show	wrong disk number
> 
> 
> 
> At Thu, 11 Sep 2014 10:53:42 +0800,
> =?utf-8?B?QmluZ3BlbmcgWmh1?= wrote:
> > 
> > [1  <multipart/alternative (8bit)>]
> > [1.1  <text/plain; utf-8 (base64)>]
> > If md.nr_disks changes from 1 to 0, the node is pure gateway then. Seems it doesn't need
> > to do any recovery work since no object will locate in the pure gateway.â€
> 
> But other nodes must execute recovery because redundancy is decreased, no?
> 
> Thanks,
> Hitoshi
> 
> > 
> > 
> > 
> > 
> > 
> > ------------------ Original ------------------
> > From:  "Hitoshi Mitake";<mitake.hitoshi at lab.ntt.co.jp>;
> > Date:  Thu, Sep 11, 2014 10:34 AM
> > To:  "Bingpeng Zhu"<nkuzbp at foxmail.com>; 
> > Cc:  "sheepdog"<sheepdog at lists.wpkg.org>; "Bingpeng Zhu"<bingpeng.zbp at alibaba-inc.com>; 
> > Subject:  Re: [sheepdog] [PATCH v1] sheep/md: fix dog cluster info command show	wrong disk number
> > 
> > 
> > 
> > At Wed, 10 Sep 2014 20:08:32 +0800,
> > Bingpeng Zhu wrote:
> > > 
> > > If we enable diskvnodes mode and encounter EIO, running dog
> > > cluster info will show wrong disk number in the new epoch.
> > > We should update node disks when handling EIO and unplug the
> > > disk.
> > > 
> > > Signed-off-by: Bingpeng Zhu <bingpeng.zbp at alibaba-inc.com>
> > > ---
> > >  sheep/md.c |    3 +++
> > >  1 files changed, 3 insertions(+), 0 deletions(-)
> > > 
> > > diff --git a/sheep/md.c b/sheep/md.c
> > > index b13d30a..378d1f1 100644
> > > --- a/sheep/md.c
> > > +++ b/sheep/md.c
> > > @@ -541,6 +541,9 @@ static void md_do_recover(struct work *work)
> > >  out:
> > >  	sd_rw_unlock(&md.lock);
> > >  
> > > +	if (disk)
> > > +		update_node_disks();
> > > +
> > 
> > The change looks reasonable. But I have a question:
> > 
> > >  	if (nr > 0)
> > >  		kick_recover();
> > 
> > Does the above conditional branch (nr > 0) required? When a node has
> > single disk and faces EIO, md.nr_disks can change from 1 to 0. In such
> > a case, the code will not call kick_recover(). I think it is a
> > problematic behavior. How do you think?
> > 
> > Thanks,
> > Hitoshi
> > 
> > >  
> > > -- 
> > > 1.7.1
> > > 
> > > 
> > > 
> > > 
> > > -- 
> > > sheepdog mailing list
> > > sheepdog at lists.wpkg.org
> > > http://lists.wpkg.org/mailman/listinfo/sheepdog
> > [1.2  <text/html; utf-8 (base64)>]
> > 
> > [2  <text/plain; us-ascii (7bit)>]
> > -- 
> > sheepdog mailing list
> > sheepdog at lists.wpkg.org
> > http://lists.wpkg.org/mailman/listinfo/sheepdog
> [1.2  <text/html; utf-8 (base64)>]
> 
> [2  <text/plain; us-ascii (7bit)>]
> -- 
> sheepdog mailing list
> sheepdog at lists.wpkg.org
> http://lists.wpkg.org/mailman/listinfo/sheepdog