[Sheepdog] panic in get_nth_node

Fri Mar 2 02:47:51 CET 2012

-----Original Message-----
> From: Liu Yuan [mailto:namei.unix at gmail.com]
> Sent: Wednesday, February 29, 2012 6:16 PM
> To: huxinwei
> Cc: sheepdog at lists.wpkg.org; Liu Jiang
> Subject: Re: [Sheepdog] panic in get_nth_node
> 
> On 02/29/2012 04:30 PM, huxinwei wrote:
> 
> > Hi list:
> >
> >   In my environment (2 sheep only), sheep always panic while recovering
> from a left node returning.
> >
> > It turns out to be a intend behavior in get_nth_node:
> >
> > =========================================
> >         if (idx == base) {
> >                 panic("bug"); /* not found */
> > =========================================
> >
> >   While I agree this is the correct in most scenarios, it does seem to be too
> intrusive while recovering in my trivial test.
> > To be specific, find_tgt_node calls get_nth_node
> >
> >   I don't have a lot of faith in my own workaround either. Let me know what
> you think ;)
> >
> >   Thanks.
> 
> 
> How to reproduce this issue in your case?

Let's say we have 2 nodes, running 3 sheep instance

Node1 # sheep /home/sheep1 -p 7000
Node2 # sheep /home/sheep1 -p 7000
Node2 # sheep /home/sheep2 -p 7001

Node1 # collie cluster format -c 2 # farm or simple doesn't matter here
Node1 # collie vdi create ss1 1G
... Keep ss1 busy writing ...
Then kill sheep2 on node2, will almost surely panic sheep1 on node2 too.

FYI.

> Thanks,
> Yuan