[Sheepdog] [PATCH 0/2] fix collie command errors during node member changes

MORITA Kazutaka morita.kazutaka at lab.ntt.co.jp
Thu Dec 15 22:58:14 CET 2011


At Fri, 16 Dec 2011 06:00:02 +0900,
MORITA Kazutaka wrote:
> 
> At Thu, 15 Dec 2011 20:42:09 +0000,
> Chris Webb wrote:
> > 
> > MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp> writes:
> > 
> > > Probably, it is a bug of Sheepdog.  Is there an easy way to reproduce
> > > it with a small cluster?  I'd like to try to test it, too.
> > 
> > Hi Kazutaka. I just started a small cluster of three machines (with three
> > sheep per machine on three different drives, but I'm sure it would work just
> > as well with only one), did a cluster format with --copies=2, and wrote a
> > vdi to the cluster so I had something to test with.
> > 
> > I then (effectively---actually did an ip link set ethX down) unplugged the
> > network to one of the machines. When I did a collie vdi list on one of the
> > machines in the remaining cluster, it paused until it noticed the machine
> > had gone, then continued correctly. However, the sheep daemon never seemed
> > to exit on the machine that had been disconnected, and collie vdi list just
> > hung forever. It seems to happen this way every time so is probably very
> > easy to reproduce.
> 
> I tried it with the master branch just now, but the sheep exited
> correctly on my environment.  Can you give me the sheep.log of the
> disconnected sheep?

I've sent some fixes related to network failure.  Can you try with the
devel branch again?

Thanks,

Kazutaka



More information about the sheepdog mailing list