[sheepdog-users] Single-node sheepdog for testing

Fri Apr 18 05:01:34 CEST 2014

Thanks, that's what I suspected. I'll use the shutdown command from now on.
:-)

To your other question, I'm running the 0.6.0 version--I know, it's way out
of date! But I am trying to get sheepdog integrated as an optional
configuration for Devstack, and I don't think the Devstack team likes
including alternate PPAs or building other projects from source. That's the
version that Ubuntu 13.10 supports; hopefully Devstack will bump itself up
to Trusty soon.

~ Scott

On Thu, Apr 17, 2014 at 9:41 PM, Hitoshi Mitake <mitake.hitoshi at gmail.com>wrote:

> On Fri, Apr 18, 2014 at 10:16 AM, Scott Devoid <devoid at anl.gov> wrote:
> > Thanks Hitoshi,
> >
> > So I am seeing some interesting behavior when I try to shutdown and
> restart
> > my 3 node cluster:
> >
> > $ for pid in `pgrep | sheep`; do kill -15 $pid; sleep 2; done
> > $ for i in 0 1 2; do sheep -c local -d /path/to/store/$i -z $i -p 700$i;
> > sleep 1; done
> >
> > Shutdown works fine, but when I go to start the cluster up the first
> member
> > fails when the second joins. I think this is because the second member
> has a
> > later epoch than the first.
> >
> > Here is the tail of the first member logs:
> > http://paste.openstack.org/show/76195/
> >
> > Let me know if I am doing things incorrectly.
>
> A little bit follow up:
>
> As you say, the problem is caused by the difference of epoch numbers.
>
> The detail of the problem is like below:
> 1. sheep A, B, and C form a cluster
> 2. kill command kills A, and the killing is notified to B and C. So B
> and C update their membership (called epoch).
> 3. the for loop kills B and C with 2 seconds interval
> 4. restart sheeps, the second for loop restarts A
> 5. the for loop restarts B. B's membership is newer than A. So A exits
> voluntary because it doesn't know the latest membership.
>
> Thanks,
> Hitoshi
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wpkg.org/pipermail/sheepdog-users/attachments/20140417/95f648ad/attachment-0005.html>