<p dir="ltr">As andy noted, try 'dog node md info --all' to make sure your disks are properly added to sheep</p>

<p dir="ltr">Yuan</p>

<div class="gmail_quote">2014-3-15 AM5:27于 "Andrew J. Hobbs" <<a href="mailto:ajhobbs@desu.edu">ajhobbs@desu.edu</a>>写道：<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

As a follow up, it's specifically the line after your zookeeper entry.<br>

<br>

/meta /var/lib/sheepdog/disc0,...<br>

<br>

should read<br>

<br>

/meta,/var/lib/sheepdog/disc0,...<br>

<br>

As for why corosync vs zookeeper:  Two primary reasons come to mind.<br>

The first is corosync is based on multicast, which is not always<br>

supported on the switch.  In our case, it works fine on edge switches,<br>

but explicitly disabled at the network core.  Second reason is it's not<br>

unusual for corosync packets to drop under load (such as during a node<br>

rebuild if it's running same interface as sheepdog), where enough<br>

dropped packets result in a partition which causes sheepdog to panic and<br>

halt.  If you're running on IB, you are probably much more resilient to<br>

this due to the nature of how IPoIB works.  I've noticed also that<br>

corosync or zookeeper also have little storms when you perform a<br>

snapshot of a sheepdog vdi, which is another thing that can trigger a<br>

partition.<br>

<br>

I'm very interested in your performance numbers once you iron through<br>

this hiccup, as it's very much the direction I'm hoping to take our<br>

cluster going forward.  What with the price advantage of IB over 10G or<br>

40G cards and switches.<br>

<br>

Best of luck!<br>

<br>

Andrew<br>

<br>

<br>--<br>

sheepdog-users mailing lists<br>

<a href="mailto:sheepdog-users@lists.wpkg.org">sheepdog-users@lists.wpkg.org</a><br>

<a href="http://lists.wpkg.org/mailman/listinfo/sheepdog-users" target="_blank">http://lists.wpkg.org/mailman/listinfo/sheepdog-users</a><br>

<br></blockquote></div>