<p dir="ltr">As andy noted, try 'dog node md info --all' to make sure your disks are properly added to sheep</p>
<p dir="ltr">Yuan</p>
<div class="gmail_quote">2014-3-15 AM5:27于 "Andrew J. Hobbs" <<a href="mailto:ajhobbs@desu.edu">ajhobbs@desu.edu</a>>写道:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
As a follow up, it's specifically the line after your zookeeper entry.<br>
<br>
/meta /var/lib/sheepdog/disc0,...<br>
<br>
should read<br>
<br>
/meta,/var/lib/sheepdog/disc0,...<br>
<br>
As for why corosync vs zookeeper: Two primary reasons come to mind.<br>
The first is corosync is based on multicast, which is not always<br>
supported on the switch. In our case, it works fine on edge switches,<br>
but explicitly disabled at the network core. Second reason is it's not<br>
unusual for corosync packets to drop under load (such as during a node<br>
rebuild if it's running same interface as sheepdog), where enough<br>
dropped packets result in a partition which causes sheepdog to panic and<br>
halt. If you're running on IB, you are probably much more resilient to<br>
this due to the nature of how IPoIB works. I've noticed also that<br>
corosync or zookeeper also have little storms when you perform a<br>
snapshot of a sheepdog vdi, which is another thing that can trigger a<br>
partition.<br>
<br>
I'm very interested in your performance numbers once you iron through<br>
this hiccup, as it's very much the direction I'm hoping to take our<br>
cluster going forward. What with the price advantage of IB over 10G or<br>
40G cards and switches.<br>
<br>
Best of luck!<br>
<br>
Andrew<br>
<br>
<br>--<br>
sheepdog-users mailing lists<br>
<a href="mailto:sheepdog-users@lists.wpkg.org">sheepdog-users@lists.wpkg.org</a><br>
<a href="http://lists.wpkg.org/mailman/listinfo/sheepdog-users" target="_blank">http://lists.wpkg.org/mailman/listinfo/sheepdog-users</a><br>
<br></blockquote></div>