[sheepdog-users] Corosync Vs Zookeeper Backends

Andrew J. Hobbs ajhobbs at desu.edu
Fri Mar 14 22:26:50 CET 2014


As a follow up, it's specifically the line after your zookeeper entry.

/meta /var/lib/sheepdog/disc0,...

should read

/meta,/var/lib/sheepdog/disc0,...

As for why corosync vs zookeeper:  Two primary reasons come to mind.  
The first is corosync is based on multicast, which is not always 
supported on the switch.  In our case, it works fine on edge switches, 
but explicitly disabled at the network core.  Second reason is it's not 
unusual for corosync packets to drop under load (such as during a node 
rebuild if it's running same interface as sheepdog), where enough 
dropped packets result in a partition which causes sheepdog to panic and 
halt.  If you're running on IB, you are probably much more resilient to 
this due to the nature of how IPoIB works.  I've noticed also that 
corosync or zookeeper also have little storms when you perform a 
snapshot of a sheepdog vdi, which is another thing that can trigger a 
partition.

I'm very interested in your performance numbers once you iron through 
this hiccup, as it's very much the direction I'm hoping to take our 
cluster going forward.  What with the price advantage of IB over 10G or 
40G cards and switches.

Best of luck!

Andrew

-------------- next part --------------
A non-text attachment was scrubbed...
Name: ajhobbs.vcf
Type: text/x-vcard
Size: 353 bytes
Desc: ajhobbs.vcf
URL: <http://lists.wpkg.org/pipermail/sheepdog-users/attachments/20140314/51242f67/attachment-0005.vcf>


More information about the sheepdog-users mailing list