[sheepdog-users] Corosync Vs Zookeeper Backends
Andrew J. Hobbs
ajhobbs at desu.edu
Fri Mar 14 22:26:50 CET 2014
As a follow up, it's specifically the line after your zookeeper entry.
/meta /var/lib/sheepdog/disc0,...
should read
/meta,/var/lib/sheepdog/disc0,...
As for why corosync vs zookeeper: Two primary reasons come to mind.
The first is corosync is based on multicast, which is not always
supported on the switch. In our case, it works fine on edge switches,
but explicitly disabled at the network core. Second reason is it's not
unusual for corosync packets to drop under load (such as during a node
rebuild if it's running same interface as sheepdog), where enough
dropped packets result in a partition which causes sheepdog to panic and
halt. If you're running on IB, you are probably much more resilient to
this due to the nature of how IPoIB works. I've noticed also that
corosync or zookeeper also have little storms when you perform a
snapshot of a sheepdog vdi, which is another thing that can trigger a
partition.
I'm very interested in your performance numbers once you iron through
this hiccup, as it's very much the direction I'm hoping to take our
cluster going forward. What with the price advantage of IB over 10G or
40G cards and switches.
Best of luck!
Andrew
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ajhobbs.vcf
Type: text/x-vcard
Size: 353 bytes
Desc: ajhobbs.vcf
URL: <http://lists.wpkg.org/pipermail/sheepdog-users/attachments/20140314/51242f67/attachment-0005.vcf>
More information about the sheepdog-users
mailing list