[sheepdog] [PATCH] zookeeper: fix cluster hang by giving priority to process LEAVE event

Liu Yuan namei.unix at gmail.com
Thu Jul 19 05:20:08 CEST 2012


On 07/19/2012 11:10 AM, Yunkai Zhang wrote:
> That is the point, they are different!
> 
> Zookeeper driver *just* giving priority to process LEAVE event, only
> when there is unfinished BLOCK event. By this difference, all sheep
> will process each message in the same order, but this rule will be
> broken in corosync driver.

They are different in relax degree, but compared with strict ordering,
it is the same: order is relaxed. You relax the order when the queue is
blocked.

We don't need this rule in corosync driver, this is not the rule for
sheepdog: we don't blindly stick to any stereotype unless it is proved
necessary. Relax ordering is very common in distributed system, for e.g,
event consistency relax the read/write ordering.

What really matters is, if relaxing still provide correct behavior. As
far as corosync is concerned, this relaxing is correct and give us
benefit that: once confchg is handled as highest priority, we will
reduce the wrong read/write requests with epoch mismatch to a very low
degree, compared with strict ordering.

Thanks,
Yuan



More information about the sheepdog mailing list