[sheepdog] [PATCH v4 1/5] zookeeper: fixed concurrent startup error

Liu Yuan namei.unix at gmail.com
Tue Jun 18 10:06:44 CEST 2013


On 06/18/2013 02:15 PM, Kai Zhang wrote:
> Current implementation of zookeeper driver has a risk when multiple sheep
> start up concurrently.
> 
> Consider the following situation:
> 1. There is a 3 node cluster: sheep1, sheep2, sheep3.
> 2. Both sheep1 and sheep2 leave cluster.
> 3. Both sheep1 and sheep2 start up after previous zookeeper session timeout.
> 4. Sheep3 leaves the cluster before sheep1 and sheep2 receiving join requests
>    from zookeeper.
> 5. When sheep1 and sheep2 receive the join requests, both of them assume they
>    are master due to zk_member_empty() returns true.

Could you write a test to demonstrate this happen in real life in the
first place?

> 
> The new implementation can avoid this problem because sheep will assume itself
> as master only if it creates master node successfully.

If you can write how the new impl would work in the commit log, we will
spend less time on reading the code to get how it works.

Thanks,
Yuan




More information about the sheepdog mailing list