[sheepdog] [PATCH] sheep: remove master node

Sun Jul 14 08:25:12 CEST 2013

On Sun, Jul 14, 2013 at 12:08:46AM +0900, MORITA Kazutaka wrote:
> From: MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp>
> 
> The current procedure to handle sheep join is as follows.
> 
>  1. The joining node sends a join request.
>  2. The master node accepts the request.
>  3. All the nodes update cluster members.
> 
> This procedure has some problems:
> 
>  - The master election is too complex to maintain.
>    It is very difficult to make sure that the implementation is
>    correct.
> 
>  - The master node can fail while it is accepting the joining node.
>    The newly elected master has to take over the process, but it's
>    usually difficult to implement because we have to know what the
>    previous master did and what it did not before its failure.
> 
> This patch changes the sheep join procedure to the following.
> 
>  1. The joining node sends a join request.
>  2. Some of the existing nodes accept the request.

Seems that all the nodes in the cluster accept the request, no?

>  3. All the nodes update cluster members.
> 
> It is allowed for the multiple nodes to call sd_accept_handler()
> against the same join request, but at least one node must have to do
> it.  With this change, we can eliminate a master, and node failure
> while accepting node join is also allowed.
> 

Why sd_accept_handler is reentrant in cluster aspect? I noticed that, e.g,
push_join_response() of zk driver are called on all the nodes too. So if
following case happens, can sheep handle it?

2 nodes in the cluster {A, B}. And C is joining the cluster.

A -> push_join_response() and quickly return, watcher of A, B, C is called
     to handle EVENT_ACCEPT from A.
B -> push_join_response() slowly return because of network, A, B, C handles
     EVENT_ACCEPT from B.

Simply put, can sheep hanle multiple EVENT_ACCEPT of the same node?

Thanks
Yuan