[sheepdog] [PATCH v2] sheep: remove master node
Liu Yuan
namei.unix at gmail.com
Thu Jul 18 12:37:16 CEST 2013
On Thu, Jul 18, 2013 at 06:59:54PM +0900, MORITA Kazutaka wrote:
> From: MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp>
>
> The current procedure to handle sheep join is as follows.
>
> 1. The joining node sends a join request.
> 2. The master node accepts the request.
> 3. All the nodes update cluster members.
>
> This procedure has some problems:
>
> - The master election is too complex to maintain.
> It is very difficult to make sure that the implementation is
> correct.
>
> - The master node can fail while it is accepting the joining node.
> The newly elected master has to take over the process, but it's
> usually difficult to implement because we have to know what the
> previous master did and what it did not before its failure.
>
> This patch changes the sheep join procedure to the following.
>
> 1. The joining node sends a join request.
> 2. Some of the existing nodes accept the request.
> 3. All the nodes update cluster members.
>
> It is allowed for the multiple nodes to call sd_join_handler() against
> the same join request, but at least one node must have to do it. With
> this change, we can eliminate a master, and node failure while
> accepting node join is also allowed.
>
> Removing a master from zookeeper is not easy since it doesn't expect
> that multiple nodes send EVENT_ACCEPT. I'll leave this for another
> day.
>
> Signed-off-by: MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp>
> ---
> include/shepherd.h | 3 --
> sheep/cluster.h | 2 +-
> sheep/cluster/corosync.c | 60 +++--------------------
> sheep/cluster/local.c | 38 +++++++--------
> sheep/cluster/shepherd.c | 43 +++++------------
> sheep/cluster/zookeeper.c | 75 ++++++++++++++---------------
zk code wasn't removed.
Thanks
Yuan
More information about the sheepdog
mailing list