[sheepdog] [PATCH 3/3] sheep: remove check_majority()
Liu Yuan
namei.unix at gmail.com
Wed May 16 11:08:55 CEST 2012
On 05/16/2012 05:04 PM, Yunkai Zhang wrote:
> From: Yunkai Zhang <qiushu.zyk at taobao.com>
>
> When sheep receives LEAVE event, check_majority() will be executed in
> __sd_leave(), it'll make network very busy as it try to connect all
> sheeps each other.
>
> I don't think this checking is necessary, that is driver's work. Driver
> will tell us which sheep is alive and which have left. So let's remove
> this checking.
>
> Signed-off-by: Yunkai Zhang <qiushu.zyk at taobao.com>
> ---
> sheep/group.c | 36 ++----------------------------------
> 1 files changed, 2 insertions(+), 34 deletions(-)
>
> diff --git a/sheep/group.c b/sheep/group.c
> index e37e049..69c6b71 100644
> --- a/sheep/group.c
> +++ b/sheep/group.c
> @@ -668,39 +668,6 @@ void sd_notify_handler(struct sd_node *sender, void *msg, size_t msg_len)
> process_request_event_queues();
> }
>
> -/*
> - * Check whether the majority of Sheepdog nodes are still alive or not
> - */
> -static int check_majority(struct sd_node *nodes, int nr_nodes)
> -{
> - int nr_majority, nr_reachable = 0, fd, i;
> - char name[INET6_ADDRSTRLEN];
> -
> - nr_majority = nr_nodes / 2 + 1;
> -
> - /* we need at least 3 nodes to handle network partition
> - * failure */
> - if (nr_nodes < 3)
> - return 1;
> -
> - for (i = 0; i < nr_nodes; i++) {
> - addr_to_str(name, sizeof(name), nodes[i].addr, 0);
> - fd = connect_to(name, nodes[i].port);
> - if (fd < 0)
> - continue;
> -
> - close(fd);
> - nr_reachable++;
> - if (nr_reachable >= nr_majority) {
> - dprintf("the majority of nodes are alive\n");
> - return 1;
> - }
> - }
> - dprintf("%d, %d, %d\n", nr_nodes, nr_majority, nr_reachable);
> - eprintf("the majority of nodes are not alive\n");
> - return 0;
> -}
> -
> static void __sd_join(struct event_struct *cevent)
> {
> struct work_join *w = container_of(cevent, struct work_join, cev);
> @@ -731,7 +698,8 @@ static void __sd_leave(struct event_struct *cevent)
> {
> struct work_leave *w = container_of(cevent, struct work_leave, cev);
>
> - if (!check_majority(w->member_list, w->member_list_entries)) {
> + /* we need at least 3 nodes to handle network partition failure */
> + if (w->member_list_entries < 3) {
> eprintf("perhaps a network partition has occurred?\n");
> abort();
> }
Some people might run sheep with less than 3 nodes, so I think this
check would break it.
Thanks,
Yuan
More information about the sheepdog
mailing list