[sheepdog] [PATCH 3/3] sheep: remove check_majority()

Yunkai Zhang yunkai.me at gmail.com
Wed May 16 11:37:36 CEST 2012


On Wed, May 16, 2012 at 5:08 PM, Liu Yuan <namei.unix at gmail.com> wrote:
> On 05/16/2012 05:04 PM, Yunkai Zhang wrote:
>
>> From: Yunkai Zhang <qiushu.zyk at taobao.com>
>>
>> When sheep receives LEAVE event, check_majority() will be executed in
>> __sd_leave(), it'll make network very busy as it try to connect all
>> sheeps each other.
>>
>> I don't think this checking is necessary, that is driver's work. Driver
>> will tell us which sheep is alive and which have left. So let's remove
>> this checking.
>>
>> Signed-off-by: Yunkai Zhang <qiushu.zyk at taobao.com>
>> ---
>>  sheep/group.c |   36 ++----------------------------------
>>  1 files changed, 2 insertions(+), 34 deletions(-)
>>
>> diff --git a/sheep/group.c b/sheep/group.c
>> index e37e049..69c6b71 100644
>> --- a/sheep/group.c
>> +++ b/sheep/group.c
>> @@ -668,39 +668,6 @@ void sd_notify_handler(struct sd_node *sender, void *msg, size_t msg_len)
>>       process_request_event_queues();
>>  }
>>
>> -/*
>> - * Check whether the majority of Sheepdog nodes are still alive or not
>> - */
>> -static int check_majority(struct sd_node *nodes, int nr_nodes)
>> -{
>> -     int nr_majority, nr_reachable = 0, fd, i;
>> -     char name[INET6_ADDRSTRLEN];
>> -
>> -     nr_majority = nr_nodes / 2 + 1;
>> -
>> -     /* we need at least 3 nodes to handle network partition
>> -      * failure */
>> -     if (nr_nodes < 3)
>> -             return 1;
>> -
>> -     for (i = 0; i < nr_nodes; i++) {
>> -             addr_to_str(name, sizeof(name), nodes[i].addr, 0);
>> -             fd = connect_to(name, nodes[i].port);
>> -             if (fd < 0)
>> -                     continue;
>> -
>> -             close(fd);
>> -             nr_reachable++;
>> -             if (nr_reachable >= nr_majority) {
>> -                     dprintf("the majority of nodes are alive\n");
>> -                     return 1;
>> -             }
>> -     }
>> -     dprintf("%d, %d, %d\n", nr_nodes, nr_majority, nr_reachable);
>> -     eprintf("the majority of nodes are not alive\n");
>> -     return 0;
>> -}
>> -
>>  static void __sd_join(struct event_struct *cevent)
>>  {
>>       struct work_join *w = container_of(cevent, struct work_join, cev);
>> @@ -731,7 +698,8 @@ static void __sd_leave(struct event_struct *cevent)
>>  {
>>       struct work_leave *w = container_of(cevent, struct work_leave, cev);
>>
>> -     if (!check_majority(w->member_list, w->member_list_entries)) {
>> +     /* we need at least 3 nodes to handle network partition failure */
>> +     if (w->member_list_entries < 3) {
>>               eprintf("perhaps a network partition has occurred?\n");
>>               abort();
>>       }
>
>
> Some people might run sheep with less than 3 nodes, so I think this
> check would break it.

Oh, I was misled by its comments.

Maybe no checking would be best for us, I'll give V2.

>
> Thanks,
> Yuan



-- 
Yunkai Zhang
Work at Taobao



More information about the sheepdog mailing list