[Sheepdog] [PATCH 1/2] sheep: sheep: handle node change event first
MORITA Kazutaka
morita.kazutaka at gmail.com
Sat Mar 31 11:42:24 CEST 2012
At Sat, 31 Mar 2012 16:55:40 +0800,
Liu Yuan wrote:
>
> From: Liu Yuan <tailai.ly at taobao.com>
>
> In a multiple node failure scenario, the latter node event is supposed to
> replace previous one, in this way we reduce the overhead of wasteful multiple
> recovery process.
>
> But currently we insert node event at the tail of cpg event queue, if there are
> infly IO in the queue, node event will be blocked until all the IO are complete.
>
> The symptom can be depiced as following case, drop 15 nodes in a 360 nodes cluster
> while Some Guests doing 'dd' to procude infly IOs:
>
> without the patch:
> ...
> Mar 28 17:10:24 sd_leave_handler(1350) leave ip: 10.232.97.101, port: 7049
> Mar 28 17:16:22 sd_leave_handler(1350) leave ip: 10.232.97.101, port: 7029
> Mar 28 17:22:19 sd_leave_handler(1350) leave ip: 10.232.97.101, port: 7047
> Mar 28 17:28:16 sd_leave_handler(1350) leave ip: 10.232.97.101, port: 7025
> Mar 28 17:33:52 sd_leave_handler(1350) leave ip: 10.232.97.101, port: 7001
> ...
>
> we can see that leave handler are invoked one by one for a 5 minutes internal
>
> with the patch:
>
> Mar 30 17:11:00 sd_leave_handler(1354) leave ip: 10.232.97.105, port: 7003
> Mar 30 17:11:12 sd_leave_handler(1354) leave ip: 10.232.97.105, port: 7013
> Mar 30 17:11:45 sd_leave_handler(1354) leave ip: 10.232.97.105, port: 7011
> Mar 30 17:11:54 sd_leave_handler(1354) leave ip: 10.232.97.105, port: 7007
> Mar 30 17:11:54 sd_leave_handler(1354) leave ip: 10.232.97.105, port: 7004
> Mar 30 17:11:54 sd_leave_handler(1354) leave ip: 10.232.97.105, port: 7009
> Mar 30 17:11:54 sd_leave_handler(1354) leave ip: 10.232.97.105, port: 7001
> Mar 30 17:11:57 sd_leave_handler(1354) leave ip: 10.232.97.105, port: 7012
> Mar 30 17:11:57 sd_leave_handler(1354) leave ip: 10.232.97.105, port: 7010
> Mar 30 17:11:58 sd_leave_handler(1354) leave ip: 10.232.97.105, port: 7002
> Mar 30 17:11:58 sd_leave_handler(1354) leave ip: 10.232.97.105, port: 7005
> Mar 30 17:11:58 sd_leave_handler(1354) leave ip: 10.232.97.105, port: 7006
> Mar 30 17:11:58 sd_leave_handler(1354) leave ip: 10.232.97.105, port: 7014
> Mar 30 17:11:58 sd_leave_handler(1354) leave ip: 10.232.97.105, port: 7000
> Mar 30 17:11:59 sd_leave_handler(1354) leave ip: 10.232.97.105, port: 7008
>
> we see leave hander are called 15 times, and from 1 to 15, it cost less than one minute.
>
> Signed-off-by: Liu Yuan <tailai.ly at taobao.com>
> ---
> sheep/group.c | 6 +++---
> 1 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/sheep/group.c b/sheep/group.c
> index 1e840c8..a4ddc73 100644
> --- a/sheep/group.c
> +++ b/sheep/group.c
> @@ -670,7 +670,7 @@ static void sd_notify_handler(struct sd_node *sender,
> list_del(&w->req->pending_list);
> }
>
> - list_add_tail(&cevent->cpg_event_list, &sys->cpg_event_siblings);
> + list_add(&cevent->cpg_event_list, &sys->cpg_event_siblings);
Sheepdog assumes that all nodes receives the events in the same order.
But this change breaks it because sheep uses this list as a FIFO
queue.
Thanks,
Kazutaka
>
> start_cpg_event_work();
>
> @@ -1252,7 +1252,7 @@ static void sd_join_handler(struct sd_node *joined,
> panic("failed to allocate memory\n");
> memcpy(w->jm, opaque, size);
>
> - list_add_tail(&cevent->cpg_event_list, &sys->cpg_event_siblings);
> + list_add(&cevent->cpg_event_list, &sys->cpg_event_siblings);
> start_cpg_event_work();
>
> unregister_event(cdrv_fd);
> @@ -1373,7 +1373,7 @@ static void sd_leave_handler(struct sd_node *left,
>
> w->left = *left;
>
> - list_add_tail(&cevent->cpg_event_list, &sys->cpg_event_siblings);
> + list_add(&cevent->cpg_event_list, &sys->cpg_event_siblings);
> start_cpg_event_work();
>
> unregister_event(cdrv_fd);
> --
> 1.7.8.2
>
> --
> sheepdog mailing list
> sheepdog at lists.wpkg.org
> http://lists.wpkg.org/mailman/listinfo/sheepdog
More information about the sheepdog
mailing list