[sheepdog] leave event does not dispached in a corosync driver.
Hitoshi Mitake
mitake.hitoshi at lab.ntt.co.jp
Fri Sep 12 09:57:00 CEST 2014
At Fri, 12 Sep 2014 13:00:58 +0900,
tuji wrote:
>
> Hi
>
> I found problem that the node does not left when one of the node is stoped under
> recovery is running
> And it was repported to launchpad(https://bugs.launchpad.net/sheepdog-project/+bug/1368503 ).
>
> To solve this problem, I make patche for corosyn.c
>
> [root at node001 BUILD]# diff -u sheepdog-0.7.6-org/sheep/cluster/corosync.c sheepdog-0.7.6/sheep/cluster/corosync.c
> --- sheepdog-0.7.6-org/sheep/cluster/corosync.c 2013-12-22 18:07:34.000000000 +0900
> +++ sheepdog-0.7.6/sheep/cluster/corosync.c 2014-09-12 09:47:37.840975169 +0900
> @@ -368,8 +368,9 @@
> * number of alive nodes correctly, we postpone
> * processsing events if there are incoming ones.
> */
> - sd_debug("wait for a next dispatch event");
> - return;
> + sd_debug("wait for a next dispatch event.not return");
> + //sd_debug("wait for a next dispatch event");
> + //return;
> }
>
> nr_majority = 0;
>
> The problem was solved by this patch.
> I know this is an insufficiency patch because the function described in comment is disabled.
>
> /*
> * Corosync dispatches leave events one by one even
> * when network partition has occured. To count the
> * number of alive nodes correctly, we postpone
> * processsing events if there are incoming ones.
> */
>
> I can't understand about this comment.
> Does anyone give me advice about it.
Thanks a lot for your analysis! The delay of message delivery for node
leave seems to be caused by a past commit
(15df161958a38cf3f7bc83b5bc2c8a1817b3072e). The intention of the
commit was handling network partition well, but it would be a root
cause of the problem.
I created a branch which has a revert commit of the above patch. Could
you test it?
https://github.com/sheepdog/sheepdog/tree/corosync-leave
# I cannot test it because I don't have a corosync cluster now,
# sorry...
Thanks,
Hitoshi
>
>
>
> --------------------------
> Masahiro Tsuji
>
> A.T.WORKS, INC
> URL http://www.atworks.co.jp
>
> --
> sheepdog mailing list
> sheepdog at lists.wpkg.org
> http://lists.wpkg.org/mailman/listinfo/sheepdog
More information about the sheepdog
mailing list