[sheepdog] leave event does not dispached in a corosync driver.
tuji
tuji at atworks.co.jp
Fri Sep 12 06:00:58 CEST 2014
Hi
I found problem that the node does not left when one of the node is stoped under
recovery is running
And it was repported to launchpad(https://bugs.launchpad.net/sheepdog-project/+bug/1368503 ).
To solve this problem, I make patche for corosyn.c
[root at node001 BUILD]# diff -u sheepdog-0.7.6-org/sheep/cluster/corosync.c sheepdog-0.7.6/sheep/cluster/corosync.c
--- sheepdog-0.7.6-org/sheep/cluster/corosync.c 2013-12-22 18:07:34.000000000 +0900
+++ sheepdog-0.7.6/sheep/cluster/corosync.c 2014-09-12 09:47:37.840975169 +0900
@@ -368,8 +368,9 @@
* number of alive nodes correctly, we postpone
* processsing events if there are incoming ones.
*/
- sd_debug("wait for a next dispatch event");
- return;
+ sd_debug("wait for a next dispatch event.not return");
+ //sd_debug("wait for a next dispatch event");
+ //return;
}
nr_majority = 0;
The problem was solved by this patch.
I know this is an insufficiency patch because the function described in comment is disabled.
/*
* Corosync dispatches leave events one by one even
* when network partition has occured. To count the
* number of alive nodes correctly, we postpone
* processsing events if there are incoming ones.
*/
I can't understand about this comment.
Does anyone give me advice about it.
--------------------------
Masahiro Tsuji
A.T.WORKS, INC
URL http://www.atworks.co.jp
More information about the sheepdog
mailing list