[Sheepdog] PATCH S003: Handle master crashing before sending JOIN request

Liu Yuan namei.unix at gmail.com
Mon Apr 30 12:47:54 CEST 2012


On Mon, Apr 30, 2012 at 5:23 PM, MORITA Kazutaka
<morita.kazutaka at gmail.com> wrote:
> At Sun, 29 Apr 2012 02:05:33 +0800,
> Liu Yuan wrote:
>>
>> No, your patch is completely wrong workaround, not a fix. You just set
>> every node as 'first node' and all nodes become the master and send the
>> response to each other.
>
> I think you misunderstand Shevek's patch.  IIUC, there is no situation
> that multiple nodes become the master at the same time with his patch.
>
> Shevek's patch is relatively straightforward.  If you think it is
> hacky, what should be blamed is 'first_node' in the corosync_event
> structure, which is introduced by me.
>
> I like Christoph's version:
>  http://lists.wpkg.org/pipermail/sheepdog/2012-April/003186.html
> Yuan, what do you think of it?
>

Okay, I think Christoph's patch is easier to understand, I am fine
with it, but it needs following changes:

diff --git a/sheep/cluster/corosync.c b/sheep/cluster/corosync.c
index b22ed87..2d73038 100644
--- a/sheep/cluster/corosync.c
+++ b/sheep/cluster/corosync.c
@@ -37,6 +37,7 @@ static LIST_HEAD(corosync_block_list);
 static struct cpg_node cpg_nodes[SD_MAX_NODES];
 static size_t nr_cpg_nodes;
 static int self_elect;
+static int join_finished;

 /* event types which are dispatched in corosync_dispatch() */
 enum corosync_event_type {
@@ -378,7 +379,6 @@ static int __corosync_dispatch_one(struct
corosync_event *cevent)
 static void __corosync_dispatch(void)
 {
 	struct corosync_event *cevent;
-	static int join_finished;
 	int done;

 	while (!list_empty(&corosync_event_list)) {
@@ -620,27 +620,29 @@ static void cdrv_cpg_confchg(cpg_handle_t handle,
 		list_add_tail(&cevent->list, &corosync_event_list);
 	}

-	/*
-	 * Exactly one non-master member has seen join events for all other
-	 * members, because events are ordered.
-	 */
-	for (i = 0; i < member_list_entries; i++) {
-		cevent = find_block_event(COROSYNC_EVENT_TYPE_JOIN,
-					  &member_sheep[i]);
-		if (!cevent) {
-			dprintf("Not promoting because member is not in our "
-				"event list.\n");
-			promote = 0;
+	if (!join_finished) {
+		/*
+		 * Exactly one non-master member has seen join events for all other
+		 * members, because events are ordered.
+		 */
+		for (i = 0; i < member_list_entries; i++) {
+			cevent = find_block_event(COROSYNC_EVENT_TYPE_JOIN,
+					&member_sheep[i]);
+			if (!cevent) {
+				dprintf("Not promoting because member is not in our "
+						"event list.\n");
+				promote = 0;
+				break;
+			}
 		}
-	}
-
-	/*
-	 * If we see the join events for all nodes promote ourself to master
-	 * right here.
-	 */
-	if (promote)
-		self_elect = 1;

+		/*
+		 * If we see the join events for all nodes promote ourself to master
+		 * right here.
+		 */
+		if (promote)
+			self_elect = 1;
+	}
 	__corosync_dispatch();
 }

These changes are needed, because only those nodes, that aren't
join_finished, are legitimate to do self-elect. or we are suffering
confusing log output like below:

Apr 30 18:14:47 cdrv_cpg_confchg(565) mem:5, joined:0, left:1
Apr 30 18:14:47 cdrv_cpg_confchg(632) Not promoting because member is
not in our event list.
Apr 30 18:14:47 cdrv_cpg_confchg(632) Not promoting because member is
not in our event list.
Apr 30 18:14:47 cdrv_cpg_confchg(632) Not promoting because member is
not in our event list.
Apr 30 18:14:47 cdrv_cpg_confchg(632) Not promoting because member is
not in our event list.
Apr 30 18:14:47 sd_check_join_cb(774) 0, 2 <--- this actually means it
is already the master.

Thanks,
Yuan



More information about the sheepdog mailing list