[sheepdog] [PATCH V2] sheep: remove check_majority()

Christoph Hellwig hch at infradead.org
Wed May 16 14:27:45 CEST 2012


On Wed, May 16, 2012 at 08:54:01PM +0900, MORITA Kazutaka wrote:
> I also think it's the right way to go to check network partition in
> cluster drivers, but the corosync driver doesn't support it yet.  Is
> it possible to implement a network partition handling in the corosync
> driver before removing the code from __sd_leave()?  There are already
> some users who use Sheepdog with corosync.

I don't even think the current code work is practice, as it probes
only the nodes in w->member_list, which doesn't include the nodes that
left with the current confchg event.

The untested patch below implements what I think the intention of the
check was, can you confirm that?


Index: sheepdog/sheep/cluster/corosync.c
===================================================================
--- sheepdog.orig/sheep/cluster/corosync.c	2012-05-16 13:46:25.207717214 +0200
+++ sheepdog/sheep/cluster/corosync.c	2012-05-16 14:18:11.747699030 +0200
@@ -541,11 +541,22 @@ static void cdrv_cpg_confchg(cpg_handle_
 	int i;
 	struct cpg_node joined_sheep[SD_MAX_NODES];
 	struct cpg_node left_sheep[SD_MAX_NODES];
+	int nr_total = member_list_entries + left_list_entries;
 
 	dprintf("mem:%zu, joined:%zu, left:%zu\n",
 		member_list_entries, joined_list_entries,
 		left_list_entries);
 
+	/*
+	 * Abort as quickly as we can to save ourselves from running into
+	 * a split brain scenario in case of cluster partition.  We can
+	 * only reasonably handle this with more than three nodes.
+	 */
+	if (nr_total >= 3 && member_list_entries < nr_total / 2 + 1) {
+		eprintf("the majority of nodes are not alive\n");
+		abort();
+	}
+
 	/* convert cpg_address to cpg_node */
 	for (i = 0; i < left_list_entries; i++) {
 		left_sheep[i].nodeid = left_list[i].nodeid;



More information about the sheepdog mailing list