On Mon, Apr 30, 2012 at 5:23 PM, MORITA Kazutaka <morita.kazutaka at gmail.com> wrote: > At Sun, 29 Apr 2012 02:05:33 +0800, > Liu Yuan wrote: >> >> No, your patch is completely wrong workaround, not a fix. You just set >> every node as 'first node' and all nodes become the master and send the >> response to each other. > > I think you misunderstand Shevek's patch. IIUC, there is no situation > that multiple nodes become the master at the same time with his patch. > > Shevek's patch is relatively straightforward. If you think it is > hacky, what should be blamed is 'first_node' in the corosync_event > structure, which is introduced by me. > > I like Christoph's version: > http://lists.wpkg.org/pipermail/sheepdog/2012-April/003186.html > Yuan, what do you think of it? > Okay, I think Christoph's patch is easier to understand, I am fine with it, but it needs following changes: diff --git a/sheep/cluster/corosync.c b/sheep/cluster/corosync.c index b22ed87..2d73038 100644 --- a/sheep/cluster/corosync.c +++ b/sheep/cluster/corosync.c @@ -37,6 +37,7 @@ static LIST_HEAD(corosync_block_list); static struct cpg_node cpg_nodes[SD_MAX_NODES]; static size_t nr_cpg_nodes; static int self_elect; +static int join_finished; /* event types which are dispatched in corosync_dispatch() */ enum corosync_event_type { @@ -378,7 +379,6 @@ static int __corosync_dispatch_one(struct corosync_event *cevent) static void __corosync_dispatch(void) { struct corosync_event *cevent; - static int join_finished; int done; while (!list_empty(&corosync_event_list)) { @@ -620,27 +620,29 @@ static void cdrv_cpg_confchg(cpg_handle_t handle, list_add_tail(&cevent->list, &corosync_event_list); } - /* - * Exactly one non-master member has seen join events for all other - * members, because events are ordered. - */ - for (i = 0; i < member_list_entries; i++) { - cevent = find_block_event(COROSYNC_EVENT_TYPE_JOIN, - &member_sheep[i]); - if (!cevent) { - dprintf("Not promoting because member is not in our " - "event list.\n"); - promote = 0; + if (!join_finished) { + /* + * Exactly one non-master member has seen join events for all other + * members, because events are ordered. + */ + for (i = 0; i < member_list_entries; i++) { + cevent = find_block_event(COROSYNC_EVENT_TYPE_JOIN, + &member_sheep[i]); + if (!cevent) { + dprintf("Not promoting because member is not in our " + "event list.\n"); + promote = 0; + break; + } } - } - - /* - * If we see the join events for all nodes promote ourself to master - * right here. - */ - if (promote) - self_elect = 1; + /* + * If we see the join events for all nodes promote ourself to master + * right here. + */ + if (promote) + self_elect = 1; + } __corosync_dispatch(); } These changes are needed, because only those nodes, that aren't join_finished, are legitimate to do self-elect. or we are suffering confusing log output like below: Apr 30 18:14:47 cdrv_cpg_confchg(565) mem:5, joined:0, left:1 Apr 30 18:14:47 cdrv_cpg_confchg(632) Not promoting because member is not in our event list. Apr 30 18:14:47 cdrv_cpg_confchg(632) Not promoting because member is not in our event list. Apr 30 18:14:47 cdrv_cpg_confchg(632) Not promoting because member is not in our event list. Apr 30 18:14:47 cdrv_cpg_confchg(632) Not promoting because member is not in our event list. Apr 30 18:14:47 sd_check_join_cb(774) 0, 2 <--- this actually means it is already the master. Thanks, Yuan |