[Sheepdog] PATCH S003: Handle master crashing before sending JOIN request

Shevek shevek at anarres.org
Sat Apr 28 11:49:13 CEST 2012


On Sat, 2012-04-28 at 17:34 +0800, Liu Yuan wrote:
> On 04/28/2012 05:22 PM, Shevek wrote:
> 
> >> > -               if (join_finished)
> >> > +               if (join_finished || is_master(&cevent->sender))
> > I believe this patch does NOT work:
> > 
> > 1) is_master returns -1 for not-master, 0 means the given node is
> > master, so this is apparently a misuse of is_master.
> > 
> > However, even worse:
> > 
> > 2) If we never got a join response, then line 388:
> >     388 if (!cevent->blocked && cpg_node_equal(&cevent->sender,
> > &this_node)) {
> > never triggered, therefore we have never set join_finished, and
> > nr_cpg_nodes is always 0 (look for every place where it got set), so
> > is_master() is always 0.
> > 
> > So, by inspection, this patch can never work, for both reasons.
> > 
> 
> 
> Oh, yes. Thanks for pointing it out.
> 
> But what if set set
> 
>                 if (join_finished || (is_master(&cevent->sender >= 0)))
>                         done = __corosync_dispatch_one(cevent);
>                 else
>                         done = !cevent->blocked;
> 
> Then, the master will be called for sure, and would give other nodes a
> join response.

As I said before under (2), is_master() will always return 0, since the
only assignments to nr_cpg_nodes are within __corosync_dispatch_one()
(which has never yet been called) or when first_node == 1 (which never
happened).

So this new patch is still invalid.

S.




More information about the sheepdog mailing list