[Sheepdog] [PATCH S003 v2] sheep: handle master crashing before sending JOIN request

Liu Yuan namei.unix at gmail.com
Fri May 11 04:32:07 CEST 2012


On 05/11/2012 06:40 AM, Shevek wrote:

> There are a number of races around join/leave.
> 
> This patch fixes a case described by Huxinwei and Liu Yuan where
> sheepdog fails to elect a master. A longer description is in the patch.
> 
> A problem arises if a node joins the cluster and generates a
> confchg event, then crashes or leaves without sending a join
> request and receiving a join response. The second node to join
> never becomes master, and the entire cluster hangs.
> 
> This patch allows a node to detect whether it should promote itself
> to master after an arbitrary confchg event. Every node except the
> master creates a blocked JOIN event for every node that joined
> after itself, therefore the master is the node which has a JOIN
> event for every node in the members list.
> 
> A following patch will handle the case where a join request
> is sent, but the master crashes before sending a join response.
> 
> Changes to this patch have been made as requested by Liu Yuan.


Hi Shevek,

  I have run script/checkpatch.pl against your patch, and get 3
warnings, please fix it.

Thanks,
Yuan



More information about the sheepdog mailing list