[sheepdog] [PATCH v5 2/2] sheep/cluster: re-create znode of 'lock-id' if it isn't exists

Liu Yuan namei.unix at gmail.com
Mon Jan 20 06:22:34 CET 2014


On Sun, Jan 19, 2014 at 10:39:09PM +0800, Robin Dong wrote:
> From: Robin Dong <sanbai at taobao.com>
> 
> When a sheep daemon call zk_unlock(), it will remove the whole znode of 'lock-id' and
> make zoo_create() fail in zk_lock() which called by another sheep daemon.
> To avoid this, we should try to create a znode of 'lock-id' if zoo_create() fail.
> 
> Signed-off-by: Robin Dong <sanbai at taobao.com>
> ---
>  sheep/cluster/zookeeper.c | 23 +++++++++++++++++++----
>  1 file changed, 19 insertions(+), 4 deletions(-)
> 
> diff --git a/sheep/cluster/zookeeper.c b/sheep/cluster/zookeeper.c
> index fd6dcaf..d5cb4ce 100644
> --- a/sheep/cluster/zookeeper.c
> +++ b/sheep/cluster/zookeeper.c
> @@ -1240,6 +1240,7 @@ static void zk_lock(uint64_t lock_id)
>  	int rc, len = MAX_NODE_STR_LEN;
>  	char *my_path;
>  	char parent[MAX_NODE_STR_LEN];
> +	char parent_node[MAX_NODE_STR_LEN];
>  	char lowest_seq_path[MAX_NODE_STR_LEN];
>  	char owner_name[MAX_NODE_STR_LEN];
>  	struct cluster_lock *cluster_lock;
> @@ -1248,24 +1249,38 @@ static void zk_lock(uint64_t lock_id)
>  
>  	my_path = cluster_lock->lock_path;
>  
> -	/* compete owner of lock is just like zk_compete_master() */
>  	snprintf(parent, MAX_NODE_STR_LEN, LOCK_ZNODE "/%"PRIu64"/",
>  		 cluster_lock->id);
> +	/*
> +	 * It need using path without end of '/' to create node of lock_id in
> +	 * zookeeper's API, so we use 'parent_node'.
> +	 */
> +	snprintf(parent_node, MAX_NODE_STR_LEN, LOCK_ZNODE "/%"PRIu64,
> +		 cluster_lock->id);
> +	/* compete owner of lock is just like zk_compete_master() */

Applied, but I think this fix is not ideal. There is still visible problems in 
current design:

- lock starvation. Every ->unlock() will wipe out the lock-id/, so all the other
lock contenders will race for the lock again in an arbitrary order. So there might
be possibility that one lock contender will never have the chance to grab the 
lock. We already have a ticket like sequence after every lock contender create
a EPHEMERAL node in lock_id/. So we might better build a ticket lock on this
sequence to avoid lock starvation (though we really have a very low possiblity
of lock starvation)

Also we should refine the function to avoid use parent and parent_node to
represent the same thing.

Thanks
Yuan



More information about the sheepdog mailing list