[sheepdog] [PATCH v3 5/5] zookeeper: handle session timeout for all zookeeper operations
MORITA Kazutaka
morita.kazutaka at gmail.com
Mon Jun 17 18:22:13 CEST 2013
At Mon, 17 Jun 2013 05:28:46 -0700,
Kai Zhang wrote:
>
> The idea is: when a zk_* APIs returns ZK_INVALIDSTATE, it means the connection
> and session to zookeeper have been lost.
> At this point, callers of zk_* APIs should just do nothing but drop control as
> soon as possible.
> And another thread will responsable for cleaning memory state, re-connecting
> to zookeeper and re-sending join request.
>
> Signed-off-by: Kai Zhang <kyle at zelin.io>
> ---
> sheep/cluster/zookeeper.c | 245 ++++++++++++++++++++++++++++++---------------
> 1 file changed, 163 insertions(+), 82 deletions(-)
>
> diff --git a/sheep/cluster/zookeeper.c b/sheep/cluster/zookeeper.c
> index ca113dc..eec1e2e 100644
> --- a/sheep/cluster/zookeeper.c
> +++ b/sheep/cluster/zookeeper.c
> @@ -33,8 +33,7 @@
>
> /* iterate child znodes */
> #define FOR_EACH_ZNODE(parent, path, strs) \
> - for (zk_get_children(parent, strs), \
> - (strs)->data += (strs)->count; \
> + for ((strs)->data += (strs)->count; \
> (strs)->count-- ? \
> snprintf(path, sizeof(path), "%s/%s", parent, \
> *--(strs)->data) : (free((strs)->data), 0); \
> @@ -76,6 +75,7 @@ static LIST_HEAD(zk_block_list);
> static uatomic_bool is_master;
> static uatomic_bool stop;
> static bool first_push = true;
> +static uint64_t zk_flying_ops;
>
> static void zk_compete_master(void);
>
> @@ -140,28 +140,39 @@ static inline struct zk_node *zk_tree_search(const struct node_id *nid)
> static zhandle_t *zhandle;
> static struct zk_node this_node;
>
> +#define check_zk_rc(rc, path) \
> + if (rc != ZOK && rc != ZNONODE && rc != ZNODEEXISTS && \
> + rc != ZINVALIDSTATE) \
> + panic("failed, path:%s, %s", path, zerror(rc));
> +
On my environment, zoo_exist() can return ZSESSIONEXPIRED, which
should pass this check I think.
> @@ -191,11 +204,14 @@ zk_create_seq_node(const char *path, const char *value, int valuelen,
> char *path_buffer, int path_buffer_len)
> {
> int rc;
> + uatomic_inc(&zk_flying_ops);
> rc = zoo_create(zhandle, path, value, valuelen, &ZOO_OPEN_ACL_UNSAFE,
> ZOO_SEQUENCE, path_buffer, path_buffer_len);
> + uatomic_dec(&zk_flying_ops);
> + check_zk_rc(rc, path);
> +
This causes panic when rc is ZOPERATIONTIMEOUT or ZCONNECTIONLOSS.
Thanks,
Kazutaka
More information about the sheepdog
mailing list