[sheepdog] [PATCH 1/2] zookeeper: handling lost block/notify events during session timeout
Kai Zhang
kyle at zelin.io
Tue Jul 2 08:31:56 CEST 2013
If zookeeper session has timeout, zk_block() and zk_notify() will just
do nothing, and the block/notify event will be missed.
However, these events have been added to pending_block_list and
pending_notify_list. If it comes a new cluster operation, this will lead
to an undefined operation.
This patch fixed the problem by recalling unhandled block/notify events
when re-established connection to zookeeper.
Signed-off-by: Kai Zhang <kyle at zelin.io>
---
sheep/group.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/sheep/group.c b/sheep/group.c
index 2d4a25c..de315c3 100644
--- a/sheep/group.c
+++ b/sheep/group.c
@@ -1101,13 +1101,32 @@ static int send_join_request(struct sd_node *ent)
int sd_reconnect_handler(void)
{
+ struct request *req;
+
sys->status = SD_STATUS_WAIT_FOR_JOIN;
sys->join_finished = false;
+
if (sys->cdrv->init(sys->cdrv_option) != 0)
return -1;
if (send_join_request(&sys->this_node) != 0)
return -1;
+ list_for_each_entry(req, main_thread_get(pending_notify_list),
+ pending_list) {
+ struct vdi_op_message *msg;
+ size_t size;
+ msg = prepare_cluster_msg(req, &size);
+ msg->rsp.result = SD_RES_SUCCESS;
+ sys->cdrv->notify(msg, size);
+ free(msg);
+ }
+
+ list_for_each_entry(req, main_thread_get(pending_block_list),
+ pending_list) {
+ sys->cdrv->block();
+ }
+ cluster_op_running = false;
+
return 0;
}
--
1.7.9.5
More information about the sheepdog
mailing list