[sheepdog] [PATCH] sheep: let outstanding IO req doesn't block confchg event

Liu Yuan namei.unix at gmail.com
Thu May 17 14:10:59 CEST 2012


From: Liu Yuan <tailai.ly at taobao.com>

We already define the in-fly IO object as busy object, which sit on the
sys->outstanding_req_list. So recovery request for this object will be queued
on sys->req_wait_for_obj_list, where it will be resumed later.

So there is no need to block confchg event for outstanding IO thus confchg
could be processed as soon as possible. Confchg should take precedence over IO
req because:

Suppose doing heavy IO on each node while cluster is in recovery.
Every node is issuing IO request while doing recovery. Both outstanding
IO and unfinished confchg event blocks each other (nearly dead lock),
all nodes are busy retrying those pending I/Os (live lock), and recovery
requests are mostly denied of service, neither outstanding IO nor
recovery moves on to completion.

Signed-off-by: Liu Yuan <tailai.ly at taobao.com>
---
 sheep/group.c      |    3 +--
 sheep/sdnet.c      |    3 ---
 sheep/sheep_priv.h |    1 -
 3 files changed, 1 insertions(+), 6 deletions(-)

diff --git a/sheep/group.c b/sheep/group.c
index 399c0c4..ada9e55 100644
--- a/sheep/group.c
+++ b/sheep/group.c
@@ -1121,7 +1121,6 @@ static void process_request_queue(void)
 		if (is_io_op(req->op)) {
 			list_add_tail(&req->request_list,
 				      &sys->outstanding_req_list);
-			sys->nr_outstanding_io++;
 
 			if (need_consistency_check(req))
 				set_consistency_check(req);
@@ -1150,7 +1149,7 @@ static inline void process_event_queue(void)
 	 * we need to serialize events so we don't call queue_work
 	 * if one event is running by executing event_fn() or event_done().
 	 */
-	if (event_running || sys->nr_outstanding_io)
+	if (event_running)
 		return;
 
 	cevent = list_first_entry(&sys->event_queue,
diff --git a/sheep/sdnet.c b/sheep/sdnet.c
index a13b3e3..13e8030 100644
--- a/sheep/sdnet.c
+++ b/sheep/sdnet.c
@@ -97,7 +97,6 @@ static void io_op_done(struct work *work)
 	struct sd_req *hdr = &req->rq;
 
 	list_del(&req->request_list);
-	sys->nr_outstanding_io--;
 
 	switch (req->rp.result) {
 	case SD_RES_OLD_NODE_VER:
@@ -202,7 +201,6 @@ static int check_request(struct request *req)
 		int ret = check_epoch(req);
 		if (ret != SD_RES_SUCCESS) {
 			req->rp.result = ret;
-			sys->nr_outstanding_io++;
 			req->work.done(&req->work);
 			return -1;
 		}
@@ -215,7 +213,6 @@ static int check_request(struct request *req)
 		if (req->rq.flags & SD_FLAG_CMD_IO_LOCAL) {
 			/* Sheep peer request */
 			req->rp.result = SD_RES_NEW_NODE_VER;
-			sys->nr_outstanding_io++;
 			req->work.done(&req->work);
 		} else {
 			/* Gateway request */
diff --git a/sheep/sheep_priv.h b/sheep/sheep_priv.h
index 157bb45..092bdb6 100644
--- a/sheep/sheep_priv.h
+++ b/sheep/sheep_priv.h
@@ -136,7 +136,6 @@ struct cluster_info {
 	struct list_head request_queue;
 	struct list_head event_queue;
 	struct event_struct *cur_cevent;
-	int nr_outstanding_io;
 	int nr_outstanding_reqs;
 	unsigned int outstanding_data_size;
 
-- 
1.7.8.2




More information about the sheepdog mailing list