[sheepdog] [PATCH v2 6/7] recovery: fix a race condition in recovery
levin li
levin108 at gmail.com
Wed May 23 09:02:28 CEST 2012
From: levin li <xingke.lwp at taobao.com>
Take consider of this scene:
Node A and B are in recovery
A is recovering object x from B,
and object x hasn't been recovered by B.
B is recovering object y from A,
and object y hasn't been recovered by A.
Then B will response A with result SD_RES_NEW_NODE_VER, and
A will also response B with result SD_RES_NEW_NODE_VER, then
A and B will continually retry to recover object x and y, but always
get an response SD_RES_NEW_NODE_VER, neither success, so here's a
dead lock which stops the recovery from completing.
Signed-off-by: levin li <xingke.lwp at taobao.com>
---
sheep/sdnet.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/sheep/sdnet.c b/sheep/sdnet.c
index 45f2f30..03c6f78 100644
--- a/sheep/sdnet.c
+++ b/sheep/sdnet.c
@@ -224,7 +224,8 @@ static int check_request(struct request *req)
if (!req->local_oid)
return 0;
- if (is_recoverying_oid(req->local_oid)) {
+ if (is_recoverying_oid(req->local_oid) &&
+ !(req->rq.flags & SD_FLAG_CMD_RECOVERY)) {
if (req->rq.flags & SD_FLAG_CMD_IO_LOCAL) {
/* Sheep peer request */
if (is_recovery_init()) {
--
1.7.10
More information about the sheepdog
mailing list