[sheepdog] [PATCH] sheep: fix handling of too old epoch in check_request
Christoph Hellwig
hch at infradead.org
Tue May 29 11:33:56 CEST 2012
If we hit a too old epoch in check_request (which can only happen
when resume_pending_requests) we currently try to use io_op_done
to decide what to do. For objects that are store locally on a node
that also acts as a gateway this is currently fatal:
(a) io_op_done tries to remove the request from a list it has never
been added to
(b) io_op_done completes the request despite adding it to the
wait_rw_queue list
In addition it seems that
(c) io_op_done was checking for a too large epoch, not a too small one
If that last one was intentional it should at least be documented in there,
but I can't come up with a good explanation for it.
Fix these issues by opencoding the action we want in check_request.
Signed-off-by: Christoph Hellwig <hch at lst.de>
diff --git a/sheep/sdnet.c b/sheep/sdnet.c
index 7fcff4b..7f501aa 100644
--- a/sheep/sdnet.c
+++ b/sheep/sdnet.c
@@ -188,10 +188,18 @@ static int check_request(struct request *req)
if (before(req->rq.epoch, sys->epoch)) {
eprintf("old node version %u, %u, %x\n",
sys->epoch, req->rq.epoch, req->rq.opcode);
- /* ask gateway to retry. */
+
+ /* ask gateway to retry unless this is a local object */
+ if (req->rq.flags & SD_FLAG_CMD_IO_LOCAL) {
+ req->rp.result = SD_RES_OLD_NODE_VER;
+ req->rp.epoch = sys->epoch;
+ req_done(req);
+ return -1;
+ }
+
req->rp.result = SD_RES_OLD_NODE_VER;
- req->rp.epoch = sys->epoch;
- req->work.done(&req->work);
+ list_add_tail(&req->request_list,
+ &sys->wait_rw_queue);
return -1;
} else if (after(req->rq.epoch, sys->epoch)) {
eprintf("new node version %u, %u, %x\n",
More information about the sheepdog
mailing list