[sheepdog] [PATCH 2/2] farm: drop fcntl lock
MORITA Kazutaka
morita.kazutaka at lab.ntt.co.jp
Tue Jun 5 16:38:19 CEST 2012
At Tue, 05 Jun 2012 21:30:58 +0800,
Liu Yuan wrote:
>
> On 06/05/2012 09:23 PM, Christoph Hellwig wrote:
>
> > On Tue, Jun 05, 2012 at 09:20:30PM +0800, Liu Yuan wrote:
> >> You meant the requests from Guest could be overlapped? Disk is seen by
> >> guest OS as an array of blocks, I can't imagine any sane driver issue
> >> overlapped requests to disk drive controller.
> >
> > Yes, we've seen overlapping or duplicate requests from the OS.
> >
> > Assuming any sort of sanity from the OS is unfortunately not an option
> > in virtualized environments.
>
>
> This would be something odd, but we have to face up with. Well, then
> there might be problems with overlapped requests:
>
> Ordering:
> 1 over network, the requests seen by Sheepdog might not be the same as
> is from Guest.
> 2 how do we deal with overlapped read/write, write/write pair?
>
> Looks to me, even with locks, we can't cope with it.
The below change to the qemu block driver would fix the problem.
I think this kind of ordering should be done in the gateway node,
though.
diff --git a/block/sheepdog.c b/block/sheepdog.c
index 8877f45..68b35c0 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -645,6 +645,9 @@ static void coroutine_fn send_pending_req(BDRVSheepdogState *s, uint64_t oid, ui
if (QLIST_EMPTY(&acb->aioreq_head)) {
sd_finish_aiocb(acb);
}
+ } else {
+ /* send pending requests one by one */
+ return;
}
}
}
@@ -710,13 +713,6 @@ static void coroutine_fn aio_read_response(void *opaque)
s->inode.data_vdi_id[idx] = s->inode.vdi_id;
s->max_dirty_data_idx = MAX(idx, s->max_dirty_data_idx);
s->min_dirty_data_idx = MIN(idx, s->min_dirty_data_idx);
-
- /*
- * Some requests may be blocked because simultaneous
- * create requests are not allowed, so we search the
- * pending requests here.
- */
- send_pending_req(s, vid_to_data_oid(s->inode.vdi_id, idx), rsp.id);
}
break;
case AIOCB_READ_UDATA:
@@ -729,6 +725,13 @@ static void coroutine_fn aio_read_response(void *opaque)
break;
}
+ /*
+ * Some requests may be blocked because simultaneous requests to
+ * the same object are not allowed, so we search the pending
+ * requests here.
+ */
+ send_pending_req(s, vid_to_data_oid(s->inode.vdi_id, idx), rsp.id);
+
if (rsp.result != SD_RES_SUCCESS) {
acb->ret = -EIO;
error_report("%s", sd_strerror(rsp.result));
@@ -1580,7 +1583,7 @@ static int coroutine_fn sd_co_rw_vector(void *p)
aio_req = alloc_aio_req(s, acb, oid, len, offset, flags, old_oid, done);
- if (create) {
+ if (1) {
AIOReq *areq;
QLIST_FOREACH(areq, &s->outstanding_aio_head,
outstanding_aio_siblings) {
@@ -1589,10 +1592,9 @@ static int coroutine_fn sd_co_rw_vector(void *p)
}
if (areq->oid == oid) {
/*
- * Sheepdog cannot handle simultaneous create
- * requests to the same object. So we cannot send
- * the request until the previous request
- * finishes.
+ * Sheepdog cannot handle simultaneous requests to
+ * the same object. So we cannot send the request
+ * until the previous request finishes.
*/
aio_req->flags = 0;
aio_req->base_oid = 0;
More information about the sheepdog
mailing list