[sheepdog] [Qemu-devel] [PATCH] sheepdog: serialize requests to overwrapping area
Liu Yuan
namei.unix at gmail.com
Tue Jul 28 16:31:32 CEST 2015
On Mon, Jul 27, 2015 at 11:23:02AM -0400, Jeff Cody wrote:
> On Sat, Jul 18, 2015 at 01:44:24AM +0900, Hitoshi Mitake wrote:
> > Current sheepdog driver only serializes create requests in oid
> > unit. This mechanism isn't enough for handling requests to
> > overwrapping area spanning multiple oids, so it can result bugs like
> > below:
> > https://bugs.launchpad.net/sheepdog-project/+bug/1456421
> >
> > This patch adds a new serialization mechanism for the problem. The
> > difference from the old one is:
> > 1. serialize entire aiocb if their targetting areas overwrap
> > 2. serialize all requests (read, write, and discard), not only creates
> >
> > This patch also removes the old mechanism because the new one can be
> > an alternative.
> >
Okay, I figured out what the problem is myself and allow me to try to make it
clear to non-sheepdog devs:
sheepdog volume is thin-provision, so for the first write, we create the object
internally, meaning that we need to handle write in two case:
1. write to non-allocated object, create it then update inode, so in this case
two request will be generated: create(oid), update_inode(oid_to_inode_idx)
2. write the allocated object, just write(oid).
Current sheepdog driver use a range update_inode(min_idx, max_idx) for batching
the updates. But there is subtle problem by determining min_idx and max_idx:
for a single create request, min_idx == max_idx, so actually we just update one
one bit as expected.
Suppose we have 2 create request, create(10) and create(20), then min == 10,
max==20 even though we just need to update index 10 and index 20, update_inode(10,20)
will actually update range from 10 to 20. This would work if all the update_inode()
requests won't overlap. But unfortunately, this is not true for some corner case.
So the problem arise as following:
req 1: update_inode(10,20)
req 2: update_inode(15,22)
req 1 and req 2 might have different value between [15,20] and cause problems.
Based on above analysis, I think the real fix is to fix update_inode(), not
serialize all the requests in overkill way. The fix would be easy, considering
most update_inode() update only 1 index, we could just make update_inode a
single bit updater, not a range one, in which way we don't affect performance
as the above patch did. (I actually suspect that the above patch might not solve
the problem because update_inode() can overlap even with the patch).
If everyone agrees with my analysis, I'll post the fix.
Thanks,
Yuan
More information about the sheepdog
mailing list