[Sheepdog] [PATCH v5 4/8] sheep: teach sheep to use object cache
Liu Yuan
namei.unix at gmail.com
Mon Mar 26 09:07:20 CEST 2012
On 03/26/2012 01:39 PM, MORITA Kazutaka wrote:
> At Mon, 26 Mar 2012 12:15:29 +0800,
> Liu Yuan wrote:
>>
>> On 03/26/2012 04:41 AM, MORITA Kazutaka wrote:
>>
>>> Consider that the object is dirty if the VDI was opened with a
>>> writeback mode in the previous time. I think we need to check a cache
>>> here.
>>
>>
>> For the sane usage, VMs should not be opened both for none-cache and
>> cache mode off and on. Even if VM first open the vdi with cache mode
>> first, then want to use none-cache mode, current implementation would be
>> okay, since after the guest was closed, the dirty bits would be flushed
>> to the cluster.
>>
>> So the problem is precisely that we can't handle the situation that VM
>> runs with cache first, then crashes, then boots up again without cache.
>> We'd better use qemu-io to issue a flush requests first manually, then
>> we can safe boot the VM without cache, not losing any dirty data.
>>
>> So I don't think it worth introducing complexity to object cache to
>> handle this corner case that already should be handled by other means.
>
> We can more easily hit this problem. For example:
>
> $ qemu-img convert linux.raw sheepdog:linux
>
> where linux.raw is a linux image file, and
>
> $ qemu sheepdog:linux
>
> We cannot boot the linux VDI because qemu-img opens with a
> BDRV_O_CACHE_WB always, but qemu opens without it by default.
>
I think this is more a bug of qemu-img than of object cache. We should
fix it at qemu-img level.
If it is default to open the file as BDRV_O_CACHE_WB, its qemu-img's
responsibility to flush the dirty bits. As a temporary fix, we can use
qemu-io -c "flush" sheepdog:linux
before launch the VM.
> I think you are mixing two things in this patchset.
> 1. use a disk cache (this should be disabled by BDRV_O_NOCACHE)
> 2. support a writeback mode (this should be enabled by BDRV_O_CACHE_WB)
>
This depends how you define what disk cache does. I don't follow qemu's
definition of cache mode(none, write through, write back), and I think
it simply complicates the usage and code. Even
http://www.linux-kvm.org/page/Tuning_KVM suggests that use cache=none
mode without convincing explanation.
So for object cache, the notion is simple, none or have it.
cache=writeback is somewhat misleading, would be better scripted as
'cache=on', but I can't find any better flag to activate object cache
from qemu.
> If you think it doesn't make sense to support a writethrough mode with
> a disk cache, I'm not against to the current approach in this
> patchset. But at least, please flush a dirty cache before handling
> non-cache requests. Otherwise, we can easily hit a cache coherency
> problem (e.g. 'collie vdi list' doesn't show the correct used size).
>
Yes, for vdi operation, we need flush first.
Thanks,
Yuan
More information about the sheepdog
mailing list