[Sheepdog] [PATCH v5 4/8] sheep: teach sheep to use object cache

Liu Yuan namei.unix at gmail.com
Mon Mar 26 09:07:20 CEST 2012


On 03/26/2012 01:39 PM, MORITA Kazutaka wrote:

> At Mon, 26 Mar 2012 12:15:29 +0800,
> Liu Yuan wrote:
>>
>> On 03/26/2012 04:41 AM, MORITA Kazutaka wrote:
>>
>>> Consider that the object is dirty if the VDI was opened with a
>>> writeback mode in the previous time.  I think we need to check a cache
>>> here.
>>
>>
>> For the sane usage, VMs should not be opened both for none-cache and
>> cache mode off and on. Even if VM first open the vdi with cache mode
>> first, then want to use none-cache mode, current implementation would be
>> okay, since after the guest was closed, the dirty bits would be flushed
>> to the cluster.
>>
>> So the problem is precisely that we can't handle the situation that VM
>> runs with cache first, then crashes, then boots up again without cache.
>> We'd better use qemu-io to issue a flush requests first manually, then
>> we can safe boot the VM without cache, not losing any dirty data.
>>
>> So I don't think it worth introducing complexity to object cache to
>> handle this corner case that already should be handled by other means.
> 
> We can more easily hit this problem.  For example:
> 
>  $ qemu-img convert linux.raw sheepdog:linux
> 
> where linux.raw is a linux image file, and
> 
>  $ qemu sheepdog:linux
> 
> We cannot boot the linux VDI because qemu-img opens with a
> BDRV_O_CACHE_WB always, but qemu opens without it by default.
> 


I think this is more a bug of qemu-img than of object cache. We should
fix it at qemu-img level.

If it is default to open the file as BDRV_O_CACHE_WB, its qemu-img's
responsibility to flush the dirty bits. As a temporary fix, we can use
qemu-io -c "flush" sheepdog:linux
before launch the VM.

> I think you are mixing two things in this patchset.
>   1. use a disk cache (this should be disabled by BDRV_O_NOCACHE)
>   2. support a writeback mode (this should be enabled by BDRV_O_CACHE_WB)
> 


This depends how you define what disk cache does. I don't follow qemu's
definition of cache mode(none, write through, write back), and I think
it simply complicates the usage and code. Even
http://www.linux-kvm.org/page/Tuning_KVM suggests that use cache=none
mode without convincing explanation.

So for object cache, the notion is simple, none or have it.
cache=writeback is somewhat misleading, would be better scripted as
'cache=on', but I can't find any better flag to activate object cache
from qemu.


> If you think it doesn't make sense to support a writethrough mode with
> a disk cache, I'm not against to the current approach in this
> patchset.  But at least, please flush a dirty cache before handling
> non-cache requests.  Otherwise, we can easily hit a cache coherency
> problem (e.g. 'collie vdi list' doesn't show the correct used size).
> 


Yes, for vdi operation, we need flush first.

Thanks,
Yuan



More information about the sheepdog mailing list