[Sheepdog] [PATCH v2] sheepdog: implement SD_OP_FLUSH_VDI operation

MORITA Kazutaka morita.kazutaka at gmail.com
Sat Mar 31 07:03:36 CEST 2012


At Sat, 31 Mar 2012 11:48:07 +0800,
Liu Yuan wrote:
> 
> On 03/31/2012 12:17 AM, MORITA Kazutaka wrote:
> 
> > It might be better to ignore BDRV_O_NOCACHE here because:
> > 
> >  - When writeback is enabled, we always use a cache.  And when
> >    writeback is disabled, we don't use a cache at all.  This means
> >    that users cannot specify whether to use a cache.
> > 
> >  - I think qemu users expect a better performance if cache=none, which
> >    means BDRV_O_NOCACHE | BDRV_O_CACHE_WB, is specified
> > 
> 
> 
> I have to admit that this is my first time understanding that
> cache=none, means a cache with DIO mode.
> 
> So my question is what is a cache with DIO mode?

E.g. a volatile write cache of the physical disk.

> 
> I gave a gimps over the code
> 
>     /* Use O_DSYNC for write-through caching, no flags for write-back
> caching,
>      * and O_DIRECT for no caching. */
>     if ((bdrv_flags & BDRV_O_NOCACHE))
>         s->open_flags |= O_DIRECT;
>     if (!(bdrv_flags & BDRV_O_CACHE_WB))
>         s->open_flags |= O_DSYNC;
> 
> For BDRV_O_NOCACHE, it means no need of kernel's page cache. I don't
> think there is any 'writeback' cache existing with cache=none mode, so
> 'better performance' doesn't make sense if we have extra memory in host
> that can be used as page cache.

O_DIRECT bypasses the page cache, but not the other ones like a disk
write cache.  We need to add O_DSYNC to flush data completely.

On my environment, I can surely confirm it:

* benchmark with a disk write cache

  # hdparm -W 1 /dev/sdb
  
  /dev/sdb:
   setting drive write-caching to 1 (on)
   write-caching =  1 (on)

  # dd if=/dev/zero of=/dev/sdb5 bs=1M count=64 oflag=direct
  64+0 records in
  64+0 records out
  67108864 bytes (67 MB) copied, 0.974981 s, 68.8 MB/s

  # dd if=/dev/zero of=/dev/sdb5 bs=1M count=64 oflag=direct,dsync
  64+0 records in
  64+0 records out
  67108864 bytes (67 MB) copied, 1.62426 s, 41.3 MB/s


* benchmark without a disk write cache

  # hdparm -W 0 /dev/sdb
  
  /dev/sdb:
   setting drive write-caching to 0 (off)
   write-caching =  0 (off)

  # dd if=/dev/zero of=/dev/sdb5 bs=1M count=64 oflag=direct
  64+0 records in
  64+0 records out
  67108864 bytes (67 MB) copied, 2.13579 s, 31.4 MB/s

  # dd if=/dev/zero of=/dev/sdb5 bs=1M count=64 oflag=direct,dsync
  64+0 records in
  64+0 records out
  67108864 bytes (67 MB) copied, 2.1628 s, 31.0 MB/s

> 
> Further more, so for users, if setting cache=none or cache=off(yes, code
> tells me that we can pass 'off' to qemu', means our object cache is
> enabled ! Do you ever expect this behaviour as a ordinary user ?

cache=none and cache=off only mean that QEMU doesn't use a page cache.
It is a bit confusing but the way QEMU use it.

> 
> I don't think QEMU's cache mode is well received, especially cache=none
> means 'DRV_O_NOCACHE | BDRV_O_CACHE_WB'. what does it mean literally?
> Hmm, do not gimme a cache but a writeback cache please?
> 
> >  - I guess qemu users expect that if BDRV_O_NOCACHE is set, O_DIRECT
> >    is used for file I/Os.
> > 
> >  - If we ignore BDRV_O_NOCACHE here, we can use qemu-iotests for
> >    Sheepdog cache tests with the following command:
> > 
> >      $ check -sheepdog -nocache
> > 
> >    where -nocache means BDRV_O_NOCACHE | BDRV_O_CACHE_WB.
> > 
> 
> 
> I am confused by 'ignoring DRV_O_NOCACHE'. Actually, I don't care qemu's
> flags about cache mode. All I want is, a control with binary semantics
> that can disable/enable object cache in sheepdog.
> 
> Any better scheme?
> currently, cache=writeback enables it, and others disables it.

My suggestion is:
 - use a sheepdog object cache when cache=writeback or cache=none,
   which means when BDRV_O_CACHE_WB is set
 - don't use a object cache when cache=directsync or cache=writethrough,
   which means BDRV_O_CACHE_WB is not set

Thanks,

Kazutaka



More information about the sheepdog mailing list