[sheepdog] [PATCH 0/2] add cache options 'page' and 'unsafe'

迪八哥 xiaoxichen at qq.com
Mon Oct 22 14:35:05 CEST 2012


------------------
From a user's view,I have totally confused about how many kinks of caches are there in sheepdog and how many configurations can be used for cache...


We are exploring Ceph now, and it shows a better performance over sheepdog especially for sequential R/W and random write.I think the Sheepdog and Ceph share a similar internal design(i.e split the image to 4M object, some kinds of consistent-hash has been used).What we think most important is **journal disk** in storage node,Ceph's performance boost up to  3X when an extra 7200rpm sata disk was used for journal.Will sheepdog consider some similar mechanism?


Thanks. 




------------------ Original ------------------
From:  "MORITA Kazutaka"<morita.kazutaka at lab.ntt.co.jp>;
Date:  Mon, Oct 22, 2012 02:26 PM
To:  "Liu Yuan"<namei.unix at gmail.com>; 
Cc:  "sheepdog"<sheepdog at lists.wpkg.org>; 
Subject:  Re: [sheepdog] [PATCH 0/2] add cache options 'page' and 'unsafe'



At Mon, 22 Oct 2012 13:49:43 +0800,
Liu Yuan wrote:
> 
> On 10/22/2012 01:33 PM, MORITA Kazutaka wrote:
> > I don't intend to avoid a flush completely with 'unsafe' mode.  The
> > difference between 'page' and 'unsafe' is that sheep doesn't call
> > syncfs even if a VM sends a flush request.
> > 
> 
> If disk is failed, I don't think buffered read/write will succeed
> because we will fail to open the fd. So your rationale about unsafe
> seems useless: no one will actually use it.

Actually we NTT would use it.  We have data centers which can supply
reliable power and we can regard that, if data is replicated to
multiple memories, the data is safe.  If disk is failed with 'unsafe'
mode, the node will be completely replaced with a new machine and
there is no risk of reading invalid data.

> 
> I think 'page' and 'unsafe' can be merged into one mode, which indicates
> use page cache for storage backend IO.
> 
> >> > 
> >> > I think your patch set is going to finer-control the fd flags for page
> >> > cache. So I think we can control it via disk cache, like disk:pagecache.
> >> > 
> >> > I am more concerned that the cache mode setting might look too
> >> > complicated to end users.
> > 'disk' means a disk cache of a local disk, so disk:pagecache looks
> > strange to me.  I think we should more descriptive names for caches.
> > There are two kinds of sheepdog caches; one uses a memory on storage
> > nodes and the other one uses a memory on gateway.  How about the
> > following?:
> > 
> 
> I think 'pagecache' is kind of straight forward and descriptive. Both
> 'page' and 'unsafe' will need additional explanation to what it really
> controls.
> 
> why pagecache to disk is strange? I think most users of Linux will be
> familiar with page cache much more than other cache, for e.g, QEMU cache
> mode. If you are concerned that users might be confused with 'disk
> cache' and 'page cache', I'd suggest a new naming, 'gateway' for client
> side cache as you suggest, and 'backend' for cluster side cache.

A disk cache is controlled by a hardware (local disk) and a page cache
is by Linux.  I thought that using a two different things into one
name 'disk:pagecache' looks strange.

Anyway, using 'gateway'/'backend' and controlling fd flags with
backend options (e.g. backend:pagecache) look better to me.

Thanks,

Kazutaka

> 
> I think direct exposure of which side of settings to take effect is
> better than umbrella all the settings inside.
> 
> Thanks,
> Yuan
> 
> -- 
> sheepdog mailing list
> sheepdog at lists.wpkg.org
> http://lists.wpkg.org/mailman/listinfo/sheepdog
-- 
sheepdog mailing list
sheepdog at lists.wpkg.org
http://lists.wpkg.org/mailman/listinfo/sheepdog
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wpkg.org/pipermail/sheepdog/attachments/20121022/026edc65/attachment-0003.html>


More information about the sheepdog mailing list