[sheepdog] [PATCH v0, RFC] sheep: writeback cache semantics in backend store

Mon Aug 20 07:51:08 CEST 2012

On 08/18/2012 11:36 PM, Hitoshi Mitake wrote:
> Thanks for your comments, Liu and Christoph.
> 
> On Sat, Aug 18, 2012 at 3:09 AM, Christoph Hellwig <hch at infradead.org> wrote:
>> On Fri, Aug 17, 2012 at 11:01:41AM +0800, Liu Yuan wrote:
>>>> +static int syncfs(int fd)
>>>> +{
>>>> +   return syscall(__NR_syncfs, fd);
>>>> +}
>>>> +
>>>
>>> syncfs only appeared in Linux 2.6.39, rather new, I think most of
>>> systems won't support it.
>>
>> Yeah, we probably need to test that it works during startup, similar to
>> what we do for xattrs.
> 
> Yes. I have to modify build process for checking the existence of syncfs().
> It seems that check with ENOSYS is also required.
> 
>>
>> Some systems use a plain sync for fallback, and we could also do that.
>> The only issue is that it can lead to deadlocks if the client is running
>> on the same systems as the actual backend storage.
> 
> I don't know well about this problem. How can be the deadlocks caused?
> 
>>
>>> With current implementation, you only flush inode object, not all the
>>> data objects belong to targeted VDI. I think this is hardest part to
>>> implement. Probably you need to call exec_local_request() to take
>>> advantage of retry mechanism.
>>
>> Yes, that also was the issue last time I looked into it.  One simple
>> approach could be to simply forward the flush to all nodes.
>>
> 
> Thanks for your advices. I'll try to solve this problem in the next version.
> 

I am suspicious that this approach is really useful:

 1. suppose we can only call sync() to flush page cache on each node.
With a cluster that runs hundred of images, the sync() request will be
issued almost every second, this kind of request storm will put this
idea to useless, compared to SYNC open flag.
 2 even if we are working with syncfs(), the benefit will be offset by
the complexity to find all the location of specified VDI and send
requests one by one

Thanks,
Yuan