At Tue, 15 Nov 2011 07:58:30 -0500, Christoph Hellwig wrote: > > On Tue, Nov 15, 2011 at 08:47:24PM +0900, MORITA Kazutaka wrote: > > The key idea in the above link is that, when writeback is enabled, a > > gateway node can send write responses to VMs before replicating data > > to storage nodes. Note that VM sends write requests to one of > > Sheepdog nodes (gateway node) first, and then the node replicates data > > to multiple nodes (storage nodes). Even if we use this approach, the > > gateway node can send the unstable write requests to the storage nodes > > ASAP before receiving flush requests. I think this reduces the write > > latency when we use Sheepdog on the WAN environment. > > Okay, now I understand the idea. Yes, this sounds like a useful idea > to me. > > > If the gateway node writes data to the mmaped area before sending > > responses to VMs, we can regard the local mmapped file as Sheepdog > > disk cache - this is what I meant in the above link. This approach > > may also reduce the read latency on the WAN environment. > > Any idea why you care about a mmaped area specifically? shared > writeable mmaps are a horrible I/O interface, most notably they don't > allow for any kind of error handling. I would absolutely advice against > using them for clustered storage. It just looked simple to create a whole disk image file and use it with mmap() as a disk cache, but probably it would be a bad idea. > > Except for that the idea sounds fine - I suspect making the gateway > node use the same storage mechanism as "normal" endpoint nodes is going > to both make the code simpler and easier to debug. It is difficult to use the gateway node as a "normal" storage one because a VM can use a different gateway after restarting or migrating. The VM cannot find the previous gateway which has one of the replicated data. If we you use the gateway as a "temporal" storage node to store one extra replicated data for caching, it is very easy. Thanks, Kazutaka |