On 12/16/2011 04:57 PM, John Ryan wrote: > Ok I got it. All writes have all or nothing semantics. What happens to > writes that straddle chunks? How is that case handled? Because you could > be writing to two different set of replicas. > read/write requests are firstly handled in the qemu block layer. I guess request that straddle chunks will be split into several ones. Kazum, would you confirm this? Thanks, Yuan > On Thu, Dec 15, 2011 at 11:52 PM, Liu Yuan <namei.unix at gmail.com > <mailto:namei.unix at gmail.com>> wrote: > > On 12/16/2011 03:12 PM, John Ryan wrote: > > > Trying to get my feet wet with Sheepdog. I understand that all > vdisk are > > chunked at a 4MB boundary. Suppose a write of 64 bytes needs to happen > > at offset 390. This maps to the first 4MB chunk. In traditionally > block > > devices it is a read modify write where a read of 512 (assuming > this is > > the sector size) bytes occurs, sector that contains the offset, the > > value updated with the 64 bytes from offset 390 and then the buffer is > > written out to disk. How does this map to Sheepdog case? What is the > > equivalent of sector size and how many bytes are accumulated before a > > write is pushed out to disk? > > > > > Sheepdog uses system call pread/pwrite that do the object rw. So in your > case, sheep daedom just finds the targed replica and sends the requests > to the nodes. The sheep daemons running on those nodes will parse the > requests and call pread/pwrite finally. > > So this operations (read/modify/write) for partial rw rely on the > underlying file system, I guess normally as you described above. > > Thanks, > Yuan > > |