[Sheepdog] Block write size

Liu Yuan namei.unix at gmail.com
Fri Dec 16 10:10:19 CET 2011


On 12/16/2011 04:57 PM, John Ryan wrote:

> Ok I got it. All writes have all or nothing semantics. What happens to
> writes that straddle chunks? How is that case handled? Because you could
> be writing to two different set of replicas.
> 


read/write requests are firstly handled in the qemu block layer. I guess
request that straddle chunks will be split into several ones. Kazum,
would you confirm this?

Thanks,
Yuan

> On Thu, Dec 15, 2011 at 11:52 PM, Liu Yuan <namei.unix at gmail.com
> <mailto:namei.unix at gmail.com>> wrote:
> 
>     On 12/16/2011 03:12 PM, John Ryan wrote:
> 
>     > Trying to get my feet wet with Sheepdog. I understand that all
>     vdisk are
>     > chunked at a 4MB boundary. Suppose a write of 64 bytes needs to happen
>     > at offset 390. This maps to the first 4MB chunk. In traditionally
>     block
>     > devices it is a read modify write where a read of 512 (assuming
>     this is
>     > the sector size) bytes occurs, sector that contains the offset, the
>     > value updated with the 64 bytes from offset 390 and then the buffer is
>     > written out to disk. How does this map to Sheepdog case? What is the
>     > equivalent of sector size and how many bytes are accumulated before a
>     > write is pushed out to disk?
>     >
> 
> 
>     Sheepdog uses system call pread/pwrite that do the object rw. So in your
>     case, sheep daedom just finds the targed replica and sends the requests
>     to the nodes. The sheep daemons running on those nodes will parse the
>     requests and call pread/pwrite finally.
> 
>     So this operations (read/modify/write) for partial rw rely on the
>     underlying file system, I guess normally as you described above.
> 
>     Thanks,
>     Yuan
> 
> 





More information about the sheepdog mailing list