MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp> writes: > I'm not familiar with btrfs mount options, but if a raw image show a > good performance on the same file system, I think this is a problem of > Sheepdog. To be honest, I don't have the slightest idea why Sheepdog > shows such a bad results in your environment. [...] > I think a barrier option makes a huge difference. Hi Kazutaka. I've just reformatted with ext4 and performed an initial two tests: with and without barrier=0, using the existing triple-replicated configuration and using the default ext4 data=ordered mode. With barriers on (default ext4), I still see 5-6MB/s write performance on unallocated blocks, and around 10MB/s write performance rewriting those blocks once they've been allocated. Turning barriers off, this becomes more like 53MB/s (on unallocated blocks) and 60MB/s rewriting already allocated blocks. This is a very dramatic difference, as you predicted! The btrfs results are presumably 5-6MB/s because barriers are enabled by default there too. (There was no different between allocated and unallocated blocks in btrfs, presumably because CoW behaviour means blocks always have to be allocated afresh when writing to btrfs files, with the exception of O_DIRECT access that sheepdog doesn't use.) Turning off barriers essentially means that write-ordering to disk is no longer guaranteed in a power-failure situation. Isn't this likely to cause corruption on a sheepdog cluster in the same way as a traditional filesystem, or are there mechanisms in place to avoid this effect at a higher level? What is it that sheepdog does that makes barriers so prohibitively expensive (factor of 10+) on sheep file stores? Very heavy filesystem metadata update? Cheers, Chris. |