[sheepdog-users] Sheepdog image fills only one node

Mon Jul 2 09:18:57 CEST 2012

On Sat, Jun 30, 2012 at 12:53:13AM +0800, Liu Yuan wrote:
> On 06/29/2012 11:05 PM, Stefan Priebe - Profihost AG wrote:
> > # kvm ... -drive
> > file=sheepdog:10.0.255.100:7000:testsheep,if=none,id=drive-virtio0,cache=writeback,aio=native
> > -device virtio-blk-pci,drive=drive-virtio0,id=virtio0
> 
> cache=writeback enables object cache. See wiki about object cache, which
> only flush dirty bits to the cluster.

This is just another reason why enabling the object cache by defaul is
wrong.  In my opinion making sure it is not enabled by defauly is a high
priority for the 0.4.0 release.

Reasons:

 - While qemu still defaults to cache=writethrough all management tools
   that people actually use (most importantly libvirt) change that to
   cache=none
 - With cache=none the new sheepdog version will get semantics that
   people absolutely do not expect from a distributed block storage
   system:
	(1) data is not striped over different nodes for the actual
            write, thus not getting any scale out for big streaming
	    writes
	(2) data is not written back to the cluster until a cache
	    flush happens, thus causing havoc with restarting a
	    VM on a different node when one node crashes

That beeing said I really like the object cache for some specific
workloads, mostly in complete read-only mode for VDI COW base images,
and even in write mode for cloud deployments like as a replacement
for the default openstack semantics where images get downloaded to
a local host and exectuted there.  But to make them useful for these
use cases the cache needs to default to off, and there needs to be a
sheep-side configuration to enable it for each VDI.  A good reason for
that is for example when we to enable it for the base image but not
the overlay which isn't even possible from qemu even if we wanted to
go through all the hoops instead of making things work out of the box.