[sheepdog-users] object cache questions

Maxim Terletskiy
Wed Feb 5 11:19:39 CET 2014

We are using cache on separate devices. Root partition mounted without 
"user_xattr", cache now mounted with "noatime,user_xattr". Checked once 
again on another compute node. VMs still dying (qemu segfaults) if 
object cache mounted with "noatime,nodiratime,user_xattr" options even 
with qemu 1.7.0. Very strange.

Thank you very much for information about auto reconnect. We will update 
qemu to 1.7.0 now.

What type of cache are you using(wb/wt)? Are you using ssd for object 
cache in raid1/10 or as separate disks/raid0? Have you dealt with cache 
device failure?

03.02.2014 18:29, Andrew J. Hobbs ?????:
> http://lwn.net/Articles/245002/
> noatime automatically implies nodiratime.  I'm curious as to what else
> might be going on.  Are you running the cache on the same drive as the
> sheepdog store?  Also, updating to Qemu 1.7 will support the automatic
> reconnect so your VMs don't spontaneously die.
> Our production cluster uses ext4 with noatime, user_xattr on 0.7.6 with
> Qemu 1.7.  I only run object cache on machines with a SSD (3 of the 6
> nodes), and separate from sheepdog data disk. Performance is very good.
> I have a dedicated virtual machine for user home directories over nfs,
> which hits 50MB/s write, 80MB/s read over gigabit interconnects.
> What I definitely recommend is do not use btrfs with sheepdog, the IO
> ops per second fall through the floor while writing data.  I'd look to
> other potential issues as to why you have nodes dropping out.
> On 02/01/2014 03:09 PM, Maxim Terletskiy wrote:
>> Hi everybody!
>> First of all I'd like to share my experience in object cache usage.
>> Noticed that with object cache on ext4 volume mounted with
>> "noatime,nodiratime,user_xattr" mount options my VMs dying in short
>> time after start(qemu processes crashing with segfaults). I spent some
>> days reading wiki, searching net and rebuilding qemu in attempts to
>> understand the reason before found that "nodiratime" is root of the
>> evil. Now using ext4 with "noatime,user_xattr" and looks like
>> everything is ok. Hope this information would be usefull for somebody.
>> Now I'm curious about object cache failover. What happen if volume
>> with cache will fail? Will sheep and VMs live or will they die?
>> If they will live is there any way to replace failed volume without
>> restarting sheep process (I found that with restart of sheep process
>> VMs connected to it dying) or migrating VMs to other host?
>> Is there any difference in behavior in this cases between 0.7.6 and
>> 0.8.0?
>> P.S.: Using sheep version 0.7.6 and qemu 1.6.x.

