[sheepdog-users] object cache questions

Maxim Terletskiy terletskiy at emu.ru
Mon Feb 17 13:51:13 CET 2014


Made some more tests. VM with WT cache dying with messages:
kernel: qemu-kvm[12449] general protection ip:7fdb692af44c 
sp:7fdb73b6d4e0 error:0 in qemu-kvm[7fdb690ce000+431000]
No messages in sheep log.

VMs with writeback cache works OK even under heavy load.

Sheep version 0.7.6, qemu 1.7.0, using zookeeper, cluster formatted with 
"--copies=2", using dual-nic.

Andrew J. Hobbs wrote me that his tests with WT cache was unsuccessfully 
too and ended with similar results.

As Anrew observed "nodiratime" is included in "noatime" option of ext4. 
So looks like that my previous assemption about "nodiratime" mount 
option problem was wrong. The problem is in writethrough cache mode.

I'm curious is there someone who successfully using wt cache with 
sheepdog? Maybe I'm doing something wrong?

P.S.: Update to sheep 0.7.7 changes nothing. :(


05.02.2014 14:19, Maxim Terletskiy ?????:
> We are using cache on separate devices. Root partition mounted without 
> "user_xattr", cache now mounted with "noatime,user_xattr". Checked 
> once again on another compute node. VMs still dying (qemu segfaults) 
> if object cache mounted with "noatime,nodiratime,user_xattr" options 
> even with qemu 1.7.0. Very strange.
>
> Thank you very much for information about auto reconnect. We will 
> update qemu to 1.7.0 now.
>
> What type of cache are you using(wb/wt)? Are you using ssd for object 
> cache in raid1/10 or as separate disks/raid0? Have you dealt with 
> cache device failure?
>
> 03.02.2014 18:29, Andrew J. Hobbs ?????:
>> http://lwn.net/Articles/245002/
>>
>> noatime automatically implies nodiratime.  I'm curious as to what else
>> might be going on.  Are you running the cache on the same drive as the
>> sheepdog store?  Also, updating to Qemu 1.7 will support the automatic
>> reconnect so your VMs don't spontaneously die.
>>
>> Our production cluster uses ext4 with noatime, user_xattr on 0.7.6 with
>> Qemu 1.7.  I only run object cache on machines with a SSD (3 of the 6
>> nodes), and separate from sheepdog data disk. Performance is very good.
>> I have a dedicated virtual machine for user home directories over nfs,
>> which hits 50MB/s write, 80MB/s read over gigabit interconnects.
>>
>> What I definitely recommend is do not use btrfs with sheepdog, the IO
>> ops per second fall through the floor while writing data.  I'd look to
>> other potential issues as to why you have nodes dropping out.
>>
>> On 02/01/2014 03:09 PM, Maxim Terletskiy wrote:
>>> Hi everybody!
>>>
>>> First of all I'd like to share my experience in object cache usage.
>>> Noticed that with object cache on ext4 volume mounted with
>>> "noatime,nodiratime,user_xattr" mount options my VMs dying in short
>>> time after start(qemu processes crashing with segfaults). I spent some
>>> days reading wiki, searching net and rebuilding qemu in attempts to
>>> understand the reason before found that "nodiratime" is root of the
>>> evil. Now using ext4 with "noatime,user_xattr" and looks like
>>> everything is ok. Hope this information would be usefull for somebody.
>>>
>>> Now I'm curious about object cache failover. What happen if volume
>>> with cache will fail? Will sheep and VMs live or will they die?
>>>
>>> If they will live is there any way to replace failed volume without
>>> restarting sheep process (I found that with restart of sheep process
>>> VMs connected to it dying) or migrating VMs to other host?
>>>
>>> Is there any difference in behavior in this cases between 0.7.6 and
>>> 0.8.0?
>>>
>>> P.S.: Using sheep version 0.7.6 and qemu 1.6.x.
>>
>>
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wpkg.org/pipermail/sheepdog-users/attachments/20140217/149ac3c1/attachment-0005.html>


More information about the sheepdog-users mailing list