[sheepdog-users] Stability problems with kvm using a remote sheepdog volume

David Douard david.douard at logilab.fr
Wed Jun 13 15:37:52 CEST 2012


On 12/06/2012 07:04, MORITA Kazutaka wrote:
> At Mon, 11 Jun 2012 15:22:44 +0200,
> David Douard wrote:
>>
>> On 09/06/2012 12:39, David Douard wrote:
>>> On 08/06/2012 16:48, MORITA Kazutaka wrote:
>>>> On Fri, Jun 8, 2012 at 9:41 PM, David Douard <david.douard at logilab.fr> wrote:
>>>>> Hi,
>>>>>
>>>>> I still have very serious stability problems with kvm when using remote
>>>>> sheepdog access.
>>>>>
>>>>> I filles a bug on github about this:
>>>>>
>>>>>  https://github.com/collie/sheepdog/issues/26
>>>>>
>>>>> Are there any other people having similar problems? What can I do to
>>>>> identify the problem and try to fix it?
>>>> Hi David,
>>>>
>>> Hi,
>>>
>>>> I'm working on fixing a race condition in the qemu sheepdog block driver.
>>>> I guess you are hitting the same problem.  I've pushed some half baked fixes to
>>>>   git://github.com/kazum/qemu.git
>>>>
>>>> Can you try this tree?
>>> I will.
>>>
>>> Thanks,
>>> David
>>
>> Humm, spoke a bit too quick.
>>
>> The kvm does not segfault any more, but the sheepdog volume generates
>> errors (in the guest) when writing. I have many
>>
>>   end_request: I/O error, dev vdc, sector 0
>>
>> in the syslog of the guest (vdc being the block device served by sheepdog).
>>
>> Running "zcav -w",  the guest freezed for a while, and finally produced
>> the traceback below.
> 
> I updated the qemu tree, can you try again?  I also recommend to
> update your sheepdog code to the latest one because a fatal network
> I/O problem was fixed last week.


My objective is to be able to propose a patch to debian and ubuntu so
they can fix the qemu-kvm they chip (patched 1.0.1), so it can be
quickly made available to everyone, so sheepdog can be actually use on
these platforms (which it cannot for now).

So I'd like to fix this issue in the kvm 1.0.1 tree.

Regarding sheepdog itself, I have a work in progress to make sheepdog
0.3.0 (and I expect 0.4.0 soon) available un debian (in backports ASAP).
If I fond time, I'll try to see if I can backport the network IO fix in
0.3.0 to provide more stable debian packages.

>>
>> If I can, I'd like to try to rebuild the kvm binary from the ubuntu
>> package, just applying the required patches to fix the race condition.
>> Kazataka, can you please point me the strictly required changesets in
>> your git repo I must apply as patches?

Sorry for the typo i your name,

> The required patches are:
>   54de366 sheepdog: avoid sleep while traversing pending_list
>   3585170 sheepdog: split outstanding list into inflight and pending
>   b319e0a sheepdog: create all aio_reqs before sending I/Os
>   fead1e7 sheepdog: restart I/O when socket becomes ready in do_co_req()
>   72eafcf sheepdog: fix dprintf format strings
> 

thanks,

> But I'm not sure you can apply them cleanly.  I think it is easier to
> copy block/sheepdog.c to your source tree.

I can't do that according my objective (a patch for kvm 1.0). As
expected, they do not apply, and I'm not sure I can find by myself
(having never digged in kvm/qemu source code) a way to apply this fix in
the 1.0 tree.

I'll try, however I'll appreciate any help.

Thanks,

David

> 
> Thanks,
> 
> Kazutaka

-------------- next part --------------
A non-text attachment was scrubbed...
Name: david_douard.vcf
Type: text/x-vcard
Size: 246 bytes
Desc: not available
URL: <http://lists.wpkg.org/pipermail/sheepdog-users/attachments/20120613/e27ef92d/attachment-0004.vcf>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 262 bytes
Desc: OpenPGP digital signature
URL: <http://lists.wpkg.org/pipermail/sheepdog-users/attachments/20120613/e27ef92d/attachment-0003.sig>


More information about the sheepdog-users mailing list