[sheepdog-users] Stability problems with kvm using a remote sheepdog volume

MORITA Kazutaka morita.kazutaka at lab.ntt.co.jp
Thu Jun 14 12:37:18 CEST 2012


At Thu, 14 Jun 2012 00:19:06 +0200,
David Douard wrote:
> 
> On 13/06/2012 19:28, David Douard wrote:
> [snip]
> >>>
> >>> I updated the qemu tree, can you try again?  I also recommend to
> >>> update your sheepdog code to the latest one because a fatal network
> >>> I/O problem was fixed last week.
> >>
> >>
> >> My objective is to be able to propose a patch to debian and ubuntu so
> >> they can fix the qemu-kvm they chip (patched 1.0.1), so it can be
> >> quickly made available to everyone, so sheepdog can be actually use on
> >> these platforms (which it cannot for now).
> >>
> >> So I'd like to fix this issue in the kvm 1.0.1 tree.
> >>
> >> Regarding sheepdog itself, I have a work in progress to make sheepdog
> >> 0.3.0 (and I expect 0.4.0 soon) available un debian (in backports ASAP).
> >> If I fond time, I'll try to see if I can backport the network IO fix in
> >> 0.3.0 to provide more stable debian packages.
> >>
> >>>>
> >>>> If I can, I'd like to try to rebuild the kvm binary from the ubuntu
> >>>> package, just applying the required patches to fix the race condition.
> >>>> Kazataka, can you please point me the strictly required changesets in
> >>>> your git repo I must apply as patches?
> >>
> >> Sorry for the typo i your name,
> >>
> >>> The required patches are:
> >>>   54de366 sheepdog: avoid sleep while traversing pending_list
> >>>   3585170 sheepdog: split outstanding list into inflight and pending
> >>>   b319e0a sheepdog: create all aio_reqs before sending I/Os
> >>>   fead1e7 sheepdog: restart I/O when socket becomes ready in do_co_req()
> >>>   72eafcf sheepdog: fix dprintf format strings
> >>>
> >>
> >> thanks,
> >>
> >>> But I'm not sure you can apply them cleanly.  I think it is easier to
> >>> copy block/sheepdog.c to your source tree.
> >>
> >> I can't do that according my objective (a patch for kvm 1.0). As
> >> expected, they do not apply, and I'm not sure I can find by myself
> >> (having never digged in kvm/qemu source code) a way to apply this fix in
> >> the 1.0 tree.
> >>
> >> I'll try, however I'll appreciate any help.
> > 
> > Ok I've applied b319e0a, 3585170 and 54de366 on the sources from ubuntu
> > qemu-kvm package (qemu-kvm-1.0+noroms-0ubuntu13) and rebuilt deb
> > packages for precise. I have a running kvm that seems stable for now
> > (dbench test OK, bonnie++ currently in progress) running on my laptop.
> > I've also launch these tests on my openstack cluster.
> 
> Here is the situation:
> 
> the kvm guest seems much more stable (sheepdog at 0ae0ad, kvm using
> ubuntu precise source package (1.0+noroms-0ubuntu13) on which I applied
> the 3 patches listed above). I could create a filesystem on it, I could
> run a complete bonnie++ test on it, but a pgbench (postgresql 9.1 with a
> cluster created on the sheepdog device) did make the kvm segfault.

Are there any errors in sheep.log?  Could you generate and send a
stack trace of the kvm segfault?

> 
> In this test, the sheepdog and the kvm are on the same machine.

I'll create the same environment and give it a try this weekend.  (I
don't have enough time to do it today and tomorrow, sorry.)

Thanks,

Kazutaka



More information about the sheepdog-users mailing list