[Sheepdog] [Qemu-devel] coroutine bug?, was Re: [PATCH] sheepdog: use coroutines

Stefan Hajnoczi stefanha at gmail.com
Mon Jan 2 23:38:11 CET 2012


On Mon, Jan 2, 2012 at 3:39 PM, Christoph Hellwig <hch at lst.de> wrote:
> On Fri, Dec 30, 2011 at 10:35:01AM +0000, Stefan Hajnoczi wrote:
>> If you can reproduce this bug and suspect coroutines are involved then I
>
> It's entirely reproducable.
>
> I've played around a bit and switched from the ucontext to the gthreads
> coroutine implementation.  The result seems odd, but starts to make sense.
>
> Running the workload I now get the following message from qemu:
>
> Co-routine re-entered recursively
>
> and the gdb backtrace looks like:
>
> (gdb) bt
> #0  0x00007f2fff36f405 in *__GI_raise (sig=<optimized out>)
>    at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
> #1  0x00007f2fff372680 in *__GI_abort () at abort.c:92
> #2  0x00007f30019a6616 in qemu_coroutine_enter (co=0x7f3004d4d7b0, opaque=0x0)
>    at qemu-coroutine.c:53
> #3  0x00007f30019a5e82 in qemu_co_queue_next_bh (opaque=<optimized out>)
>    at qemu-coroutine-lock.c:43
> #4  0x00007f30018d5a72 in qemu_bh_poll () at async.c:71
> #5  0x00007f3001982990 in main_loop_wait (nonblocking=<optimized out>)
>    at main-loop.c:472
> #6  0x00007f30018cf714 in main_loop () at /home/hch/work/qemu/vl.c:1481
> #7  main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>)
>    at /home/hch/work/qemu/vl.c:3479
>
> adding some printks suggest this happens when calling add_aio_request from
> aio_read_response when either delaying creates, or updating metadata,
> although not everytime one of these cases happens.
>
> I've tried to understand how the recursive calling happens, but unfortunately
> the whole coroutine code lacks any sort of documentation how it should
> behave or what it asserts about the callers.
>
>> I don't have a sheepdog setup here but if there's an easy way to
>> reproduce please let me know and I'll take a look.
>
> With the small patch below applied to the sheppdog source I can reproduce
> the issue on my laptop using the following setup:
>
> for port in 7000 7001 7002; do
>    mkdir -p /mnt/sheepdog/$port
>    /usr/sbin/sheep -p $port -c local /mnt/sheepdog/$port
>    sleep 2
> done
>
> collie cluster format
> collie vdi create test 20G
>
> then start a qemu instance that uses the the sheepdog volume using the
> following device and drive lines:
>
>        -drive if=none,file=sheepdog:test,cache=none,id=test \
>        -device virtio-blk-pci,drive=test,id=testdev \
>
> finally, in the guest run:
>
>        dd if=/dev/zero of=/dev/vdX bs=67108864 count=128 oflag=direct

Thanks for these instructions.  I can reproduce the issue here.

It seems suspicious the way that BDRVSheepdogState->co_recv and
->co_send work.  The code adds select(2) read/write callback functions
on the sheepdog socket file descriptor.  When the socket becomes
writeable or readable the co_send or co_recv coroutines are entered.
So far, so good, this is how a coroutine is integrated into the main
loop of QEMU.

The problem is that this patch is mixing things.  The co_recv
coroutine runs aio_read_response(), which invokes send_pending_req().
send_pending_req() invokes add_aio_request().  That function isn't
suitable for co_recv's context because it actually sends data and hits
a few blocking (yield) points.  It takes a coroutine mutex - but the
select(2) read callback is still in place.  We're now still in the
aio_read_response() call chain except we're actually not reading at
all, we're trying to write!  And we'll get spurious wakeups if there
is any data readable on the socket.

So the co_recv coroutine has two things in the system that will try to enter it:
1. The select(2) read callback on the sheepdog socket.
2. The aio_add_request() blocking operations, including a coroutine mutex.

This is bad, a yielded coroutine should only have one thing that will
enter it.  It's rare that it makes sense to have multiple things
entering a coroutine.

It's late here but I hope this gives Kazutaka some thoughts on what is
causing the issue with this patch.

Stefan



More information about the sheepdog mailing list