[Sheepdog] coroutine bug?, was Re: [PATCH] sheepdog: use coroutines

Christoph Hellwig hch at lst.de
Thu Dec 29 13:06:26 CET 2011


On Fri, Dec 23, 2011 at 02:38:50PM +0100, Christoph Hellwig wrote:
> FYI, this causes segfaults when doing large streaming writes when
> running against a sheepdog cluster which:
> 
>   a) has relatively fast SSDs
> 
> and
> 
>   b) uses buffered I/O.
> 
> Unfortunately I can't get a useful backtrace out of gdb.  When running just
> this commit I at least get some debugging messages:
> 
> qemu-system-x86_64: failed to recv a rsp, Socket operation on non-socket
> qemu-system-x86_64: failed to get the header, Socket operation on non-socket
> 
> but on least qemu these don't show up either.

s/least/latest/

Some more debugging.  Just for the call that eventually segfaults s->fd
turns from its normal value (normall 13 for me) into 0.  This is entirely
reproducable in my testing, and given that the sheepdog driver never
assigns to that value except opening the device this seems to point to
an issue in the coroutine code to me.



More information about the sheepdog mailing list