[sheepdog-users] sheepdog replication got stuck
Liu Yuan
namei.unix at gmail.com
Fri Dec 27 08:50:36 CET 2013
On Mon, Dec 23, 2013 at 02:15:55PM +0100, Gerald Richter - ECOS wrote:
> Hi,
>
> >
> > So what is the problem? 'qemu-img convert' get hung so that never finish?
> >
>
> On the first host (where the qemu-img runs) I have:
>
> vm-61025-disk-1 0 20 GB 15 GB 0.0 MB 2013-12-19 09:52 bb7a25 3
>
> on the second one I have:
>
> vm-61025-disk-1 0 20 GB 36 MB 0.0 MB 2013-12-19 09:52 bb7a25 3
>
> Regardless if qemu-img hangs I expect that the second machine show the same "Used" value as the first one (after the time it takes to push the cached content over the network).
>
> The other question is why qemu-img hangs. I guess (but this can be wrong) it has issued a flush at the end of the import and now is waiting until the cache has been flushed to all nodes. That is how I understand from the docs how it should work.
>
> At least doing an strace and lsof on the qemu-img process shows that it is waiting for the sheepdog server (select on the sheepdog socket connection).
>
> Maybe it's important that I run qemu 1.4 because that is part of the distribution (Proxmox) I use and it contains a bunch of patches, so it's not easy to compile from the source.
>
This QEMU is very old and has known buggy code for sheepdog. I'd suggest you
compile the QEMU yourself and it would be much stable and more importantly
include auto-reconnection support for sheepdog from following command:
$ git clone https://github.com/sheepdog/qemu.git # this is actually based on latest official QEMU with minor fixes
$ cd qemu
$ ./configure --target-list=x86_64-softmmu
$ sudo make install
Thanks
Yuan
More information about the sheepdog-users
mailing list