[sheepdog-users] sheepdog replication got stuck
Gerald Richter - ECOS
richter at ecos.de
Mon Dec 23 14:15:55 CET 2013
> So what is the problem? 'qemu-img convert' get hung so that never finish?
On the first host (where the qemu-img runs) I have:
vm-61025-disk-1 0 20 GB 15 GB 0.0 MB 2013-12-19 09:52 bb7a25 3
on the second one I have:
vm-61025-disk-1 0 20 GB 36 MB 0.0 MB 2013-12-19 09:52 bb7a25 3
Regardless if qemu-img hangs I expect that the second machine show the same "Used" value as the first one (after the time it takes to push the cached content over the network).
The other question is why qemu-img hangs. I guess (but this can be wrong) it has issued a flush at the end of the import and now is waiting until the cache has been flushed to all nodes. That is how I understand from the docs how it should work.
At least doing an strace and lsof on the qemu-img process shows that it is waiting for the sheepdog server (select on the sheepdog socket connection).
Maybe it's important that I run qemu 1.4 because that is part of the distribution (Proxmox) I use and it contains a bunch of patches, so it's not easy to compile from the source.
But regardsless if the hang of qemu-img is due to an old qemu, I would expect that the cache get flushed to the second node over time or am I wrong?
More information about the sheepdog-users