[sheepdog-users] sheepdog replication got stuck

Hitoshi Mitake mitake.hitoshi at gmail.com
Mon Jan 6 07:41:36 CET 2014


Hi Gerald,

I have some questions about your situation for effective problem solving.

At Sat, 4 Jan 2014 20:20:04 +0100,
Gerald Richter - ECOS wrote:
> 
> Hi,
> 
> I have done further investigation on that issue.
> 
> As long as I import image by image, only one at a time everything works as expected, but when I try to import all images at the same time, either the replication to the other node gets stuck and the  and the import never finish or sheep gets an segmentation fault (see below).
> 
> Looks to me like some kind of raise condition in the thread handling.
> 
> Upgrade to 0.7.6 doesn't change anything
> 
> Sheep is running with the following options:
> 
> /usr/sbin/sheep --pidfile /var/run/sheep.pid -l 6 --nosync /var/lib/sheepdog/ /var/lib/sheepdog//disc1/data,/var/lib/sheepdog//disc2/data -w dir=/var/lib/sheepdog//cache size=100000
> 
> Regards
> 
> Gerald
> 
> Crash from 0.7.5:
> 
> Jan 01 11:03:58  EMERG [gway 82801] crash_handler(250) sheep exits unexpectedly (Segmentation fault).
> Jan 01 11:04:03  EMERG [gway 82801] sd_backtrace(843) sheep.c:252: crash_handler
> Jan 01 11:04:03  EMERG [gway 82801] sd_backtrace(857) /lib/x86_64-linux-gnu/libpthread.so.0(+0xf02f) [0x7f491ba3502f]
> Jan 01 11:04:03  EMERG [gway 82801] sd_backtrace(843) object_cache.c:643: find_object_cache
> Jan 01 11:04:03  EMERG [gway 82801] sd_backtrace(843) object_cache.c:1098: bypass_object_cache
> Jan 01 11:04:04  EMERG [gway 82801] sd_backtrace(843) gateway.c:39: gateway_read_obj
> Jan 01 11:04:04  EMERG [gway 82801] sd_backtrace(843) ops.c:1337: do_process_work
> Jan 01 11:04:04  EMERG [gway 82801] sd_backtrace(843) work.c:294: worker_routine
> Jan 01 11:04:04  EMERG [gway 82801] sd_backtrace(857) /lib/x86_64-linux-gnu/libpthread.so.0(+0x6b4f) [0x7f491ba2cb4f]
> Jan 01 11:04:04  EMERG [gway 82801] sd_backtrace(857) /lib/x86_64-linux-gnu/libc.so.6(clone+0x6c) [0x7f491b0e9a7c]
> Jan 01 11:04:19  ERROR [main] crash_handler(490) sheep pid 81829 exited unexpectedly.
> 

1. Can you reproduce this problem without object cache?
2. How do you "import all images at the same time"? Executing qemu-img in parallel?
3. Can you check CPU usage of sheep daemon when its hangs?

Thanks,
Hitoshi



More information about the sheepdog-users mailing list