[sheepdog-users] cache crash test

Tue Jun 25 15:17:57 CEST 2013

I got sheep to crash.

Jun 25 15:00:32 [main] main(797) shutdown
Jun 25 15:04:23 [main] md_add_disk(161) /mnt/sheep/dsk02, nr 1
Jun 25 15:04:23 [main] send_join_request(1095) IPv4 ip:192.168.2.44 port:7000
Jun 25 15:04:23 [main] for_each_object_in_stale(403) /mnt/sheep/dsk02/.stale
Jun 25 15:04:23 [main] check_host_env(396) WARN: Allowed open files
1024 too small, suggested 1024000
Jun 25 15:04:23 [main] check_host_env(405) Allowed core file size 0,
suggested unlimited
Jun 25 15:04:23 [main] main(790) sheepdog daemon (version
0.6.0_49_g00d4b07) started
Jun 25 15:04:23 [main] update_cluster_info(871) status = 4, epoch = 4,
finished: 0
Jun 25 15:04:31 [main] sd_check_join_cb(1055) 192.168.2.45:7000: ret =
0x0, cluster_status = 0x4
Jun 25 15:04:31 [main] update_cluster_info(871) status = 4, epoch = 4,
finished: 1
Jun 25 15:04:35 [main] sd_check_join_cb(1055) 192.168.2.47:7000: ret =
0x0, cluster_status = 0x1
Jun 25 15:04:35 [main] update_cluster_info(871) status = 1, epoch = 4,
finished: 1
Jun 25 15:08:32 [gway 2820] wait_forward_request(211) fail
a34c67000003a8, No object found
Jun 25 15:08:32 [oc_push 2762] push_cache_object(532) failed to push
object No object found
Jun 25 15:08:32 [oc_push 2762] do_push_object(897) PANIC: push failed
but should never fail
Jun 25 15:08:32 [oc_push 2762] crash_handler(181) sheep exits
unexpectedly (Aborted).
Jun 25 15:08:32 [oc_push 2762] sd_backtrace(834) sheep.c:183: crash_handler
Jun 25 15:08:32 [oc_push 2762] sd_backtrace(848)
/lib/x86_64-linux-gnu/libpthread.so.0(+0xf02f) [0x7fc8da35d02f]
Jun 25 15:08:32 [oc_push 2762] sd_backtrace(848)
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x34) [0x7fc8d9969474]
Jun 25 15:08:32 [oc_push 2762] sd_backtrace(848)
/lib/x86_64-linux-gnu/libc.so.6(abort+0x17f) [0x7fc8d996c6ef]
Jun 25 15:08:32 [oc_push 2762] sd_backtrace(834) object_cache.c:897:
do_push_object
Jun 25 15:08:32 [oc_push 2762] sd_backtrace(834) work.c:243: worker_routine
Jun 25 15:08:32 [oc_push 2762] sd_backtrace(848)
/lib/x86_64-linux-gnu/libpthread.so.0(+0x6b4f) [0x7fc8da354b4f]
Jun 25 15:08:32 [gway 2813] wait_forward_request(211) fail
a34c67000003ad, No object found
Jun 25 15:08:32 [oc_push 2763] push_cache_object(532) failed to push
object No object found
Jun 25 15:08:32 [oc_push 2763] do_push_object(897) PANIC: push failed
but should never fail
Jun 25 15:08:32 [main] crash_handler(487) sheep pid 2659 exited unexpectedly.

I was testing the combination of elevator=deadline on the guest.
I stopped the cluster and changed also on node id 0 the elevator to
deadline and rebooted it.
I restarted sheep on all hosts, then the guest on node id 0.
I run dd two times (512M) then qemu process got frozen.
In theory no recovery was running after cluster has been restarted.

PS: I notice 'Allowed open files 1024 too small, suggested 1024000'
that I thought I fixed it.