[sheepdog] zookeeper driver join panic

Liu Yuan namei.unix at gmail.com
Fri Jun 28 05:32:30 CEST 2013


This is reported by a user's private message to me. The problem is that after
setup 3 nodes with zookeeper, the 4th node always got paniced. The log is

Jun 27 22:11:22 [main] md_add_disk(161) /var/lib/sheepdog/obj, nr 1
Jun 27 22:11:22 [main] init_signal(173) register signal_handler for 12
Jun 27 22:11:22 [main] calculate_vdisks(177) /var/lib/sheepdog/obj has 128 vdisks, free space 84358889472
Jun 27 22:11:22 [main] init_disk_space(346) disk free space is 84358889472
Jun 27 22:11:22 [main] zk_init(1111) version 3.4.3, address 103.21.143.125:2181, timeout 30000
Jun 27 22:11:22 [main] zk_watcher(545) path:, type:-1
Jun 27 22:11:22 [main] zk_init_node(196) failed, path:/sheepdog, node exists
Jun 27 22:11:22 [main] zk_init_node(196) failed, path:/sheepdog/master, node exists
Jun 27 22:11:22 [main] zk_init_node(196) failed, path:/sheepdog/queue, node exists
Jun 27 22:11:22 [main] zk_init_node(196) failed, path:/sheepdog/member, node exists
Jun 27 22:11:22 [main] get_local_addr(592) found IPv4 address
Jun 27 22:11:22 [main] create_cluster(1303) zone id = 2173637991
Jun 27 22:11:22 [main] zk_node_exists(271) failed, path:/sheepdog/member/IPv4 ip:103.21.143.129 port:7000, no node
Jun 27 22:11:22 [main] zk_compete_master(729) start to compete master for the first time
Jun 27 22:11:22 [main] zk_create_seq_node(235) PANIC: failed, path:/sheepdog/master/, no children for ephemerals
Jun 27 22:11:22 [main] crash_handler(181) sheep exits unexpectedly (Aborted).
Jun 27 22:11:22 [main] sd_backtrace(834) sheep.c:183: crash_handler
Jun 27 22:11:22 [main] sd_backtrace(848) /lib64/libpthread.so.0(+0xf4ff) [0x7f35752154ff]
Jun 27 22:11:22 [main] sd_backtrace(848) /lib64/libc.so.6(gsignal+0x34) [0x7f3574a078a4]
Jun 27 22:11:22 [main] sd_backtrace(848) /lib64/libc.so.6(abort+0x174) [0x7f3574a09084]
Jun 27 22:11:22 [main] sd_backtrace(834) zookeeper.c:235: zk_create_seq_node
Jun 27 22:11:22 [main] sd_backtrace(834) zookeeper.c:793: zk_join
Jun 27 22:11:22 [main] sd_backtrace(834) group.c:1093: send_join_request
Jun 27 22:11:22 [main] sd_backtrace(834) group.c:1331: create_cluster
Jun 27 22:11:22 [main] sd_backtrace(834) sheep.c:732: main
Jun 27 22:11:22 [main] sd_backtrace(848) /lib64/libc.so.6(__libc_start_main+0xfc) [0x7f35749f3cdc]
Jun 27 22:11:22 [main] sd_backtrace(848) sheep() [0x403e28]
Jun 27 22:11:22 [main] __dump_stack_frames(744) cannot find gdb
Jun 27 22:11:22 [main] __sd_dump_variable(694) cannot find gdb
Jun 27 22:11:22 [main] crash_handler(487) sheep pid 2881 exited unexpectedly.
Jun 27 22:12:24 [main] md_add_disk(161) /var/lib/sheepdog/obj, nr 1
Jun 27 22:12:24 [main] init_signal(173) register signal_handler for 12
Jun 27 22:12:24 [main] calculate_vdisks(177) /var/lib/sheepdog/obj has 128 vdisks, free space 84358889472
Jun 27 22:12:24 [main] init_disk_space(346) disk free space is 84358889472
Jun 27 22:12:24 [main] zk_init(1111) version 3.4.3, address 103.21.143.125:2181, timeout 30000
Jun 27 22:12:24 [main] zk_watcher(545) path:, type:-1
Jun 27 22:12:24 [main] zk_init_node(196) failed, path:/sheepdog, node exists
Jun 27 22:12:24 [main] zk_init_node(196) failed, path:/sheepdog/master, node exists
Jun 27 22:12:24 [main] zk_init_node(196) failed, path:/sheepdog/queue, node exists
Jun 27 22:12:24 [main] zk_init_node(196) failed, path:/sheepdog/member, node exists
Jun 27 22:12:24 [main] get_local_addr(592) found IPv4 address
Jun 27 22:12:24 [main] create_cluster(1303) zone id = 2173637991
Jun 27 22:12:24 [main] zk_node_exists(271) failed, path:/sheepdog/member/IPv4 ip:103.21.143.129 port:7000, no node
Jun 27 22:12:24 [main] zk_compete_master(729) start to compete master for the first time
Jun 27 22:12:24 [main] zk_create_seq_node(235) PANIC: failed, path:/sheepdog/master/, no children for ephemerals
Jun 27 22:12:24 [main] crash_handler(181) sheep exits unexpectedly (Aborted).
Jun 27 22:12:24 [main] sd_backtrace(834) sheep.c:183: crash_handler
Jun 27 22:12:24 [main] sd_backtrace(848) /lib64/libpthread.so.0(+0xf4ff) [0x7f3c515614ff]
Jun 27 22:12:24 [main] sd_backtrace(848) /lib64/libc.so.6(gsignal+0x34) [0x7f3c50d538a4]
Jun 27 22:12:24 [main] sd_backtrace(848) /lib64/libc.so.6(abort+0x174) [0x7f3c50d55084]
Jun 27 22:12:24 [main] sd_backtrace(834) zookeeper.c:235: zk_create_seq_node
Jun 27 22:12:24 [main] sd_backtrace(834) zookeeper.c:793: zk_join
Jun 27 22:12:24 [main] sd_backtrace(834) group.c:1093: send_join_request
Jun 27 22:12:24 [main] sd_backtrace(834) group.c:1331: create_cluster
Jun 27 22:12:24 [main] sd_backtrace(834) sheep.c:732: main
Jun 27 22:12:24 [main] sd_backtrace(848) /lib64/libc.so.6(__libc_start_main+0xfc) [0x7f3c50d3fcdc]
Jun 27 22:12:24 [main] sd_backtrace(848) sheep() [0x403e28]
Jun 27 22:12:24 [main] __dump_stack_frames(744) cannot find gdb
Jun 27 22:12:24 [main] __sd_dump_variable(694) cannot find gdb
Jun 27 22:12:24 [main] crash_handler(487) sheep pid 3091 exited unexpectedly.
Jun 27 22:22:03 [main] md_add_disk(161) /var/lib/sheepdog/obj, nr 1
Jun 27 22:22:03 [main] init_signal(173) register signal_handler for 12
Jun 27 22:22:03 [main] calculate_vdisks(177) /var/lib/sheepdog/obj has 128 vdisks, free space 84358889472
Jun 27 22:22:03 [main] init_disk_space(346) disk free space is 84358889472
Jun 27 22:22:03 [main] zk_init(1111) version 3.4.3, address 103.21.143.129:2181, timeout 30000
Jun 27 22:22:03 [main] zk_watcher(545) path:, type:-1
Jun 27 22:22:03 [main] zk_init_node(196) failed, path:/sheepdog, node exists
Jun 27 22:22:03 [main] zk_init_node(196) failed, path:/sheepdog/master, node exists
Jun 27 22:22:03 [main] zk_init_node(196) failed, path:/sheepdog/queue, node exists
Jun 27 22:22:03 [main] zk_init_node(196) failed, path:/sheepdog/member, node exists
Jun 27 22:22:03 [main] get_local_addr(592) found IPv4 address
Jun 27 22:22:03 [main] create_cluster(1303) zone id = 2173637991
Jun 27 22:22:03 [main] zk_node_exists(271) failed, path:/sheepdog/member/IPv4 ip:103.21.143.129 port:7000, no node
Jun 27 22:22:03 [main] zk_compete_master(729) start to compete master for the first time
Jun 27 22:22:03 [main] zk_create_seq_node(235) PANIC: failed, path:/sheepdog/master/, no children for ephemerals
Jun 27 22:22:03 [main] crash_handler(181) sheep exits unexpectedly (Aborted).
Jun 27 22:22:03 [main] sd_backtrace(834) sheep.c:183: crash_handler
Jun 27 22:22:03 [main] sd_backtrace(848) /lib64/libpthread.so.0(+0xf4ff) [0x7f7fc43044ff]
Jun 27 22:22:03 [main] sd_backtrace(848) /lib64/libc.so.6(gsignal+0x34) [0x7f7fc3af68a4]
Jun 27 22:22:03 [main] sd_backtrace(848) /lib64/libc.so.6(abort+0x174) [0x7f7fc3af8084]
Jun 27 22:22:03 [main] sd_backtrace(834) zookeeper.c:235: zk_create_seq_node
Jun 27 22:22:03 [main] sd_backtrace(834) zookeeper.c:793: zk_join
Jun 27 22:22:03 [main] sd_backtrace(834) group.c:1093: send_join_request
Jun 27 22:22:03 [main] sd_backtrace(834) group.c:1331: create_cluster
Jun 27 22:22:03 [main] sd_backtrace(834) sheep.c:732: main
Jun 27 22:22:03 [main] sd_backtrace(848) /lib64/libc.so.6(__libc_start_main+0xfc) [0x7f7fc3ae2cdc]
Jun 27 22:22:03 [main] sd_backtrace(848) sheep() [0x403e28]
Jun 27 22:22:03 [main] __dump_stack_frames(744) cannot find gdb
Jun 27 22:22:03 [main] __sd_dump_variable(694) cannot find gdb
Jun 27 22:22:03 [main] crash_handler(487) sheep pid 3482 exited unexpectedly.
Jun 27 22:23:39 [main] md_add_disk(161) /var/lib/sheepdog/obj, nr 1
Jun 27 22:23:39 [main] init_signal(173) register signal_handler for 12
Jun 27 22:23:39 [main] calculate_vdisks(177) /var/lib/sheepdog/obj has 128 vdisks, free space 84358889472
Jun 27 22:23:39 [main] init_disk_space(346) disk free space is 84358889472
Jun 27 22:23:39 [main] zk_init(1111) version 3.4.3, address 103.21.143.125:2181, timeout 30000
Jun 27 22:23:39 [main] zk_watcher(545) path:, type:-1
Jun 27 22:23:39 [main] zk_init_node(196) failed, path:/sheepdog, node exists
Jun 27 22:23:39 [main] zk_init_node(196) failed, path:/sheepdog/master, node exists
Jun 27 22:23:39 [main] zk_init_node(196) failed, path:/sheepdog/queue, node exists
Jun 27 22:23:39 [main] zk_init_node(196) failed, path:/sheepdog/member, node exists
Jun 27 22:23:39 [main] get_local_addr(592) found IPv4 address
Jun 27 22:23:39 [main] create_cluster(1303) zone id = 2173637991
Jun 27 22:23:39 [main] zk_node_exists(271) failed, path:/sheepdog/member/IPv4 ip:103.21.143.129 port:7000, no node
Jun 27 22:23:39 [main] zk_compete_master(729) start to compete master for the first time
Jun 27 22:23:39 [main] zk_create_seq_node(235) PANIC: failed, path:/sheepdog/master/, no children for ephemerals
...

Seems the master election is broken, any idea Kai?

Thanks
Yuan



More information about the sheepdog mailing list