[sheepdog-users] Crash if not all zookeeper nodes are active

Valerio Pachera sirio81 at gmail.com
Tue Oct 21 09:29:52 CEST 2014


Hi all,

I have 4 nodes (0,1,2,3) with a Sheepdog daemon version 0.9.0_rc0_5_g8066978.
Node 1,2,3 are also running zookeeer.
I tried to stop zookeeper before starting sheep on node 0.
So sheep tries to contact 3 zookeepers but one is down.
'-c zookeeper:192.168.2.45:2181,192.168.2.46:2181,192.168.2.47:2181'

This is what I get in sheep.log of node 0

Oct 21 09:19:06   INFO [main] md_add_disk(343) /mnt/sheep/0, vdisk nr
5, total disk 1
Oct 21 09:19:06 NOTICE [main] get_local_addr(522) found IPv4 address
Oct 21 09:19:06   INFO [main] send_join_request(991) IPv4
ip:192.168.10.4 port:7000 going to join the cluster
Oct 21 09:19:26  ERROR [main] zk_create_seq_node(251) failed,
path:/sheepdog/queue/, operation timeout
Oct 21 09:19:47 NOTICE [main] nfs_init(607) nfs server service is not compiled
Oct 21 09:19:47   INFO [main] check_host_env(493) Allowed open files
1024000, suggested 6144000
Oct 21 09:19:47   INFO [main] main(944) sheepdog daemon (version
0.9.0_rc0_5_g8066978) started
Oct 21 09:19:47  ERROR [block] connect_to(193) failed to connect to
192.168.10.5:3333: Connection refused
Oct 21 09:19:47  ERROR [block] sockfd_cache_get_long(348) fallback to
non-io connection
Oct 21 09:19:47  ERROR [block] connect_to(193) failed to connect to
192.168.10.5:7000: Connection refused
Oct 21 09:19:47  ERROR [block] connect_to(193) failed to connect to
192.168.10.5:7000: Connection refused
Oct 21 09:19:47  ALERT [block] do_get_vdis(530) failed to get vdi
bitmap from IPv4 ip:192.168.10.5 port:7000
Oct 21 09:19:47  ERROR [block] connect_to(193) failed to connect to
192.168.10.6:3333: Connection refused
Oct 21 09:19:47  ERROR [block] sockfd_cache_get_long(348) fallback to
non-io connection
Oct 21 09:19:47  ERROR [block] connect_to(193) failed to connect to
192.168.10.6:7000: Connection refused
Oct 21 09:19:47  ERROR [block] connect_to(193) failed to connect to
192.168.10.6:7000: Connection refused
Oct 21 09:19:47  ALERT [block] do_get_vdis(530) failed to get vdi
bitmap from IPv4 ip:192.168.10.6 port:7000
Oct 21 09:19:47  EMERG [main] refcount_dec(292) Asserting `1 <=
uatomic_read(&rc->val)' failed.
Oct 21 09:19:47  EMERG [main] crash_handler(268) sheep exits
unexpectedly (Aborted).
Oct 21 09:19:47  EMERG [main] sd_backtrace(833) sheep.c:270: crash_handler
Oct 21 09:19:47  EMERG [main] sd_backtrace(847)
/lib/x86_64-linux-gnu/libpthread.so.0(+0xf02f) [0x7f2cceb1502f]
Oct 21 09:19:47  EMERG [main] sd_backtrace(847)
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x34) [0x7f2cce10a474]
Oct 21 09:19:47  EMERG [main] sd_backtrace(847)
/lib/x86_64-linux-gnu/libc.so.6(abort+0x17f) [0x7f2cce10d6ef]
Oct 21 09:19:47  EMERG [main] sd_backtrace(833) util.h:292: refcount_dec
Oct 21 09:19:47  EMERG [main] sd_backtrace(833) request.c:739: put_request
Oct 21 09:19:47  EMERG [main] sd_backtrace(833) group.c:937: sd_notify_handler
Oct 21 09:19:47  EMERG [main] sd_backtrace(833) zookeeper.c:1252:
zk_event_handler
Oct 21 09:19:47  EMERG [main] sd_backtrace(833) event.c:210: do_event_loop
Oct 21 09:19:47  EMERG [main] sd_backtrace(833) sheep.c:949: main
Oct 21 09:19:47  EMERG [main] sd_backtrace(847)
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfc)
[0x7f2cce0f6eac]
Oct 21 09:19:47  EMERG [main] sd_backtrace(847) sheep() [0x405f18]

Do I have to consider this as a bug?



More information about the sheepdog-users mailing list