[sheepdog-users] Crash if not all zookeeper nodes are active

Hitoshi Mitake mitake.hitoshi at lab.ntt.co.jp
Wed Oct 22 03:55:34 CEST 2014


At Tue, 21 Oct 2014 09:29:52 +0200,
Valerio Pachera wrote:
> 
> Hi all,
> 
> I have 4 nodes (0,1,2,3) with a Sheepdog daemon version 0.9.0_rc0_5_g8066978.
> Node 1,2,3 are also running zookeeer.
> I tried to stop zookeeper before starting sheep on node 0.
> So sheep tries to contact 3 zookeepers but one is down.
> '-c zookeeper:192.168.2.45:2181,192.168.2.46:2181,192.168.2.47:2181'
> 
> This is what I get in sheep.log of node 0
> 
> Oct 21 09:19:06   INFO [main] md_add_disk(343) /mnt/sheep/0, vdisk nr
> 5, total disk 1
> Oct 21 09:19:06 NOTICE [main] get_local_addr(522) found IPv4 address
> Oct 21 09:19:06   INFO [main] send_join_request(991) IPv4
> ip:192.168.10.4 port:7000 going to join the cluster
> Oct 21 09:19:26  ERROR [main] zk_create_seq_node(251) failed,
> path:/sheepdog/queue/, operation timeout
> Oct 21 09:19:47 NOTICE [main] nfs_init(607) nfs server service is not compiled
> Oct 21 09:19:47   INFO [main] check_host_env(493) Allowed open files
> 1024000, suggested 6144000
> Oct 21 09:19:47   INFO [main] main(944) sheepdog daemon (version
> 0.9.0_rc0_5_g8066978) started
> Oct 21 09:19:47  ERROR [block] connect_to(193) failed to connect to
> 192.168.10.5:3333: Connection refused
> Oct 21 09:19:47  ERROR [block] sockfd_cache_get_long(348) fallback to
> non-io connection
> Oct 21 09:19:47  ERROR [block] connect_to(193) failed to connect to
> 192.168.10.5:7000: Connection refused
> Oct 21 09:19:47  ERROR [block] connect_to(193) failed to connect to
> 192.168.10.5:7000: Connection refused
> Oct 21 09:19:47  ALERT [block] do_get_vdis(530) failed to get vdi
> bitmap from IPv4 ip:192.168.10.5 port:7000
> Oct 21 09:19:47  ERROR [block] connect_to(193) failed to connect to
> 192.168.10.6:3333: Connection refused
> Oct 21 09:19:47  ERROR [block] sockfd_cache_get_long(348) fallback to
> non-io connection
> Oct 21 09:19:47  ERROR [block] connect_to(193) failed to connect to
> 192.168.10.6:7000: Connection refused
> Oct 21 09:19:47  ERROR [block] connect_to(193) failed to connect to
> 192.168.10.6:7000: Connection refused
> Oct 21 09:19:47  ALERT [block] do_get_vdis(530) failed to get vdi
> bitmap from IPv4 ip:192.168.10.6 port:7000
> Oct 21 09:19:47  EMERG [main] refcount_dec(292) Asserting `1 <=
> uatomic_read(&rc->val)' failed.
> Oct 21 09:19:47  EMERG [main] crash_handler(268) sheep exits
> unexpectedly (Aborted).
> Oct 21 09:19:47  EMERG [main] sd_backtrace(833) sheep.c:270: crash_handler
> Oct 21 09:19:47  EMERG [main] sd_backtrace(847)
> /lib/x86_64-linux-gnu/libpthread.so.0(+0xf02f) [0x7f2cceb1502f]
> Oct 21 09:19:47  EMERG [main] sd_backtrace(847)
> /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x34) [0x7f2cce10a474]
> Oct 21 09:19:47  EMERG [main] sd_backtrace(847)
> /lib/x86_64-linux-gnu/libc.so.6(abort+0x17f) [0x7f2cce10d6ef]
> Oct 21 09:19:47  EMERG [main] sd_backtrace(833) util.h:292: refcount_dec
> Oct 21 09:19:47  EMERG [main] sd_backtrace(833) request.c:739: put_request
> Oct 21 09:19:47  EMERG [main] sd_backtrace(833) group.c:937: sd_notify_handler
> Oct 21 09:19:47  EMERG [main] sd_backtrace(833) zookeeper.c:1252:
> zk_event_handler
> Oct 21 09:19:47  EMERG [main] sd_backtrace(833) event.c:210: do_event_loop
> Oct 21 09:19:47  EMERG [main] sd_backtrace(833) sheep.c:949: main
> Oct 21 09:19:47  EMERG [main] sd_backtrace(847)
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfc)
> [0x7f2cce0f6eac]
> Oct 21 09:19:47  EMERG [main] sd_backtrace(847) sheep() [0x405f18]
> 
> Do I have to consider this as a bug?

Hmm, how do you think about this problem, Ruoyu? I'm not sure, but it
seems bug. I want to hear your opinion.

BTW, Valerio, could you test my recent patch (sheep: exit when vdi
bitmap collection is failed) for this case? The patch will reduce
stack trace and let sheep exit gracefully. It will not solve the
problem but it will make trouble shooting a little bit easy.

Thanks,
Hitoshi

> -- 
> sheepdog-users mailing lists
> sheepdog-users at lists.wpkg.org
> http://lists.wpkg.org/mailman/listinfo/sheepdog-users



More information about the sheepdog-users mailing list