[sheepdog-users] add new node to cluster and new node crashed ...

Jens WEBER jweber at tek2b.org
Sat Jul 28 12:40:14 CEST 2012


While testing 0.4.0
setup: 1 gateway-only node, 3 storage-node, 3 copies
[ ok ] sheepdog gateway-only (-d -p 7000 -v 0 -z 999 /var/lib/sheepdog/disc0) is running.
[ ok ] sheepdog for Disk A (-d -p 7001 -v 32 -z 1 /var/lib/sheepdog/disc1) is running.
[ ok ] sheepdog for Disk B (-d -p 7002 -v 32 -z 2 /var/lib/sheepdog/disc2) is running.
[ ok ] sheepdog for Disk C (-d -p 7003 -v 32 -z 3 /var/lib/sheepdog/disc3) is running.
collie cluster format -c 3
so far, so good

test 1: add new node disc4
[ ok ] sheepdog for Disk D (-d -p 7004 -v 32 -z 4 /var/lib/sheepdog/disc4) is running.
ok, works

test 2: create test vdi and then add new node disc5
collie vdi create test 100M -P
[ ok ] sheepdog for Disk E (-d -p 7005 -v 32 -z 5 /var/lib/sheepdog/disc5) is running.
ok, works

test 3: shutdown cluster and start cluster then add node disc6
[ ok ] sheepdog gateway-only (-d -p 7000 -v 0 -z 999 /var/lib/sheepdog/disc0) is running.
[ ok ] sheepdog for Disk A (-d -p 7001 -v 32 -z 1 /var/lib/sheepdog/disc1) is running.
[ ok ] sheepdog for Disk B (-d -p 7002 -v 32 -z 2 /var/lib/sheepdog/disc2) is running.
[ ok ] sheepdog for Disk C (-d -p 7003 -v 32 -z 3 /var/lib/sheepdog/disc3) is running.
[ ok ] sheepdog for Disk D (-d -p 7004 -v 32 -z 4 /var/lib/sheepdog/disc4) is running.
[ ok ] sheepdog for Disk E (-d -p 7005 -v 32 -z 5 /var/lib/sheepdog/disc5) is running.
[FAIL] sheepdog for Disk F (-d -p 7006 -v 32 -z 6 /var/lib/sheepdog/disc6) is not running ... failed!
failed,node disc6 starts but crashes shortly after start, sheep.log from disc6
Jul 28 12:14:32 [main] create_cluster(1127) use corosync cluster driver as default
Jul 28 12:14:32 [main] create_cluster(1156) zone id = 6
Jul 28 12:14:33 [main] send_join_request(992) IPv4 ip:172.30.0.80 port:7006
Jul 28 12:14:33 [main] init_signal(171) register signal_handler for 12
Jul 28 12:14:33 [main] main(367) sheepdog daemon (version 0.4.0) started
Jul 28 12:14:33 [main] cdrv_cpg_confchg(568) mem:7, joined:1, left:0
Jul 28 12:14:33 [main] cdrv_cpg_confchg(634) Not promoting because member is not in our event list.
Jul 28 12:14:33 [main] cdrv_cpg_deliver(454) 0
Jul 28 12:14:33 [main] cdrv_cpg_deliver(454) 1
Jul 28 12:14:33 [main] sd_join_handler(1021) join IPv4 ip:172.30.0.80 port:7006
Jul 28 12:14:33 [main] sd_join_handler(1023) [0] IPv4 ip:172.30.0.80 port:7000
Jul 28 12:14:33 [main] sd_join_handler(1023) [1] IPv4 ip:172.30.0.80 port:7001
Jul 28 12:14:33 [main] sd_join_handler(1023) [2] IPv4 ip:172.30.0.80 port:7002
Jul 28 12:14:33 [main] sd_join_handler(1023) [3] IPv4 ip:172.30.0.80 port:7003
Jul 28 12:14:33 [main] sd_join_handler(1023) [4] IPv4 ip:172.30.0.80 port:7004
Jul 28 12:14:33 [main] sd_join_handler(1023) [5] IPv4 ip:172.30.0.80 port:7005
Jul 28 12:14:33 [main] sd_join_handler(1023) [6] IPv4 ip:172.30.0.80 port:7006
Jul 28 12:14:33 [main] update_cluster_info(786) status = 1, epoch = 2, finished: 0
Jul 28 12:14:33 [main] crash_handler(408) sheep pid 24450 exited unexpectedly.

What goes wrong? - Thanks Jens



More information about the sheepdog-users mailing list