[sheepdog] issue of collie after sheepdog upgrade

陈李粮 1446130 at qq.com
Wed Aug 7 05:32:54 CEST 2013


Once ago we used version 0.56 for cluster testing and IO performance testing.
Everything working well but some times losing data (all or some vdi losing) because of zookeeper time out.
we found you have new version so we decide to upgrade.
no issue or error while configuration and make install the stable-0.6 version.
and we start the new sheepdog with this script,using -u option to upgrade store.
mountdir=/cloud
zookeeper="192.168.10.45:2181 192.168.10.46:2181 192.168.10.41:2181 192.168.10.47:2181"
sheep  /home/gateway -u  -b 192.168.10.47 -y 192.168.10.47 -p 7000 -g 64 -z 4 -D -c zookeeper:${zookeeper}
for i in {0..6}
do
sheep -d ${mountdir}/${i} -u -b 192.168.10.47  -y  192.168.10.47 -p $((7001 + $i)) -D -c zookeeper:${zookeeper}
done

We skip the MD support and Shepherd support because we have not move the data.
so,every sheep process working ,
root      8139  0.0  0.1 354864 50948 ?        Sl   10:55   0:00 sheep /home/gateway -u -b 192.168.10.47 -y 192.168.10.47 -p 7000 -g 64 -z 4 -D -c zookeeper:192.168.10.45:2181 192.168.10.46:2181 192.168.10.41:2181 192.168.10.47:2181
root      8142  0.0  0.1 353312 48712 ?        Sl   10:55   0:00 sheep -d /cloud/0 -u -b 192.168.10.47 -y 192.168.10.47 -p 7001 -D -c zookeeper:192.168.10.45:2181 192.168.10.46:2181 192.168.10.41:2181 192.168.10.47:2181
root      8145  0.0  0.1 353312 48788 ?        Sl   10:55   0:00 sheep -d /cloud/1 -u -b 192.168.10.47 -y 192.168.10.47 -p 7002 -D -c zookeeper:192.168.10.45:2181 192.168.10.46:2181 192.168.10.41:2181 192.168.10.47:2181
root      8148  0.0  0.1 418852 49328 ?        Sl   10:55   0:00 sheep -d /cloud/2 -u -b 192.168.10.47 -y 192.168.10.47 -p 7003 -D -c zookeeper:192.168.10.45:2181 192.168.10.46:2181 192.168.10.41:2181 192.168.10.47:2181
root      8151  0.0  0.1 353312 48252 ?        Sl   10:55   0:00 sheep -d /cloud/3 -u -b 192.168.10.47 -y 192.168.10.47 -p 7004 -D -c zookeeper:192.168.10.45:2181 192.168.10.46:2181 192.168.10.41:2181 192.168.10.47:2181
root      8154  0.0  0.1 353316 49660 ?        Sl   10:55   0:00 sheep -d /cloud/4 -u -b 192.168.10.47 -y 192.168.10.47 -p 7005 -D -c zookeeper:192.168.10.45:2181 192.168.10.46:2181 192.168.10.41:2181 192.168.10.47:2181
root      8157  0.0  0.1 353316 49368 ?        Sl   10:55   0:00 sheep -d /cloud/5 -u -b 192.168.10.47 -y 192.168.10.47 -p 7006 -D -c zookeeper:192.168.10.45:2181 192.168.10.46:2181 192.168.10.41:2181 192.168.10.47:2181
root      8160  0.0  0.1 353312 48776 ?        Sl   10:55   0:00 sheep -d /cloud/6 -u -b 192.168.10.47 -y 192.168.10.47 -p 7007 -D -c zookeeper:192.168.10.45:2181 192.168.10.46:2181 192.168.10.41:2181 192.168.10.47:2181
root      8161  0.0  0.0  85636   632 ?        Ss   10:55   0:00 sheep /home/gateway -u -b 192.168.10.47 -y 192.168.10.47 -p 7000 -g 64 -z 4 -D -c zookeeper:192.168.10.45:2181 192.168.10.46:2181 192.168.10.41:2181 192.168.10.47:2181
root      8163  0.0  0.0  85636   768 ?        Ss   10:55   0:00 sheep -d /cloud/0 -u -b 192.168.10.47 -y 192.168.10.47 -p 7001 -D -c zookeeper:192.168.10.45:2181 192.168.10.46:2181 192.168.10.41:2181 192.168.10.47:2181
root      8164  0.0  0.0  85636   764 ?        Ss   10:55   0:00 sheep -d /cloud/1 -u -b 192.168.10.47 -y 192.168.10.47 -p 7002 -D -c zookeeper:192.168.10.45:2181 192.168.10.46:2181 192.168.10.41:2181 192.168.10.47:2181
root      8165  0.0  0.0  85636   764 ?        Ss   10:55   0:00 sheep -d /cloud/2 -u -b 192.168.10.47 -y 192.168.10.47 -p 7003 -D -c zookeeper:192.168.10.45:2181 192.168.10.46:2181 192.168.10.41:2181 192.168.10.47:2181
root      8166  0.0  0.0  85636   764 ?        Ss   10:55   0:00 sheep -d /cloud/4 -u -b 192.168.10.47 -y 192.168.10.47 -p 7005 -D -c zookeeper:192.168.10.45:2181 192.168.10.46:2181 192.168.10.41:2181 192.168.10.47:2181
root      8167  0.0  0.0  85636   768 ?        Ss   10:55   0:00 sheep -d /cloud/3 -u -b 192.168.10.47 -y 192.168.10.47 -p 7004 -D -c zookeeper:192.168.10.45:2181 192.168.10.46:2181 192.168.10.41:2181 192.168.10.47:2181
root      8168  0.0  0.0  85636   764 ?        Ss   10:55   0:00 sheep -d /cloud/6 -u -b 192.168.10.47 -y 192.168.10.47 -p 7007 -D -c zookeeper:192.168.10.45:2181 192.168.10.46:2181 192.168.10.41:2181 192.168.10.47:2181
root      8169  0.0  0.0  85636   764 ?        Ss   10:55   0:00 sheep -d /cloud/5 -u -b 192.168.10.47 -y 192.168.10.47 -p 7006 -D -c zookeeper:192.168.10.45:2181 192.168.10.46:2181 192.168.10.41:2181 192.168.10.47:2181
root      8476  0.0  0.0 103252   864 pts/2    S+   11:29   0:00 grep sheep

and follow are the gateway log:
Aug 07 10:55:19 [main] main(774) sheepdog daemon (version 0.6.0_22_g8032354) started
Aug 07 10:55:19 [main] update_cluster_info(877) status = 4, epoch = 1, finished: 0
Aug 07 10:55:42 [main] update_cluster_info(877) status = 4, epoch = 1, finished: 1
Aug 07 10:55:42 [main] update_cluster_info(877) status = 4, epoch = 1, finished: 1
Aug 07 10:55:42 [main] update_cluster_info(877) status = 4, epoch = 1, finished: 1
Aug 07 10:55:42 [main] update_cluster_info(877) status = 4, epoch = 1, finished: 1
Aug 07 10:55:42 [main] update_cluster_info(877) status = 4, epoch = 1, finished: 1
Aug 07 10:55:42 [main] update_cluster_info(877) status = 4, epoch = 1, finished: 1
Aug 07 10:55:42 [main] update_cluster_info(877) status = 4, epoch = 1, finished: 1
Aug 07 10:55:42 [main] update_cluster_info(877) status = 4, epoch = 1, finished: 1
Aug 07 10:56:15 [main] update_cluster_info(877) status = 4, epoch = 1, finished: 1
Aug 07 10:56:15 [main] update_cluster_info(877) status = 4, epoch = 1, finished: 1
Aug 07 10:56:15 [main] update_cluster_info(877) status = 4, epoch = 1, finished: 1
Aug 07 10:56:15 [main] update_cluster_info(877) status = 4, epoch = 1, finished: 1
Aug 07 10:56:15 [main] update_cluster_info(877) status = 4, epoch = 1, finished: 1
Aug 07 10:56:15 [main] update_cluster_info(877) status = 4, epoch = 1, finished: 1
Aug 07 10:56:15 [main] update_cluster_info(877) status = 4, epoch = 1, finished: 1
Aug 07 10:56:16 [main] update_cluster_info(877) status = 1, epoch = 1, finished: 1


but when I run "collie node list",I got error
Aug 07 11:31:51 [main] connect_to(254) failed to connect to 127.0.0.1:7000: Connection refused
Failed to connect to 127.0.0.1:7000
Failed to get node list

what's wrong?
I have tried edit /etc/hosts to point localhost to 192.168.10.47 that sheepdog listened.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wpkg.org/pipermail/sheepdog/attachments/20130807/ebf17c2f/attachment-0003.html>


More information about the sheepdog mailing list