Hi Hongyi, At Wed, 22 May 2013 12:55:36 +0800, Hongyi Wang wrote: > > Hi, > > I just got a error message " Network error between sheep" when I ran > "collie vdi check <volume id>." > The #commit is: *c3727b73964c* > I replay the error as follows: > Make a new sheep cluster, in my case, I made 4 machines cluster: > z0(10.0.0.10), z1(10.0.0.11), z2(10.0.0.12), z3(10.0.0.13) > My zookeeper ran on z0. > For z0, I start a sheep-gateway only: > >sheep -b 0.0.0.0 -y 10.0.0.10 -c zookeeper:10.0.0.10:2181 /sheep/state -g > For z1, z2, z3. I start both sheep-gateway and sheep-store. For example, on > z1 > >sheep -b 0.0.0.0 -y 10.0.0.11 -p 7000 -j dir=/sheep/journal size=3000 -D > -w size=40000 dir=/sheep/object_cache -c zookeeper:10.0.0.10:2181,timeout=30s > /sheep/state /sheep/disk1,/sheep/disk2 -P /sheep/state/sheep.pid > Then I ran: > >collie cluster format -c > >collie cluster info > Epoch Time Version > 2013-05-22 07:12:41 9 [10.0.0.10:7000, 10.0.0.11:7000, 10.0.0.12:7000, > 10.0.0.13:7000] > It shows the cluster is initialized successfully. > > I uploaded a image by “qemu-img convert -O raw ...” > when I ran: > >collie vdi list > Name Id Size Used Shared Creation time VDI id Copies > Tag > volume-5ed1fb6f-f64a-48ea-9249-636615098a42 0 5.0 GB 1.3 GB 0.0 MB > 2013-05-22 03:37 ce945b 2 > It works great. > > But when I ran vdi check: > >collie vdi check <volume_id> > 3.5 % [====> > ] 180 MB / 5.0 > GB failed to read ce945b00000001 from 10.0.0.11:7000, Network error > between sheep > 3.6 % [====> > ] 184 MB / 5.0 > GB failed to read ce945b00000000 from 10.0.0.12:7000, Network error > between sheep > > However, if I rollback to *"c2f6a6d" *commit, vdi check works correctly. > > I just wonder if it is a bug in the collie *"c3727b73964c"* commit (or even > in the latest version)? Thanks for your report. I've found a bug which causes the problem. I'll send the fix soon. Thanks, Kazutaka |