[sheepdog] Fwd: Network error between sheep when vdi check

MORITA Kazutaka morita.kazutaka at lab.ntt.co.jp
Wed May 22 07:41:41 CEST 2013


Hi Hongyi,

At Wed, 22 May 2013 12:55:36 +0800,
Hongyi Wang wrote:
> 
> Hi,
> 
> I just got a error message " Network error between sheep" when I ran
> "collie vdi check <volume id>."
> The #commit is: *c3727b73964c*
> I replay the error as follows:
> Make a new sheep cluster, in my case, I made 4 machines cluster:
> z0(10.0.0.10), z1(10.0.0.11), z2(10.0.0.12), z3(10.0.0.13)
> My zookeeper ran on z0.
> For z0, I start a sheep-gateway only:
> >sheep -b 0.0.0.0 -y 10.0.0.10 -c zookeeper:10.0.0.10:2181 /sheep/state -g
> For z1, z2, z3. I start both sheep-gateway and sheep-store. For example, on
> z1
> >sheep -b 0.0.0.0 -y 10.0.0.11 -p 7000 -j dir=/sheep/journal size=3000 -D
> -w size=40000 dir=/sheep/object_cache -c zookeeper:10.0.0.10:2181,timeout=30s
> /sheep/state /sheep/disk1,/sheep/disk2 -P /sheep/state/sheep.pid
> Then I ran:
> >collie cluster format -c
> >collie cluster info
> Epoch Time           Version
> 2013-05-22 07:12:41      9 [10.0.0.10:7000, 10.0.0.11:7000, 10.0.0.12:7000,
> 10.0.0.13:7000]
> It shows the cluster is initialized successfully.
> 
> I uploaded  a image by “qemu-img convert -O raw ...”
> when I ran:
> >collie vdi list
> Name        Id    Size    Used  Shared    Creation time   VDI id  Copies
>  Tag
> volume-5ed1fb6f-f64a-48ea-9249-636615098a42     0  5.0 GB  1.3 GB  0.0 MB
> 2013-05-22 03:37   ce945b     2
> It works great.
> 
> But when I ran vdi check:
> >collie vdi check <volume_id>
>   3.5 % [====>
>                                                              ] 180 MB / 5.0
> GB    failed to read ce945b00000001 from 10.0.0.11:7000, Network error
> between sheep
>   3.6 % [====>
>                                                              ] 184 MB / 5.0
> GB    failed to read ce945b00000000 from 10.0.0.12:7000, Network error
> between sheep
> 
> However, if I rollback to *"c2f6a6d" *commit, vdi check works correctly.
> 
> I just wonder if it is a bug in the collie *"c3727b73964c"* commit (or even
> in the latest version)?

Thanks for your report.  I've found a bug which causes the problem.
I'll send the fix soon.

Thanks,

Kazutaka



More information about the sheepdog mailing list