[sheepdog-users] [Sheepdog Announcement] erasure coding is fully functional, please try test

Liu Yuan namei.unix at gmail.com
Wed Oct 23 07:59:56 CEST 2013


On Tue, Oct 22, 2013 at 02:23:51PM -0400, Shawn Moore wrote:
> $ for i in `seq 0 5`; do sheep /tmp/store$i -n -c local -z $i -p 700$i;done
> $ dog cluster format
> $ dog vdi create -c 4:2 test 10G
> 
> In the above example you created a "zone" per node.  If you have 3
> zones with 5 nodes per zone in the past you could set copies = 3 and
> guarantee that each "zone" would have a full copy of each vdi amongst
> the 5 nodes in that zone.
> 
> With EC does that same logic still apply? 

Yes. the copy number for erasure coding is (x + y).

>
> In that would it do the EC
> inside each zone or is it now a sum of all the nodes in all zones put
> together?  In the following example could two whole zones fail and
> still allow one node to fail in the remaining zone?

Erasure coding operate on the zone level as replication.

With erasure coding, we only have 1 copy of data that *must* have x nodes
to hold, if you have number of nodes less than x, it means data loss.

In your scenario, you'd better use replication instead of erasure coding.

> 
> $ for i in `seq 0 5`; do sheep /tmp/store$i -n -c local -z 0 -p 700$i;done
> $ for i in `seq 6 10`; do sheep /tmp/store$i -n -c local -z 1 -p 700$i;done
> $ for i in `seq 11 15`; do sheep /tmp/store$i -n -c local -z 2 -p 700$i;done
> $ dog cluster format
> $ dog vdi create -c 2:1 test 10G

Currently we need (x + y) zones to make erasure coding vdi work. E.g, we have
2:1 scheme, if number of living zones is < 3, we can't serve the requests. But
in theory, we should operate if any one of zones fail comepletely and never
come back. That is, we should provide service with x zones alive in x:y scheme.

> A long time ago there was a "--unsafe" option that allowed us to keep
> the cluster running even if two whole zones, out of three failed.
> This left the number of functional nodes less than 50%.  In all our
> latest tests, this option is gone and it appears "-m unsafe" doesn't
> emulate the same feature.  My above example couldn't occur without a
> way to bring this "unsafe" feature back which we desire and could be
> useful if parity could occur within each zone.

Currently, we are always in unsafe mode.

Thanks
Yuan




More information about the sheepdog-users mailing list