[sheepdog] Issue with "-m unsafe", copies and zones
MORITA Kazutaka
morita.kazutaka at lab.ntt.co.jp
Tue Oct 2 20:56:40 CEST 2012
At Tue, 2 Oct 2012 10:20:53 -0400,
Shawn Moore wrote:
>
> I have been testing the 0.5.0 release and believe I have found
> regression issues related to "mode unsafe" as well as just one zone
> out of three causing issues. The last time I know this worked was
> when the option was "-H" for no halt before the "-m OPTION".
>
>
> I have 6 nodes (2 per zone with 3 zones). Each zone is on it's own
> switch with the switch for zone 0 bringing them all together.
> # collie node list
> M Id Host:Port V-Nodes Zone
> - 0 172.16.1.151:7000 64 0
> - 1 172.16.1.152:7000 64 0
> - 2 172.16.1.153:7000 64 1
> - 3 172.16.1.154:7000 64 1
> - 4 172.16.1.155:7000 64 2
> - 5 172.16.1.159:7000 64 2
>
>
> The cluster was formatted as follows:
> # collie cluster format -b farm -c 3 -m unsafe
> # collie cluster info
> Cluster status: running
> Cluster created at Mon Oct 1 15:40:55 2012
> Epoch Time Version
> 2012-10-01 15:40:55 1 [172.16.1.151:7000, 172.16.1.152:7000,
> 172.16.1.153:7000, 172.16.1.154:7000, 172.16.1.155:7000,
> 172.16.1.159:7000]
>
>
> I created a 40GB vdi via each node.
> # collie vdi list
> Name Id Size Used Shared Creation time VDI id Copies Tag
> test159 1 40 GB 40 GB 0.0 MB 2012-10-01 16:46 279f76
> 3
> test153 1 40 GB 40 GB 0.0 MB 2012-10-01 16:46 27a9a8
> 3
> test152 1 40 GB 40 GB 0.0 MB 2012-10-01 16:46 27ab5b
> 3
> test151 1 40 GB 40 GB 0.0 MB 2012-10-01 16:46 27ad0e
> 3
> test155 1 40 GB 40 GB 0.0 MB 2012-10-01 16:46 27b3da
> 3
> test154 1 40 GB 40 GB 0.0 MB 2012-10-01 16:46 27b58d 3
> # collie node info
> Id Size Used Use%
> 0 476 GB 117 GB 24%
> 1 476 GB 123 GB 25%
> 2 476 GB 136 GB 28%
> 3 476 GB 104 GB 21%
> 4 476 GB 117 GB 24%
> 5 476 GB 123 GB 25%
> Total 2.8 TB 720 GB 25%
>
>
> Then I kill the uplink interface for zone 2 from the zone 0 switch.
> This leaves zones 0/1 talking to each other and zone 2 talking only to
> itself.
> # collie cluster info
> Cluster status: running
> Cluster created at Mon Oct 1 15:40:55 2012
> Epoch Time Version
> 2012-10-02 09:04:28 3 [172.16.1.151:7000, 172.16.1.152:7000,
> 172.16.1.153:7000, 172.16.1.154:7000]
> 2012-10-02 09:04:28 2 [172.16.1.151:7000, 172.16.1.152:7000,
> 172.16.1.153:7000, 172.16.1.154:7000, 172.16.1.159:7000]
> 2012-10-01 15:40:55 1 [172.16.1.151:7000, 172.16.1.152:7000,
> 172.16.1.153:7000, 172.16.1.154:7000, 172.16.1.155:7000,
> 172.16.1.159:7000]
> # collie node info
> Id Size Used Use%
> 0 476 GB 117 GB 24%
> 1 476 GB 123 GB 25%
> 2 476 GB 136 GB 28%
> 3 476 GB 104 GB 21%
> Total 1.9 TB 480 GB 25%
> At this point every node in zones 0/1 start writing every second:
> Oct 02 09:04:28 [rw 128323] get_vdi_copy_number(82) No VDI copy
> entry for 0 found
> The command below hangs till killed on every vdi:
> # collie vdi object test151
> So I try to check the vdi's and they all do:
> # collie vdi check test151
> [main] get_vnode_next_idx(106) PANIC: can't find next new idx
> Aborted
>
>
> When I bring back the interface between zone 0/1 and 2, the sheep
> processes have died stating:
> Oct 02 09:04:28 [main] cdrv_cpg_confchg(599) PANIC: Network
> partition is detected
> Oct 02 09:04:28 [main] crash_handler(439) sheep pid 6780 exited unexpectedly.
> Shouldn't zone two have remained running due to the "-m unsafe"
> option? I understand about network partitioning and want this issue
> as I can handle it myself.
I think we should add another option to disable network partition
detection. "-m unsafe" means only allowing I/Os even if there are
enough nodes in the cluster. With the option, we have a risk of
reading an old data. On the other hand, the risk of allowing network
parition is that we could update the same data in both clusters at the
same time and different from what "-m unsafe" can cause.
> And I can't understand why zones 0/1 were affected at all with copies
> 2 and especially with "-m unsafe".
>
>
> Let me know if you need anymore information or would like me to re-run
> the test a different way.
Unfortunately, I could not reproduced the problem. Does the problem
happen only against network error? What happens if you simply stop
sheeps in zone 2 with kill command?
Thanks,
Kazutaka
More information about the sheepdog
mailing list