[sheepdog] compatibility of dog command between new and old cluster
Ruoyu
liangry at ucweb.com
Tue Jul 15 11:31:20 CEST 2014
Once I submit a read request by new version dog command (ledger object
supported) to a old version cluster (ledger object not supported), the
cluster is corrupted.
Error messages in sheep.log:
Jul 15 11:17:26 ERROR [gway 24285] default_read_from_path(291) failed
to read object 80e4a2b600000000,
path=/mnt/sheepdog/obj/80e4a2b600000000, offset=0, size=12587576,
result=4198976, Success
Jul 15 11:17:26 ERROR [gway 24285] err_to_sderr(114)
oid=80e4a2b600000000, Success
Jul 15 11:17:26 ERROR [gway 24285] gateway_replication_read(270) local
read 80e4a2b600000000 failed, Network error between sheep
Jul 15 11:17:26 INFO [main] md_remove_disk(349) /mnt/sheepdog/obj from
multi-disk array
Jul 15 11:17:26 ERROR [gway 24285] sheep_exec_req(1114) failed Network
error between sheep, remote address: 192.168.1.2:7000, op name: READ_PEER
As you can see, vdi object size was changed, expected 12587576, actually
4198976. As a result, sheep thought the disk had unrecoverable problem
so that it must been removed. And then, a recovery will be triggered.
The behavior is not so robust.
Maybe we need something like version control to avoid this issue. What
is your opinion?
More information about the sheepdog
mailing list