[sheepdog] compatibility of dog command between new and old cluster

Fri Jul 18 04:35:46 CEST 2014

Hi there,

Should we upgrade SD_PROTO_VER to 0x03 to avoid it because vdi object 
size is different since ledger object is introduced?

On 2014年07月15日 17:31, Ruoyu wrote:
> Once I submit a read request by new version dog command (ledger object 
> supported) to a old version cluster (ledger object not supported), the 
> cluster is corrupted.
>
> Error messages in sheep.log:
>
> Jul 15 11:17:26 ERROR [gway 24285] default_read_from_path(291) failed 
> to read object 80e4a2b600000000, 
> path=/mnt/sheepdog/obj/80e4a2b600000000, offset=0, size=12587576, 
> result=4198976, Success
> Jul 15 11:17:26 ERROR [gway 24285] err_to_sderr(114) 
> oid=80e4a2b600000000, Success
> Jul 15 11:17:26 ERROR [gway 24285] gateway_replication_read(270) local 
> read 80e4a2b600000000 failed, Network error between sheep
> Jul 15 11:17:26 INFO [main] md_remove_disk(349) /mnt/sheepdog/obj from 
> multi-disk array
> Jul 15 11:17:26 ERROR [gway 24285] sheep_exec_req(1114) failed Network 
> error between sheep, remote address: 192.168.1.2:7000, op name: READ_PEER
>
> As you can see, vdi object size was changed, expected 12587576, 
> actually 4198976. As a result, sheep thought the disk had 
> unrecoverable problem so that it must been removed. And then, a 
> recovery will be triggered. The behavior is not so robust.
>
> Maybe we need something like version control to avoid this issue. What 
> is your opinion?