[sheepdog-users] cluster data distribution

Valerio Pachera sirio81 at gmail.com
Thu May 9 17:04:26 CEST 2013


2013/5/9 Liu Yuan <namei.unix at gmail.com>:
> Since we don't change weight by plugging/unplugging disks, there is no
> way to rebalance data. If we allow weight-change for plug/unplug, we
> have to pay price: plug/unplug one disk will trigger the whole cluster
> recovery.

Looking at it from the other side:
my node uses two disks: 2T + 500G.
What happens then if I unplug the 2T disk now? It contains lot's of data.
Data can't be distributed across the node disk.
It's has to trigger a cluster recover.
When I plug the disk backup, this node is not going to be used much.

root at sheepdog001:~# collie node md info
Id      Size    Use     Path
 0      1.2 TB  593 GB  /mnt/ST2000DM001-1CH164_W1E2N5G6/obj
 1      212 GB  253 GB  /mnt/wd_WMAYP0904279

I do not like the idea of having more sheeps on a single node.
I think it's dangerous because n sheeps on a node may be bigger than n copies.
I the whole host dies, more nodes are going to die.
That's why I like md approach.

I think a recovery/rebalance is needed.
Maybe not automatically, but by a collie command, so we can choose to
trigger the cluster when it's less loaded.



More information about the sheepdog-users mailing list