[sheepdog-users] cluster data distribution

Thu May 9 09:54:52 CEST 2013

On 05/09/2013 03:02 PM, Valerio Pachera wrote:
> Hi, yesterday I plugged new device in my production cluster.
> Everything went fine.
> 
> Looking at node info, node id 2 is less used.
> 
> root at sheepdog004:~# collie node info
> Id      Size    Used    Use%
>  0      1.6 TB  778 GB   47%
>  1      1.6 TB  723 GB   42%
>  2      2.1 TB  175 GB    8%
> Total   5.4 TB  1.6 TB   30%
> Total virtual image size        1.2 TB
> 
> When we add devices to a cluster, I guess sheepdog is not going to
> redistribute/balance the nodes load, right?

Yes, plugging the new disk doesn't trigger a data re-balance across the
nodes but re-balance data between the disks in this node, which is not
the same as adding a new node that trigger a data re-balance across the
nodes.

> Node info is showing "size", that I think should be the total size, or
> is it the free space? All 3 nodes are using 500G + 2T devices.
> 

Total size

> Before adding devices cluster was
> id0 1 device 2T
> id1 1 device 2T
> id2 1 device 500G
> 
> Now
> id0 1 device 2T + 1 device 500G
> id1 1 device 2T + 1 device 500G
> id1 1 device 2T + 1 device 500G
> 
> root at sheepdog001:~# df -h
> /dev/sdc1                                               1,9T  545G
> 1,3T  30% /mnt/ST2000DM001-1CH164_W1E2N5G6
> /dev/sdb1                                               466G  234G
> 233G  51% /mnt/wd_WMAYP0904279
> 
> root at sheepdog002:~# df -h
> /dev/sdc1                                               1,9T  548G
> 1,3T  30% /mnt/ST2000DM001-1CH164_W1E2N5GM
> /dev/sdb1                                               466G  176G
> 291G  38% /mnt/wd_WMAYP1690412
> 
> root at sheepdog004:~# df -h
> /dev/sdc1                 466G   40G    427G   9% /mnt/wd_WCAYUEP99298
> /dev/sdd1                 1,9T  154G    1,7T   9% /mnt/wd_WCAWZ1588874
> 
> 
> root at sheepdog004:~# collie node md info --all
> Id      Size    Use     Path
> Node 0:
>  0      1.3 TB  544 GB  /mnt/ST2000DM001-1CH164_W1E2N5G6/obj
>  1      232 GB  233 GB  /mnt/wd_WMAYP0904279

Looks like this node is full of space because disk 1 doesn't have enough
space to hold more objects. If the newly added object is hashed to this
disk, sheep will get a disk full error. By the analyzing, this is a bug
of sheep. The right behavior should be unplug the disk 1 automatically.
I'll write a patch for it later.

Thanks,
Yuan