[sheepdog-users] cluster-full due to different size devices

Liu Yuan namei.unix at gmail.com
Wed Jun 19 04:02:39 CEST 2013


On 06/19/2013 06:10 AM, Valerio Pachera wrote:
> It's happening again:
> The node sheepdog002 is filling up it's smaller device (500G 87%
> /mnt/wd_WMAYP1690412).
> The same is not happening to node sheepdog004 (500G) , nor sheepdog003 (217G).
> 
> Note: i killed sheepdog002 and insert it back to the cluster right away.
> This triggered the cluster recovery.
> When I noticed sheepdog002 was filling up its smaller disk, I tried to
> call cluster reweigh, but it didn't help.
> 
> 
>  parallel-ssh  -i -h etc/pssh.conf 'df -h | grep mnt'
> [1] 23:50:37 [SUCCESS] sheepdog004
> /dev/mapper/vg01-bkp      296G  267G     14G  96% /mnt/backup
> /dev/sdc1                 466G  232G    234G  50% /mnt/wd_WCAYUEP99298
> /dev/sdd1                 1,9T  834G    1,1T  45% /mnt/wd_WCAWZ1588874
> [2] 23:50:37 [SUCCESS] sheepdog002
> /dev/mapper/vg00-dati 213G  144G     59G  72% /mnt/dati
> /dev/sdb1                   466G  403G     64G  87% /mnt/wd_WMAYP1690412
> /dev/sdc1                   1,9T  762G    1,1T  41%
> /mnt/ST2000DM001-1CH164_W1E2N5GM
> [3] 23:50:37 [SUCCESS] sheepdog003
> /dev/sda3       217G  146G     71G  68% /mnt/sheep/dsk01
> /dev/sdb1       2,8T  1,1T    1,7T  40% /mnt/sheep/dsk02
> /dev/sdc1       2,8T  1,5T    1,4T  52% /mnt/cubonas
> [4] 23:50:37 [SUCCESS] sheepdog001
> /dev/mapper/vg00-dati     192G  170G     12G  94% /mnt/dati
> /dev/sdc1            1,9T  1,1T    768G  59% /mnt/ST2000DM001-1CH164_W1E2N5G6
> 
> Now I'm going to unplug  /mnt/wd_WMAYP1690412 but I fear other "small"
> devices are going to fill up.
> 
> I can't follow the recovery because it's pretty late here now : /
> 

Could you show 'md info --all' to see if some md devices are really full?

I can't reproduce this problem easily on my laptop, if there is problem,
it is the problem of our hash function. Kazutaka, can you see to this
problem on your test cluster? If there indeed is problem in sheep for
distribution, it is kind of fatal problem. cluster-full just because of
one disk full will lead the cluster unusable, I think this is unacceptable.

Thanks,
Yuan



More information about the sheepdog-users mailing list