At Wed, 7 Aug 2013 15:07:14 +0800, Liu Yuan wrote: > > On Wed, Aug 07, 2013 at 08:23:49AM +0200, Valerio Pachera wrote: > > I unplugged the full disk; > > waited for the recovery to be done; > > (manually removed data in obj dir, just to be sure) > > plugged it back and waited for the recovery: > > > > # collie node md info > > Id Size Used Avail Use% Path > > 0 2.7 TB 1.8 TB 950 GB 66% /mnt/sheep/dsk02 > > 1 169 GB 76 GB 93 GB 44% /mnt/sheep/dsk01/obj > > > > Now it's fine :-) > > > > I have to do the same on another node: > > > > Node 0: > > 0 166 GB 154 GB 12 GB 92% /mnt/sheep/dsk01/obj > > 1 465 GB 318 GB 147 GB 68% /mnt/sheep/dsk02 > > 2 1.8 TB 1.1 TB 716 GB 61% /mnt/sheep/dsk03 > > > > > > In these two cases (node 2 and node 0), it's ok to unplug the full > > disk because there is enough space on the second (and third) disk to > > rebuild data. > > If there was not enough space, it was probably going to fail the rebuild. > > > > I know in theory it should not happen to have a disk used more than > > another but it actually does. > > May it be possible to implement something like > > collie node md reweight > > ? > > Well, I have no idea why unplug/plug effectively rebalance the data. Before we > do anything further, we need to find out why it works. Any idea kazum? Valerio, Can you try the following command on Node 0 and give us the outputs? $ attr -g md.size /mnt/sheep/dsk01/obj | hd $ attr -g md.size /mnt/sheep/dsk02 | hd $ attr -g md.size /mnt/sheep/dsk03 | hd Thanks, Kazutaka |