[sheepdog-users] monitor cluster to avoid corruption
Liu Yuan
namei.unix at gmail.com
Sat Dec 15 11:44:46 CET 2012
On 12/14/2012 10:39 PM, Valerio Pachera wrote:
> *If it happens to write till the end of the cluster, the disk get corrupted*
> collie vdi check test
> Failed to read, No object found
>
> I've been testing with only 1 vdi and 1 guest.
> If we have more disks, they might get corrupted as well.
>
> Correct me if I'm wrong, but the only thing it can be done is to
> delete the vdi disk.
>
This should be fixed by the QEMU patch I mentioned.
> To monitor when the cluster is getting full, we have
> collie node info
>
> It's pretty easy if we have nodes all with the same amount of space,
> we just have to look at the 'Total' percentage or any of the disk.
> It gets more difficult when we have different node sizes.
>
> Here is an example, after I've been writing 512M (formated with 2 copies)
> ---
> collie node info
> Id Size Used Use%
> 0 982 MB 196 MB 19%
> 1 982 MB 160 MB 16%
> 2 982 MB 204 MB 20%
> 3 10.0 GB 528 MB 5%
> Total 13 GB 1.1 GB 8%
> Total virtual image size 10 GB
> ---
>
> And here is the same cluster after I've been writing data till filling
> up all the available space.
> ---
> fino in fondo
> collie node info
> Id Size Used Use%
> 0 982 MB 980 MB 99%
> 1 982 MB 796 MB 81%
> 2 982 MB 952 MB 96%
> 3 10.0 GB 2.5 GB 25%
> Total 13 GB 5.2 GB 40%
> Total virtual image size 10 GB
> ---
>
> *Obviously we can't look at the 'Total' percentage to understand when
> the cluster is getting full.*
With nodes of different size, sheep just try its best to balance the
data over all nodes. Sheepdog internally use a hash function to store
objects on to nodes and this is kind of hash collision problem. We have
make use of virtual nodes to mitigate this problem and it works well
with multiple images(The more the better). But for a single VM, I think
it fails its goal. Well, single VM usage isn't practical.
Please try to test a more practical case, for e.g, to run dozens of VMs.
If data aren't balanced well, we need to fix sheep then.
> Think of a different scenario with several different node sizes (1T,
> 500G, 2T, 750G....).
> I bet you to find out the total amount of available space and, more
> important, the total free space (percentage) of the cluster.
>
> May it be possible to print the 'Total relative available space ' and
> the respective percentage?
> The actual 'Total' is just the sum of the devices.
> If not, may you please tell me how to calculate it?
>
> *Is it going to be possible to avoid disk corruption?*
>
I think in this case, admins should parse percentage of every node and
set the greatest one as out of space indicator.
Thanks,
Yuan
More information about the sheepdog-users
mailing list