[sheepdog-users] monitor cluster to avoid corruption

Tue Dec 18 19:38:29 CET 2012

At Tue, 18 Dec 2012 18:02:15 +0800,
Liu Yuan wrote:
> 
> On 12/18/2012 05:41 PM, Valerio Pachera wrote:
> > collie vdi list
> >   Name        Id    Size    Used  Shared    Creation time   VDI id  Copies  Tag
> >   test         1   10 GB  1.5 GB  0.0 MB 2012-12-18 10:29   7c2b25     2
> > 
> > collie node info
> > Id      Size    Used    Use%
> >  0      982 MB  892 MB   90%
> >  1      982 MB  672 MB   68%
> >  2      10.0 GB 1.5 GB   15%
> > Total   12 GB   3.0 GB   25%
> > Total virtual image size        10 GB
> > 
> > 
> > collie node list
> > M   Id   Host:Port         V-Nodes       Zone
> > -    0   192.168.2.41:7000      16  688040128
> > -    1   192.168.2.42:7000      16  704817344
> > -    2   192.168.2.43:7000      161  721594560
> 
> I think this is not easy to solve with a small change. To take it
> concrete, let's start with current example. Sheepdog is taught to
> distribute the *whole* image 'test' evenly on weighted node. So we write
> full of test to 10GB, then all nodes will share the correct portion of
> data. The problem is, it seldomly happens in real world and before it
> reaches full, its data will never get evenly distributed.
> 
> For a quick thought, I think probably the right fix is to allocate
> virtual nodes *dynamically* based on the *actual data written*, instead
> of statically based on the node space.

I'm against introducing more complex strategy to allocate objects.
The correct approach to get a better balance is to allocate more
virtual nodes.

The ideal allocation here is something like:

 Id      Size    Used    Use%
  0      982 MB  750 MB   76%
  1      982 MB  750 MB   76%
  2      10.0 GB 1.5 GB   15%

What I think is a problem here is that objects are not well-balanced
between disk 0 and 1.  It is because disk 2 (10.0 GB) eats up too many
virtual nodes and the number of virtual nodes of 0 and 1 are too small
(current implementation conserves the average number of v-nodes).  If
sheep could allocate more v-nodes like as follows, we would get a
better balance.

 collie node list
 M   Id   Host:Port         V-Nodes       Zone
 -    0   192.168.2.41:7000      64  688040128
 -    1   192.168.2.42:7000      64  704817344
 -    2   192.168.2.43:7000     640  721594560

Note that it is inevitable that usage rate of 10.0 GB disk is smaller
than others.  If we want to avoid it, we need at least 10 nodes.

Thanks,

Kazutaka