[sheepdog] Users inputs for new reclaim algorithm, please
Liu Yuan
namei.unix at gmail.com
Mon Mar 17 17:02:18 CET 2014
On Tue, Mar 18, 2014 at 12:49:54AM +0900, MORITA Kazutaka wrote:
> At Mon, 17 Mar 2014 21:43:25 +0800,
> Liu Yuan wrote:
> > > >
> > > > With new algorithm,
> > > >
> > > > $ dog vdi create image
> > > > $ dog vdi snapshot image -s snap1
> > > > $ dog vdi clone -s snap1 image clone
> > > > $ dog vdi delete clone <-- this operation will surprise you that it won't
> > > > release space but instead increase the space.
> > > >
> > > > Following is the real case, we can see that deletion of a clone, which uses 316MB
> > > > space, will actaully cause 5.2GB more space to be used.
>
> This is obviously strange. Although the algorithm creates additional
> objects to count reference counts, the objects are sparse and should
> not waste many spaces at all even in the worst case.
>
> > > > So if you have this usage in mind, you'll expect a catastrophic prolem:
> > > > - frequent cloned instance release and creation will pose much more space
> > > > pressure on you.
> > > > - when space is near low watermark, you are not allowed to delete clones because
> > > > deletion will actually increase the space and end up destroying your cluster.
> > > > You have no choise, either add more nodes nor deny create of new clones and
> > > > never try to delete clones later.
>
> After the algorithm is implemented correctly, this looks like a corner
> case since the additional space for the new reclaim algorithm is very
> small - it should be only 8 bytes for each object IIUC. However, if
> you still concern about the case, we can preallocate some spaces for
> that beforehand to be used for ledger objects. For example,
>
> 1. Preallocate some small files for each device when sheep starts up.
>
> 2. When the sheepdog cluster becomes disk-full and the user requests
> object deletion, we can rename the preallocated file to a ledger
> object and continue object reclaiming.
>
> In either way, I think this should be a future work. Sheepdog still
> have some bugs in handling a disk-full problem even without object
> reclaiming.
>
> > There might be some users need this new algorithm for their specific usage, but
> > I'd suggest that:
> >
> > 1 make old algorithm as default reclaim one
> > 2 modularize the reclaim algorithm and add new algorithm as an option for users
> > in this way, we can improve the new algorithm steps by steps and possibly
> > we can introduce more algorithms to meet varoius needs.
>
> IMHO, modularizing object reclaiming is overkill. I cannot imagine so
> many algorithms for that. Even if we keep the old algorithm, adding a
> sheep command line option to enable this experimental object
> reclaiming looks enough. If we come up with another one, then let's
> discuss this topic again.
Keep old algorithm is a bottom line to me. Either sheep option or static #ifdef
looks fine to me. We can later refine it to be dynamically pluggable.
New algorithm looks to me more a partial solution than a generic one, compared
with old algorithm. So keep old algorithm really make sense especially to those
who seldom delete snapshots.
Thanks
Yuan
More information about the sheepdog
mailing list