[sheepdog-users] Users inputs for new reclaim algorithm, please

Tue Mar 18 14:48:31 CET 2014

At Tue, 18 Mar 2014 13:03:27 +0000,
Andrew J. Hobbs wrote:
> 
> [1  <text/plain; iso-8859-1 (quoted-printable)>]
> Is there a reason deletion causes such a magnification in usage?  This 
> would force me to rethink current plans for the fall, where we intend to 
> give each student in a system administration course a clone of a VM to 
> play with (directly triggering the catastrophic situation you describe).
> 
> If that's the case, I'd prefer the current situation (dumb reclamation) 
> be retained until a better solution can be found.

The problem of large ledger object is already fixed in the latest
snapshot-object-reclaim branch. Practically, a ledger object consumes
4KB disk space.

I think you will create many clones from single snapshot in your
system administration course. In such a situation, a number of ledger
objects created by CoW is equal to:
size of snapshot / 4MB
even in in the worst case. As an example, assume the size of your base
snapshotimage is 20GB (it would be reasonable for simple OS
installation). In this case, the worst number of ledger object is
5,000. Total amount of disk consumption by ledger object is 60MB
(5,000 * 4KB * 3, assuming 3 replicated cluster).

I think the overhead is quite reasonable. If you delete one VDI which
has 15 its own object (60MB / 4MB), you can gain the return of
investment. And the return cannot be gained under old GC algorithm.

Thanks,
Hitoshi

> 
> On 03/17/2014 04:12 AM, Liu Yuan wrote:
> > Hi all,
> >
> >     I think this would be a big topic regarding new deletion algirthm, which is
> > currently bening undertaken by Hithsh.
> >
> >   The motivation is very well explained as follows:
> >
> >   $ dog vdi create image
> >   $ dog vdi write image < some_data
> >   $ dog vdi snapshot image -s snap1
> >   $ dog vdi write image < some_data
> >   $ dog vdi delete image            <- this doesn't reclaim the objects
> >                                           of the image
> >   $ dog vdi delete image -s snap1   <- this reclaims all the data objects
> >                                           of both image and image:snap1
> >
> > Simply put, we use a simple and stupid algirthm that when all the vdis on the
> > snapshot chain are deleted, the space will then be released.
> >
> > The new algorithm add more complexity to handle this problem, but also introduce
> > a new big problem.
> >
> > With new algorithm,
> >
> >   $ dog vdi create image
> >   $ dog vdi snapshot image -s snap1
> >   $ dog vdi clone -s snap1 image clone
> >   $ dog vdi delete clone  <-- this operation will surprise you that it won't
> >                               release space but instead increase the space.
> >
> > Following is the real case, we can see that deletion of a clone, which uses 316MB
> > space, will actaully cause 5.2GB more space to be used.
> >
> > yliu at ubuntu-precise:~/sheepdog$ dog/dog vdi list
> >    Name        Id    Size    Used  Shared    Creation time   VDI id  Copies  Tag
> > c clone        0   40 GB  316 MB  1.5 GB 2014-03-17 14:35   72a1e2    2:2
> > s test         1   40 GB  1.8 GB  0.0 MB 2014-03-17 14:16   7c2b25    2:2
> >    test         0   40 GB  0.0 MB  1.8 GB 2014-03-17 14:34   7c2b26    2:2
> > yliu at ubuntu-precise:~/sheepdog$ dog/dog node info
> > Id	Size	Used	Avail	Use%
> >   0	39 GB	932 MB	38 GB	  2%
> >   1	39 GB	878 MB	38 GB	  2%
> >   2	39 GB	964 MB	38 GB	  2%
> >   3	39 GB	932 MB	38 GB	  2%
> >   4	39 GB	876 MB	38 GB	  2%
> >   5	39 GB	978 MB	38 GB	  2%
> > Total	234 GB	5.4 GB	229 GB	  2%
> >
> > Total virtual image size	80 GB
> > yliu at ubuntu-precise:~/sheepdog$ dog/dog vdi delete clone
> > yliu at ubuntu-precise:~/sheepdog$ dog/dog node info
> > Id	Size	Used	Avail	Use%
> >   0	34 GB	1.7 GB	33 GB	  4%
> >   1	34 GB	1.7 GB	33 GB	  4%
> >   2	35 GB	1.9 GB	33 GB	  5%
> >   3	35 GB	1.8 GB	33 GB	  5%
> >   4	35 GB	1.8 GB	33 GB	  5%
> >   5	35 GB	1.9 GB	33 GB	  5%
> > Total	208 GB	11 GB	197 GB	  5%
> >
> > Total virtual image size	40 GB
> > yliu at ubuntu-precise:~/sheepdog$ dog/dog vdi list
> >    Name        Id    Size    Used  Shared    Creation time   VDI id  Copies  Tag
> > s test         1   40 GB  1.8 GB  0.0 MB 2014-03-17 14:16   7c2b25    2:2
> >    test         0   40 GB  0.0 MB  1.8 GB 2014-03-17 14:34   7c2b26    2:2
> >
> > For old algorithm, the clones 316MB will be released without posing any problem.
> >
> > I think this is a very important issue for following use case:
> >
> >   - suppose you are providing VM services with pre-defined iamges as bases
> >   - these pre-defined images are actually snapshots in the sheepdog and you
> >     you seldom delete them
> >   - VM instance are provided by clone operation
> >   - since VM instance are all created on demand, they are likely to be released
> >     or recreated very often.
> >
> > So if you have this usage in mind, you'll expect a catastrophic prolem:
> >   - frequent cloned instance release and creation will pose much more space
> >     pressure on you.
> >   - when space is near low watermark, you are not allowed to delete clones because
> >     deletion will actually increase the space and end up destroying your cluster.
> >     You have no choise, either add more nodes nor deny create of new clones and
> >     never try to delete clones later.
> >
> > Any ideas?
> >
> > Or am I missing something?
> >
> > Thanks
> > Yuan
> 
> [2 ajhobbs.vcf <text/x-vcard (base64)>]
> begin:vcard
> fn:Andrew J. Hobbs
> n:Hobbs;Andrew
> org:Delaware State University;Computer and Information Sciences
> adr:;;1200 N Dupont Hwy;Dover;DE;19901;USA
> email;internet:ajhobbs at desu.edu
> title:Lab Coordinator/System Administrator
> tel;work:302-857-7814
> tel;cell:443-359-0122
> x-mozilla-html:TRUE
> url:http://cis.desu.edu
> version:2.1
> end:vcard
> 
> [3  <text/plain; us-ascii (7bit)>]
> -- 
> sheepdog-users mailing lists
> sheepdog-users at lists.wpkg.org
> http://lists.wpkg.org/mailman/listinfo/sheepdog-users