[sheepdog] [PATCH v4 3/3] sheep: look up whole range of bitmap

Hitoshi Mitake mitake.hitoshi at gmail.com
Thu Mar 12 12:26:06 CET 2015


At Thu, 12 Mar 2015 10:09:19 +0800,
Liu Yuan wrote:
> 
> On Sun, Mar 08, 2015 at 01:08:35PM +0900, Hitoshi Mitake wrote:
> > Older sheepdog didn't have a functionality of recycling VID, so the
> > get_vdi_bitmap_range() can detect correct range of bitmap. But newer
> > sheepdog recycles VID. It can produce situations like below:
> 
> Which commit introduces the recycling VID? I think our current vid id allocation
> algorithm is heavily rely on the assumption that "no recycling". Otherwise, a
> redesign of the algorithm should be considered.
> 
> > The first state of VID bitmap:
> > 0 0 1* 1* 1 0 0 0
> > 1 is a VID bit of working VDI, 1* is a bit of snapshot. Assume the
> > above 1 and 1* are used for VDI named "A" and its snapshots.
> > 
> > Then, a user tries to create VDI "B". sd_hash_vdi() returns VID which
> > conflicts with existing bits for A.
> > 0 0 1* 1* 1 0 0 0
> >        ^
> >        |
> >        sd_hash_vdi() returns VID which conflicts with the
> >        above bit.
> > 
> > So B acquires the left most free bit
> > 0 0 1* 1* 1 1 0 0
> >             ^
> >             |
> >             B acquires this bit.
> > 
> > Then, the user deletes A and its snapshots. All of the family members
> > are deleted. The bitmap becomes like below
> > 0 0 0 0 0 1 0 0
> >       ^
> >       |
> >       B's original VID sd_hash_vdi() calculates.
> > 
> > Now sheep fails to lookup VID of B, because the VID calculated by
> > sd_hash_vdi() is zero.
> > 
> > This is the reason of the looking up whole range of bitmap. Of course
> > it is ugly and costly. But its cost is equal or less than "dog vdi
> > list"'s one.
> 
> I think this is too costly, meaning the more vdi you created, the more time
> will be used to create one vdi. After we have more than half the bitmap is set,
> every create of new vdi will take a long time to finish because we have to read
> all the existing inode and check the name. It is unacceptable and will be a big
> scaling problem in this manner. I guess this will not only affect vdi create but
> also other functions, to name a few, snapshoting, cloning, which we advertise to
> do it in lighting fast time.

Of course it is costly. But current VID management policy can also
cause similar situation of heavy snapshotting and cloning
(e.g. VDI with millions of snapshot requires millions of lookup).

As I wrote in the another mail, simply provide an option of cluster
format for turning on/off VID recycling would be suitable.

Thanks,
Hitoshi

> 
> thanks,
> Yuan
> -- 
> sheepdog mailing list
> sheepdog at lists.wpkg.org
> https://lists.wpkg.org/mailman/listinfo/sheepdog



More information about the sheepdog mailing list