[sheepdog] [PATCH v2 1/2] sheep, dog: make recycling VID selectable
Liu Yuan
namei.unix at gmail.com
Tue Mar 17 04:22:30 CET 2015
On Tue, Mar 17, 2015 at 10:03:53AM +0800, Liu Yuan wrote:
> On Tue, Mar 17, 2015 at 04:44:46AM +0900, MORITA Kazutaka wrote:
> > At Mon, 16 Mar 2015 21:13:29 +0800,
> > Liu Yuan wrote:
> > >
> > > How about make 'dog vdi clone --no-share' as the default clone operation? And
> > > we can add dog vdi clone --share to keep old behavior as optional. By this
> > > manner, --no-share will save us from this kind of subtle problem. And your team
> > > problem about vdi exhaustion will be achieved :).
> >
> > --no-share option disables thin provisioing. It shouldn't be a default option,
> > IMHO.
>
> Following bug will disable vid recycle for old algorithm.
>
> commit 21549a1bd4981fabcc09d062a647162127fe0637
> Author: Hitoshi Mitake <mitake.hitoshi at gmail.com>
> Date: Sun Jun 1 23:23:18 2014 +0900
>
> sheep: don't recycle VDI ID
>
> Recycling VDI IDs of deleted VDIs is a completely wrong idea. It
> breaks relations between inode objects and data objects. For example,
> it can cause a problem of corrupting cloned VDIs (see related
> issue). This patch forbids the recycling.
>
> Related issue:
> https://bugs.launchpad.net/sheepdog-project/+bug/1317755
>
> Signed-off-by: Hitoshi Mitake <mitake.hitoshi at lab.ntt.co.jp>
> Signed-off-by: Liu Yuan <namei.unix at gmail.com>
>
> This means we don't have vid recycle for old algorithm now because of this
> subtle problem. This is why I suggest set --no-share as default, in order to
> bring this functionality back.
>
> > >
> > > This manner is not perfect, but it will benefit us:
> > >
> > > 1. stable code base since old algorithm is long tested.
> >
> > Hitoshi's patch enables the stable algorithm by default. Isn't it enough?
>
> I'm afraid not.
>
> 1. as above mentioned, simply disable new algorithm won't bring us back vid
> recycle. But if we bring it back, seems it will conflict with new algorithm.
> 2. new algorithm has a bug that need to hack vdi_lookup(), which will degrade it
> a lot. I'm not sure if we can hack vdi_lookup() to meet two algorithm's needs.
>
> > > 2. we won't degrate vdi_lookup
> >
> > I'm still not sure which code in vdi_lookup() is a problem. The problem
> > happens even when we disable VID garbage collection?
>
> vdi_lookup() becomes a problem if Hitoshi's patch is enabled after he fixes a
> fatal bug of new algorithm.
Hi Kazutaka,
To give you more background, following is excerpt from Hithoshi's buf fix patch
******************************************************
Older sheepdog didn't have a functionality of recycling VID, so the
get_vdi_bitmap_range() can detect correct range of bitmap. But newer
sheepdog recycles VID. It can produce situations like below:
The first state of VID bitmap:
0 0 1* 1* 1 0 0 0
1 is a VID bit of working VDI, 1* is a bit of snapshot. Assume the
above 1 and 1* are used for VDI named "A" and its snapshots.
Then, a user tries to create VDI "B". sd_hash_vdi() returns VID which
conflicts with existing bits for A.
0 0 1* 1* 1 0 0 0
^
|
sd_hash_vdi() returns VID which conflicts with the
above bit.
So B acquires the left most free bit
0 0 1* 1* 1 1 0 0
^
|
B acquires this bit.
Then, the user deletes A and its snapshots. All of the family members
are deleted. The bitmap becomes like below
0 0 0 0 0 1 0 0
^
|
B's original VID sd_hash_vdi() calculates.
Now sheep fails to lookup VID of B, because the VID calculated by
sd_hash_vdi() is zero.
This is the reason of the looking up whole range of bitmap. Of course
it is ugly and costly. But its cost is equal or less than "dog vdi
list"'s one.
***********************************************************************
I need to add that, old sheep can reuse vid by checking inode's name, so vid
recycling is unnecessary. It is proved simple and reliable, at the cost of
allow deleted inodes stored on the storage, but the space overhead is too small
to notice.
New algorithm recycle vid, which means we need to look up the whole bitmap every
time we call vdi_lookup(), the hottest function in vdi.c.
Thanks,
Yuan
More information about the sheepdog
mailing list