[sheepdog] [PATCH v2 1/2] sheep, dog: make recycling VID selectable

Hitoshi Mitake mitake.hitoshi at lab.ntt.co.jp
Tue Mar 17 05:33:58 CET 2015


At Tue, 17 Mar 2015 11:06:34 +0800,
Liu Yuan wrote:
> 
> On Tue, Mar 17, 2015 at 11:42:01AM +0900, Hitoshi Mitake wrote:
> > At Tue, 17 Mar 2015 10:03:53 +0800,
> > Liu Yuan wrote:
> > > 
> > > On Tue, Mar 17, 2015 at 04:44:46AM +0900, MORITA Kazutaka wrote:
> > > > At Mon, 16 Mar 2015 21:13:29 +0800,
> > > > Liu Yuan wrote:
> > > > > 
> > > > > How about make 'dog vdi clone --no-share' as the default clone operation? And
> > > > > we can add dog vdi clone --share to keep old behavior as optional. By this
> > > > > manner, --no-share will save us from this kind of subtle problem. And your team
> > > > > problem about vdi exhaustion will be achieved :).
> > > > 
> > > > --no-share option disables thin provisioing.  It shouldn't be a default option,
> > > > IMHO.
> > > 
> > > Following bug will disable vid recycle for old algorithm.
> > > 
> > > commit 21549a1bd4981fabcc09d062a647162127fe0637
> > > Author: Hitoshi Mitake <mitake.hitoshi at gmail.com>
> > > Date:   Sun Jun 1 23:23:18 2014 +0900
> > > 
> > >     sheep: don't recycle VDI ID
> > >     
> > >     Recycling VDI IDs of deleted VDIs is a completely wrong idea. It
> > >     breaks relations between inode objects and data objects. For example,
> > >     it can cause a problem of corrupting cloned VDIs (see related
> > >     issue). This patch forbids the recycling.
> > >     
> > >     Related issue:
> > >     https://bugs.launchpad.net/sheepdog-project/+bug/1317755
> > >     
> > >     Signed-off-by: Hitoshi Mitake <mitake.hitoshi at lab.ntt.co.jp>
> > >     Signed-off-by: Liu Yuan <namei.unix at gmail.com>
> > > 
> > > This means we don't have vid recycle for old algorithm now because of this
> > > subtle problem. This is why I suggest set --no-share as default, in order to
> > > bring this functionality back.
> > 
> > 1. clone --no-share is much heavier operation than whole range lookup
> > of bitmap. It produces read + write request * # replication for every
> > objects pointed by a parent snapshot. It means we cannot provide fast
> > cloning. And space consumption will increase explosively.
> > 
> > 2. the old recycling doesn't take care about snapshots completely as I
> >    wrote in my another email (and the issue in the above link of
> >    launchpad describes).
> > 
> > > 
> > > > > 
> > > > > This manner is not perfect, but it will benefit us:
> > > > > 
> > > > > 1. stable code base since old algorithm is long tested.
> > > > 
> > > > Hitoshi's patch enables the stable algorithm by default.  Isn't it enough?
> > > 
> > > I'm afraid not.
> > > 
> > > 1. as above mentioned, simply disable new algorithm won't bring us back vid
> > >    recycle. But if we bring it back, seems it will conflict with new
> > >    algorithm.
> > 
> > Revive the old algorithm is completely impossible as I described in the above.
> 
> Not all the use case will have above mentioned problem. Most of the time, people
> won't destroy the whole chain and recreate it with the same name while clones
> are running. In this sense, it is a extreme corner case that some user might
> have it.
> 
> So you mentioned NTT will take periodic snapshots and afraid of vid exthaustion,
> it is a valid demand. But you can't recycle vid unless you use --no-share for
> clone even with your new old algorithm, right?
> 
> This means, both old and new algorithm face the same problem, no?
> 
> > > 2. new algorithm has a bug that need to hack vdi_lookup(), which will degrade it
> > >    a lot. I'm not sure if we can hack vdi_lookup() to meet two
> > >    algorithm's needs.
> > 
> > Looking up whole range unconditionally solves the problem. And it can
> > be disabled with the option if users don't like.
> 
> But we can only recycle vid if the whole chain is deleted even with your new
> algorithm, meaning that --no-share will still be used if your new alogithm take
> effect, otherwise, your new algorithm won't help us recycle vid, no? In other
> words, new algorithm = old algorithm.
> 
> What I am concerned of new algorithm is it is very limited, if I don't get it
> wrong.
> 
> The new algorithm allow to recycle vid only if we delete the whole chain and
> use --no-share for clone to cut the relationship, which old algorithm can
> achieve the same purpose withouth changing a single line.
> 
> *So my question is, why we need new one?*
> 
> considering new algorithm will look up the whole range unconditionally, which
> will degrade the general case, even some people won't need recycle vid.
> 
> Did I misunderstand anything of your new algorithm?

Yes. The old algorithm has a possibility of data corruption as described:
https://bugs.launchpad.net/sheepdog-project/+bug/1317755
Because it doesn't care about family relation of VDIs. Your latest
patchset can revive the above problem.

My new one cares it. So we recycle VIDs safely without the data
corruption. Even though it requires --no-share cloning, it is much
better to have the new one at least as an option.

> 
> A real new algorithm, I guess, is uproot the old algorithm completely and get
> rid of vid exthaution without the help of --no-share.
> 

For doing this, at least we need a mechanism to enforce COW to VMs and
cut dependency between VDIs. And it is just a part of the
requirements. For detecting dependency between VDIs, we need to check
data_vdi_id of every VDI member of the family. It will require much
more complex implementation and runtime overhead.

Thanks,
Hitoshi




More information about the sheepdog mailing list