[sheepdog] [PATCH v8 00/19] object reclaim based on generational reference counting

Mon Jun 2 06:16:19 CEST 2014

On Mon, Jun 02, 2014 at 01:08:00PM +0900, Hitoshi Mitake wrote:
> On Fri, May 23, 2014 at 2:03 PM, Liu Yuan <namei.unix at gmail.com> wrote:
> > On Thu, May 22, 2014 at 11:29:59PM +0900, Hitoshi Mitake wrote:
> >> At Thu, 22 May 2014 16:54:32 +0800,
> >> Liu Yuan wrote:
> >> >
> >> > On Fri, May 16, 2014 at 12:22:27AM +0900, Hitoshi Mitake wrote:
> >> > > The object reclaim doesn't support hypervolume yet. But hypervolume cannot be
> >> > > used as a virtual disk (both of qemu and tgt don't support it) currently. And
> >> > > the removal of old vdi deletion is acceptable for hypervolume because it doesn't
> >> > > support snapshot, etc. So I think this patchset can be applied to the master
> >> > > branch.
> >> > >
> >> > > The same code is pushed to:
> >> > > https://github.com/sheepdog/sheepdog/tree/snapshot-object-reclaim
> >> > >
> >> > > There is a problem which can be caused by discard operation. But the
> >> > > problem can be solved as an individual topic. I'll post a patchset for
> >> > > it later.
> >> > >
> >> > > The leak problem was (also) caused by bugs in QEMU's sheepdog
> >> > > driver. The fixed version of QEMU driver is here:
> >> > >  https://github.com/sheepdog/qemu/tree/inode-sync
> >> > > I'll post it to the QEMU list later.
> >> > >
> >> > > v8:
> >> > >  - let COW and snapshot be excluded mutually
> >> > >  -- This change introduces limitation that "dog vdi snapshot" must be
> >> > >     executed on a same node which execute QEMU. But it is temporal and
> >> > >     removed easily.
> >> >
> >> > What happens if I run 'dog vdi snapsht' on node B while VM runs on node A?
> >>
> >> Snapshot is created but gref of the snapshot can be inconsistent
> >> (depends on timing). But it can be removed easily by reviving VDI lock
> >> operation.
> >
> > Is it possible to exclude this negative effect from this patch set? VDI lock
> > operation isn't easy to implement, at least in the near future.
> >
> > So this open a hole that casual user might destroy the conconsistcy of vdi
> > without notice, right? Suppose someone run 'dog vdi snapshot' on different node
> > as before, he wouldn't expect side effect.
> >
> > Even we can't git rid of this limilation in this patchset, we should warn people
> > who try to do it programatically.
> 
> Removing the limitation from this patchset is hard. But current
> snapshot feature in the master branch has similar problems. Because
> sheepdog doesn't provide a mechanism for serializing updates of inode
> objects.
> 
> This patchset solves the problem partially. If users execute "dog vdi
> snapshot" on a host which executes a QEMU process, updates of inode
> objects are seriealized. It cannot be achieved by the original
> snapshot feature. So updating documents for this problem and warn
> operators about this problem would be a reasonable way.

Okay, I'll give a review at the patch set and seems that many stuff added after
my last review. I'll do it ASAP.

> 
> In addition, I discussed with Kazutaka-san about the vdi locking
> feature and found a smart way of implementation. It can be achieved in
> the near future.

We have already added lock/unlock to cluster driver(but corosync doesn't support
it yet). This locking mechanism might help you imlementent this feature.

Thanks
Yuan