[sheepdog] [PATCH v2 3/3] sheep: remove cache object for discard operation
Liu Yuan
namei.unix at gmail.com
Wed Jul 17 10:08:12 CEST 2013
On Wed, Jul 17, 2013 at 04:46:17PM +0900, Hitoshi Mitake wrote:
> At Tue, 16 Jul 2013 14:09:04 +0800,
> Liu Yuan wrote:
> >
> > On Tue, Jul 16, 2013 at 02:21:25PM +0900, Hitoshi Mitake wrote:
> > > At Tue, 16 Jul 2013 10:36:22 +0800,
> > > Liu Yuan wrote:
> > > >
> > > > On Mon, Jul 15, 2013 at 11:49:38PM +0900, Hitoshi Mitake wrote:
> > > > > At Mon, 15 Jul 2013 14:28:48 +0800,
> > > > > Liu Yuan wrote:
> > > > > >
> > > > > > Signed-off-by: Liu Yuan <namei.unix at gmail.com>
> > > > > > ---
> > > > > > sheep/object_cache.c | 28 ++++++++++++++++++++++++++++
> > > > > > sheep/sheep_priv.h | 1 +
> > > > > > sheep/store.c | 6 ++++++
> > > > > > 3 files changed, 35 insertions(+)
> > > > > >
> > > > > > diff --git a/sheep/object_cache.c b/sheep/object_cache.c
> > > > > > index e3270df..1579a30 100644
> > > > > > --- a/sheep/object_cache.c
> > > > > > +++ b/sheep/object_cache.c
> > > > > > @@ -1317,6 +1317,34 @@ out:
> > > > > > return ret;
> > > > > > }
> > > > > >
> > > > > > +int object_cache_remove(uint64_t oid)
> > > > > > +{
> > > > > > + /* Inc the entry refcount to exclude the reclaimer */
> > > > > > + struct object_cache_entry *entry = oid_to_entry(oid);
> > > > > > + struct object_cache *oc = entry->oc;
> > > > > > + int ret;
> > > > > > +
> > > > > > + if (!entry)
> > > > > > + return SD_RES_NO_OBJ;
> > > > > > +
> > > > > > + sd_dprintf("%" PRIx64, oid);
> > > > > > + while (refcount_read(&entry->refcnt) > 1)
> > > > > > + usleep(100000); /* Object might be in push */
> > > > >
> > > > > I believe this sort of magic number based coding should be
> > > > > avoided.
> > > >
> > > > just #define 100000 xxx doesn't make anything better. I think of wait-queue
> > > > later to replace all the busy loop.
> > > >
> > > > > coroutine would be suitable for writing this type of code cleanly. How
> > > > > about recover it? I've heard that the previous sheepdog had coroutine
> > > > > but it was reverted because of poor compatibility with valgrind.
> > > > >
> > > > > But I believe coroutine is worth considering to use, because other
> > > > > code in sheepdog (e.g. swithc_journal_file()) would be able to enjoy
> > > > > it.
> > > >
> > > > coroutine is notorious for:
> > > > 1. bad code readability, very hard to maintain and extend
> > > > 2. not work with gdb, very hard to debug
> > > > 3. limited stack size
> > > >
> > > > It might look smart but we should completely avoid this kind of trick.
> > > >
> > > > What we really need, is wait queue mechanism, then above whie loop will simply
> > > > be transformed into:
> > > >
> > > > wait_event(queue, refcount_read(&entry->refcnt) <= 1)
> > > >
> > > > wait_event(q, condition) is much more straightforward scheduling mechanism than
> > > > coroutine.
> > >
> > > OK, I agree with not using coroutine.
> > >
> > > BTW, it seems that object_cache_remove() contains a bug. There is a
> > > possibility that other threads can acquire the removed object cache
> > > between the end of usleep() loop and write_lock_cache(). We need a
> > > better locking scheme for object cache.
> > >
> > > (The sequence means a VM issues read or write direct after its remove
> > > request. It would be rare but we don't have any assumption about
> > > behaviour of VMs.)
> > >
> >
> > Seem not a bug, we don't return for discard command yet and VM won't proceed
> > with subsequent IO requests, no?
>
> The assumption depends on the behavior of VMs. I agree with that
> the proceeding commands are rare. But it seems that inserting assert()
> or warning and returning between usleep() and
> write_lock_cache() because sheep doesn't forbid the sequence. These
> can help us to detect unexpected behavior.
>
> Other parts looks good to me.
We already have many assumptions, e.g., read will be issued after write request
responses. Anyway, issue a warning looks fine to me to detect insane VM behavior
Thanks
Yuan
More information about the sheepdog
mailing list