[Sheepdog] [PATCH 0/5] stop grabbing vnode_info references outside the main thread

Christoph Hellwig hch at infradead.org
Mon May 7 16:52:03 CEST 2012


On Mon, May 07, 2012 at 05:59:23PM +0800, Liu Yuan wrote:
> So we don't have a complete fix for this issue ( doesn't fix it for
> async flush).

It should fix async flush, just not in a nice way.

> I just wonder, does URCU really frighten us away? Yeah,
> probably RCU add some burden to the programmers, but it seems that,
> without it, we also would play dirty tricks to work around it. With this
> patch set, we also restrict programmers to be aware of "Hey, you are not
> allowed to use vnode info directly or by any helper function, you can
> only add extra parameter to point to it". isn't it another restriction?


The other easy alternative is simple refcounting for async flush and
delete - it's way simpler than rcu.

There's two reasons why I don't like using rcu for vnode_info, and only
half of it has anything to do with scary.

 1) RCU does not fit the problem.  What RCU excels at is lock free
    lookups in multi-threaded lookup data structures.  We have a simple
    pointer derference here, and even without my patchset we can never
    grab the first references outside the main thread; it's just
    non-obvious without it.  In fact that is a pretty fundamental
    design point of non-blocking event based architectures like sheep -
    anything touching central data structures is done quickly and
    non-blocking in the main event loop thread, blocking work is handed
    off to workers just to be completed by beeing reinjected into the
    main thread.  If this design is carefull followed there is almost
    no need for global data structures that are access for threads
    outside the main thread.  In fact the core sheepdog code (minus
    farm, object_cache and the accord cluster driver) never uses any
    thread synchronization outside the workqueue code, and if followed
    carefully and asserted that's a very powerful architecture.
    In fact the only smart lock-less algorithm that would fit very
    well into this architecture would be a lockless producer/consumer
    queue for the workqueues.  Given that both sides modify the queue
    that's not something RCU would help with at all.

 2) Richt now we don't have the thread model in sheep spelled out,
    documented and verified using assertations.  Until that happens
    ading additional complexities to the concurrency model like RCU
    does not seem like a good idea, if not a little scary.



More information about the sheepdog mailing list