[Sheepdog] [PATCH 0/5] stop grabbing vnode_info references outside the main thread
Christoph Hellwig
hch at infradead.org
Mon May 7 16:52:03 CEST 2012
On Mon, May 07, 2012 at 05:59:23PM +0800, Liu Yuan wrote:
> So we don't have a complete fix for this issue ( doesn't fix it for
> async flush).
It should fix async flush, just not in a nice way.
> I just wonder, does URCU really frighten us away? Yeah,
> probably RCU add some burden to the programmers, but it seems that,
> without it, we also would play dirty tricks to work around it. With this
> patch set, we also restrict programmers to be aware of "Hey, you are not
> allowed to use vnode info directly or by any helper function, you can
> only add extra parameter to point to it". isn't it another restriction?
The other easy alternative is simple refcounting for async flush and
delete - it's way simpler than rcu.
There's two reasons why I don't like using rcu for vnode_info, and only
half of it has anything to do with scary.
1) RCU does not fit the problem. What RCU excels at is lock free
lookups in multi-threaded lookup data structures. We have a simple
pointer derference here, and even without my patchset we can never
grab the first references outside the main thread; it's just
non-obvious without it. In fact that is a pretty fundamental
design point of non-blocking event based architectures like sheep -
anything touching central data structures is done quickly and
non-blocking in the main event loop thread, blocking work is handed
off to workers just to be completed by beeing reinjected into the
main thread. If this design is carefull followed there is almost
no need for global data structures that are access for threads
outside the main thread. In fact the core sheepdog code (minus
farm, object_cache and the accord cluster driver) never uses any
thread synchronization outside the workqueue code, and if followed
carefully and asserted that's a very powerful architecture.
In fact the only smart lock-less algorithm that would fit very
well into this architecture would be a lockless producer/consumer
queue for the workqueues. Given that both sides modify the queue
that's not something RCU would help with at all.
2) Richt now we don't have the thread model in sheep spelled out,
documented and verified using assertations. Until that happens
ading additional complexities to the concurrency model like RCU
does not seem like a good idea, if not a little scary.
More information about the sheepdog
mailing list