[Sheepdog] [PATCH] fix ORDERED work handling bug

MORITA Kazutaka morita.kazutaka at lab.ntt.co.jp
Mon Apr 19 07:47:26 CEST 2010


At Wed, 14 Apr 2010 13:18:54 +0900,
FUJITA Tomonori wrote:
> 
> On Wed, 14 Apr 2010 13:15:19 +0900
> FUJITA Tomonori <fujita.tomonori at lab.ntt.co.jp> wrote:
> 
> > fix a bug that A SIMPLE work wronly passes blocked ORDERED works.
> > 
> > 1. a SIMPLE work is on the pending_list
> > 2. when a new ORDERED work comes, then it added to the blocked_list.
> > 3. then a new SIMPLE work comes, it's wrongly added to the blocked_list (it should be delayed untile the above ORDERED work finishes).
> > 
> 
> Should have been:
> 
> 3. then a new SIMPLE work comes, it's wrongly added to the
> pending_list. It will be executed wrongly before the above ORDERED.
> 
> It should be delayed untile the above ORDERED work finishes.

If nodes mutually sent requests just before sheepdog node membership
changes, this patch seems to cause problems.

Here is an example scenario:
1) There are two nodes, A and B, in the sheepdog cluster, and
a VM is running on the each node.
2) Each VM sends a write request to the local collie at the same time.
3) Node A forwards the request to node B, and vice versa at the same
time.
3) Node C joins to the cluster.
4) sd_confch add the ORDERED work to the work_queue on each node. This
work is blocked because the previous SIMPLE work are not finished.
5) On each node, collie receives forwarded write requests, but collie
blocks the request because the previous ORDERED work is not finished.
6) Two nodes cannot accept any requests any more.

I guess, if worker thread calls __sd_confch with an ORDERED attribute,
we must assume that collie can process SIMPLE work before its previous
ORDERED work finishes.

Regards,

Kazutaka Morita



More information about the sheepdog mailing list