[Sheepdog] [PATCH 1/2] sheep: sheep: handle node change event first

Sun Apr 1 05:41:44 CEST 2012

At Sat, 31 Mar 2012 18:31:00 +0800,
Liu Yuan wrote:
> 
> On 03/31/2012 06:23 PM, MORITA Kazutaka wrote:
> 
> > Many bad effects.  For example, imagine that join messages are
> > processed in the different order with other nodes.
> 
> 
> Maybe not. I notice that every call to start_cpg_event_work() will drain
> the cpg queue, So this change will assure us that confchg will be
> handled for sure, despite of other requests.

No, membership change events are blocked until all outstanding I/O
requests are flushed or the previous change membership event are
finished.  There exists the case that the cpg queue is not empty after
start_cpg_event_work() was called.

> 
> We both do DD in guests and do a loop for creating new vid and deleting
> that vdi during the join/leave test.
> 
> All seems good so far... look the sequence for joining 60 nodes

This is a timing problem.  I think the problem would happen on other
environments.

Let's take another approach.  Here is my suggestion:

 - Use different queues for I/O requests and membership events.
 - When membership queue is empty, we can process I/O requests as
   usual.
 - When membership queue is not empty, flush all outstanding I/Os.
   New I/O requests are blocked until the membership queue becomes
   empty.
 - SD_OP_SHUTDOWN and SD_OP_MAKE_FS should be pushed to the membership
   queue, and other operations are pushed to the I/O request queue.

Thanks,

Kazutaka