[sheepdog] [PATCH] sheep: fix handling of too old epoch in check_request

Tue May 29 13:45:23 CEST 2012

On 05/29/2012 06:51 PM, Christoph Hellwig wrote:
> On Tue, May 29, 2012 at 06:28:56PM +0800, Liu Yuan wrote:
>>> Fix these issues by opencoding the action we want in check_request.
>>
>>
>> I think we need revisit
>>
>>         list_for_each_entry_safe(req, n, &pending_list, request_list) {
>>                 list_del(&req->request_list);
>>
>>                 if (check_request(req) < 0)
>>                         continue;
>>                 list_add_tail(&req->request_list, &sys->request_queue);
>>         }
>>
>> which brings nested request manipulation.  It seems to me that if we can
>> remove this check_request(), the world will be less confusing.
> 
> I don't think there's a way around it - by the time we requeue requests
> the cluster state may have changed and we need to to redo the checks.
> 
> In fact I suspect when resubmitting requests from the other lists we'll
> need the same as well.
> 
> The real problem was calling ->done from outside the workqueue code,
> which led to all these issues.

I can't tell now whether there's some problem when we did not check_request
again before it's resumed.

What about this, we call queue_request() when a request is to resume instead of
calling process_request_event_queues(), we make queue_request to check the request
for us.

thanks,

levin