[sheepdog] [PATCH 2/2] sheep: remove timeout for socket pool

Fri Jun 8 07:26:42 CEST 2012

On 06/08/2012 12:06 AM, MORITA Kazutaka wrote:

> At Fri, 08 Jun 2012 00:01:20 +0800,
> Liu Yuan wrote:
>>
>> On 06/07/2012 11:50 PM, MORITA Kazutaka wrote:
>>
>>> The reason we use timeout for socket connections is that, when
>>> membership change happens, the gateway should retry I/Os with a new
>>> membership instead of sleeping long time in forward_read/write_obj_req
>>> with an old membership.  If send/recv/poll blocks for a long time in
>>> the gateway node, timeout happens in the guest OSes, which is what we
>>> really want to avoid.
>>
>>
>> So there is a dilemma: if not long enough, we will cancel a valid
>> connection which the other end is just busy.
>>
>> I am considering another approach that let recovery thread to kill those
>> blocking connection instead of timeout. How about it?
> 
> Looks a good approach to me :)
> 

I have tried the 'kill' idea, but found it rather difficult than
necessary, so I switched to keepalive, which is considerably simpler.
I'll draft a new patch based on keepalive (timeout)

Thanks,
Yuan