[sheepdog] [PATCH 1/2] test: add a test for sockfd keepalive

Liu Yuan namei.unix at gmail.com
Mon Sep 3 18:45:21 CEST 2012


On 09/03/2012 11:52 PM, MORITA Kazutaka wrote:
> At Mon, 03 Sep 2012 22:41:24 +0800,
> Liu Yuan wrote:
>>
>> On 09/03/2012 10:07 PM, MORITA Kazutaka wrote:
>>> It waits for a response from the unreachable node at poll() in
>>> wait_forward_request().  I'm not sure why it returns after keepalive
>>> timeout...
>>
>> I met this problem too. But it is quite rare, and I think need to look at how poll
>> works inside kernel to give the fix. Because I have test keepalive with poll, keepalive
>> does take effect on my tests. I guess at some corner cases of poll, keepalive don't take effect.
> 
> Another approach:
>  - set poll timeout, SO_SNDTIMEO, and SO_RCVTIMEO as we did before,
>    but return SD_RES_NETWORK_ERROR only if epoch is incremented after
>    timeout.
>  - call connect/2 with nonblocking, and wait for connect completion
>    with poll to avoid a connect timeout problem.
> 
> This may be easier than digging into kernel code.
> 
> Thanks,
> 
> Kazutaka
> 

Well, by reading your email, I guess flash-get a possible solution for poll() wait on RTO
timers. I have met a silly sendmsg() behavior fixed by e8c4069, which didn't fire on keepalive
timer instead fired RTO timer due to some conditions. Well, I am afraid, this can only be found
by reading the kernel code. 

So I guess poll() which read the connection will also have the similar behavior. I'll try to fix it.

Thanks,
Yuan



More information about the sheepdog mailing list