[sheepdog] [PATCH 1/2] test: add a test for sockfd keepalive
Liu Yuan
namei.unix at gmail.com
Mon Sep 3 18:45:21 CEST 2012
On 09/03/2012 11:52 PM, MORITA Kazutaka wrote:
> At Mon, 03 Sep 2012 22:41:24 +0800,
> Liu Yuan wrote:
>>
>> On 09/03/2012 10:07 PM, MORITA Kazutaka wrote:
>>> It waits for a response from the unreachable node at poll() in
>>> wait_forward_request(). I'm not sure why it returns after keepalive
>>> timeout...
>>
>> I met this problem too. But it is quite rare, and I think need to look at how poll
>> works inside kernel to give the fix. Because I have test keepalive with poll, keepalive
>> does take effect on my tests. I guess at some corner cases of poll, keepalive don't take effect.
>
> Another approach:
> - set poll timeout, SO_SNDTIMEO, and SO_RCVTIMEO as we did before,
> but return SD_RES_NETWORK_ERROR only if epoch is incremented after
> timeout.
> - call connect/2 with nonblocking, and wait for connect completion
> with poll to avoid a connect timeout problem.
>
> This may be easier than digging into kernel code.
>
> Thanks,
>
> Kazutaka
>
Well, by reading your email, I guess flash-get a possible solution for poll() wait on RTO
timers. I have met a silly sendmsg() behavior fixed by e8c4069, which didn't fire on keepalive
timer instead fired RTO timer due to some conditions. Well, I am afraid, this can only be found
by reading the kernel code.
So I guess poll() which read the connection will also have the similar behavior. I'll try to fix it.
Thanks,
Yuan
More information about the sheepdog
mailing list