[sheepdog] [PATCH 1/2] test: add a test for sockfd keepalive
MORITA Kazutaka
morita.kazutaka at lab.ntt.co.jp
Mon Sep 3 16:11:36 CEST 2012
At Mon, 03 Sep 2012 23:07:34 +0900,
MORITA Kazutaka wrote:
>
> At Mon, 03 Sep 2012 21:30:09 +0800,
> Liu Yuan wrote:
> >
> > On 09/03/2012 08:24 PM, MORITA Kazutaka wrote:
> > > No. The reason I doubt keepalive is that, when the trouble happens,
> > > the scripts takes 15 minutes always. I just guess the connection is
> > > closed with another timeout, but I'm not sure. So, I wrote 'perhaps'.
> > >
> > >> >
> > >> > I am not sure, but I think current keepalive implementation looks okay to me, it is simple
> > >> > and efficient. I have tested with various situation besides this script. If there is any
> > >> > problem inside the code, I'd like to fix the bug instead of running away completely from it.
> > > Okay, but in future, it would be considerable to remove TCP keepalive.
> > > The check of node availability is the work of cluster driver.
> >
> > All the hangs is suspected to use RTO instead of keepalive timer. Could you please tell me where
> > the thread is hung at?
>
> It waits for a response from the unreachable node at poll() in
> wait_forward_request(). I'm not sure why it returns after keepalive
s/it returns/it doesn't return/
Thanks,
Kazutaka
More information about the sheepdog
mailing list