[Sheepdog] [RFC PATCH] sheep: add client side timeout support for socket
MORITA Kazutaka
morita.kazutaka at lab.ntt.co.jp
Sat Nov 26 19:10:39 CET 2011
At Sat, 26 Nov 2011 19:06:18 +0800,
Liu Yuan wrote:
>
> On 11/26/2011 05:53 PM, Yibin Shen wrote:
>
> > oops, I found a regression with this patch
> >
> > Nov 26 11:19:13 store_queue_request(936) 3, 3, 412ca6000022d8 , 10
> > Nov 26 11:19:13 forward_write_obj_req(368) 412ca6000022d8
> > Nov 26 11:19:13 store_queue_request_local(843) 3, 412ca6000022d8 , 10
> > Nov 26 11:19:43 store_queue_request(967) failed: 3, 3, 412ca6000022d8 , 10, 3
> > Nov 26 11:19:43 io_op_done(147) leaving sheepdog cluster
> > Nov 26 11:19:43 sd_leave_handler(1291) network partition bug: this
> > sheep should have exited
> > Nov 26 11:19:43 log_sigsegv(358) logger pid 9654 exiting abnormally
> >
> >
> > e.g : if a object have 3 copies, and is hashed to (local, node A, node B)
> > then in a write operation, if node A leave cluster, IO towards node A
> > will timeout after 30sec,
> > but we use a strong consistency model, so return value of
> > store_request_queue will be set to SD_RES_EIO,
> > then io_op_done (sdnet.c) function will call leave_cluster .
> >
> > 144 } else if (is_access_local(req->entry, req->nr_vnodes,
> > 145 ((struct sd_obj_req
> > *)&req->rq)->oid, copies) &&
> > 146 req->rp.result == SD_RES_EIO) {
> > 147 eprintf("leaving sheepdog cluster\n");
> > 148 leave_cluster();
> >
> > IMO, maybe we should:
> > 1)split store_request_queue() into multiple works.
> > 2)replace strong consistency with eventual consistency or casual consistency。
> >
> > any comments?
> >
> > thanks
>
>
> I think it is not the time to introduce other consistency models which
> bring in much complexity.
>
> Whatever consistency model you use, you still need to handle EIO. IMO,
It is completely wrong to set SD_RES_EIO when timeout occurs because
the error means disk I/O errors. We must set SD_RES_NETWORK_ERROR in
this case so that the request will be retried after epoch is updated.
But I guess it is better to enable TCP keepalive. If we use it, the
connection will be closed after timeout automatically, so we don't
need to change network I/O code at all.
Thanks,
Kazutaka
> you could handle EIO even with current strong model. In this case, A is
> gone, you could
>
> 1) timeout the write
> 2) wait for the cluster get recovered (get a new hash)
> 3) do the write again.
>
> The newest HEAD have already removed the lines that makes sheep
> panic-out in error case. So currently, EIO will leave the node a gateway
> for VMs. This is a acceptable compromise.
>
> Thanks,
> Yuan
>
> --
> sheepdog mailing list
> sheepdog at lists.wpkg.org
> http://lists.wpkg.org/mailman/listinfo/sheepdog
More information about the sheepdog
mailing list