On 06/07/2012 11:50 PM, MORITA Kazutaka wrote: > The reason we use timeout for socket connections is that, when > membership change happens, the gateway should retry I/Os with a new > membership instead of sleeping long time in forward_read/write_obj_req > with an old membership. If send/recv/poll blocks for a long time in > the gateway node, timeout happens in the guest OSes, which is what we > really want to avoid. So there is a dilemma: if not long enough, we will cancel a valid connection which the other end is just busy. I am considering another approach that let recovery thread to kill those blocking connection instead of timeout. How about it? Thanks, Yuan |