At Mon, 04 Jun 2012 14:12:10 +0800, Liu Yuan wrote: > > On 06/04/2012 02:04 PM, Liu Yuan wrote: > > > The current object_cache_pull() cause bellow bug: > > ... > > do_gateway_request(288) 2, 80d6d76e00000000 , 1 > > Jun 04 10:16:37 connect_to(241) 2126, 10.232.134.3:7000 > > Jun 04 10:16:37 client_handler(747) closed connection 2116 > > Jun 04 10:16:37 destroy_client(678) connection from: 127.0.0.1:60214 > > Jun 04 10:16:37 listen_handler(797) accepted a new connection: 2116 > > Jun 04 10:16:37 client_rx_handler(586) connection from: 127.0.0.1:60216 > > Jun 04 10:16:37 queue_request(385) 2 > > Jun 04 10:16:37 do_gateway_request(288) 2, 80d6d76e00000000 , 1 > > Jun 04 10:16:37 do_gateway_request(308) failed: 2, 80d6d76e00000000 , 1, 54014b01 > > ... > > > > This is because we use forward_read_obj_req(), which tries to multiplex a socket > > FD if concurrent requests access to the same object and unforunately routed to > > the same node. > > > > Object cache has a very high pressure of current requests access to the same > > COW object from cloned VMs, so this problem emerges. It looks to me that, > > besides object cache, QEMU requests are also be subject to this problem too > > because QEMU's sheepdog block layer can issue multiple requests in one go. > > > The alternative fix is to write a new fd cache, which allow mutiple FDs > to the same node. This looks a better fix that sort out all the related > problems Can you explain how the current fd cache causes the above problem against the concurrent accesses to the same node in more detail? Thanks, Kazutaka |