<div>I understand this is not a failure. But the problem is big IO which causes "poll timeout" hugely degrades the performance. For my test, I tried to write a 10<span style="line-height: 1.5;">GB file (bs=4m), which is expected to complete in 30m even in 100M network bandwidth. But I cannot finish it in 2 hours (actually I had to kill the task). That is a bit wield.</span></div><div><span style="line-height: 1.5;"> </span></div><div><tincludetail><div style="font:Verdana normal 14px;color:#000;"><div style="FONT-SIZE: 12px;FONT-FAMILY: Arial Narrow;padding:2px 0 2px 0;">------------------ Original ------------------</div><div style="FONT-SIZE: 12px;background:#efefef;padding:8px;"><div id="menu_sender"><b>From: </b> "Liu Yuan"<namei.unix@gmail.com>;</div><div><b>Date: </b> Thu, May 23, 2013 01:28 PM</div><div><b>To: </b> "Hongyi Wang"<hongyi@zelin.io>; <wbr></div><div><b>Cc: </b> "sheepdog"<sheepdog@lists.wpkg.org>; <wbr></div><div><b>Subject: </b> Re: [sheepdog] Some thoughts based on my preliminary I/O test</div></div><div> </div>On 05/23/2013 01:17 PM, Hongyi Wang wrote:<br>> Update for the last mail:<br>> In order to reproduce the issue "wait_forward_request(167) poll timeout<br>> 1", we do our test with 100M switch. We can reproduce the issue every<br>> time with sequential write on a large file (in our case, 10GB).<br>>  We notice the number of threads reaches ~70 and the network bandwidth<br>> is saturated (~90Mbps).<br>> <br><br>No, this is not a failure. Just the nodes or network are busy. SD will<br>take care of retry of the requests after poll timeout.<br><br>Thanks,<br>Yuan<br></div></tincludetail></div>