I understand this is not a failure. But the problem is big IO which causes "poll timeout" hugely degrades the performance. For my test, I tried to write a 10GB file (bs=4m), which is expected to complete in 30m even in 100M network bandwidth. But I cannot finish it in 2 hours (actually I had to kill the task). That is a bit wield. ------------------ Original ------------------ From: "Liu Yuan"<namei.unix at gmail.com>; Date: Thu, May 23, 2013 01:28 PM To: "Hongyi Wang"<hongyi at zelin.io>; Cc: "sheepdog"<sheepdog at lists.wpkg.org>; Subject: Re: [sheepdog] Some thoughts based on my preliminary I/O test On 05/23/2013 01:17 PM, Hongyi Wang wrote: > Update for the last mail: > In order to reproduce the issue "wait_forward_request(167) poll timeout > 1", we do our test with 100M switch. We can reproduce the issue every > time with sequential write on a large file (in our case, 10GB). > We notice the number of threads reaches ~70 and the network bandwidth > is saturated (~90Mbps). > No, this is not a failure. Just the nodes or network are busy. SD will take care of retry of the requests after poll timeout. Thanks, Yuan -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.wpkg.org/pipermail/sheepdog/attachments/20130523/143c70f1/attachment.html> |