[sheepdog] Some thoughts based on my preliminary I/O test

Wed May 22 19:26:43 CEST 2013

Hi,

I did some I/O tests on 4-node cluster these days. I used fio micro benchmark. It works perfectly and the I/O performance is reasonably good except the following two scenarios:

1. Recovery 

I noticed recovery occurs as soon as any node joins or quits the cluster. I know triggering recovery as early as possible can make sure data more reliable since replica are reconstructed earlier. But the recovery consume huge both disk and network bandwidth, which heavily hurts guest vm I/O. 

2. Intensive I/O

Vm performing some big I/O is fail. In my case, I use fio micro benchmark to generate a big file (size=10g), then read and write the file randomly(bs=4m), the vm is hang and on its host machine, the sheep.log shows:
May 23 07:22:07 [gway 28664] wait_forward_request(167) poll timeout 1
May 23 07:22:08 [gway 28666] wait_forward_request(167) poll timeout 1
May 23 07:22:09 [gway 28668] wait_forward_request(167) poll timeout 1
May 23 07:22:09 [gway 28669] wait_forward_request(167) poll timeout 1
......
But if I tried small file size, say 128m, no poll timeout log output. 

For 1, I think two better policies can be applied: a) allocating fixed bandwidth for recovery operation so as not to consume all bandwidth resource; b) initialize recovery lazily which can prevent with thrashing when nodes join or quit frequently. 

For 2, I have no idea why this happened. I wonder what causes poll timeout.

Thanks,

--Hongyi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wpkg.org/pipermail/sheepdog/attachments/20130523/9826b3e1/attachment-0004.html>