[sheepdog] Some thoughts based on my preliminary I/O test

Liu Yuan namei.unix at gmail.com
Thu May 23 07:34:53 CEST 2013


On 05/23/2013 01:26 AM, Hongyi Wang wrote:
> 1. Recovery
> 
> I noticed recovery occurs as soon as any node joins or quits the
> cluster. I know triggering recovery as early as possible can make sure
> data more reliable since replica are reconstructed earlier. But the
> recovery consume huge both disk and network bandwidth, which heavily
> hurts guest vm I/O.
> 

All of this is a trade off questions. When we throttle recovery IO
(actually, recovery IO is throttled with one thread per node), we will
hurt data reliability and availability.

When you do a manual recovery without VM running, you'll probably want a
full speed recovery to minimize the recovery time.

> 2. Intensive I/O
> 
> Vm performing some big I/O is fail. In my case, I use fio micro
> benchmark to generate a big file (size=10g), then read and write the
> file randomly(bs=4m), the vm is hang and on its host machine, the
> sheep.log shows:
> May 23 07:22:07 [gway 28664] wait_forward_request(167) poll timeout 1
> May 23 07:22:08 [gway 28666] wait_forward_request(167) poll timeout 1
> May 23 07:22:09 [gway 28668] wait_forward_request(167) poll timeout 1
> May 23 07:22:09 [gway 28669] wait_forward_request(167) poll timeout 1
> ......
> But if I tried small file size, say 128m, no poll timeout log output.
> 
> For 1, I think two better policies can be applied: a) allocating fixed
> bandwidth for recovery operation so as not to consume all bandwidth
> resource; b) initialize recovery lazily which can prevent with thrashing
> when nodes join or quit frequently.

Recovery don't consume bandwidth much because we only have one thread
per node for recovery, this is already the lowest equipment for recovery.

When you restart the node, you are supposed to manually 'disable
recvery' and then restart it after node joins.

The same reason, lazy recovery will hurt data reliability and
availability, the chances to lose object will be increased.

Thanks,
Yuan


More information about the sheepdog mailing list