At Tue, 4 Feb 2014 15:16:18 +0800, Liu Yuan wrote: > > On Tue, Feb 04, 2014 at 03:14:44PM +0800, Liu Yuan wrote: > > Rationale for multi-threaded recovery: > > > > 1. If one node is added, we find that all the VMs on other nodes will get > > noticeably affected until 50% data is transferred to the new node. > > > > 2. For node failure, we might not have problems of running VM but the > > recovery process boost will benefit IO operation of VM with less > > chances to be blocked for write and also improve reliability. > > > > 3. For disk failure in node, this is similar to adding a node. All > > the data on the broken disk will be recovered on other disks in > > this node. Speedy recoery not only improve data reliability but > > also cause less writing blocking on the lost data. > > > > Our oid scheduling algorithm is intact and simply add multi-threading onto top > > of current recovery algorithm with minimal changes. > > > > - we still have ->oids array to denote oids to be recovered > > - we start up 2 * nr_disks threads for recovery > > - the tricky part is that we need to wait all the running threads to > > completion before start next recovery events for multiple nodes/disks events > > > > This patch passes "./check -g md -md" on my local box > > > > Hitoshi, can this one pass your test 033? It passed 033 on my environment. BTW, you should write differences between versions. Thanks, Hitoshi |