On 05/22/2012 10:51 AM, levin li wrote: > When IO request comes for an object in recovery, we put it > into the priority rbtree, during the recovery, we make sheep > recover the objects in the rbtree firstly, it's more efficient > than the scheduler in is_recovering_oid() What is the point to use rbtree instead of a list? It seems this tree doesn't have anything to do with 'priority', the objects in the tree are sorted by 'oid', then the smaller oid take precedence over bigger one, why so? This strategy seems quit irreverent to recovery performance. And the latency of recovering one object can't be estimated at all. Maybe a simple FIFO list will do a better job. > +again: > + if (rw->prior_count == 0) { > + if (rw->state == RW_INIT) > + rw->state = RW_RUN; > + else if (!rw->retry) > + rw->done++; > + } > + > + if (rw->prior_count > 0) { > + node = rb_first(&rw->prior_tree); > + entry = rb_entry(node, struct object_rb_entry, node); > + oid = entry->oid; > + rb_erase(node, &rw->prior_tree); > + free(entry); > + } else if (rw->done < rw->count) { > + oid = rw->oids[rw->done]; > + if (!oid) { > + dprintf("oid is zero\n"); > + /* object has been recovered by fast recovery, > + * move to the next. */ > + goto again; > + } This goto looks abusing to me. We'd better have a helper function that code the logic more explicit and abstract away lower implementation, such as pick_next_oid(), which tries to pick the next oid which your algorithm thinks of appropriate. |