[sheepdog] [PATCH 1/8] use rbtree to replace the priority scheduler in recovery

levin li levin108 at gmail.com
Tue May 22 12:17:13 CEST 2012


On 05/22/2012 03:01 PM, Liu Yuan wrote:
> On 05/22/2012 10:51 AM, levin li wrote:
>
>> When IO request comes for an object in recovery, we put it
>> into the priority rbtree, during the recovery, we make sheep
>> recover the objects in the rbtree firstly, it's more efficient
>> than the scheduler in is_recovering_oid()
>
>
> What is the point to use rbtree instead of a list? It seems this tree
> doesn't have anything to do with 'priority', the objects in the tree are
> sorted by 'oid', then the smaller oid take precedence over bigger one,
> why so? This strategy seems quit irreverent to recovery performance. And
> the latency of recovering one object can't be estimated at all. Maybe a
> simple FIFO list will do a better job.
>

It's indeed related with priority, when IO request comes for one object,
firstly we should find out whether this object has been marked as 
'priority', in the old code, we move it to the head of the oid list by
memmove, which I think is inefficient.

The priority I mean is that the object is prior to those not requested 
by IO req. The objects requested by IO req in the rb tree should be 
recovered before those not requested.

>> +again:
>> +	if (rw->prior_count == 0) {
>> +		if (rw->state == RW_INIT)
>> +			rw->state = RW_RUN;
>> +		else if (!rw->retry)
>> +			rw->done++;
>> +	}
>> +
>> +	if (rw->prior_count>  0) {
>> +		node = rb_first(&rw->prior_tree);
>> +		entry = rb_entry(node, struct object_rb_entry, node);
>> +		oid = entry->oid;
>> +		rb_erase(node,&rw->prior_tree);
>> +		free(entry);
>> +	} else if (rw->done<  rw->count) {
>> +		oid = rw->oids[rw->done];
>> +		if (!oid) {
>> +			dprintf("oid is zero\n");
>> +			/* object has been recovered by fast recovery,
>> +			 * move to the next. */
>> +			goto again;
>> +		}
>
> This goto looks abusing to me. We'd better have a helper function that
> code the logic more explicit and abstract away lower implementation,
> such as pick_next_oid(), which tries to pick the next oid which your
> algorithm thinks of appropriate.




More information about the sheepdog mailing list