[sheepdog] question about the epoch field in reading peer request header
nkuzbp at foxmail.com
Fri May 29 19:12:12 CEST 2015
I have a problem about the epoch information in SD_OP_READ_PEER request header. I'm not sure whether I misunderstand the code or it is a bug.
When we recover a erasure code object in recovery, we need to read the remaining replicas firstly to rebuild the lost replica. In function read_erasure_object(), we init SD_OP_READ_PEER request header by the following code:
hdr.epoch = epoch;
hdr.flags = SD_FLAG_CMD_RECOVERY;
hdr.data_length = rlen;
hdr.obj.oid = oid;
hdr.obj.tgt_epoch = tgt_epoch;
hdr.obj.ec_index = idx;
I think hdr.epoch is current epoch of the cluster and hdr.obj.tgt_epoch is the historical epoch from which we want to read the stale replica. The target node will call peer_read_obj() to process SD_OP_READ_PEER request. Peer_read_obj() set iocb.epoch = hdr->epoch then pass iocb to sd_store->read(). In default_read(), we use iocb->epoch < sys_epoch() to judge whether the request is againt the older epoch which needs to read replica from the stale directory. I think we use the wrong epoch here. We should use hdr.obj.tgt_epoch rather than hdr.epoch to make the judgement. Can anyone answer my question?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the sheepdog