[sheepdog] [PATCH] recovery: fix incomplete recovery because of faulty oid scheduling
MORITA Kazutaka
morita.kazutaka at lab.ntt.co.jp
Mon Apr 29 09:11:35 CEST 2013
At Mon, 29 Apr 2013 15:59:03 +0900,
MORITA Kazutaka wrote:
>
> At Mon, 29 Apr 2013 13:15:55 +0800,
> Liu Yuan wrote:
> >
> > On 04/29/2013 07:15 AM, MORITA Kazutaka wrote:
> > > When auto-recovery is disabled, sheep recovers the objects only in
> > > prio_oids and shouldn't reach here if the oid is in recovery, no?
> >
> > No, this is why I saw bugs. I saw it happens only when two read on the same oid in the recovery process. The first one will trigger the oid scheduling(prepare/finish) and then the second one will also trigger it(prepare/finish) and result in faulty bug. The oids in rw->oids will be '7c2b2500000003,7c2b2500000003,7c2b2500000006', 7c2b2500000007 was ejected out.
> >
> > Following is the error log when bug is reproduced by tests/010.
> >
> > Apr 29 13:01:23 [main] queue_request(353) READ_OBJ, 1
> > Apr 29 13:01:23 [main] get_object_path(351) 0, /tmp/sheepdog/7/obj
> > Apr 29 13:01:23 [main] finish_schedule_oids(405) nr_recovered 0, nr_prio_oids 1, count 3 = new 3
> > Apr 29 13:01:23 [main] prepare_schedule_oid(252) 7c2b2500000003 nr_prio_oids 0
> > Apr 29 13:01:23 [main] request_in_recovery(200) 7c2b2500000003 wait on oid
> > Apr 29 13:01:23 [rw 4870] recover_object_work(205) done:0 count:3, oid:7c2b2500000003
> > Apr 29 13:01:23 [rw 4870] get_object_path(351) 0, /tmp/sheepdog/7/obj
> > Apr 29 13:01:23 [rw 4870] do_recover_object(147) try recover object 7c2b2500000003 from epoch 8
> > Apr 29 13:01:23 [rw 4870] sockfd_cache_get(387) 127.0.0.1:7007, idx 0
> > Apr 29 13:01:23 [main] client_handler(808) 1, rx 0, tx 3
> > Apr 29 13:01:23 [main] finish_rx(612) 27, 127.0.0.1:56182
> > Apr 29 13:01:23 [main] queue_request(353) READ_PEER, 1
> > Apr 29 13:01:23 [main] get_object_path(351) 0, /tmp/sheepdog/7/obj
> > Apr 29 13:01:23 [main] prepare_schedule_oid(252) 7c2b2500000003 nr_prio_oids 1
> > Apr 29 13:01:23 [io 4869] do_process_work(1359) a4, 7c2b2500000003, 8
> > Apr 29 13:01:23 [io 4869] get_object_path(351) 0, /tmp/sheepdog/7/obj
> > Apr 29 13:01:23 [io 4869] err_to_sderr(65) /tmp/sheepdog/7/obj
> > Apr 29 13:01:23 [io 4869] err_to_sderr(72) object 007c2b2500000003 not found locally
> > Apr 29 13:01:23 [io 4869] do_process_work(1366) failed: a4, 7c2b2500000003 , 8, 2
> > Apr 29 13:01:23 [main] io_op_done(67) unhandled error 2
> > Apr 29 13:01:23 [main] client_handler(808) 4, rx 0, tx 3
> > Apr 29 13:01:23 [main] finish_tx(699) connection from: 27, 127.0.0.1:56182
> > Apr 29 13:01:23 [rw 4870] sheep_exec_req(526) failed 2
> > Apr 29 13:01:23 [rw 4870] sockfd_cache_put(422) 127.0.0.1:7007 idx 0
> > Apr 29 13:01:23 [rw 4870] sockfd_cache_get(387) 127.0.0.1:7002, idx 0
> > Apr 29 13:01:23 [rw 4870] sockfd_cache_put(422) 127.0.0.1:7002 idx 0
> > Apr 29 13:01:23 [rw 4870] get_object_path(351) 0, /tmp/sheepdog/7/obj
> > Apr 29 13:01:23 [rw 4870] get_object_path(351) 0, /tmp/sheepdog/7/obj
> > Apr 29 13:01:23 [rw 4870] default_create_and_write(343) 7c2b2500000003
> > Apr 29 13:01:23 [rw 4870] recover_object_from_replica(111) recovered oid 7c2b2500000003 from 8 to epoch 8
> > Apr 29 13:01:23 [main] wakeup_requests_on_oid(255) retry 7c2b2500000003
> > Apr 29 13:01:23 [main] queue_request(353) READ_OBJ, 1
> > Apr 29 13:01:23 [main] get_object_path(351) 0, /tmp/sheepdog/7/obj
> > Apr 29 13:01:23 [main] oid_in_recovery(264) the object 7c2b2500000003 is already recoverd
> > Apr 29 13:01:23 [main] finish_schedule_oids(405) WARN: nr_recovered 1, nr_prio_oids 1, count 3 = new 4
> >
>
> Thanks, I understood what's going on from your log.
>
> The problem is that scheduled oids can be re-scheduled again. I think
> the following is a better fix because it also omits redundant
> scheduling even when auto-recovery is enabled.
>
> ---- >8 ---- >8 ---- >8 ----
> diff --git a/sheep/recovery.c b/sheep/recovery.c
> index 23babe0..e9cfc02 100644
> --- a/sheep/recovery.c
> +++ b/sheep/recovery.c
> @@ -238,11 +238,15 @@ static inline void prepare_schedule_oid(uint64_t oid)
> return;
> }
> /*
> - * When auto recovery is enabled, the oid is currently being
> - * recovered
> + * rw->oids[rw->done..rw->nr_scheduled_prio_oids - 1] are
> + * already scheduled ones.
> */
> - if (!sys->disable_recovery && rw->oids[rw->done] == oid)
> - return;
> + for (i = rw->done; i < rw->nr_scheduled_prio_oids; i++)
> + if (rw->oids[i] == oid) {
> + sd_dprintf("oid %" PRIx64 " is already scheduled", oid);
> + return;
> + }
> +
> rw->nr_prio_oids++;
> rw->prio_oids = xrealloc(rw->prio_oids,
> rw->nr_prio_oids * sizeof(uint64_t));
Please drop my patch. Seems that we need more fixes for this problem.
I'll prepare another ones.
Thanks,
Kazutaka
More information about the sheepdog
mailing list