[sheepdog] [PATCH v3 2/2] collie: add a new option --progress to "node recovery" for showing recovery progress
Hitoshi Mitake
mitake.hitoshi at gmail.com
Fri Aug 2 15:58:43 CEST 2013
At Fri, 2 Aug 2013 16:46:18 +0800,
Liu Yuan wrote:
>
> On Fri, Aug 02, 2013 at 05:31:10PM +0900, Hitoshi Mitake wrote:
> > At Fri, 2 Aug 2013 16:27:03 +0800,
> > Liu Yuan wrote:
> > >
> > > On Fri, Aug 02, 2013 at 04:52:06PM +0900, Hitoshi Mitake wrote:
> > > > This patch adds a new option --progress (or -P) to the node recovery
> > > > subcommand. With this subcommand, users can show a progress of
> > > > recovery process.
> > > >
> > > > Example:
> > > > $ sudo collie node recovery --progress
> > > > 99.7 % [==============================================>] 7047 / 7068
> > > >
> > > > The denominator (7068 in the above case) indicates a number of entire
> > > > object which should be checked. The numerator (7047 in the above case)
> > > > indicates a number of objects which is already checked or copied.
> > > >
> > > > Signed-off-by: Hitoshi Mitake <mitake.hitoshi at lab.ntt.co.jp>
> > > > ---
> > > > v3:
> > > > - clean coding style
> > > >
> > > > v2:
> > > > - make this feature as an option of "node recovery", not a new subcommand
> > > > - clean coding style
> > > > -- renaming recovery_progress_unit() -> get_recovery_progress()
> > > >
> > > > collie/node.c | 91 +++++++++++++++++++++++++++++++++++++++++++++++++++++++--
> > > > 1 file changed, 89 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/collie/node.c b/collie/node.c
> > > > index 4230af5..19b5508 100644
> > > > --- a/collie/node.c
> > > > +++ b/collie/node.c
> > > > @@ -13,6 +13,7 @@
> > > >
> > > > static struct node_cmd_data {
> > > > bool all_nodes;
> > > > + bool recovery_progress;
> > > > } node_cmd_data;
> > > >
> > > > static void cal_total_vdi_size(uint32_t vid, const char *name, const char *tag,
> > > > @@ -120,10 +121,92 @@ static int node_info(int argc, char **argv)
> > > > return EXIT_SUCCESS;
> > > > }
> > > >
> > > > +static int get_recovery_state(struct recovery_state *state)
> > > > +{
> > > > + int ret;
> > > > + struct sd_req req;
> > > > +
> > > > + sd_init_req(&req, SD_OP_STAT_RECOVERY);
> > > > + req.data_length = sizeof(*state);
> > > > +
> > > > + ret = collie_exec_req(sdhost, sdport, &req, state);
> > > > + if (ret < 0) {
> > > > + fprintf(stderr, "Failed to execute request\n");
> > > > + return -1;
> > > > + }
> > > > +
> > > > + return 0;
> > > > +}
> > > > +
> > > > +static int node_recovery_progress(void)
> > > > +{
> > > > + int result, prev_in_recovery = 0;
> > > > +
> > > > + /*
> > > > + * prev_in_recovery is required for expressing state transition.
> > > > + * If the variable is 0 and obtained state indicates not in recovery,
> > > > + * node wasn't doing recovery from first.
> > > > + */
> > > > +
> > > > + /*
> > > > + * ToDos
> > > > + *
> > > > + * 1. Calculate size of actually copied objects.
> > > > + * For doing this, not so trivial changes for recovery process are
> > > > + * required.
> > > > + *
> > > > + * 2. Print remaining physical time.
> > > > + * Even if it is not so acculate, the information is helpful for
> > > > + * administrators.
> > > > + */
> > > > +
> > > > + do {
> > > > + struct recovery_state rstate;
> > > > +
> > > > + result = get_recovery_state(&rstate);
> > > > + if (result < 0)
> > > > + break;
> > > > +
> > > > + if (!rstate.in_recovery) {
> > > > + if (prev_in_recovery)
> > > > + /* not an immediate completion */
> > > > + show_progress(rstate.nr_total, rstate.nr_total,
> > > > + true);
> > > > +
> > > > + break;
> > > > + }
> > >
> > > I don't think we need prev_in_recovery.
> > >
> > > if (not_in_recovery) {
> > > show_progress(total, total, true);
> > > break;
> > > }
> > >
> > > Above can both apply to 1) no recovery 2) recovery finishes.
> >
> > If we do so, collie prints 100% progress bar even if sheep didn't do
> > recovery from first. This is confusing. In addition, show_progress()
> > raises an exception of zero division because the value of the total is
> > 0.
>
> So with your code, rstate.nr_total can be 0 too? I think you should store
> nr_total when you get it first.
>
> Seems that get_recovery_state before the loop would be better solution.
OK, let's call get_recovery_state() once before the loop.
Thanks,
Hitoshi
More information about the sheepdog
mailing list