[Sheepdog] [PATCH] sheep: reduce snapshot COW read/write

Liu Yuan namei.unix at gmail.com
Fri Mar 23 04:16:03 CET 2012


On 03/23/2012 09:50 AM, HaiTing Yao wrote:

> 
> 
> On Thu, Mar 22, 2012 at 5:15 PM, Liu Yuan <namei.unix at gmail.com
> <mailto:namei.unix at gmail.com>> wrote:
> 
>     On 03/22/2012 04:40 PM, yaohaiting.wujue at gmail.com
>     <mailto:yaohaiting.wujue at gmail.com> wrote:
> 
>     > From: HaiTing Yao <wujue.yht at taobao.com <mailto:wujue.yht at taobao.com>>
>     >
>     > Doing snapshot COW:
>     >
>     > 1, If new writing request occurs and need COW for sanpshot, now
>     read old
>     > object to buffer, then write the buffer to new object, then write the
>     > request data to new object. We can merge the latter two writing
>     request.
>     >
>     > 2, If new writing request covers whole object, no need to read old
>     > object.
>     >
>     > After the modification, pass bigger buffer to do_write_obj when doing
>     > COW, but it will not add the burden. COW is never for inode object, so
>     > it will not use the journal.
>     >
>     > Signed-off-by: HaiTing Yao <wujue.yht at taobao.com
>     <mailto:wujue.yht at taobao.com>>
>     > ---
>     >  sheep/store.c |   29 +++++++++++++++++------------
>     >  1 files changed, 17 insertions(+), 12 deletions(-)
>     >
>     > diff --git a/sheep/store.c b/sheep/store.c
>     > index dfec235..97f1f13 100644
>     > --- a/sheep/store.c
>     > +++ b/sheep/store.c
>     > @@ -640,6 +640,7 @@ int store_create_and_write_obj(const struct
>     sd_req *req, struct sd_rsp *rsp, voi
>     >  {
>     >       struct sd_obj_req *hdr = (struct sd_obj_req *)req;
>     >       struct request *request = (struct request *)data;
>     > +     struct sd_obj_req cow_hdr;
>     >       int ret;
>     >       uint32_t epoch = hdr->epoch;
>     >       char *buf = NULL;
>     > @@ -664,19 +665,23 @@ int store_create_and_write_obj(const struct
>     sd_req *req, struct sd_rsp *rsp, voi
>     >               dprintf("%" PRIu64 ", %" PRIx64 "\n", hdr->oid,
>     hdr->cow_oid);
>     >
>     >               buf = xzalloc(SD_DATA_OBJ_SIZE);


We'd better not use xzalloc for big data allocation, if it fails,
currently will panic the sheep. So use valloc() instead and check the
failure manually.

>     > -             ret = read_copy_from_cluster(request, hdr->epoch,
>     hdr->cow_oid, buf);
>     > -             if (ret != SD_RES_SUCCESS) {
>     > -                     eprintf("failed to read cow object\n");
>     > -                     goto out;
>     > +             if ((hdr->data_length != SD_DATA_OBJ_SIZE)
>     > +                     || (hdr->offset!= 0)) {



if 'hdr->data_length != SD_DATA_OBJ_SIZE', the hdr->offset will never be
0, so check hdr->data_length != SD_DATA_OBJ_SIZE is good enough.

> 
> 
>     For create_and_write object, data_length of request is always
>     SD_DATA_OBJ_SIZE and offset always 0.
> 
>     Thanks,
>     Yuan
> 
>  
> Yes, usually the offset is 0 and size is SD_DATA_OBJ_SIZE , but there is
> one exception.
>  
> for example:
> 1) write one vdi with ID of 0x19128 from 0 to  SD_DATA_OBJ_SIZE(write
> the first whole object)
> 2)create snapshot for this vdi, get one new vdi with ID 0x19129
> 3)write vdi again, from 1024 to 2048 this time
> 4)need read data from 0x19128-00000000, then write data to
> 0x19129-00000000, so occurs COW
> 5)the hdr offset is 1024, and size is 1024 now
>  
> Thanks
> Haiting 
> 
> 
>     > +                     ret = read_copy_from_cluster(request,
>     hdr->epoch, hdr->cow_oid, buf);
>     > +                     if (ret != SD_RES_SUCCESS) {
>     > +                             eprintf("failed to read cow object\n");
>     > +                             goto out;
>     > +                     }
>     >               }
>     > -             iocb.buf = buf;
>     > -             iocb.length = SD_DATA_OBJ_SIZE;
>     > -             iocb.offset = 0;
>     > -             ret = sd_store->write(hdr->oid, &iocb);
>     > -             if (ret != SD_RES_SUCCESS)
>     > -                     goto out;
>     > -     }
>     > -     ret = do_write_obj(&iocb, hdr, epoch, request->data);
>     > +
>     > +             memcpy(buf + hdr->offset, request->data,
>     hdr->data_length);
>     > +             memcpy(&cow_hdr, hdr, sizeof(cow_hdr));
>     > +             cow_hdr.offset = 0;
>     > +             cow_hdr.data_length = SD_DATA_OBJ_SIZE;
>     > +
>     > +             ret = do_write_obj(&iocb, &cow_hdr, epoch, buf);
>     > +     } else
>     > +             ret = do_write_obj(&iocb, hdr, epoch, request->data);
>     >  out:
>     >       free(buf);
>     >       sd_store->close(hdr->oid, &iocb);
> 
> 
> 





More information about the sheepdog mailing list