[Sheepdog] [PATCH] sheep: reduce snapshot COW read/write
Liu Yuan
namei.unix at gmail.com
Fri Mar 23 04:16:03 CET 2012
On 03/23/2012 09:50 AM, HaiTing Yao wrote:
>
>
> On Thu, Mar 22, 2012 at 5:15 PM, Liu Yuan <namei.unix at gmail.com
> <mailto:namei.unix at gmail.com>> wrote:
>
> On 03/22/2012 04:40 PM, yaohaiting.wujue at gmail.com
> <mailto:yaohaiting.wujue at gmail.com> wrote:
>
> > From: HaiTing Yao <wujue.yht at taobao.com <mailto:wujue.yht at taobao.com>>
> >
> > Doing snapshot COW:
> >
> > 1, If new writing request occurs and need COW for sanpshot, now
> read old
> > object to buffer, then write the buffer to new object, then write the
> > request data to new object. We can merge the latter two writing
> request.
> >
> > 2, If new writing request covers whole object, no need to read old
> > object.
> >
> > After the modification, pass bigger buffer to do_write_obj when doing
> > COW, but it will not add the burden. COW is never for inode object, so
> > it will not use the journal.
> >
> > Signed-off-by: HaiTing Yao <wujue.yht at taobao.com
> <mailto:wujue.yht at taobao.com>>
> > ---
> > sheep/store.c | 29 +++++++++++++++++------------
> > 1 files changed, 17 insertions(+), 12 deletions(-)
> >
> > diff --git a/sheep/store.c b/sheep/store.c
> > index dfec235..97f1f13 100644
> > --- a/sheep/store.c
> > +++ b/sheep/store.c
> > @@ -640,6 +640,7 @@ int store_create_and_write_obj(const struct
> sd_req *req, struct sd_rsp *rsp, voi
> > {
> > struct sd_obj_req *hdr = (struct sd_obj_req *)req;
> > struct request *request = (struct request *)data;
> > + struct sd_obj_req cow_hdr;
> > int ret;
> > uint32_t epoch = hdr->epoch;
> > char *buf = NULL;
> > @@ -664,19 +665,23 @@ int store_create_and_write_obj(const struct
> sd_req *req, struct sd_rsp *rsp, voi
> > dprintf("%" PRIu64 ", %" PRIx64 "\n", hdr->oid,
> hdr->cow_oid);
> >
> > buf = xzalloc(SD_DATA_OBJ_SIZE);
We'd better not use xzalloc for big data allocation, if it fails,
currently will panic the sheep. So use valloc() instead and check the
failure manually.
> > - ret = read_copy_from_cluster(request, hdr->epoch,
> hdr->cow_oid, buf);
> > - if (ret != SD_RES_SUCCESS) {
> > - eprintf("failed to read cow object\n");
> > - goto out;
> > + if ((hdr->data_length != SD_DATA_OBJ_SIZE)
> > + || (hdr->offset!= 0)) {
if 'hdr->data_length != SD_DATA_OBJ_SIZE', the hdr->offset will never be
0, so check hdr->data_length != SD_DATA_OBJ_SIZE is good enough.
>
>
> For create_and_write object, data_length of request is always
> SD_DATA_OBJ_SIZE and offset always 0.
>
> Thanks,
> Yuan
>
>
> Yes, usually the offset is 0 and size is SD_DATA_OBJ_SIZE , but there is
> one exception.
>
> for example:
> 1) write one vdi with ID of 0x19128 from 0 to SD_DATA_OBJ_SIZE(write
> the first whole object)
> 2)create snapshot for this vdi, get one new vdi with ID 0x19129
> 3)write vdi again, from 1024 to 2048 this time
> 4)need read data from 0x19128-00000000, then write data to
> 0x19129-00000000, so occurs COW
> 5)the hdr offset is 1024, and size is 1024 now
>
> Thanks
> Haiting
>
>
> > + ret = read_copy_from_cluster(request,
> hdr->epoch, hdr->cow_oid, buf);
> > + if (ret != SD_RES_SUCCESS) {
> > + eprintf("failed to read cow object\n");
> > + goto out;
> > + }
> > }
> > - iocb.buf = buf;
> > - iocb.length = SD_DATA_OBJ_SIZE;
> > - iocb.offset = 0;
> > - ret = sd_store->write(hdr->oid, &iocb);
> > - if (ret != SD_RES_SUCCESS)
> > - goto out;
> > - }
> > - ret = do_write_obj(&iocb, hdr, epoch, request->data);
> > +
> > + memcpy(buf + hdr->offset, request->data,
> hdr->data_length);
> > + memcpy(&cow_hdr, hdr, sizeof(cow_hdr));
> > + cow_hdr.offset = 0;
> > + cow_hdr.data_length = SD_DATA_OBJ_SIZE;
> > +
> > + ret = do_write_obj(&iocb, &cow_hdr, epoch, buf);
> > + } else
> > + ret = do_write_obj(&iocb, hdr, epoch, request->data);
> > out:
> > free(buf);
> > sd_store->close(hdr->oid, &iocb);
>
>
>
More information about the sheepdog
mailing list