<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
<title></title>
</head>
<body bgcolor="#ffffff" text="#000000">
On 05/05/2012 02:08 PM, levin wrote:
<blockquote
cite="mid:CA+vr8WbYFjLXGO=ESQpQgFVzJoiQQ3eeOj4Yu1vWzN14R05frg@mail.gmail.com"
type="cite"><br>
<br>
On Saturday, May 5, 2012, Liu Yuan wrote:<br>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex;
border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">On
05/04/2012 02:49 PM, levin li wrote:<br>
<br>
> Cluster recovery may cause objects migrated from one node<br>
> to another, but the object list cache doesn't change, when<br>
> deleting an object we can not find the right node in whose<br>
> cache the id stays, so we need to notify the deletion
object<br>
> list to all the node to make them delete the specified<br>
> object id from object list cache.<br>
><br>
> I reused the deletion_work_list and deletion_wqueue for the<br>
> object list cleanup work, so that we save some memory by
not<br>
> starting a new work queue.<br>
><br>
> The del_vdi is called in worker thread, but it's locked by
the<br>
> notification mechanism, other operations about
deletion_work_list<br>
> and deletion_wqueue are done in main thread, so there's no
need<br>
> to concern about the race condition.<br>
><br>
> Signed-off-by: levin li <<a moz-do-not-send="true">xingke.lwp@taobao.com</a>><br>
> ---<br>
> include/sheep.h | 1 +<br>
> sheep/group.c | 1 +<br>
> sheep/object_list_cache.c | 17 +++++++++<br>
> sheep/ops.c | 57
++++++++++++++++++++++++++++++<br>
> sheep/sheep_priv.h | 21 ++++++++++++<br>
> sheep/vdi.c | 84
++++++++++++++++++++++++++++++---------------<br>
> 6 files changed, 154 insertions(+), 27 deletions(-)<br>
><br>
> diff --git a/include/sheep.h b/include/sheep.h<br>
> index 7e287c4..e941dc1 100644<br>
> --- a/include/sheep.h<br>
> +++ b/include/sheep.h<br>
> @@ -46,6 +46,7 @@<br>
> #define SD_OP_TRACE 0x95<br>
> #define SD_OP_TRACE_CAT 0x96<br>
> #define SD_OP_STAT_RECOVERY 0x97<br>
> +#define SD_OP_NOTIFY_VDI_DEL 0x98<br>
><br>
> #define SD_FLAG_CMD_IO_LOCAL 0x0010<br>
> #define SD_FLAG_CMD_RECOVERY 0x0020<br>
> diff --git a/sheep/group.c b/sheep/group.c<br>
> index c7fd387..1f1aa8d 100644<br>
> --- a/sheep/group.c<br>
> +++ b/sheep/group.c<br>
> @@ -1314,6 +1314,7 @@ int create_cluster(int port, int64_t
zone, int nr_vnodes)<br>
> INIT_LIST_HEAD(&sys->req_wait_for_obj_list);<br>
> INIT_LIST_HEAD(&sys->consistent_obj_list);<br>
> INIT_LIST_HEAD(&sys->blocking_conn_list);<br>
> + INIT_LIST_HEAD(&sys->deletion_work_list);<br>
><br>
> INIT_LIST_HEAD(&sys->request_queue);<br>
> INIT_LIST_HEAD(&sys->event_queue);<br>
> diff --git a/sheep/object_list_cache.c
b/sheep/object_list_cache.c<br>
> index 28cdbbc..0aa93e5 100644<br>
> --- a/sheep/object_list_cache.c<br>
> +++ b/sheep/object_list_cache.c<br>
> @@ -149,3 +149,20 @@ int get_obj_list(const struct
sd_list_req *hdr, struct sd_list_rsp *rsp, void *d<br>
><br>
> return res;<br>
> }<br>
> +<br>
> +int del_vdi_from_objlist_cache(uint64_t *objs, int count)<br>
> +{<br>
> + int i;<br>
> +<br>
> + dprintf("%d\n", count);<br>
> + for (i = 0; i < count; i++) {<br>
> +
pthread_rwlock_wrlock(&obj_list_cache.lock);<br>
> + if
(!objlist_cache_rb_remove(&obj_list_cache.root, objs[i])) {<br>
> + dprintf("remove oid %" PRIx64 " from
objlist cache\n", objs[i]);<br>
> + obj_list_cache.cache_size--;<br>
> + }<br>
> +
pthread_rwlock_unlock(&obj_list_cache.lock);<br>
> + }<br>
> +<br>
> + return 0;<br>
> +}<br>
> diff --git a/sheep/ops.c b/sheep/ops.c<br>
> index b6f8eb2..5763b6f 100644<br>
> --- a/sheep/ops.c<br>
> +++ b/sheep/ops.c<br>
> @@ -550,6 +550,57 @@ static int cluster_cleanup(const
struct sd_req *req, struct sd_rsp *rsp,<br>
> return ret;<br>
> }<br>
><br>
> +static void objlist_cache_del_vdi_work(struct work *work)<br>
> +{<br>
> + struct deletion_work *dw = container_of(work, struct
deletion_work, work);<br>
> +<br>
> + del_vdi_from_obThis name looks confusing, I guess you
mean undeleted objects as in<br>
dw->deleted_objs, dw->deleted_count.<br>
<br>
</blockquote>
<div>No, here I do mean deleted objects indeed, by this buffer I
record which objects are deleted,</div>
<div>and then we can notify this list to the other nodes.</div>
<div><br>
</div>
<div> </div>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex;
border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
> +<br>
> if (dw->delete_error) {<br>
> write_object(dw->vnodes, dw->epoch,
vid_to_vdi_oid(vdi_id),<br>
> (void *)inode, sizeof(*inode),
0, 0, nr_copies, 0);<br>
> @@ -505,8 +535,8 @@ static void delete_one_done(struct work
*work)<br>
> free(dw->buf);<br>
> free(dw);<br>
><br>
> - if (!list_empty(&deletion_work_list)) {<br>
> - dw =
list_first_entry(&deletion_work_list,<br>
> + if (!list_empty(&sys->deletion_work_list)) {<br>
> + dw =
list_first_entry(&sys->deletion_work_list,<br>
> struct deletion_work,
dw_siblings);<br>
><br>
> queue_work(sys->deletion_wqueue,
&dw->work);<br>
> @@ -647,12 +677,12 @@ int start_deletion(uint32_t vid,
uint32_t epoch)<br>
> if (dw->count == 0)<br>
> goto out;<br>
><br>
> - if (!list_empty(&deletion_work_list)) {<br>
> - list_add_tail(&dw->dw_siblings,
&deletion_work_list);<br>
> + if (!list_empty(&sys->deletion_work_list)) {<br>
> + list_add_tail(&dw->dw_siblings,
&sys->deletion_work_list);<br>
> goto out;<br>
> }<br>
><br>
> - list_add_tail(&dw->dw_siblings,
&deletion_work_list);<br>
> + list_add_tail(&dw->dw_siblings,
&sys->deletion_work_list);<br>
> queue_work(sys->deletion_wqueue,
&dw->work);<br>
<br>
<br>
it looks redundant to have deletion_work_list here to get FIFO
semantics.<br>
<br>
We only have one worker for deletion_wqueue, isn't it FIFO
already?<br>
<br>
</blockquote>
<div><br>
</div>
<div>Maybe <span style="color: rgb(34, 34, 34); font-family:
arial,sans-serif; font-size: 13px; white-space: nowrap;
background-color: rgb(255, 255, 255);">kazutaka can explain
this.</span></div>
<div><span style="color: rgb(34, 34, 34); font-family:
arial,sans-serif; font-size: 13px; white-space: nowrap;
background-color: rgb(255, 255, 255);"><br>
</span></div>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex;
border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
> out:<br>
> return SD_RES_SUCCESS;<br>
<br>
<br>
This code looks really confusing, I think you need rework the
logic<br>
first, take object recovery as an example, recovery_object just
recovers<br>
the targeted object once at a time. Your delete_one I guess
serves the<br>
same purpose as its name suggests, but you actually try to
delete all<br>
the objects in one go.<br>
<br>
</blockquote>
<div><br>
</div>
<div>It's something different from recovery_object, in
recovery_object it means recovering</div>
<div>a single object, but here delete_one means deleting a single
VDI, not a single object,</div>
<div>as we see, maybe there're more than one VDI needs to be
deleted in one deletion work,</div>
<div>so what delete_one does is to delete on of the VDIs.</div>
<div> </div>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex;
border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
By the way, please run script/<a moz-do-not-send="true"
href="http://checkpatch.pl" target="_blank">checkpatch.pl</a>,
there are some style problems.<br>
<br>
</blockquote>
<div>OK, I will check my latest patch with this script, maybe this
version is dropped, as I just submitted</div>
<div>the 5th version yesterday.</div>
<div><br>
</div>
</blockquote>
Sorry, I was confused by gmail, this is my 5th version.<br>
<br>
thanks,<br>
<br>
levin<br>
<br>
<blockquote
cite="mid:CA+vr8WbYFjLXGO=ESQpQgFVzJoiQQ3eeOj4Yu1vWzN14R05frg@mail.gmail.com"
type="cite">
<div>thanks,</div>
<div><br>
</div>
<div>levin</div>
<div><br>
</div>
<div> </div>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex;
border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
Thanks,<br>
Yuan<br>
</blockquote>
<br>
<br>
-- <br>
<font color="#999999">levin</font>
<div><a moz-do-not-send="true" href="http://basiccoder.com"
style="background-color: rgb(255, 255, 255);" target="_blank"><font
color="#999999">basiccoder.com</font></a></div>
<br>
</blockquote>
<br>
</body>
</html>