On Sun, Apr 15, 2012 at 10:48 PM, Liu Yuan <namei.unix at gmail.com> wrote: > On 04/15/2012 09:53 PM, Yunkai Zhang wrote: > >> Currently, we use a dirty_rb tree to record which cache object >> have been updated. >> >> Two kinds of threads will operate this dirty tree concurrently: >> a. mulitple io-workers: write something to cache objects >> b. one flush-worker: flush all updates to cluster >> >> In the following scenario, update will be lost: >> >> flush-worker one io-worker >> [object_cache_push] [object_cache_rw] >> |-(1) get a cache object from dirty tree | >> |-(2) read the data file of this object | >> | modify data file of this object (3)-| >> |-(4) forward_write_obj_req() | >> | add this object to dirty tree (5)-| >> |-(6) rb_erase: remove this object from dirty tree | >> >> Note: io-worker generate *new update* in step (3), but flush-worker >> remove this cache object from dirty tree in step (6). >> >> I use two dirty trees to fix this bug and avoid heavy lock between >> flush-worker and io-wroker threads. >> >> There is only one *active* dirty tree for io-workers in any time. >> After io-worker modify something, it operate this active dirty tree. >> >> When flush-worker want to flush all updates to cluster, it: >> 1. mark another tree as *active* dirty tree. >> 2. get update info from *inactive* dirty tree. >> 3. if something wrong occur in flushing process, >> merge two dirty trees into active drity tree. >> >> Signed-off-by: Yunkai Zhang <qiushu.zyk at taobao.com> >> --- >> sheep/object_cache.c | 113 ++++++++++++++++++++++++++++++++++++++++--------- >> sheep/sheep_priv.h | 9 +++- >> 2 files changed, 99 insertions(+), 23 deletions(-) >> >> diff --git a/sheep/object_cache.c b/sheep/object_cache.c >> index df9ab49..c14ec52 100644 >> --- a/sheep/object_cache.c >> +++ b/sheep/object_cache.c >> @@ -127,9 +127,16 @@ not_found: >> cache = xzalloc(sizeof(*cache)); >> cache->vid = vid; >> create_dir_for(vid); >> - cache->dirty_rb = RB_ROOT; >> + >> + cache->dirty_trees[0] = RB_ROOT; >> + cache->dirty_trees[1] = RB_ROOT; >> + cache->active_dirty_tree = &cache->dirty_trees[0]; >> + >> + INIT_LIST_HEAD(&cache->dirty_lists[0]); >> + INIT_LIST_HEAD(&cache->dirty_lists[1]); >> + cache->active_dirty_list = &cache->dirty_lists[0]; >> + >> pthread_mutex_init(&cache->lock, NULL); >> - INIT_LIST_HEAD(&cache->dirty_list); >> hlist_add_head(&cache->hash, head); >> } else >> cache = NULL; >> @@ -138,19 +145,61 @@ out: >> return cache; >> } >> >> -static void add_to_dirty_tree_and_list(struct object_cache *oc, uint32_t idx, int create) >> +static void add_to_dirty_tree_and_list(struct object_cache *oc, uint32_t idx, >> + struct object_cache_entry *entry) >> { >> - struct object_cache_entry *entry = xzalloc(sizeof(*entry)); >> + int create = 0; >> >> - entry->idx = idx; >> - pthread_mutex_lock(&oc->lock); >> - if (!dirty_tree_insert(&oc->dirty_rb, entry)) { >> + if (!entry) { >> + create = 1; >> + entry = xzalloc(sizeof(*entry)); >> + entry->idx = idx; >> + } >> + if (!dirty_tree_insert(oc->active_dirty_tree, entry)) { >> if (create) >> entry->create = 1; >> - list_add(&entry->list, &oc->dirty_list); >> + list_add(&entry->list, oc->active_dirty_list); >> } else >> free(entry); >> - pthread_mutex_unlock(&oc->lock); >> +} > > > It is okay for me to move lock out of the function since you can use it > with merge_dirty_tree_and_list() in a clean way. > > But I think here you can not remove 'create' argument without breakage > of current logic for object handling because > > 'create' is needed to distinguish two kinds of situation: > 1) object cache layer itself handles the creation of object from Guests > for SD_OP_CREATE_AND_WRITE_OBJ even if object isn't hit in cache > 2) for regular R/W, we don't handling object creation, just pulls it > from the cluster in case of that object is not hit in cache When we pass NULL as entry's value, the new version of this function will create a new object_cache_entry just list we pass 1 as create's value in the old style. Have any questions? > > I'd suggest > - keeping of 'create' argument for this function > - would better rename the function as add_to_active_dirty_tree_and_list() > >> + >> +static void switch_dirty_tree_and_list(struct object_cache *oc, >> + struct rb_root ** inactive_dirty_tree, >> + struct list_head **inactive_dirty_list) >> +{ >> + *inactive_dirty_list = oc->active_dirty_list; >> + *inactive_dirty_tree = oc->active_dirty_tree; >> + >> + if (oc->active_dirty_tree == &oc->dirty_trees[0]) { >> + oc->active_dirty_list = &oc->dirty_lists[1]; >> + oc->active_dirty_tree = &oc->dirty_trees[1]; >> + } else { >> + oc->active_dirty_list = &oc->dirty_lists[0]; >> + oc->active_dirty_tree = &oc->dirty_trees[0]; >> + } >> +} >> + >> +static inline void del_from_dirty_tree_and_list( >> + struct object_cache_entry *entry, >> + struct rb_root *inactive_dirty_tree, >> + struct list_head *inactive_dirty_list) >> +{ >> + rb_erase(&entry->rb, inactive_dirty_tree); >> + list_del(&entry->list); >> +} >> + > > > I think this would be a general function, so we'd better simply it > prototype it as > > del_from_dirty_tree_and_list(struct object_cache_entry *entry, > struct rb_root *tree, struct list_head *list) > > Thanks, > Yuan -- Yunkai Zhang Work at Taobao |