[sheepdog] [PATCH v2 7/9] object cache: reclaim cached objects when cache reaches the max size

Thu Jul 26 02:22:16 CEST 2012

At Wed, 25 Jul 2012 20:15:23 +0800,
levin li wrote:
> 
> From: levin li <xingke.lwp at taobao.com>
> 
> This patch do reclaiming work when the total size of cached objects
> reaches the max size specified by user, I did it in the following way:
> 
> 1. check the object tree for the object entry to determine whether the
>    cache entry is exist and whether it's reclaiming, if it's reclaiming
>    we make sheep ingore the cache.
> 2. In object_cache_rw() we search the cache entry, after passed the sanity
>    check, we increment its refcnt to tell the reclaiming worker that this
>    entry is being referenced, we should not reclaim it now.
> 3. In add_to_object_cache(), when the cached size reaches the max size,
>    we start a reclaiming thread, only one such thread can be running at
>    one time.
> 4. In reclaim_work(), we reclaim cached objects until the cache size reduced
>    to 80% of the max size.
> 5. In reclaim_object(), we start to reclaim an object, before this, we check
>    that if the cache is flushing, we don't reclaim it, and if the refcnt of
>    the object is not zero, we also don't reclaim it.
>    If the cached object is dirty, we flush it by push_cache_object(), and
>    then try to remove the object.
> 
> Signed-off-by: levin li <xingke.lwp at taobao.com>
> ---
>  include/sheepdog_proto.h |    1 +
>  sheep/object_cache.c     |  463 +++++++++++++++++++++++++++++++++++++++-------
>  sheep/sheep.c            |    3 +-
>  sheep/sheep_priv.h       |    1 +
>  sheep/store.c            |    8 +
>  5 files changed, 407 insertions(+), 69 deletions(-)
> 
> diff --git a/include/sheepdog_proto.h b/include/sheepdog_proto.h
> index 45a4b81..05597fb 100644
> --- a/include/sheepdog_proto.h
> +++ b/include/sheepdog_proto.h
> @@ -68,6 +68,7 @@
>  #define SD_RES_CLUSTER_RECOVERING 0x22 /* Cluster is recovering. */
>  #define SD_RES_OBJ_RECOVERING     0x23 /* Object is recovering */
>  #define SD_RES_KILLED           0x24 /* Node is killed */
> +#define SD_RES_NO_CACHE      0x25 /* No cache object found */

This should be a sheepdog-internal error code, no?

> -static inline void
> -del_from_dirty_tree_and_list(struct object_cache_entry *entry,
> -			     struct rb_root *dirty_tree)
> -{
> -	rb_erase(&entry->dirty_node, dirty_tree);
> -	list_del(&entry->list);
> -}
> -
>  /* Caller should hold the oc->lock */
>  static inline void
>  add_to_dirty_tree_and_list(struct object_cache *oc, uint32_t idx,
> @@ -289,6 +488,9 @@ add_to_dirty_tree_and_list(struct object_cache *oc, uint32_t idx,
>  	if (!entry)
>  		panic("Can not find object entry %" PRIx32 "\n", idx);
>  
> +	if (cache_is_reclaiming())
> +		return;
> +
>  	/* If cache isn't in reclaiming, move it
>  	 * to the head of lru list */
>  	cds_list_del_rcu(&entry->lru_list);
> @@ -321,6 +523,9 @@ static void add_to_object_cache(struct object_cache *oc, uint32_t idx)
>  	entry->idx = idx;
>  	CDS_INIT_LIST_HEAD(&entry->lru_list);
>  
> +	dprintf("cache object for vdi %" PRIx32 ", idx %08" PRIx32 "added\n",
> +		oc->vid, idx);
> +
>  	pthread_rwlock_wrlock(&oc->lock);
>  	old = object_cache_insert(&oc->object_tree, entry);
>  	if (!old) {
> @@ -331,20 +536,55 @@ static void add_to_object_cache(struct object_cache *oc, uint32_t idx)
>  		entry = old;
>  	}
>  	pthread_rwlock_unlock(&oc->lock);
> +
> +	dprintf("sys_cache.cache_size %" PRIx64 ", sys->cache_size %" PRIx64 "\n",
> +		uatomic_read(&sys_cache.cache_size), sys->cache_size);
> +	if (sys->cache_size &&
> +	    uatomic_read(&sys_cache.cache_size) > sys->cache_size &&
> +	    !cache_is_reclaiming()) {
> +		struct work *work = xzalloc(sizeof(struct work));
> +		uatomic_set(&sys_cache.reclaiming, 1);

IIUC, cache_is_reclaiming() and uatomic_set(&sys_cache.reclaiming, 1)
must be done atomically.  I think you need to use uatomic_cmpxchg.

Thanks,

Kazutaka