[sheepdog] [PATCH v3 5/6] sheep: cache vnode_info when doing recovery

Liu Yuan namei.unix at gmail.com
Tue May 20 09:24:35 CEST 2014


On Mon, May 19, 2014 at 03:11:14PM +0800, Robin Dong wrote:
> From: Robin Dong <sanbai at taobao.com>
> 
> When sheepdog doing recovery in same low-performance machines, the CPU is
> very high. After using perf tools to check the hot point of performance in
> sheep daemon, we find out that the "alloc_vnode_info()" function cost lots
> of CPU circyles because the rollback_vnode_info() rebuilds the vnode_info
> by calling alloc_vnode_info() too frequently.
> 
> The solution is to cache result of alloc_vnode_info() for specific 'epoch'
> and 'nr_nodes' in the recovery context.
> 
> Signed-off-by: Robin Dong <sanbai at taobao.com>
> ---
>  sheep/group.c      | 12 ++++++++++++
>  sheep/recovery.c   | 46 ++++++++++++++++++++++++++++++++++++++++++----
>  sheep/sheep_priv.h |  4 ++++
>  3 files changed, 58 insertions(+), 4 deletions(-)
> 
> diff --git a/sheep/group.c b/sheep/group.c
> index 63e9ab9..360771d 100644
> --- a/sheep/group.c
> +++ b/sheep/group.c
> @@ -175,6 +175,18 @@ struct vnode_info *get_vnode_info_epoch(uint32_t epoch,
>  	return alloc_vnode_info(&nroot);
>  }
>  
> +int get_nodes_epoch(uint32_t epoch, struct vnode_info *cur_vinfo,
> +		    struct sd_node *nodes, int len)
> +{
> +	int nr_nodes;
> +
> +	nr_nodes = epoch_log_read(epoch, nodes, len);
> +	if (nr_nodes < 0)
> +		nr_nodes = epoch_log_read_remote(epoch, nodes, len,
> +						 NULL, cur_vinfo);
> +	return nr_nodes;
> +}
> +
>  int local_get_node_list(const struct sd_req *req, struct sd_rsp *rsp,
>  			void *data)
>  {
> diff --git a/sheep/recovery.c b/sheep/recovery.c
> index 6008a0b..3616f0a 100644
> --- a/sheep/recovery.c
> +++ b/sheep/recovery.c
> @@ -71,6 +71,10 @@ struct recovery_info {
>  
>  	struct vnode_info *old_vinfo;
>  	struct vnode_info *cur_vinfo;
> +
> +	int max_epoch;
> +	struct vnode_info **vinfo_array;
> +	struct sd_mutex vinfo_lock;
>  };
>  
>  static struct recovery_info *next_rinfo;
> @@ -97,23 +101,44 @@ static inline bool node_is_gateway_only(void)
>  	return sys->this_node.nr_vnodes == 0;
>  }
>  
> +static inline int vinfo_idx(uint32_t epoch, int nr_nodes)
> +{
> +	return epoch * SD_MAX_NODES + nr_nodes;
> +}
> +
>  static struct vnode_info *rollback_vnode_info(uint32_t *epoch,
>  					      struct vnode_info *cur)
>  {
> -	struct vnode_info *vinfo;
> +	struct recovery_info *rinfo = main_thread_get(current_rinfo);

You shouldn't call main_thread_get() in the worker thread. We have assertion
current_rinfo is not thread-safe.

Thanks
Yuan



More information about the sheepdog mailing list