[sheepdog] [PATCH v8 2/3] collie: let collie use sockfd cache

Hitoshi Mitake mitake.hitoshi at gmail.com
Wed Jul 31 09:48:45 CEST 2013


At Wed, 31 Jul 2013 15:46:36 +0800,
Liu Yuan wrote:
> 
> On Wed, Jul 31, 2013 at 04:41:27PM +0900, Hitoshi Mitake wrote:
> > At Wed, 31 Jul 2013 15:34:01 +0800,
> > Liu Yuan wrote:
> > > 
> > > On Wed, Jul 31, 2013 at 04:11:47PM +0900, Hitoshi Mitake wrote:
> > > > From: Hitoshi Mitake <mitake.hitoshi at gmail.com>
> > > > 
> > > > This patch lets collie use sockfd cache. Because the sockfd caching
> > > > mechanism is now in libsheepdog, collie can enjoy it on some
> > > > subcommands which issue many requests to sheep.
> > > > 
> > > > Signed-off-by: Hitoshi Mitake <mitake.hitoshi at lab.ntt.co.jp>
> > > > ---
> > > > v8:
> > > >  - fix an unsuitable error message of collie
> > > > 
> > > > v5:
> > > >  - trivial cleaning. remove an old prototype of function
> > > > 
> > > > v4:
> > > >  - remove the redundant function, collie_exec_req_nid()
> > > > 
> > > >  collie/collie.c |    6 ++++++
> > > >  collie/common.c |   24 ++++++++++++++----------
> > > >  2 files changed, 20 insertions(+), 10 deletions(-)
> > > > 
> > > > diff --git a/collie/collie.c b/collie/collie.c
> > > > index 2f8e9d0..d6eef73 100644
> > > > --- a/collie/collie.c
> > > > +++ b/collie/collie.c
> > > > @@ -17,6 +17,7 @@
> > > >  #include "sheep.h"
> > > >  #include "collie.h"
> > > >  #include "util.h"
> > > > +#include "sockfd_cache.h"
> > > >  
> > > >  #define EPOLL_SIZE 4096
> > > >  
> > > > @@ -418,6 +419,11 @@ int main(int argc, char **argv)
> > > >  		exit(EXIT_SYSFAIL);
> > > >  	}
> > > >  
> > > > +	if (sockfd_init()) {
> > > > +		fprintf(stderr, "sockfd_init() failed\n");
> > > > +		exit(EXIT_SYSFAIL);
> > > > +	}
> > > > +
> > > >  	ret = command_fn(argc, argv);
> > > >  	if (ret == EXIT_USAGE)
> > > >  		subcommand_usage(argv[1], argv[2], EXIT_USAGE);
> > > > diff --git a/collie/common.c b/collie/common.c
> > > > index b2b9fb1..2aca7bd 100644
> > > > --- a/collie/common.c
> > > > +++ b/collie/common.c
> > > > @@ -11,6 +11,7 @@
> > > >  
> > > >  #include "collie.h"
> > > >  #include "sha1.h"
> > > > +#include "sockfd_cache.h"
> > > >  
> > > >  char *size_to_str(uint64_t _size, char *str, int str_size)
> > > >  {
> > > > @@ -169,13 +170,18 @@ out:
> > > >  	return ret;
> > > >  }
> > > >  
> > > > -int collie_exec_req(const char *host, int port, struct sd_req *hdr, void *data)
> > > > +int collie_exec_req(const char *host, int port, struct sd_req *hdr, void *buf)
> > > >  {
> > > > -	int fd, ret;
> > > > -	struct sd_rsp *rsp = (struct sd_rsp *)hdr;
> > > > +	struct node_id nid;
> > > > +	struct sockfd *sfd;
> > > > +	int ret;
> > > > +
> > > > +	memset(&nid, 0, sizeof(nid));
> > > > +	str_to_addr(host, nid.addr);
> > > > +	nid.port = port;
> > > >  
> > > > -	fd = connect_to(host, port);
> > > > -	if (fd < 0)
> > > > +	sfd = sockfd_cache_get(&nid);
> > > > +	if (!sfd)
> > > >  		return -1;
> > > >  
> > > >  	/*
> > > > @@ -183,13 +189,11 @@ int collie_exec_req(const char *host, int port, struct sd_req *hdr, void *data)
> > > >  	 * 1. We can't get the newest epoch
> > > >  	 * 2. Some operations might take unexpected long time
> > > >  	 */
> > > > -	ret = exec_req(fd, hdr, data, NULL, 0, UINT32_MAX);
> > > > -	close(fd);
> > > > +	ret = exec_req(sfd->fd, hdr, buf, NULL, 0, UINT32_MAX);
> > > >  
> > > > -	if (ret)
> > > > -		return -1;
> > > > +	sockfd_cache_put(&nid, sfd);
> > > >  
> > > > -	return rsp->result;
> > > > +	return ret;
> > > 
> > > Any reason you return 'ret' instead of old rsp->result? This kind of change is
> > > quite hard to catch, so please write it in the commit log if you change the old
> > > semantics of any function. I think we should return rsp->result to the caller.
> > 
> > Sorry for that, I had to add a description for this change.
> > 
> > But returning rsp->result is a bug. Because many callers of
> > collie_exec_req() detects an error of issuing request by checking the
> > returned value is negative or not. And the code which represents the
> > result of request cannot be negative. So I think returning the "ret"
> > is a correct behavior.
> > 
> > Should I make a new patch for this change in the next series?
> 
> Better you do. And one caller of it actually check it as ->result.
> 
> collie/node.c:          ret = collie_exec_req(host, sd_nodes[i].nid.port, &req, NULL);
> collie/node.c-          if (ret == SD_RES_NODE_IN_RECOVERY) {
> 
> So we need one patch 
> 1. teach it to return 0 for success and -1 for error.
> 2. fix node.c

OK, I'll allocate one new patch for doing this change.

Thanks,
Hitoshi



More information about the sheepdog mailing list