[sheepdog] [PATCH RFC 5/5] sheep: retry when connect() or accept() fails with EMFILE

Hitoshi Mitake mitake.hitoshi at gmail.com
Thu Jul 18 14:55:44 CEST 2013


At Thu, 18 Jul 2013 18:50:21 +0800,
Liu Yuan wrote:
> 
> On Thu, Jul 18, 2013 at 07:43:00PM +0900, Hitoshi Mitake wrote:
> > At Thu, 18 Jul 2013 17:36:11 +0800,
> > Liu Yuan wrote:
> > > 
> > > On Thu, Jul 18, 2013 at 05:47:16PM +0900, Hitoshi Mitake wrote:
> > > > At Thu, 18 Jul 2013 13:36:30 +0800,
> > > > Liu Yuan wrote:
> > > > > 
> > > > > On Thu, Jul 18, 2013 at 02:33:20PM +0900, Hitoshi Mitake wrote:
> > > > > > At Thu, 18 Jul 2013 10:26:22 +0800,
> > > > > > Liu Yuan wrote:
> > > > > > > 
> > > > > > > On Thu, Jul 18, 2013 at 12:46:24AM +0900, Hitoshi Mitake wrote:
> > > > > > > > At Wed, 17 Jul 2013 16:56:10 +0800,
> > > > > > > > Liu Yuan wrote:
> > > > > > > > > 
> > > > > > > > > On Fri, Jul 12, 2013 at 10:54:26AM +0900, Hitoshi Mitake wrote:
> > > > > > > > > > This patch adds calling of shrink_sockfd() after connect() and
> > > > > > > > > > accept() when they return EMFILE.
> > > > > > > > > > 
> > > > > > > > > > These retries can be invoked twice at a maximum. This policy must be
> > > > > > > > > > improved in the future.
> > > > > > > > > > 
> > > > > > > > > > Signed-off-by: Hitoshi Mitake <mitake.hitoshi at lab.ntt.co.jp>
> > > > > > > > > > ---
> > > > > > > > > >  sheep/request.c      |    7 +++++++
> > > > > > > > > >  sheep/sockfd_cache.c |    6 ++++++
> > > > > > > > > >  2 files changed, 13 insertions(+)
> > > > > > > > > > 
> > > > > > > > > > diff --git a/sheep/request.c b/sheep/request.c
> > > > > > > > > > index 3b43c76..8f3f840 100644
> > > > > > > > > > --- a/sheep/request.c
> > > > > > > > > > +++ b/sheep/request.c
> > > > > > > > > > @@ -828,8 +828,15 @@ static void listen_handler(int listen_fd, int events, void *data)
> > > > > > > > > >  	}
> > > > > > > > > >  
> > > > > > > > > >  	namesize = sizeof(from);
> > > > > > > > > > +
> > > > > > > > > > +	int retry = 2;
> > > > > > > > > > +retry_accept:
> > > > > > > > > >  	fd = accept(listen_fd, (struct sockaddr *)&from, &namesize);
> > > > > > > > > >  	if (fd < 0) {
> > > > > > > > > > +		if (errno == EMFILE && retry--) {
> > > > > > > > > > +			if (shrink_sockfd())
> > > > > > > > > > +				goto retry_accept;
> > > > > > > > > > +		}
> > > > > > > > > >  		sd_eprintf("failed to accept a new connection: %m");
> > > > > > > > > >  		return;
> > > > > > > > > 
> > > > > > > > > Better use xaccept() to handle retry accept internally and document why 2.
> > > > > > > > 
> > > > > > > > I don't have the reason of 2, currently. I'd like to profile sheep
> > > > > > > > with tools like LTTng or perf and seek better retry count later.
> > > > > > > 
> > > > > > > No, I think we should try forever for EMFILE until success.
> > > > > > 
> > > > > > Retrying forever is too agressive. Because EMFILE can be caused when
> > > > > > opened files are too many, in theory. I think a threshold is required.
> > > > > > 
> > > > > 
> > > > > Why we need a threshold? I don't see the point. Please explain your theory.
> > > > > 
> > > > 
> > > > In theory, other subsystems than sockfd can exhaust file
> > > > descriptors. But... it would be a case which we don't have to
> > > > consider, sorry.
> > > > 
> > > > How about this: basically, wrappers like xopen() retries until it
> > > > succeeds. When not used cached fd of sockfd is empty, they return
> > > > EMFILE.
> > > > 
> > > 
> > > Currently only sockfd cache need long fd (except our main loop's epoll_fd and
> > > listen_fd with qemu).
> > > 
> > > So I think we can loop until success.
> > > 
> > > sd_open() {
> > > retry:
> > > 	if (open() < 0 && errnor == EMFILE)
> > > 		sockfd_shrink(); # it either succeed or fail to reclaim fd
> > > 		goto retry;
> > > }
> > > 
> > > I think at some point sockfd_shrink() will assure us to reclaim a fd. We can
> > > sleep if sockfd_shrink() fails for better scheduling.
> > 
> > OK. It seems that let sheep_put_sockfd() notify sleeping
> > sockfd_shrink() of fd closing would be a good way.
> > 
> > > 
> > > > > > > 
> > > > > > > > 
> > > > > > > > We shouldn't use x prefix for retrying versions of socket producing
> > > > > > > > functions. Because many x prefixed functions are in libsheepdog and
> > > > > > > > the retrying functions cannot be moved to libsheepdog. They depends on
> > > > > > > > sockfd.
> > > > > > > 
> > > > > > > Then sd_open() is much better name.
> > > > > > 
> > > > > > I'll move sockfd to libsheepdog. So using x prefix is okay.
> > > > > 
> > > > > Perhaps, sockfd cache can't be used by others because of IO NIC code and node
> > > > > management code.
> > > > 
> > > > It would be difficult, as you say. If it is impossible, I'll implement
> > > > a minimal sockfd caching mechanism for collie.
> > > > 
> > > > BTW, I think employing unix domain socket between collie and sheep
> > > > would be benefitical when collie issues many requests. How do you
> > > > think?
> > > 
> > > No, collie are exepcted to connect to other remote sheep too. unix domain won't
> > > speedup collie operation much for local connect. Sockfd cache for collie really
> > > does.
> > 
> > Of course collie talks with remote sheeps. The purpose of this
> > optimization is similar to using unix domain socket between qemu and
> > sheep.
> > 
> > The scheme I'm thinking is like this: 
> > 0. implementing a new operation like SD_OP_UDS_PATH.
> > 1. collie issues this request if an IP address of requesting sheep is
> >    same to its address.
> > 2. sheep returns a path of its unix domain socket
> > 3. collie connects to sheep via the unix domain socket
> > 
> > This optimization is invisible from users. They don't have to give
> > collie the path.
> 
> What I am concerned is that the speedup is very limited because collie opeation is IO
> bound. So it is not worhty of it to use unix domain socket. But I think sockfd
> cache will be a good speedup for operation like 'vdi check' and 'cluster snapshot'
> 
> So I'd like you to try sockfd cache instead of unix domain for collie.

Yes, using unix domain socket is one of the ideas for
optimization. sockfd caching for collie is highest prioritized.

Thanks,
Hitoshi



More information about the sheepdog mailing list