[sheepdog] [PATCH RFC 0/5] sockfd shrinking mechanism for handling EMFILE

Hitoshi Mitake mitake.hitoshi at gmail.com
Sun Jul 14 09:41:31 CEST 2013


At Sun, 14 Jul 2013 14:42:13 +0800,
Liu Yuan wrote:
> 
> On Sun, Jul 14, 2013 at 03:30:04PM +0900, Hitoshi Mitake wrote:
> > At Sun, 14 Jul 2013 13:43:52 +0800,
> > Liu Yuan wrote:
> > > 
> > > On Fri, Jul 12, 2013 at 10:54:21AM +0900, Hitoshi Mitake wrote:
> > > > This patchset implements a mechanism for shrinking cached fds in
> > > > sockfd subsystem for handling EMFILE gracefully. With this mechanism,
> > > > sheep can retry creating a new fd after it faces EMFILE. In this
> > > > patchset, some invocations of retrying are inserted into various
> > > > operations which can create new fds.
> > > >
> > > 
> > > I think we should try to shrink at the time sockfd_cache_put() is called too.
> > > This is indirect shrink and shink at the time of EMFILE is direct shrink. I
> > > think indirect shrink should be implemented first. This will reduce the long
> > > connections for idle sheep.
> > 
> > Is the indirect shrinking useful? Of course, TCP ports are globally
> 
> Yes, to me. Reasons:
>  1. we don't want to take much system resources (long connection) for idle sheep
>  2. we want to maximize the performance by keeping long connections while there
>     are pending requests.

Now I can understand the importance of indirect shrinking. Thanks for
your explanation.

> 
> > shared resource on one machine so sheep shouldn't use them too
> > much. But the limitation of TCP ports can be limited via ulimit. So I
> > think the direct shrink is more important. EMFILE actually causes
> > problems on typical sheepdog deployments.
> >
> 
> Can ulimit change the open file number of sheep on the fly? What do you mean by
> 'typical sheepdog deployment'? At least, as far as I know, all the production of
> sheep cluster is deployed with a very large open files to mitidate EMFILE
> problems. So for these kind of deployments, indirect shrinking would be very
> much useful to reduce system resources that is taken by long connections.
> Even if we can handle EMFILE gracefully, I don't think it is good to trigger
> direct shrink every call of open() to hurt performance. So setting a relatively
> large open files will ramain valid.

The performance degradation is caused rarely. Because the direct
shrinking is invoked only when sheep faces EMFILE.

I agree with that indirect shrinking is also important. But current
direct shrinking implementation doesn't have disadvantages and
shrink_sockfd() can also be used by indirect shrinking which will be
implemented in the future.

I believe implementing direct shrinking first has no problem.

Thanks,
Hitoshi



More information about the sheepdog mailing list