[sheepdog] [PATCH] sheep: check resource limit at startup

MORITA Kazutaka morita.kazutaka at lab.ntt.co.jp
Tue Jan 29 05:13:47 CET 2013


At Tue, 29 Jan 2013 10:34:00 +0800,
Liu Yuan wrote:
> 
> On 01/29/2013 08:35 AM, MORITA Kazutaka wrote:
> > At Mon, 28 Jan 2013 14:47:41 +0800,
> > Liu Yuan wrote:
> >>
> >> From: Liu Yuan <tailai.ly at taobao.com>
> >>
> >> Sheep daemon is FD hungry and can't survive for EMFILE.
> >> 1024 is default for NOFILE on most distributions, which is very
> >> dangerous to run Sheepdog cluster. Let's give a warning on this.
> >>
> >> Signed-off-by: Liu Yuan <tailai.ly at taobao.com>
> >> ---
> >>  sheep/sheep.c |   29 +++++++++++++++++++++++++++++
> >>  1 file changed, 29 insertions(+)
> >>
> >> diff --git a/sheep/sheep.c b/sheep/sheep.c
> >> index f2c7bcd..5c5bcc5 100644
> >> --- a/sheep/sheep.c
> >> +++ b/sheep/sheep.c
> >> @@ -26,6 +26,7 @@
> >>  #include <fcntl.h>
> >>  #include <errno.h>
> >>  #include <arpa/inet.h>
> >> +#include <sys/resource.h>
> >>  
> >>  #include "sheep_priv.h"
> >>  #include "trace/trace.h"
> >> @@ -385,6 +386,33 @@ static int init_work_queues(void)
> >>  	return 0;
> >>  }
> >>  
> >> +#define SD_RLIM_NOFILE 65536
> > 
> > Please explain why you think 65536 is enough.  Does it an enough
> > number even if many VMs send IO requests at the same time?  I've hit
> > the FD limitation when doing the test.
> >
> 
> This is a random value. How about 'unlimited'? Then we don't need to
> tweak it later when it is not big enough.

We cannot change the value to unlimited, can we?

  # ulimit -n 
  1024
  # ulimit -n 4096
  # ulimit -n
  4096
  # ulimit -n unlimited
  -bash: ulimit: open files: cannot modify limit: Operation not permitted

> 
> 
> > BTW, Is it difficult to modify sheepdog so that it doesn't use so many
> > FDs?
> > 
> 
> I think yes. Currently the biggest user of FD is sockfd cache and
> pusher's push threads, which is performance critical. If we limit the FD
> to some ceiling, then I guess we'll lose the performance.

Exceeding the RLIM_NOFILE value leads to EIO of VMs, doesn't it?  If
yes, it should be definitely avoided.  Is limiting the number of
consuming sockfds according to the current RLIM_NOFILE also difficult?
If we can do it, setting a large value to RLIM_NOFILE doesn't affect
the sheepdog performance.

Thanks,

Kazutaka



More information about the sheepdog mailing list