At Fri, 11 Nov 2011 04:59:11 -0500, Christoph Hellwig wrote: > > On Fri, Nov 11, 2011 at 04:10:00AM -0500, Christoph Hellwig wrote: > > On Fri, Nov 11, 2011 at 06:06:16PM +0900, MORITA Kazutaka wrote: > > > posix_fallocate() shows very poor performance if the underlying > > > filesystem doesn't support fallocate() (e.g. ext3). How about using > > > fallocate() instead of posix_fallocate(), and if it returns > > > EOPNOTSUPP, writing SD_DATA_OBJ_SIZE bytes with one pwrite() call? > > > > At least for the samba use case (which is preallocating in 1MB chunks > > and then filling it with 64k chunks) even the dumb preallocation has > > shown benefit for ext3. I'll try to benchmark it soon and will report > > the results to you. > > Numbers on my laptop with ext3 on the second dedicated test SSD, > averaged over three runs (recreated fs each time, restarted sheepdog), > all using > > dd if=/dev/zero of=/dev/vdc bs=67108864 count=16 oflag=direct > > note that this is on a fairly old kernel, and I manually had to mount > with -o barrier=1 > > With pwrite to the last sectors: > > 52.9MB/s for the intial write > 49.0MS/s for the rewrite > > With fallocate: > > 62.7MB/s for the initial write > 54.4MB/s for the rewrite > > From this it seems even the dumb fallocate is a clear win, which matches > the Samba observations. I've also tried Sheepdog with the fallocated patch, but it was intolerably slow on my environment. My environment is: - Linux 2.6.32 - glibc 2.11 - 1TB SATA disk (write-cache is enabled) - ext3 (barrier=1) To test the performance of posix_fallocate() on ext3, I wrote the following program. == #include <stdio.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <assert.h> #define BUF_SIZE (4 * 1024 * 1024) void do_pwrite(int fd) { int ret; static char buf[BUF_SIZE]; ret = pwrite(fd, buf, BUF_SIZE, 0); assert(ret == BUF_SIZE); } void do_fallocate(int fd) { int ret; ret = posix_fallocate(fd, 0, BUF_SIZE); assert(ret == 0); } int main(int argc, char *argv[]) { int fd; if (argc < 3) { printf("usage: %s [filename] (pwrite|fallocate)\n", argv[0]); return 1; } fd = open(argv[1], O_SYNC | O_RDWR | O_CREAT | O_TRUNC, 0644); assert(fd >= 0); if (strcmp(argv[2], "pwrite") == 0) do_pwrite(fd); else if (strcmp(argv[2], "fallocate") == 0) do_fallocate(fd); close(fd); return 0; } == The result was as follows: $ time ./a.out temp pwrite real 0m0.244s user 0m0.000s sys 0m0.008s $ time ./a.out temp fallocate real 0m43.050s user 0m0.000s sys 0m0.060s I've confirmed the similar results on other machines, too. I guess posix_fallocate() causes a severe performance problem under the circumstances that write is slow, because it calls lots of pwrite() for each ext3 block when fallocate() is not available. Thanks, Kazutaka |