On 12/28/2011 09:19 PM, Christoph Hellwig wrote: > On Sun, Dec 25, 2011 at 11:42:49PM +0800, Liu Yuan wrote: >> From: Liu Yuan <tailai.ly at taobao.com> >> >> All the objects(snap, trunk, data) in the farm is based on the >> operations of the sha1_file. >> >> sha1_file provide us some useful features: >> >> - Regardless of object type, all objects are all in deflated with zlib, >> and have a header that not only specifies their tag, but also size >> information about the data in the object. It's worth noting that the >> SHA1 hash that is used to name the object is always the hash of this >> _compressed_ object, not the original data. > > why? Compression is supposed to reduce the storage usage for snapshot objects. And I assume hashing the compressed content is faster than the original one. If not, any benefits? > >> >> - the general consistency of an object can always be tested independently >> of the contents or the type of the object: all objects can be validated >> by verifying that >> (a) their hashes match the content of the file and >> (b) the object successfully inflates to a stream of bytes that >> forms a sequence of <sha1_file_hdr> + <binary object data> > > These comments should not only be in the commit message but also in > the source code. > I have written it in doc/farm-internals.txt, and it looks a little bit redundant to include them in the files, isn't it? >> --- /dev/null >> +++ b/sheep/farm.h > > Any reason the header is not inside the farm/ subdirectory? > >> + return farm_obj_dir; >> +} >> + >> +static void fill_sha1_path(char *pathbuf, const unsigned char *sha1) >> +{ >> + int i; >> + for (i = 0; i < SHA1_LEN; i++) { >> + static char hex[] = "0123456789abcdef"; > > I'd rather have a define for the hex chars, and defined SHA1_LEN > as strlen(HEX_CHARS) which any recent compiler should optimize to a > constant expression. > Okay. >> + objdir = get_object_directory(); >> + len = strlen(objdir); >> + >> + /* '/' + sha1(2) + '/' + sha1(38) + '\0' */ >> + if (len + 43 > PATH_MAX) >> + panic("insanely long object directory %s", objdir); > > I'd rather check the path length during startup and refuse to start > sheep instead of aborting later on. Okay. >> + memcpy(buf, objdir, len); >> + buf[len] = '/'; >> + buf[len+3] = '/'; >> + buf[len+42] = '\0'; >> + fill_sha1_path(buf + len + 1, sha1); > > Can't we use the strbuf helpers here to make the code readable and > less fragile? > >> + setxattr(name, CNAME, &count, CSIZE, 0); > > error handling? Okay. >> + void *map; >> + >> + if (fd < 0) { >> + perror(filename); >> + return NULL; >> + } >> + if (fstat(fd, &st) < 0) { >> + close(fd); >> + return NULL; >> + } >> + map = mmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, fd, 0); >> + close(fd); >> + if (-1 == (int)(long)map) > > just use a > > if (map == MAP_FAILED) > > here. > Thanks for the tip. Thanks, Yuan |