[Sheepdog] [PATCH v4 03/12] farm: add sha1_file operations
Liu Yuan
namei.unix at gmail.com
Wed Dec 28 14:50:53 CET 2011
On 12/28/2011 09:19 PM, Christoph Hellwig wrote:
> On Sun, Dec 25, 2011 at 11:42:49PM +0800, Liu Yuan wrote:
>> From: Liu Yuan <tailai.ly at taobao.com>
>>
>> All the objects(snap, trunk, data) in the farm is based on the
>> operations of the sha1_file.
>>
>> sha1_file provide us some useful features:
>>
>> - Regardless of object type, all objects are all in deflated with zlib,
>> and have a header that not only specifies their tag, but also size
>> information about the data in the object. It's worth noting that the
>> SHA1 hash that is used to name the object is always the hash of this
>> _compressed_ object, not the original data.
>
> why?
Compression is supposed to reduce the storage usage for snapshot
objects. And I assume hashing the compressed content is faster than the
original one.
If not, any benefits?
>
>>
>> - the general consistency of an object can always be tested independently
>> of the contents or the type of the object: all objects can be validated
>> by verifying that
>> (a) their hashes match the content of the file and
>> (b) the object successfully inflates to a stream of bytes that
>> forms a sequence of <sha1_file_hdr> + <binary object data>
>
> These comments should not only be in the commit message but also in
> the source code.
>
I have written it in doc/farm-internals.txt, and it looks a little bit
redundant to include them in the files, isn't it?
>> --- /dev/null
>> +++ b/sheep/farm.h
>
> Any reason the header is not inside the farm/ subdirectory?
>
>> + return farm_obj_dir;
>> +}
>> +
>> +static void fill_sha1_path(char *pathbuf, const unsigned char *sha1)
>> +{
>> + int i;
>> + for (i = 0; i < SHA1_LEN; i++) {
>> + static char hex[] = "0123456789abcdef";
>
> I'd rather have a define for the hex chars, and defined SHA1_LEN
> as strlen(HEX_CHARS) which any recent compiler should optimize to a
> constant expression.
>
Okay.
>> + objdir = get_object_directory();
>> + len = strlen(objdir);
>> +
>> + /* '/' + sha1(2) + '/' + sha1(38) + '\0' */
>> + if (len + 43 > PATH_MAX)
>> + panic("insanely long object directory %s", objdir);
>
> I'd rather check the path length during startup and refuse to start
> sheep instead of aborting later on.
Okay.
>> + memcpy(buf, objdir, len);
>> + buf[len] = '/';
>> + buf[len+3] = '/';
>> + buf[len+42] = '\0';
>> + fill_sha1_path(buf + len + 1, sha1);
>
> Can't we use the strbuf helpers here to make the code readable and
> less fragile?
>
>> + setxattr(name, CNAME, &count, CSIZE, 0);
>
> error handling?
Okay.
>> + void *map;
>> +
>> + if (fd < 0) {
>> + perror(filename);
>> + return NULL;
>> + }
>> + if (fstat(fd, &st) < 0) {
>> + close(fd);
>> + return NULL;
>> + }
>> + map = mmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
>> + close(fd);
>> + if (-1 == (int)(long)map)
>
> just use a
>
> if (map == MAP_FAILED)
>
> here.
>
Thanks for the tip.
Thanks,
Yuan
More information about the sheepdog
mailing list