[Sheepdog] [PATCH 2/5] sheep: add some candy helpers in util.c

Liu Yuan namei.unix at gmail.com
Thu Nov 17 04:03:45 CET 2011


On 11/16/2011 03:34 PM, MORITA Kazutaka wrote:

> At Wed, 16 Nov 2011 14:59:12 +0800,
> Liu Yuan wrote:
>>
>> On 11/16/2011 02:45 PM, MORITA Kazutaka wrote:
>>
>>> At Wed, 16 Nov 2011 13:58:26 +0800,
>>> Liu Yuan wrote:
>>>>
>>>> On 11/16/2011 01:50 PM, MORITA Kazutaka wrote:
>>>>
>>>>>>>
>>>>>>> We already have a hash function, fnv_64a_buf().  How about using it
>>>>>>> instead of introducing new one?
>>>>>>>
>>>>>>
>>>>>>
>>>>>> When I need a hash function that inputs an integer and outputs an
>>>>>> integer for specified bits, I looked at the fnv_64a_buf(), it
seems to
>>>>>> me it tries to hash an string to integer. So I think these hash
>>>>>> functions do a different thing, no?
>>>>>
>>>>> We already use it to calculate a hash value from a 64 bit object id:
>>>>>
>>>>>   fnv_64a_buf(&oid, sizeof(uint64_t), FNV1A_64_INIT);
>>>>>
>>>>> I guess this generates a less random value than hash_64(), though.
>>>>
>>>>
>>>> Umm, I didn't go over usage close enough, and this interface is more
>>>> named to handle string hash, at least to me.
>>>>
>>>> So even we use it to hash integer, we need to further bit operation to
>>>> remove extra bits. hash_long(hval, bits) has a nicer and more intuitive
>>>> interface to handle integer.
>>>
>>> static inline uint64_t hash_64(uint64_t val, unsigned int bits)
>>> {
>>>         uint64_t hash = fnv_64a_buf(&val, sizeof(uint64_t),
FNV1A_64_INIT);
>>>
>>>         return hash >> (64 - bits);
>>> }
>>>
>>> Isn't this good for you?
>>>
>>
>>
>> I am not against it.
>>
>>>>
>>>> How about coexisting these two? fnv_64a_buf() is supposed to handle
>>>> string hash and hash_long() is supposed to handle integer otherwise.
>>>
>>> If you need a better randomized function for integers, I'm not against
>>> introducing a new hash.  But otherwise, FNV hash looks enough to me.
>>>
>>
>>
>> I really need an hash function that distribute OID to different hash
>> lists, this is quite performance sensitive, since farm use it to lookup
>> object hash table lists for every write operation to mark object dirty.
>>
>> mind do a test to show how well it perform on OID hash? If distributed
>> randomly enough, I am happy to use it.
>
> FNV hash is one of the famous hash functions.  Although I don't know
> its mathematical background so much, I think it would work well.  Many
> software use it.  See also http://www.isthe.com/chongo/tech/comp/fnv/
>
> Of course, testing would be appreciated. :)
>
> Thanks,
>
> Kazutaka


I wrote a test for it, looks hash_long() outperformed FNV for integer.

result:
10 16 8 11 12 16 14 10 15 8 4 6 15 9 8 18 7 11 10 14 9 15 15 7 8 6 9 5
14 17 9 11 7 11 8 13 19 10 10 8 11 6 9 16 7 9 8 11 10 12 11 8 5 11 10 8
11 6 5 8 12 7 9 15 4 8 6 7 5 13 17 7 7 4 8 5 10 16 4 10 9 9 7 8 12 3 11
9 7 6 8 12 9 10 8 10 9 8 11 6 14 8 7 9 8 13 7 10 7 12 10 10 11 12 14 8
10 16 10 11 12 12 8 8 13 9 9 10 13 9 9 13 15 8 12 14 9 11 12 18 6 8 10 9
12 12 17 7 10 10 6 11 13 14 6 9 10 7 10 12 18 14 11 11 5 9 14 13 8 9 9 7
14 10 10 8 12 9 4 11 10 6 8 8 10 6 19 12 5 6 12 7 6 13 9 5 7 8 6 6 16 9
6 8 11 5 6 14 9 8 12 8 5 6 15 8 5 10 10 9 5 11 8 6 13 8 7 9 11 11 8 16
10 9 7 10 13 6 13 7 10 12 7 14 9 13 10 6 11 13 11 11 15 10 8 13 9 8 9 19
12 10 11 9 8 8 16 10 7 17 8 9 8 16 11 13 12 12 10 6 13 5 12 14 7 11 11
12 8 14 14 11 11 9 13 7 11 12 9 6 10 14 10 13 11 6 6 11 8 8 12 6 2 6 10
11 9 15 8 9 3 9 8 7 17 7 8 7 9 6 4 17 5 10 7 10 7 3 14 6 13 8 9 6 5 14 8
12 10 8 8 6 12 6 12 9 6 10 8 10 12 10 9 12 12 8 10 9 11 9 12 13 12 7 13
11 7 12 11 7 8 8 15 10 11 12 13 9 11 12 11 13 12 19 6 6 6 13 10 11 17 7
10 9 10 11 13 17 9 9 6 9 5 15 19 9 16 7 8 7 13 18 7 10 9 7 6 13 14 11 8
8 9 8 14 9 9 9 7 8 14 14 3 5 8 10 7 10 14 3 8 7 5 6 15 16 3 9 6 6 6 11
15 5 10 8 7 7 10 10 3 11 8 7 9 9 13 7 13 6 10 9 9 11 7 14 7 9 8 8 11 7
11 8 9 13 10 13 9 13 9 8 13 14 12 9 15 7 8 11 10 10 10 15 10 9 11 15 9
12 16 8 9 10 18 7 7 11 9 17 9 17 7 8 11 4 9 14 12 9 9 15 6 12 10 15 10
10 11 6 12 10 12 7 9 12 8 14 10 10 7 12 9 7 12 8 6 5 11 10 7 19 11 7 4
11 8 5 15 8 3 8 11 6 6 16 7 8 7 11 5 4 15 7 10 11 8 6 5 15 7 7 11 10 8 5
11 8 7 11 8 6 9 11 11 9 13 9 11 7 8 13 9 12 10 11 13 6 15 10 11 10 8 10
9 9 12 15 12 9 17 7 11 13 16 12 10 13 8 7 6 15 9 8 15 6 10 8 15 9 14 15
9 13 6 11 5 14 16 8 11 10 11 8 14 18 9 10 8 13 5 10 15 6 9 8 12 9 14 9 6
7 10 9 8 12 7 3 5 11 9 7 16 6 8 4 8 8 10 18 6 7 5 8 6 6 18 6 12 7 9 8 4
15 4 11 8 8 8 6 13 8 13 10 9 9 5 12 6 13 7 7 9 8 13 8 12 7 13 12 9 12 9
11 9 10 14 9 9 10 12 8 11 13 8 9 6 12 11 9 11 18 10 8 13 12 13 10 19 6 4
9 9 12 13 15 7 13 9 7 11 13 16 8 8 8 9 10 14 19 11 12 11 5 8 15 16 11 9
7 7 8 12 13 8 11 10 6 8 11 9 7 8 9 6 14 13 7 5 11 8 5 11 13 2 7 8 7 5 18
11 4 9 9 6 5 12 12 4 12 7 6 8 13 9 2 10 10 7 8 10 9 7 13 9 9 7 13 9 8 16
8 8 9 9 12 7 13 8 7 12 8 14 11 15 8 7 11 12 10 11 14 9 6 15 10 10 10 16
10 7 10 13 9 9 16 12 8 13 10 7 9 13 9 13 10 14 12 7 14 4 11 14 11 8 8 14
6 14 13 14 11 11 13 7 12 10 12 8 10 12 9 16 9 6 6 12 8 9 12 7 3 6 9 9 8
18 10 8 3 11 7 8 17 6 6 6 10 5 4 17 6 12 6 10 7 3 15 7 10 9 9 7 3 16 7
12 9 9 8 4 10 8 10 9 8 9 6 11 11 10 11 11 11 8 8 12 10 9 11 11 12 6 12
10 10 10 12 10 8 8 13 12 11 ************************************
8 11 8 11 8 12 8 12 8 11 8 11 9 11 9 11 9 10 9 10 10 10 10 9 10 9 11 9
11 9 11 8 11 8 12 8 12 8 11 8 12 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12
8 11 8 11 8 12 8 12 8 11 8 11 9 11 9 11 9 10 9 10 10 10 10 9 10 9 11 9
11 9 11 8 11 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12
8 11 8 11 8 12 8 12 8 11 8 11 9 11 9 11 9 10 10 10 10 10 10 9 10 9 11 9
11 9 11 8 11 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12
8 11 8 11 8 12 8 12 8 11 8 11 9 11 9 11 9 10 10 10 10 10 10 9 10 9 11 9
11 9 11 8 11 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12
8 11 8 11 8 12 8 12 8 11 8 11 9 11 9 10 9 10 10 10 10 10 10 9 10 9 11 9
11 9 11 8 11 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 12 8 11 8 12 8 12
8 11 8 11 8 12 8 12 8 11 8 11 9 11 9 10 9 10 10 10 10 10 10 9 10 9 11 9
11 9 11 8 11 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 11 8 11 8 12 8 12
8 11 8 11 8 12 8 12 8 11 8 11 9 11 9 10 9 10 10 10 10 10 10 9 10 9 11 9
11 9 11 8 11 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 11 8 11 8 12 8 12
8 11 8 11 8 12 8 12 8 11 8 11 9 11 9 10 9 10 10 10 10 10 10 9 10 9 11 9
11 9 11 8 12 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 11 8 11 8 12 8 12
8 11 8 11 8 12 8 12 8 11 8 11 9 11 9 10 9 10 10 10 10 10 10 9 10 9 11 9
11 8 11 8 12 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 11 8 11 8 12 8 12
8 11 8 11 8 12 8 12 8 11 9 11 9 11 9 10 9 10 10 10 10 10 10 9 10 9 11 9
11 8 11 8 12 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 11 8 11 8 12 8 12
8 11 8 11 8 12 8 12 8 11 9 11 9 11 9 10 9 10 10 10 10 10 10 9 10 9 11 9
11 8 11 8 12 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 11 8 11 8 12 8 12
8 11 8 11 8 12 8 11 8 11 9 11 9 11 9 10 9 10 10 10 10 10 10 9 10 9 11 9
11 8 11 8 12 8 12 8 11 8 11 8 12 8 12 8 11 8 12 8 12 8 11 8 11 8 12 8 12
8 11 8 11 8 12 8 11 8 11 9 11 9 11 9 10 9 10 10 10 10 10 10 9 10 9 11 9
11 8 11 8 12 8 12 8 11 8 11 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 12
8 11 8 11 8 12 8 11 8 11 9 11 9 11 9 10 9 10 10 10 10 10 10 9 11 9 11 9
11 8 11 8 12 8 12 8 11 8 11 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 12
8 11 8 11 8 12 8 11 8 11 9 11 9 11 9 10 9 10 10 10 10 10 10 9 11 9 11 9
11 8 11 8 12 8 12 8 11 8 11 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 12
8 11 8 11 8 12 8 11 8 11 9 11 9 11 9 10 9 10 10 10 10 9 10 9 11 9 11 9
11 8 11 8 12 8 12 8 11 8 11 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 12
8 11 8 12 8 12 8 11 8 11 9 11 9 11 9 10 9 10 10 10 10 9 10 9 11 9 11 9
11 8 11 8 12 8 12 8 11 8 11 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 12
8 11 8 12 8 12 8 11 8 11 9 11 9 11 9 10 9 10 10 10 10 9 10 9 11 9 11 9
11 8 11 8 12 8 12 8 11 8 11 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12

test.c
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <stdint.h>

static inline uint64_t hash_long(uint64_t val, unsigned int bits)
{
        uint64_t hash = val;

        /*  Sigh, gcc can't optimise this alone like it does for 32 bits. */
        uint64_t n = hash;
        n <<= 18;
        hash -= n;
        n <<= 33;
        hash -= n;
        n <<= 3;
        hash += n;
        n <<= 3;
        hash -= n;
        n <<= 4;
        hash += n;
        n <<= 2;
        hash += n;

        /* High bits are more random, so use them. */
        return hash >> (64 - bits);
}

#define FNV1A_64_INIT ((uint64_t) 0xcbf29ce484222325ULL)

/*
 * 64 bit Fowler/Noll/Vo FNV-1a hash code
 */
static inline uint64_t fnv_64a_buf(void *buf, size_t len, uint64_t hval)
{
        unsigned char *bp = (unsigned char *) buf;
        unsigned char *be = bp + len;
        while (bp < be) {
                hval ^= (uint64_t) *bp++;
                hval += (hval << 1) + (hval << 4) + (hval << 5) +
                        (hval << 7) + (hval << 8) + (hval << 40);
        }
        return hval;
}

static inline uint64_t hash_64(uint64_t val, unsigned int bits)
{
        uint64_t hash = fnv_64a_buf(&val, sizeof(uint64_t), FNV1A_64_INIT);

        return hash >> (64 - bits);
}


int bucket[1024];
int main()
{
	uint64_t oid = 0x7c2b2500000001;
	int i;

	/* simuate objects for 40G store */
	for (i = 0; i < 10000; i++) {
		int hash = hash_64(oid, 10);
		bucket[hash]++;
		oid++;
	}

	for (i = 0; i < 1024; i++) {
		printf("%d ", bucket[i]);
		bucket[i] = 0;
	}

	printf("************************************\n");
	oid = 0x7c2b2500000001;
	for (i = 0; i < 10000; i++) {
		int hash = hash_long(oid, 10);
		bucket[hash]++;
		oid++;
	}

	for (i = 0; i < 1024; i++) {
		printf("%d ", bucket[i]);
	}
	printf("\n");
	return 0;
}





More information about the sheepdog mailing list