[Sheepdog] [PATCH 2/5] sheep: add some candy helpers in util.c

MORITA Kazutaka morita.kazutaka at lab.ntt.co.jp
Thu Nov 17 04:57:49 CET 2011


At Thu, 17 Nov 2011 11:03:45 +0800,
Liu Yuan wrote:
> 
> On 11/16/2011 03:34 PM, MORITA Kazutaka wrote:
> 
> > At Wed, 16 Nov 2011 14:59:12 +0800,
> > Liu Yuan wrote:
> >>
> >> On 11/16/2011 02:45 PM, MORITA Kazutaka wrote:
> >>
> >>> At Wed, 16 Nov 2011 13:58:26 +0800,
> >>> Liu Yuan wrote:
> >>>>
> >>>> On 11/16/2011 01:50 PM, MORITA Kazutaka wrote:
> >>>>
> >>>>>>>
> >>>>>>> We already have a hash function, fnv_64a_buf().  How about using it
> >>>>>>> instead of introducing new one?
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>> When I need a hash function that inputs an integer and outputs an
> >>>>>> integer for specified bits, I looked at the fnv_64a_buf(), it
> seems to
> >>>>>> me it tries to hash an string to integer. So I think these hash
> >>>>>> functions do a different thing, no?
> >>>>>
> >>>>> We already use it to calculate a hash value from a 64 bit object id:
> >>>>>
> >>>>>   fnv_64a_buf(&oid, sizeof(uint64_t), FNV1A_64_INIT);
> >>>>>
> >>>>> I guess this generates a less random value than hash_64(), though.
> >>>>
> >>>>
> >>>> Umm, I didn't go over usage close enough, and this interface is more
> >>>> named to handle string hash, at least to me.
> >>>>
> >>>> So even we use it to hash integer, we need to further bit operation to
> >>>> remove extra bits. hash_long(hval, bits) has a nicer and more intuitive
> >>>> interface to handle integer.
> >>>
> >>> static inline uint64_t hash_64(uint64_t val, unsigned int bits)
> >>> {
> >>>         uint64_t hash = fnv_64a_buf(&val, sizeof(uint64_t),
> FNV1A_64_INIT);
> >>>
> >>>         return hash >> (64 - bits);
> >>> }
> >>>
> >>> Isn't this good for you?
> >>>
> >>
> >>
> >> I am not against it.
> >>
> >>>>
> >>>> How about coexisting these two? fnv_64a_buf() is supposed to handle
> >>>> string hash and hash_long() is supposed to handle integer otherwise.
> >>>
> >>> If you need a better randomized function for integers, I'm not against
> >>> introducing a new hash.  But otherwise, FNV hash looks enough to me.
> >>>
> >>
> >>
> >> I really need an hash function that distribute OID to different hash
> >> lists, this is quite performance sensitive, since farm use it to lookup
> >> object hash table lists for every write operation to mark object dirty.
> >>
> >> mind do a test to show how well it perform on OID hash? If distributed
> >> randomly enough, I am happy to use it.
> >
> > FNV hash is one of the famous hash functions.  Although I don't know
> > its mathematical background so much, I think it would work well.  Many
> > software use it.  See also http://www.isthe.com/chongo/tech/comp/fnv/
> >
> > Of course, testing would be appreciated. :)
> >
> > Thanks,
> >
> > Kazutaka
> 
> 
> I wrote a test for it, looks hash_long() outperformed FNV for integer.
> 
> result:
> 10 16 8 11 12 16 14 10 15 8 4 6 15 9 8 18 7 11 10 14 9 15 15 7 8 6 9 5
> 14 17 9 11 7 11 8 13 19 10 10 8 11 6 9 16 7 9 8 11 10 12 11 8 5 11 10 8
> 11 6 5 8 12 7 9 15 4 8 6 7 5 13 17 7 7 4 8 5 10 16 4 10 9 9 7 8 12 3 11
> 9 7 6 8 12 9 10 8 10 9 8 11 6 14 8 7 9 8 13 7 10 7 12 10 10 11 12 14 8
> 10 16 10 11 12 12 8 8 13 9 9 10 13 9 9 13 15 8 12 14 9 11 12 18 6 8 10 9
> 12 12 17 7 10 10 6 11 13 14 6 9 10 7 10 12 18 14 11 11 5 9 14 13 8 9 9 7
> 14 10 10 8 12 9 4 11 10 6 8 8 10 6 19 12 5 6 12 7 6 13 9 5 7 8 6 6 16 9
> 6 8 11 5 6 14 9 8 12 8 5 6 15 8 5 10 10 9 5 11 8 6 13 8 7 9 11 11 8 16
> 10 9 7 10 13 6 13 7 10 12 7 14 9 13 10 6 11 13 11 11 15 10 8 13 9 8 9 19
> 12 10 11 9 8 8 16 10 7 17 8 9 8 16 11 13 12 12 10 6 13 5 12 14 7 11 11
> 12 8 14 14 11 11 9 13 7 11 12 9 6 10 14 10 13 11 6 6 11 8 8 12 6 2 6 10
> 11 9 15 8 9 3 9 8 7 17 7 8 7 9 6 4 17 5 10 7 10 7 3 14 6 13 8 9 6 5 14 8
> 12 10 8 8 6 12 6 12 9 6 10 8 10 12 10 9 12 12 8 10 9 11 9 12 13 12 7 13
> 11 7 12 11 7 8 8 15 10 11 12 13 9 11 12 11 13 12 19 6 6 6 13 10 11 17 7
> 10 9 10 11 13 17 9 9 6 9 5 15 19 9 16 7 8 7 13 18 7 10 9 7 6 13 14 11 8
> 8 9 8 14 9 9 9 7 8 14 14 3 5 8 10 7 10 14 3 8 7 5 6 15 16 3 9 6 6 6 11
> 15 5 10 8 7 7 10 10 3 11 8 7 9 9 13 7 13 6 10 9 9 11 7 14 7 9 8 8 11 7
> 11 8 9 13 10 13 9 13 9 8 13 14 12 9 15 7 8 11 10 10 10 15 10 9 11 15 9
> 12 16 8 9 10 18 7 7 11 9 17 9 17 7 8 11 4 9 14 12 9 9 15 6 12 10 15 10
> 10 11 6 12 10 12 7 9 12 8 14 10 10 7 12 9 7 12 8 6 5 11 10 7 19 11 7 4
> 11 8 5 15 8 3 8 11 6 6 16 7 8 7 11 5 4 15 7 10 11 8 6 5 15 7 7 11 10 8 5
> 11 8 7 11 8 6 9 11 11 9 13 9 11 7 8 13 9 12 10 11 13 6 15 10 11 10 8 10
> 9 9 12 15 12 9 17 7 11 13 16 12 10 13 8 7 6 15 9 8 15 6 10 8 15 9 14 15
> 9 13 6 11 5 14 16 8 11 10 11 8 14 18 9 10 8 13 5 10 15 6 9 8 12 9 14 9 6
> 7 10 9 8 12 7 3 5 11 9 7 16 6 8 4 8 8 10 18 6 7 5 8 6 6 18 6 12 7 9 8 4
> 15 4 11 8 8 8 6 13 8 13 10 9 9 5 12 6 13 7 7 9 8 13 8 12 7 13 12 9 12 9
> 11 9 10 14 9 9 10 12 8 11 13 8 9 6 12 11 9 11 18 10 8 13 12 13 10 19 6 4
> 9 9 12 13 15 7 13 9 7 11 13 16 8 8 8 9 10 14 19 11 12 11 5 8 15 16 11 9
> 7 7 8 12 13 8 11 10 6 8 11 9 7 8 9 6 14 13 7 5 11 8 5 11 13 2 7 8 7 5 18
> 11 4 9 9 6 5 12 12 4 12 7 6 8 13 9 2 10 10 7 8 10 9 7 13 9 9 7 13 9 8 16
> 8 8 9 9 12 7 13 8 7 12 8 14 11 15 8 7 11 12 10 11 14 9 6 15 10 10 10 16
> 10 7 10 13 9 9 16 12 8 13 10 7 9 13 9 13 10 14 12 7 14 4 11 14 11 8 8 14
> 6 14 13 14 11 11 13 7 12 10 12 8 10 12 9 16 9 6 6 12 8 9 12 7 3 6 9 9 8
> 18 10 8 3 11 7 8 17 6 6 6 10 5 4 17 6 12 6 10 7 3 15 7 10 9 9 7 3 16 7
> 12 9 9 8 4 10 8 10 9 8 9 6 11 11 10 11 11 11 8 8 12 10 9 11 11 12 6 12
> 10 10 10 12 10 8 8 13 12 11 ************************************
> 8 11 8 11 8 12 8 12 8 11 8 11 9 11 9 11 9 10 9 10 10 10 10 9 10 9 11 9
> 11 9 11 8 11 8 12 8 12 8 11 8 12 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12
> 8 11 8 11 8 12 8 12 8 11 8 11 9 11 9 11 9 10 9 10 10 10 10 9 10 9 11 9
> 11 9 11 8 11 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12
> 8 11 8 11 8 12 8 12 8 11 8 11 9 11 9 11 9 10 10 10 10 10 10 9 10 9 11 9
> 11 9 11 8 11 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12
> 8 11 8 11 8 12 8 12 8 11 8 11 9 11 9 11 9 10 10 10 10 10 10 9 10 9 11 9
> 11 9 11 8 11 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12
> 8 11 8 11 8 12 8 12 8 11 8 11 9 11 9 10 9 10 10 10 10 10 10 9 10 9 11 9
> 11 9 11 8 11 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 12 8 11 8 12 8 12
> 8 11 8 11 8 12 8 12 8 11 8 11 9 11 9 10 9 10 10 10 10 10 10 9 10 9 11 9
> 11 9 11 8 11 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 11 8 11 8 12 8 12
> 8 11 8 11 8 12 8 12 8 11 8 11 9 11 9 10 9 10 10 10 10 10 10 9 10 9 11 9
> 11 9 11 8 11 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 11 8 11 8 12 8 12
> 8 11 8 11 8 12 8 12 8 11 8 11 9 11 9 10 9 10 10 10 10 10 10 9 10 9 11 9
> 11 9 11 8 12 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 11 8 11 8 12 8 12
> 8 11 8 11 8 12 8 12 8 11 8 11 9 11 9 10 9 10 10 10 10 10 10 9 10 9 11 9
> 11 8 11 8 12 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 11 8 11 8 12 8 12
> 8 11 8 11 8 12 8 12 8 11 9 11 9 11 9 10 9 10 10 10 10 10 10 9 10 9 11 9
> 11 8 11 8 12 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 11 8 11 8 12 8 12
> 8 11 8 11 8 12 8 12 8 11 9 11 9 11 9 10 9 10 10 10 10 10 10 9 10 9 11 9
> 11 8 11 8 12 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 11 8 11 8 12 8 12
> 8 11 8 11 8 12 8 11 8 11 9 11 9 11 9 10 9 10 10 10 10 10 10 9 10 9 11 9
> 11 8 11 8 12 8 12 8 11 8 11 8 12 8 12 8 11 8 12 8 12 8 11 8 11 8 12 8 12
> 8 11 8 11 8 12 8 11 8 11 9 11 9 11 9 10 9 10 10 10 10 10 10 9 10 9 11 9
> 11 8 11 8 12 8 12 8 11 8 11 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 12
> 8 11 8 11 8 12 8 11 8 11 9 11 9 11 9 10 9 10 10 10 10 10 10 9 11 9 11 9
> 11 8 11 8 12 8 12 8 11 8 11 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 12
> 8 11 8 11 8 12 8 11 8 11 9 11 9 11 9 10 9 10 10 10 10 10 10 9 11 9 11 9
> 11 8 11 8 12 8 12 8 11 8 11 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 12
> 8 11 8 11 8 12 8 11 8 11 9 11 9 11 9 10 9 10 10 10 10 9 10 9 11 9 11 9
> 11 8 11 8 12 8 12 8 11 8 11 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 12
> 8 11 8 12 8 12 8 11 8 11 9 11 9 11 9 10 9 10 10 10 10 9 10 9 11 9 11 9
> 11 8 11 8 12 8 12 8 11 8 11 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12 8 12
> 8 11 8 12 8 12 8 11 8 11 9 11 9 11 9 10 9 10 10 10 10 9 10 9 11 9 11 9
> 11 8 11 8 12 8 12 8 11 8 11 8 12 8 11 8 11 8 12 8 12 8 11 8 11 8 12
> 
> test.c
> #include <stdlib.h>
> #include <stdio.h>
> #include <unistd.h>
> #include <stdint.h>
> 
> static inline uint64_t hash_long(uint64_t val, unsigned int bits)
> {
>         uint64_t hash = val;
> 
>         /*  Sigh, gcc can't optimise this alone like it does for 32 bits. */
>         uint64_t n = hash;
>         n <<= 18;
>         hash -= n;
>         n <<= 33;
>         hash -= n;
>         n <<= 3;
>         hash += n;
>         n <<= 3;
>         hash -= n;
>         n <<= 4;
>         hash += n;
>         n <<= 2;
>         hash += n;
> 
>         /* High bits are more random, so use them. */
>         return hash >> (64 - bits);
> }
> 
> #define FNV1A_64_INIT ((uint64_t) 0xcbf29ce484222325ULL)
> 
> /*
>  * 64 bit Fowler/Noll/Vo FNV-1a hash code
>  */
> static inline uint64_t fnv_64a_buf(void *buf, size_t len, uint64_t hval)
> {
>         unsigned char *bp = (unsigned char *) buf;
>         unsigned char *be = bp + len;
>         while (bp < be) {
>                 hval ^= (uint64_t) *bp++;
>                 hval += (hval << 1) + (hval << 4) + (hval << 5) +
>                         (hval << 7) + (hval << 8) + (hval << 40);
>         }
>         return hval;
> }
> 
> static inline uint64_t hash_64(uint64_t val, unsigned int bits)
> {
>         uint64_t hash = fnv_64a_buf(&val, sizeof(uint64_t), FNV1A_64_INIT);
> 
>         return hash >> (64 - bits);

If I modified this line to

   return hash & ((1 << bits) - 1);

the result had become good.  It looks like FNV generates values whose
lower bits are more random.

Anyway, I don't mind introducing a new hash function now.  FNV seems
to have some problems if the input buffer length is too short.

Thanks,

Kazutaka


> }
> 
> 
> int bucket[1024];
> int main()
> {
> 	uint64_t oid = 0x7c2b2500000001;
> 	int i;
> 
> 	/* simuate objects for 40G store */
> 	for (i = 0; i < 10000; i++) {
> 		int hash = hash_64(oid, 10);
> 		bucket[hash]++;
> 		oid++;
> 	}
> 
> 	for (i = 0; i < 1024; i++) {
> 		printf("%d ", bucket[i]);
> 		bucket[i] = 0;
> 	}
> 
> 	printf("************************************\n");
> 	oid = 0x7c2b2500000001;
> 	for (i = 0; i < 10000; i++) {
> 		int hash = hash_long(oid, 10);
> 		bucket[hash]++;
> 		oid++;
> 	}
> 
> 	for (i = 0; i < 1024; i++) {
> 		printf("%d ", bucket[i]);
> 	}
> 	printf("\n");
> 	return 0;
> }
> 
> 
> -- 
> sheepdog mailing list
> sheepdog at lists.wpkg.org
> http://lists.wpkg.org/mailman/listinfo/sheepdog



More information about the sheepdog mailing list