[sheepdog] [PATCH 0/7] modify hash calculation

MORITA Kazutaka morita.kazutaka at gmail.com
Tue Sep 3 18:41:59 CEST 2013


At Tue, 3 Sep 2013 00:12:08 +0800,
Liu Yuan wrote:
> 
> On Mon, Sep 02, 2013 at 02:36:41PM +0900, MORITA Kazutaka wrote:
> > At Mon, 2 Sep 2013 10:56:11 +0800,
> > Liu Yuan wrote:
> > > 
> > > On Fri, Aug 30, 2013 at 06:32:02PM +0900, MORITA Kazutaka wrote:
> > > > From: MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp>
> > > > 
> > > > The current hash algorithm is fast but shows poor dispersion.  This
> > > > series introduces a new hash function based on the current fnv1a
> > > > algorithm.
> > > > 
> > > > I compared performance and dispersion when generating 64 bit integer
> > > > values with:
> > > > 
> > > >  - hash_64 (a simple hash function used in Linux kernel)
> > > >  - fnv1a   (the current hash function)
> > > >  - sd_hash (the one this patchset introduces)
> > > >  - sha1
> > > > 
> > > > and sd_hash showed a good result.
> > > > 
> > > >                    hash_64    fnv1a   sd_hash     sha1
> > > > Performance (*1)     43 ms    96 ms    182 ms   2387 ms
> > > > Dispersion (*2)     9216.0   2927.7      11.4      6.57
> > > > 
> > > > (*1) The time to generate 10,000,000 hash values.
> > > > (*2) The result of chi-squared test.  It should be less than 16.9.
> > > > 
> > > > Please note that this series breaks backward compatibility, and
> > > > shouldn't be merged into the stable trees.
> > > 
> > > Does this means that we can't upgrade old cluster to master after this series
> > > is merged (people should format the cluster)?
> > 
> > What I mean is that such a fundamental change shouldn't go into the
> > stable trees.
> > 
> > I think the object reclaim feature will be also merged before the v0.8
> > release and it will change the format too.  I'm going to write an
> > upgrade code after the v0.8 format is fixed.
> > 
> 
> Probably, we don't need upgrade code at all. We can
> 
> $ dog cluster snapshot save old_data
> ...upgrade cluster and format...
> $ dog cluster snapshot load old_data

It looks correct.  I'm not sure it's okay to assume that all the users
have a enough another storage space for backing up whole sheepdog
data, though.

Thanks,

Kazutaka



More information about the sheepdog mailing list