[sheepdog-users] Performance Impact of Recovery

Thu Mar 13 06:35:40 CET 2014

On Wed, Mar 12, 2014 at 05:11:54PM +0100, richter at ecos.de wrote:
> > >
> > > So the question is how can this recovery process speed up?
> > >
> > 
> > We have a patch to speed up recovery
> > 
> > * <efbf7f0> 2014-02-06 [Liu Yuan] sheep/recovery: multi-threading recovery
> > process
> > 
> > which is merged in the master branch. I think this will speed up recovery
> > process a lot and the more the disks you have, the better speed-up.
> > 
> 
> I have 2 disks per node. I will give it a try.
> 
> > >
> > > From my current knowledge (which is not too deep), the only idea would
> > > be to calculate the data block hashes during storing of the data block
> > > and compare only stored hashes. Would this be possible/make sense or
> > > is there a better solution?
> > 
> > We already do it the way you suggested for full replication scheme.
> > 
> 
> I do not use erasure coding (dog vdi list shows copies = 3), so it uses full
> replication scheme, right?

Yes

>
> Sheep finally finished the recovery after 9h of hard work. It recovered
> about 150000 blocks. This is about 5Blocks/s and it had most the time
> between 5-10MByte/s Readrate. So this looks to me that it is reading much
> more data, then only the precomputed hashes. Do I have to use any special
> option during startup (or compiling) or do I understand things completely
> wrong?

We store hash into xattr of the object while the hash is calculated. Hashes are
calculated indirectly when 1. 'dog vdi check' 2. recovery. We don't calculate
hash for normal read/write for better performance.

Thanks
Yuan