[sheepdog] [PATCH 1/4] sheep: don't trim before calculating sha1

Thu Jul 18 08:49:54 CEST 2013

On Thu, Jul 18, 2013 at 03:39:46PM +0900, MORITA Kazutaka wrote:
> At Tue, 16 Jul 2013 17:30:18 +0800,
> Liu Yuan wrote:
> > 
> > Trim the object before getting sha1 don't give us much benefit because
> > 1. Most of objects can't be trimmed
> 
> We have a plan to include the object reclaim patchset which introduces
> many sparse objects (ledger objects, deleted vdi objects), no?
> 

Ah, I didn't think of this situation. I thought we wouldn't have many sparse
objects in the production environment.

> > 2. Require farm to trim the object again
> >    - need malloc() tmp space for the trim
> 
> We can do memmove() outside of trim_zero_blocks().  Then, we can avoid malloc()
> for the trim operation when we don't want to update the buffer.
> 
> > 
> > This is all about sha1ing more bytes vs triming the object, which is faster.
> > Both will take cpu cycles and no big win one over another.
> 
> Please give us a benchmark result before doing this kinds of changes.
> On my environment (Intel Core i7-3930K CPU 3.20 GHz), there was a big
> difference.
> 
> * benchmark program
> 
> int main(int argc, char **argv)
> {
> 	static unsigned char buf[SD_DATA_OBJ_SIZE] = {};
> 	int cnt;
> 	uint64_t offset;
> 	uint32_t len;
> 	unsigned char sha1[SHA1_DIGEST_SIZE];
> 	struct sha1_ctx c;
> 
> 	cnt = atoi(argv[1]);
> 	if (strcmp(argv[2], "trim") == 0) {
> 		for (int i = 0; i < cnt; i++) {
> 			offset = 0;
> 			len = 0;
> 			trim_zero_blocks(buf, &offset, &len);
> 
> 			sha1_init(&c);
> 			sha1_update(&c, buf, 0);
> 			sha1_final(&c, sha1);
> 		}
> 	} else 	if (strcmp(argv[2], "sha1") == 0) {
> 		for (int i = 0; i < cnt; i++) {
> 			sha1_init(&c);
> 			sha1_update(&c, buf, sizeof(buf));
> 			sha1_final(&c, sha1);
> 		}
> 	}
> 	return 0;
> }
> 
> * result
> 
> $ time ./sheep/sheep 10000 trim
> 
> real    0m0.013s
> user    0m0.012s
> sys     0m0.000s
> 
> $ time ./sheep/sheep 10000 sha1
> 
> real    1m59.807s
> user    1m59.799s
> sys     0m0.004s
> 
> 
> This means that calculating 10,000 objects causes 2 minutes overhead.

Okay, your test said well enough. trimming is obviously faster than hashing. If
we have many sparse objects, we can benefit it a lot. Please drop this patch.

I think you can apply other 3 patches cleanly.

Thanks
Yuan