[sheepdog-users] node recovery performance

Gerald Richter - ECOS richter at ecos.de
Tue Nov 26 12:29:34 CET 2013


Hi,

 

I have a simple test cluster with two nodes and one vdi with 26GB. If I restart one node recovery takes 7,5 minutes. Even there were no vm running in this time, so nothing is change inside the cluster, but the recovery node seem to pull all the data of the vid from the other node, even it has all the data already on the local disk.

 
It doesn't pool data from the other node if not necessary.

What it does it to checksum all objects in the node you have restarted.

If any object is missing, it will copy it from the other nodes.

[GR] I rerun the test. You are right there is not much network traffic. I must have made a mistake when I looked at the network traffic the first time.

This operation takes time even if it doesn't create network traffic.

 
[GR] I just made a quick test. Running sha1sum of the same installation in a qcow2 file take the same time as the recovery. So it’s really doing the checksum which takes up the time.


 

So I expect a cluster of 2,6GB will take 750 minutes, which is half a day.

 
Did you mean Tera?

[GR] yes, of cause

 
 
If the second server fails in this time, data might be lost. So doing a reboot of two servers within half a day might cause data loss... (it's same for three or more nodes, only the timeframe changes a little bit).

 
If the second powers off, nothing happens different of a "standard" server that gets powered off.

When you turned off the first server, no recovery has occurred.

The second server just continued working with 1 copy per object.

[GR] The first server starts recovery. When now the second server reboots, both server are trying to make a recovery from each other. What is happening inthis case? Which data will be used (assuming that some data might be modified)?

Regards

Gerald

 
 


-- 



sheepdog-users mailing lists



sheepdog-users at lists.wpkg.org <mailto:sheepdog-users at lists.wpkg.org> 



http://lists.wpkg.org/mailman/listinfo/sheepdog-users <http://lists.wpkg.org/mailman/listinfo/sheepdog-users> 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wpkg.org/pipermail/sheepdog-users/attachments/20131126/5e64c689/attachment-0005.html>


More information about the sheepdog-users mailing list