[sheepdog-users] Recovery Performance Question

Gerald Richter - ECOS richter at ecos.de
Thu Sep 12 06:51:11 CEST 2013


I have about 500GB in my test cluster. Everything works fine. Now I have killed a node and restarted it. In this time no changes were made on any node.

As I expected the restarted node starts a recovery. I can still access all data on the restarted node and all other node. So everything is fine for all vm’s.

But what I discovered is that the recovery of the 500GB has taken about 6 hours and I guess that if the second node fails (my test cluster only has two nodes) during this 6 hours, I cannot access data anymore, until at least one node has recovered. 

So my questions is what data is really moved during recovery. I understand that all object are checked and moved from the old epoch to the new one, but since no data has changed, it should be enough to move some kind of pointer and not really coping the data. From the time it takes it seems more like all data is really moved/copied?



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wpkg.org/pipermail/sheepdog-users/attachments/20130912/05b44fb2/attachment-0004.html>

More information about the sheepdog-users mailing list