[sheepdog] read/write during recovery

Liu Yuan namei.unix at gmail.com
Thu Jul 26 10:13:42 CEST 2012


On 07/26/2012 04:06 PM, Dietmar Maurer wrote:
> But recovery and cleanup actions can take several hours, so it is quite hard to find a window
> on such system?

We are always optimizing the recovery performance. For now, 30 nodes
with dozens of TB data, the recovery process is less than 30 mins. Note,
recovery can be nested, this means subsequent node event will supersede
the previous one. This means, if you have 2 nodes failed one after
another, the total time is: t0 + t(r), t0 is the window between these
two event, and t(r) is the one node event recovery time.

So yea, theoretically we can't assure mathematically recovery time is
bound into a short window, but when it is reported recovery takes hours,
I think it is time for us to revisit the code and make it faster.

Thanks,
Yuan



More information about the sheepdog mailing list