[Sheepdog] sheepdog's recovery alogorithm questions
morita.kazutaka at gmail.com
Tue Mar 15 11:14:11 CET 2011
On Tue, Mar 15, 2011 at 12:09 PM, jidalyg_8711 <jidalyg_8711 at 163.com> wrote:
> 1. sheepdog claim it is strong consistent, And I think the implemention of
> the read_object() write_object() remove_object() ensure that strong
> consistent? any other places to ensure it ? How about it affect the
> performance of the sheepdog?
The mechanism to ensure data consistency are:
- Sheepdog stores objects in the epoch number directory, and doesn't
allow clients to read objects from old epoch number directories.
This prevents clients from reading old data which are not
- Sheepdog allows write requests from at most one client.
Administrators need to pay attention for this when using Sheepdog.
This avoids write conflicts and ensures object consistency easily.
> 2. About the recovery alogorithm, When the new node join or the node
> left, sheepdog will call start_recovery()func and recovery in background.
> The main actions of the recovery is move the object to the new node and the
> epoch directory ; while recoverying, the READ_OBJECT request arrived,
> currently sys->epoch has increased, but the object maybe still in old epoch
> directory ,not yet move to the new epoch, How the sheepdog handle the
> situation ? And the sys->epoch increased in update_cluster_status()
> before calling start_recovery()
Before processing the object request, Sheepdog checks whether the
requesting object is recovered to the current epoch directory or not
in is_recoverying_oid(). If the object is not recovered yet, Sheepdog
recovers the object first, and after that, processes the request.
More information about the sheepdog