[sheepdog] [PATCH 08/11] Doc. "Sheepdog Basic" add chapter "fail over"
namei.unix at gmail.com
Tue Oct 22 07:59:37 CEST 2013
On Sun, Oct 20, 2013 at 10:41:01AM +0200, Valerio Pachera wrote:
> Signed-off-by: Valerio Pachera <sirio81 at gmail.com>
> doc/fail_over.rst | 36 ++++++++++++++++++++++++++++++++++++
> 1 file changed, 36 insertions(+)
> create mode 100644 doc/fail_over.rst
> diff --git a/doc/fail_over.rst b/doc/fail_over.rst
> new file mode 100644
> index 0000000..892d79d
> --- /dev/null
> +++ b/doc/fail_over.rst
> @@ -0,0 +1,36 @@
> +Fail Over
> +Now we are able to manage guests on our cluster and we want to check if it's
> +really able to survive a node loss.
> +Start a guest on any of the node.
> +Find the node ID you wish to fail by *'vdi list'*
> +(not the node where the guest is running, of course).
> +Then kill the node:
> + # dog node kill 3
> +Guest is still running without any problem and by 'dog node list' you'll see
> +that one node is missing.
> +But how do we know if sheepdog is recovering the "lost" data?
> +*(At this very moment, some objects have only 1 copy instead of 2.
> +The second copy has to be rebuild on the active nodes)*.
> + # dog node recovery
> + Nodes In Recovery:
> + Id Host:Port V-Nodes Zone
> + 0 192.168.2.41:7000 50 688040128
> + 1 192.168.2.42:7000 50 704817344
> + 2 192.168.2.43:7000 92 721594560
> +Here you can see which nodes are receiving data.
> +Once done, the list will be empty.
dog node recovery output is changed too in the master branch
> +do not remove other nodes from the cluster during recovery!
Actually it is okay to remove multiple nodes in the cluster to simulate
group failure, for which case sheepdog can handle gracefully.
More information about the sheepdog