[sheepdog] [PATCH] sheep: add a kill node operation

Dietmar Maurer dietmar at proxmox.com
Fri Jul 20 11:18:45 CEST 2012


> On 07/20/2012 04:59 PM, Dietmar Maurer wrote:
> > re-balance always involves massive network traffic. So IMHO this must be
> manually triggered.
> > If someone want do that automatically he can write a script
> 
> No, most of time, we actually need automatic recovery, because Sheepdog is
> targeted for thousand nodes cluster, where manual recovery will cause very
> high administration. Manual recovery could be complementary to the
> automatic recovery, where people can use it with caution for maintenance.
> Note, automatic is very crucial to assure data reliability that we must try our
> best to assure copies as many as specified, this means, when you do manual
> recovery process, you are risking to lose your data because in that window
> you have some copies less than expected.

I fully understand that. You have thousands of nodes and unlimited network bandwith.

Unfortunately, corosync only support 16 node (official limit supported by redhat),
and most of our users will run less than 5 nodes, and use GB links. So an option
to disable automatic recovery would be really helpful.

Should be just a few lines of code anyways?

- Dietmar




More information about the sheepdog mailing list