> On 07/19/2012 02:14 AM, Arnold Krille wrote: > > But you do get problems when you write to the last remaining node, > > that node dies (non-recoverable) and you bring back the other nodes. > > These node don't have a chance of knowing they have invalid data. Well > > they can know, because they might be shut down uncleanly. But then the > > remaining nodes know that they have invalid data, so what? You can't > > go on with that and have to bring in the backup you don't have... > > This is exactly why halt behavior is default one. Without -nohalt, we don't > have this problem. But that simply stops all operations, and does not tolerate failures on small system (2nodes,copies=2) or (3nodes,copies=3). > > For data consistency it would have been better if the cluster stopped > > writing after more then half of the copies died. And thus forced the > > admins to fix the nodes well before that even occures. > > > > Setting a copy-value of more then one probably meant something for the > > admin regarding data-security. So its safe to assume that he wants to > > protect himself against the scenario of the last node dying with the > > last consistent data on it. > > > > So, please give sheepdog real quorum calculation when there are more > > then two copies wanted. > > Quorum will fail the case if the majority nodes are down at the same time > and non-recoverable, in this case, we lose the updates. No, because writes/updates are not allowed when you do not have quorum. > We actually have a more stronger constraint: if nr_nodes < copies, we halt > the cluster. I think this is the safest choose. That constrains in simply not acceptable on small systems. - Dietmar |