On 07/18/2012 01:45 PM, Dietmar Maurer wrote: > I have a small cluster with only 3 nodes, and I want to store 3 copies: > > > > # cluster format –copies 3 > > > > But as soon as one node dies the IO gets halted. To prevent that one can > use: > > > > # cluster format –copies 3 –nohalt > > > > The question is why that is not the default behavior? Is that dangerous? > If so, why? > > > To quote from commit 9b6102ce: ======================================= sheep: introduce SD_STATUS_HALT Currently, sheepdog will serve IO requests even if number of nodes is less than 'copies'. When the number of the nodes (or zones) is less than the copies specified by collie-cluster-format command, the sheepdog cluster should stop serving IO requests. This is necessary to solve the below subtle case: + good nodes, - failed nodes. 0 1 2 3 + - - + + --> - --> - --> + + + - # <-- permanently down. ^ | this node has the latest data at stage 3, we will have a cluster recovered without the data tracked at stage 1. When the nodes are in the SD_STATUS_HALT, the sheepdog can also serve configuration change and do the recovery job. ========================== |