> -----Original Message----- > From: Liu Yuan [mailto:namei.unix at gmail.com] > Sent: Mittwoch, 18. Juli 2012 10:01 > To: Dietmar Maurer > Cc: sheepdog-users at lists.wpkg.org > Subject: Re: [sheepdog-users] is --nohalt dangerous? > > On 07/18/2012 03:53 PM, Dietmar Maurer wrote: > > OK, so maybe the 2 node is a special case. What about: > > > > if ((sys->nr_copies > 2) && > > (current_vnode_info->nr_zones <= (sys->nr_copies/2))) > > sys_stat_set(SD_STATUS_HALT); > > Sheepdog provide strong consistency for objects, so I don't think we need > quorum based algorithm. Even with one copies left, the cluster is still running > well with it. So I don't yet see the use case for this quorum calculation. This avoid the bug pointed out in commit 9b6102ce? > ======================================= > sheep: introduce SD_STATUS_HALT > > Currently, sheepdog will serve IO requests even if number of nodes > is less than 'copies'. > > When the number of the nodes (or zones) is less than the copies > specified by collie-cluster-format command, the sheepdog cluster > should stop serving IO requests. > > This is necessary to solve the below subtle case: > > + good nodes, - failed nodes. > > 0 1 2 3 > + - - + > + --> - --> - --> + > + + - # <-- permanently down. > ^ > | > this node has the latest data > > at stage 3, we will have a cluster recovered without the data > tracked at stage 1. > > When the nodes are in the SD_STATUS_HALT, the sheepdog can also > serve configuration change and do the recovery job. > ========================== |