[sheepdog-users] is --nohalt dangerous?

Dietmar Maurer dietmar at proxmox.com
Wed Jul 18 08:15:12 CEST 2012


Wouldn't it be good enough if 2 out of 3 nodes are online?

> > The question is why that is not the default behavior? Is that dangerous?
> > If so, why?
> >
> >
> >
> 
> To quote from commit 9b6102ce:
> =======================================
>     sheep: introduce SD_STATUS_HALT
> 
>     Currently, sheepdog will serve IO requests even if number of nodes is less
> than 'copies'.
> 
>     When the number of the nodes (or zones) is less than the copies specified
> by collie-cluster-format command, the sheepdog cluster should stop serving
> IO requests.
> 
>     This is necessary to solve the below subtle case:
> 
>     + good nodes, - failed nodes.
> 
>     0       1      2     3
>     +       -      -     +
>     +  -->  - -->  - --> +
>     +       +      -     # <-- permanently down.
>             ^
>             |
>     this node has the latest data
> 
>     at stage 3, we will have a cluster recovered without the data tracked at
> stage 1.
> 
>     When the nodes are in the SD_STATUS_HALT, the sheepdog can also serve
> configuration change and do the recovery job.
> ==========================



More information about the sheepdog-users mailing list