[sheepdog-users] add node back-in

Liu Yuan namei.unix at gmail.com
Mon Oct 7 08:49:31 CEST 2013


On Mon, Oct 07, 2013 at 07:48:48AM +0200, Kees Bos wrote:
> Just for the record. What is the correct procedure to start sheep again
> after a power failure, in a setup with plugged devices? I noticed that
> the devices have to be added manually, so I wonder whether I have to
> wipe the devices before adding them back in.
> 

No 'correct procedure'. Normally we restart the sheep with the previous command
if sheep goes down for any reasons.

You don't need to wipe the devices manually because sheepdog will take care of 
the old data in the direstories that holds the data objects. For most of the 
time, these data objects will be moved to hidden stale directory for possible
recovery at restart stage and get purged if everything is okay.

For some unhandled cases, such as sheep panic because of journal replay error
like you met, we have no choice but to remove the journal files or add 'skip'
parameter to '-j', which in effect removes the journal file brutally.

If just minor nodes in the cluster encountered power failure, it is safe to
remove(skip) journal files at restart because the node join will trigger a
data rebalance(we call it recovery too in sheepdog terminology), which will make
sure data are consistent on all nodes.

If unfortunately you met a power failure of the whole cluster, and some nodes
couldn't replay the journal because of possible bug, you can

- skip the journal if sheep refuses to start because of it.
- use dog vdi check to make sure data consistency.

Thanks
Yuan



More information about the sheepdog-users mailing list