[sheepdog-users] Questions on Sheepdog Operations
devoid at anl.gov
Wed Apr 9 21:55:45 CEST 2014
We are currently evaluating sheepdog in a development cluster. Here are a
few questions that we've had about sheepdog operations. Please let us know
if you have any thoughts.
1. We know that sheepdog expects consistent mount points for disks. For
example, if I start sheep with: "sheep -c corosync:172.21.1.1 --pidfile
/var/run/sheepdog.pid /meta,/disk1,disk2" I cannot shutdown the agent,
remount the disk, swapping positions, and restart sheepdog. It complains
loudly that it can't find the right objects in the right places. Now, if
"/disk1" dies, I swap the drive, format and mount the new drive to
"/disk1", will this cause a problem for sheep?
2. Extending on the last example, what is the best way to replace
individual disks? We are using erasure-coding, so should I use "dog node md
plug/unplug"? Or restart sheep? Or something else?
3. Is there a way to get drive-level statistics out of sheepdog? I can of
course use operating system tools for the individual devices, but are there
additional sheepdog-specific stats I should be interested in?
4. We are running a cluster with 16:4 erasure coding and multi-disk mode.
How should we think about our failure domains? Here are a few tests that we
- Shutdown a single node in the cluster. We immediately see
replication/recovery logs on all other nodes as objects are copied to meet
- Unmount a single disk on a single node: No log messages and no
indication of changes in redundancy state. dog cluster check indicates
- Unmount of 4 disks across 4 nodes: No log messages and no indication
of changes in redundancy state. "No object found" errors abound.
- Unmount of 20 disks across 20 nodes: No log messages and no
indication of changes in redundancy state. "No object found" errors abound.
It appears that sheepdog only starts recovery on node failure when a node
fails the cluster-membership (zookeeper or corosync). It is concerning that
there are no active checks or logs for single-disk failures in a multi-disk
setup. Are we missing some option here? Or are we misusing this feature?
5. However we structure our redundancy, we would like to be able to safely
offline disks that we identify as performing poorly or in SMART pre-failure
/ failure mode. What procedure should we use when replacing disks? Should
we run "dog cluster check" after each disk? "cluster rebalance?" "cluster
recover"? Can we do more than one disk at a time?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the sheepdog-users