[sheepdog-users] What is cooking in master # 2013.8.9

Fri Aug 9 14:52:05 CEST 2013

Hello sheepdog walkers,

This is the report of current status of development of sheepdog project.

I'll split the report into two part: user-visible, developer-visible from now on.

So what we are cooking right now in the development mailing list are:

User-visible changes:

 1 Release Cycle 

   Our release cycle was changed and now we'll release a stable version every 3
   months, which is at the exact the same release day as QEMU. For example,
   v0.7.0 will be released at 8.15, and v0.8.0 is scheduled at 11.15. Besides
   release cycle, we'll mantain a stable branch of every release version, 
   though how long the support for stable branch is not fixed yet. Current 
   stable branch is 'stable-0.6', and after v0.7.0 released, 'stable-0.7' will
   be created. At the same time, 'stable-0.6' is still going to be maintained.

   We have also get a new policy for versioning of release candicate, see
   http://lists.wpkg.org/pipermail/sheepdog/2013-August/011435.html

 2 Cluster-wide snapshot is considered working, request for test

   We have moved 'farm' from sheep to collie as code base for new cluster-wide
   snapshot. So now we only have 'plain' store left as store backend.

   features done: incremental backup, auto-deduplication, slicing for dedup

   For a simple test, it can achieve nearly 50% deduplication for images. More
   info, see https://github.com/collie/sheepdog/wiki/Backend-Stores%2C-Object-Cache-and-Disk-Cache#cluster-backup

 3 Better recovery status from 'collie node recovery'

   For example, the new output is
   $ collie node recovery
      Id   Host:Port         V-Nodes       Zone       Progress
       0   10.68.13.1:7000        64   17646602       56.3%
       1   10.68.13.2:7000        64   34423818       37.2%
       2   10.68.13.3:7000        64   51201034       18.1%
       3   10.68.13.4:7000        64   67978250       68.9%
       4   10.68.13.5:7000        64   84755466        3.9%
       5   10.68.13.6:7000        64  101532682       52.6%
       6   10.68.13.7:7000        64  118309898       77.9%
       7   10.68.13.8:7000        64  135087114       38.9%
       8   10.68.13.9:7000        64  151864330        8.7%
       9   10.68.13.10:7000       64  168641546       97.1%
      10   10.68.13.11:7000       64  185418762        1.1%
      11   10.68.13.12:7000       64  202195978       25.6%

   And you can watch one node progress with a progress bar
   $ collie node recovery --progress
   99.7 % [==============================================>] 7047 / 7068

 4 Openstack Glance support is upstream-merged

   Now users can use sheepdog as the backend storage both for
   Cinder(volume service) and Glance(image service)

   We are also in the effort to provide 'object storage' service on top of
   sheepdog. We are planned to provide Openstack Swift compatible API first.

 5 More detailed help messages

   Users can get the elaberate help messages for specific sheep options with
   sheep binary. For exmaple, if you don't know how to use cache, try

   $ sheep -w
   sheep/sheep: option requires an argument -- 'w'
Available arguments:
	size=: size of the cache in megabyes
	dir=: path to the location of the cache (default: $STORE/cache)
	directio: use directio mode for cache IO, if not specified use buffered IO

Example:
	$ sheep -w size=200000,dir=/my_ssd,directio ...
This tries to use /my_ssd as the cache storage with 200G allocted to the
cache in directio mode

   A detailed usage with example will be provided.

 6 Simplied restart of the cluster after crash

   In the past, we have to start many sheep twice if epoches of sheep get
   inconsistent. Now we will only have at most two steps to relaunch the nodes
   from a crashed state:

   A. start sheep on each node one by one, after all the nodes are up again, the
      cluster will be working. That is all. But

   B. If some node(s) is physically down and never have a chance to get up, try
      $ collie cluster recover force
	  The cluster will be working again. Done.

 7 Qeury the cache infomation on the node

   A new command is added to get the information of the cache usage
usage:
 $ collie vdi cache info # Get the cache informatin of the node

Example:
yliu at ubuntu-precise:~/sheepdog$ collie/collie vdi cache info
Name    Tag     Total   Dirty   Clean
test            88 MB   68 MB   20 MB
data            88 MB   0.0 MB  88 MB

Cache size 200 MB, used 176 MB

 8 Deb package support

   Now we can make deb or rpm pakcage on our own by a simple command
   $ make deb # for debian-based system package
   $ make rpm # for redhat-based system pakcage

===============================================================================

Developer-visible changes:
 1 node managment (group.c) is largely restructured and simplied.

 2 trace infrasture is refined and loop_checker and thread_checker are added.
   With this tools, we can easily find out which function hogs the sheep daemon.

 3 zookeeper driver gets enhanced, especially for session timeout and get much
   more stable. Users are advised to use zookeeper now.

 4 Unit test frame is added but lack of test cases. Also dynamorio instrumentation
   tool is added to tests/, lack of test cases too.

 5 libsheepdog is discussed but not yet code ready.

*******************************************************************************
Old posts of the report:

  What is cooking in master # 2013.5.23
  http://lists.wpkg.org/pipermail/sheepdog-users/2013-May/000746.html