[sheepdog] What is cooking in master # 2014.1.22
Liu Yuan
namei.unix at gmail.com
Wed Jan 22 08:07:07 CET 2014
Hello sheepdog walkers,
This is the report of current status of development of sheepdog project.
This is a summery of what is new in v0.8.0 compared to v0.7.0
User-visible changes:
1 Data distribution algorithm is changed
We introduce a better hash algorithm for data distribution over all nodes and
disks. As a result, you'll expect better data and request distribution over
nodes and disks. This change is NOT backward compatibable, meaning that user
can't upgrade from v0.7.x to v0.8.0 directly. One method to migrate old data
to v0.8.x is use 'dog cluster snapshot' to save cluster data first by v0.7.x
and then load into v0.8.0.
2 add x86 hardware acceleration for sha1
Cluster snapshot and recovery rely on the sha1 algorithm, so with this
acceleration, users will expect a slightly faster recovery/snapshoting if your
CPU support ssse3.
3 Basic request statistics
A new 'dog node stat' command added to give some sort of statistics.
$ dog node stat -w
Request Active Total Write Read Remove Flush All WR All RD WRBW RDBW RPS
Client 5 341362 100459 234375 0 6528 44 GB 26 GB 98 MB 0.0 MB 0
Peer 0 41128 4008 37120 0 0 8.6 GB 3.9 GB 24 MB 0.0 MB 0
Request Active Total Write Read Remove Flush All WR All RD WRBW RDBW RPS
Client 0 341969 100459 234982 0 6528 44 GB 26 GB 92 MB 0.0 MB 0
Peer 0 41220 4008 37212 0 0 8.6 GB 3.9 GB 24 MB 0.0 MB 0
Client means all the requests from VMs on this node.
Peer means all the requests from other nodes
Active: active request being handled for now
RPS: Reuqest per second
BW: bandwidth per second
4 Multi-Disk enhancement
Now we support unlimited number of disks in local nodes. Previously, we
support at most 64 disks.
5 Maximum number of cluster
With zookeeper, we can support 6k+ nodes right now. Previously, we suppport
1024 nodes at most.
6 Erasure Code
Erasure Coding(EC) is a redundancy scheme that achieves high available of
data with much less storage overhead compared to complete replication. EC
in sheepdog is supposed to run VM guest, not only for cold data, because our
EC's performance is somewhat better than full replication. More info, see
https://github.com/sheepdog/sheepdog/wiki/Erasure-Code-Support
7 Hyper Volume
We introduce hyper volume (up to 16 PB) to this release. Currently, only
sheepfs and http service make use of it, this means you can mount a 16PB
volume into local file system via sheepfs. QEMU and iSCSI doesn't support
it yet.
8 HTTP Simple Storage
Sheepdog HTTP simple storage provides simple object store/retrieve service
via RESTful API, similar to Openstack Swift or Amazone S3. Our HTTP API only
support GET/PUT/DELETE/POST/HEAD operation and we implement a subset of
Openstack Swift API for now and plan to be Amazone S3 compatible in the
future.
We have a same API to upload/download small/big objects and support object
size upto 16PB.
More info, see
https://github.com/sheepdog/sheepdog/wiki/HTTP-Simple-Storage
9 Logger Enhancement
Now user can rotate sheep log(sheep.log) by signal SIGHUP and
we also support log dog operaton by setting SHEEPDOG_DOG_LOG and
SHEEPDOG_DOG_LOG_PATH for dog. E.g,
$ export SHEEPDOG_DOG_LOG
$ export SHEEPDOG_DOG_LOG_PATH=/var/log/dog.log # if not set, we'll log in syslog
# then all your dog operations are logged.
10 strict mode added back
Previously, we support safe/qurum/unsafe mode, but removed in the later
commit because for some corner case we can't keep 'safe' promise.
But sometimes we want to make sure we write the exact number of copies to
honor the promise of the redundancy for "strict mode". This means that after
writing of targeted data, they are redundant as promised and can withstand
the random node failures.
For example, with a 4:2 policy, we need at least write to 6 nodes with data
strip and parity strips. For non-strict mode, we allow to write successfully
only if the data are written fully with 4 nodes alive.
We can pass '-t|--strict' for 'dog cluster format' to enable strict mode.
11 Sheepdog VM auto-reconnect is support officially by QEMU v1.7
Because this is all about the codes in QEMU sheepdog block driver,
auto-reconnect feature is supported by all the sheepdog release.
12 Synchronous 'dog vdi delete'
This command will wait for all the data are removed before return. Previously
we return immediately even if 'dog vdi delete' fails after some time later.
*******************************************************************************
Old posts of the report:
What is cooking in master # 2013.5.23
http://lists.wpkg.org/pipermail/sheepdog-users/2013-May/000746.html
What is cooking in master # 2013.8.9
http://lists.wpkg.org/pipermail/sheepdog-users/2013-August/001142.html
Thanks
Yuan
More information about the sheepdog
mailing list