[sheepdog] on the wire protocols: structure, versioning, etc
Christoph Hellwig
hch at infradead.org
Fri Jun 15 17:54:03 CEST 2012
I've recently been going over the on the wire structures a bit, and here's a
few notes and questions:
* endianness annotations
I think we should add Linux-kernel style endianness annotations
for the on the wire (and on disk) structures. Not only does
this allow for inter-operability of different endianness hosts
which some might only consider a minor feature, but it also
makes it very clear in the code which are the on wire
structures, so that they are handled with care for eventual
changes, and special care is taken of alignment and similar
issues.
Which brings me to the next issue:
* clearly defining the on wire protocols
Right now there are a lot of structures on the wire in places
where you don't expect it. Besides the obvious bits in
include/sheepdog_proto.h there are some additional opcodes and
their structures in include/sheep.h which also hosts some
shared code between collie and the sheep daemon, in
sheep/sheep_priv.h which otherwise just includes structures
private to the main module of the sheep daemon and not even
shared with the cluster driver, some payloads directly defined
in sheep/group.c, and the cluster driver specific event types
directly inside the cluster drivers.
This turns into the next thing:
* splitting the different protocols
While all communication between the components of sheepdog share
some common constants there are at least two, if not three
different sub protocols:
- the main user facing protocol, spoken between the qemu
frontend (or any other plain I/O fronted) and the gateway
sheep
- the protocol between sheep daemons, including the the
cluster driver level events, join/leave/notify messages,
SD_FLAG_CMD_IO_LOCAL type read/write requests, get object
list commands for recovery
- any magic admin communication between collie and sheep,
although by some argument these could be added to either
of the above ones on a case by case basis.
Identifying them as different protocols will also allow to
version them differently, including basically unlimited
backwards compatibility for the frontend, while allowing to
increment the backend protocol revisions and thus either letting
sheep with the wrong version fail the join gracefully, or with
some effort allowing sheep to inter operate with different
versions (with a lot of testing overhead)
If everyone agrees with these basic concepts I'd like to move forward
with:
(1) split each sub-protocol into a well-documented header
(2) add sparse annotations
(3) replace the SD_FLAG_CMD_IO_LOCAL flag with different operation
types for the sheep peer I/O. Not only does it make clear they
are part of a different protocol, but it will also allow to
use normal ops.c-like dispatch for the gateway
(4) add separate versioning for the sheep peer protocol, and probably
the per-cluster driver protocols.
More information about the sheepdog
mailing list