[sheepdog] on the wire protocols: structure, versioning, etc
MORITA Kazutaka
morita.kazutaka at lab.ntt.co.jp
Mon Jun 18 23:49:12 CEST 2012
At Sat, 16 Jun 2012 00:26:13 +0800,
Liu Yuan wrote:
>
> On 06/15/2012 11:54 PM, Christoph Hellwig wrote:
> > I've recently been going over the on the wire structures a bit, and here's a
> > few notes and questions:
> >
> > * endianness annotations
> >
> > I think we should add Linux-kernel style endianness annotations
> > for the on the wire (and on disk) structures. Not only does
> > this allow for inter-operability of different endianness hosts
> > which some might only consider a minor feature, but it also
> > makes it very clear in the code which are the on wire
> > structures, so that they are handled with care for eventual
> > changes, and special care is taken of alignment and similar
> > issues.
> >
> > Which brings me to the next issue:
> >
> > * clearly defining the on wire protocols
> >
> > Right now there are a lot of structures on the wire in places
> > where you don't expect it. Besides the obvious bits in
> > include/sheepdog_proto.h there are some additional opcodes and
> > their structures in include/sheep.h which also hosts some
> > shared code between collie and the sheep daemon, in
> > sheep/sheep_priv.h which otherwise just includes structures
> > private to the main module of the sheep daemon and not even
> > shared with the cluster driver, some payloads directly defined
> > in sheep/group.c, and the cluster driver specific event types
> > directly inside the cluster drivers.
> >
> > This turns into the next thing:
> >
> > * splitting the different protocols
> >
> > While all communication between the components of sheepdog share
> > some common constants there are at least two, if not three
> > different sub protocols:
> >
> > - the main user facing protocol, spoken between the qemu
> > frontend (or any other plain I/O fronted) and the gateway
> > sheep
> > - the protocol between sheep daemons, including the the
> > cluster driver level events, join/leave/notify messages,
> > SD_FLAG_CMD_IO_LOCAL type read/write requests, get object
> > list commands for recovery
> > - any magic admin communication between collie and sheep,
> > although by some argument these could be added to either
> > of the above ones on a case by case basis.
> >
> > Identifying them as different protocols will also allow to
> > version them differently, including basically unlimited
> > backwards compatibility for the frontend, while allowing to
> > increment the backend protocol revisions and thus either letting
> > sheep with the wrong version fail the join gracefully, or with
> > some effort allowing sheep to inter operate with different
> > versions (with a lot of testing overhead)
> >
> > If everyone agrees with these basic concepts I'd like to move forward
> > with:
> >
> > (1) split each sub-protocol into a well-documented header
> > (2) add sparse annotations
> > (3) replace the SD_FLAG_CMD_IO_LOCAL flag with different operation
> > types for the sheep peer I/O. Not only does it make clear they
> > are part of a different protocol, but it will also allow to
> > use normal ops.c-like dispatch for the gateway
> > (4) add separate versioning for the sheep peer protocol, and probably
> > the per-cluster driver protocols.
>
> Looks good to me and look forward to this change.
Looks good to me too. :)
Thanks,
Kazutaka
More information about the sheepdog
mailing list