[sheepdog] on the wire protocols: structure, versioning, etc

MORITA Kazutaka morita.kazutaka at lab.ntt.co.jp
Mon Jun 18 23:49:12 CEST 2012


At Sat, 16 Jun 2012 00:26:13 +0800,
Liu Yuan wrote:
> 
> On 06/15/2012 11:54 PM, Christoph Hellwig wrote:
> > I've recently been going over the on the wire structures a bit, and here's a
> > few notes and questions:
> > 
> > * endianness annotations
> > 
> > 	I think we should add Linux-kernel style endianness annotations
> > 	for the on the wire (and on disk) structures.  Not only does
> > 	this allow for inter-operability of different endianness hosts
> > 	which some might only consider a minor feature, but it also
> > 	makes it very clear in the code which are the on wire
> > 	structures, so that they are handled with care for eventual
> > 	changes, and special care is taken of alignment and similar
> > 	issues.
> > 
> > 	Which brings me to the next issue:
> > 
> > * clearly defining the on wire protocols
> > 
> > 	Right now there are a lot of structures on the wire in places
> > 	where you don't expect it.  Besides the obvious bits in
> > 	include/sheepdog_proto.h there are some additional opcodes and
> > 	their structures in include/sheep.h which also hosts some
> > 	shared code between collie and the sheep daemon, in
> > 	sheep/sheep_priv.h which otherwise just includes structures
> > 	private to the main module of the sheep daemon and not even
> > 	shared with the cluster driver, some payloads directly defined
> > 	in sheep/group.c, and the cluster driver specific event types
> > 	directly inside the cluster drivers.
> > 
> > 	This turns into the next thing:
> > 
> > * splitting the different protocols
> > 
> > 	While all communication between the components of sheepdog share
> > 	some common constants there are at least two, if not three
> > 	different sub protocols:
> > 
> > 	    - the main user facing protocol, spoken between the qemu
> > 	      frontend (or any other plain I/O fronted) and the gateway
> > 	      sheep
> > 	    - the protocol between sheep daemons, including the the
> > 	      cluster driver level events, join/leave/notify messages,
> > 	      SD_FLAG_CMD_IO_LOCAL type read/write requests, get object
> > 	      list commands for recovery
> > 	    - any magic admin communication between collie and sheep,
> > 	      although by some argument these could be added to either
> > 	      of the above ones on a case by case basis.
> > 	
> > 	Identifying them as different protocols will also allow to
> > 	version them differently, including basically unlimited
> > 	backwards compatibility for the frontend, while allowing to
> > 	increment the backend protocol revisions and thus either letting
> > 	sheep with the wrong version fail the join gracefully, or with
> > 	some effort allowing sheep to inter operate with different
> > 	versions (with a lot of testing overhead)
> > 
> > If everyone agrees with these basic concepts I'd like to move forward
> > with:
> > 
> >  (1) split each sub-protocol into a well-documented header
> >  (2) add sparse annotations
> >  (3) replace the SD_FLAG_CMD_IO_LOCAL flag with different operation
> >      types for the sheep peer I/O.  Not only does it make clear they
> >      are part of a different protocol, but it will also allow to
> >      use normal ops.c-like dispatch for the gateway
> >  (4) add separate versioning for the sheep peer protocol, and probably
> >      the per-cluster driver protocols.
> 
> Looks good to me and look forward to this change.

Looks good to me too. :)

Thanks,

Kazutaka



More information about the sheepdog mailing list