At Thu, 13 Oct 2011 13:02:48 +0100, Chris Webb wrote: > > MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp> writes: > > > Yes, as long as setattr -x is run on the same machine. Note that > > Sheepdog object storage doesn't allow concurrent accesses from > > multiple machines. > > Hi Kazutaka. For this to apply to setattr -x makes the exclusiveness of the > operation much less useful: if it's only exclusive on a single machine, one > could equivalently just use fcntl() on a lock file which is cheaper and more > convenient! Hmm, yes, you are right. > > I think the semantics for setattr -x were intended to allow it to be used to > implement the kind of exclusive locking that Sheepdog requires elsewhere > throughout the system to work correctly: claim the lock exclusively with the > same convention for the lockfile everywhere, and you know you can safely > access the vdi without causing divergence. In the absence of this, automated > users of sheepdog would need to implement a separate global locking > mechanism (on top of corosync, say) to be able to use sheepdog safely. > > If setattr -x works atomically on a single node and only breaks down when > there are multiple nodes trying to setattr -x, could one could easily fix > this by always forwarding setattr from the local sheep to the (guaranteed > unique) group leader sheep rather than just executing it locally like a > normal vdi write? Sheepdog uses a corosync multicast for all global atomic operations, so I think the correct way is to implement a SD_OP_ATOMIC_WRITE_OBJ operation with the multicast. But this limits the size of a vdi attribute to the maximum multicast size (a few hundreds KB?). Is it okay for you? Thanks, Kazutaka |