[Sheepdog] Some setattr/getattr strangeness

MORITA Kazutaka morita.kazutaka at lab.ntt.co.jp
Thu Oct 13 16:45:49 CEST 2011


At Thu, 13 Oct 2011 15:05:20 +0100,
Chris Webb wrote:
> 
> MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp> writes:
> 
> > At Thu, 13 Oct 2011 22:00:05 +0900,
> > MORITA Kazutaka wrote:
> > > 
> > > At Thu, 13 Oct 2011 13:35:06 +0100,
> > > Chris Webb wrote:
> > > > 
> > > > MORITA Kazutaka <morita.kazutaka at lab.ntt.co.jp> writes:
> > > > 
> > > > > Sheepdog uses a corosync multicast for all global atomic operations,
> > > > > so I think the correct way is to implement a SD_OP_ATOMIC_WRITE_OBJ
> > > > > operation with the multicast.
> > > > >
> > > > > But this limits the size of a vdi attribute to the maximum multicast
> > > > > size (a few hundreds KB?).  Is it okay for you?
> > > > 
> > > > Hi. That definitely wouldn't cause me any problems, as I'm only using them
> > > > for locks (twenty bytes identifying uniquely what has claimed the vdi) and
> > > > very simple textual properties (like a drive name). To be honest, I had
> > > > assumed that they were intended for very small amounts of metadata like this
> > > > rather than for bulk data storage, for which we have vdis themselves, and
> > > > didn't realise they'd hold such large chunks of data successfully.
> > > 
> > > Okay, I'll create a patch to support atomic I/Os.
> > 
> > After a close look at the codes again, I found that setattr is already
> > an atomic operation.  Sheepdog uses a corosync multicast to allocate a
> > vdi attr object id, so setattr -x works correctly even if multiple
> > hosts send the requests at the same time.
> 
> Hi Kazutaka. Just double-checking, but is there a race here where the id is
> allocated but the key isn't written yet, i.e. a getattr on another host
> could see a value for the attribute but that value is an empty string
> because the new object hasn't been written?

Yes, there is...  I believe this is the reason you got the empty
attribute.  I'll fix it.

Should setattr without '-x' also work atomically?  For example,
multiple hosts may send setattr against the same existing attribute at
the same time?  Currently, this operation can cause replication
inconsistency.


Thanks,

Kazutaka



More information about the sheepdog mailing list