[Sheepdog] Some setattr/getattr strangeness

Chris Webb chris at arachsys.com
Mon Oct 10 13:30:56 CEST 2011


Hi. We've finished porting our infrastructure management system to live
entirely on top of Sheepdog, and have begun doing some testing as a result.
We use setattr -x to implement locking in the way we've previously
discussed, and I've noticed a few consistency problems.

Here's a first, simple example, which turned up when I trying to reproduce
some of the rarer odd behaviours:

  0026# collie vdi create foo 1G
  0026# collie vdi setattr -x foo foo <<< "bar"
  0026# collie vdi delete foo
  0026# collie vdi create foo 1G
  0026# collie vdi setattr -x foo foo <<< "bar"
  the attribute already exists, foo
  0026# collie vdi getattr foo foo
  bar

Looks like attributes don't get cleaned away when a vdi is deleted, and a
new vdi with the same name with end up with the same vdi id and hence 'pick
up' the stray attributes.

However, I'm seeing some strange behaviours even when we're using UUID VDI
names, so there's no risk of one ever being reused.

I arranged for the lowest level of our management system to log all collie
invocations to a file to capture what's going on. There are no qemu-img
operations or qemu vms running at the same time as these commands, and I
started with a completely clean cluster of three nodes on the same box,
empty directories, and cluster format --copies=1.

Here's an example trace:

[4581] collie vdi create 13121389-6673-4fe1-b30a-6608b9623bbf 539545600
Exit code: 0

[4581] collie vdi setattr -x 13121389-6673-4fe1-b30a-6608b9623bbf lock
stdin: 002689c3-aeab-433d-bafc-acfb95dafe7c:4581:1318241623
stdout: 
Exit code: 0

[4581] collie vdi setattr 13121389-6673-4fe1-b30a-6608b9623bbf properties
stdin: email test at test
name debian
user 00000000-0000-0000-0000-000000000000
stdout: 
Exit code: 0

[4581] collie vdi getattr 13121389-6673-4fe1-b30a-6608b9623bbf lock
stdin: 
stdout: 
Exit code: 0

So, the 'lock' attribute is found here (else exit code would be EMISSING),
but has an empty value instead of the expected value
002689c3-aeab-433d-bafc-acfb95dafe7c:4581:1318241623. I'm guessing that
what's really going on here is that the setattr operation happens
asynchronously and hasn't finished by the time the command exits and we do
the getattr?

Best wishes,

Chris.



More information about the sheepdog mailing list