Hi. We've finished porting our infrastructure management system to live entirely on top of Sheepdog, and have begun doing some testing as a result. We use setattr -x to implement locking in the way we've previously discussed, and I've noticed a few consistency problems. Here's a first, simple example, which turned up when I trying to reproduce some of the rarer odd behaviours: 0026# collie vdi create foo 1G 0026# collie vdi setattr -x foo foo <<< "bar" 0026# collie vdi delete foo 0026# collie vdi create foo 1G 0026# collie vdi setattr -x foo foo <<< "bar" the attribute already exists, foo 0026# collie vdi getattr foo foo bar Looks like attributes don't get cleaned away when a vdi is deleted, and a new vdi with the same name with end up with the same vdi id and hence 'pick up' the stray attributes. However, I'm seeing some strange behaviours even when we're using UUID VDI names, so there's no risk of one ever being reused. I arranged for the lowest level of our management system to log all collie invocations to a file to capture what's going on. There are no qemu-img operations or qemu vms running at the same time as these commands, and I started with a completely clean cluster of three nodes on the same box, empty directories, and cluster format --copies=1. Here's an example trace: [4581] collie vdi create 13121389-6673-4fe1-b30a-6608b9623bbf 539545600 Exit code: 0 [4581] collie vdi setattr -x 13121389-6673-4fe1-b30a-6608b9623bbf lock stdin: 002689c3-aeab-433d-bafc-acfb95dafe7c:4581:1318241623 stdout: Exit code: 0 [4581] collie vdi setattr 13121389-6673-4fe1-b30a-6608b9623bbf properties stdin: email test at test name debian user 00000000-0000-0000-0000-000000000000 stdout: Exit code: 0 [4581] collie vdi getattr 13121389-6673-4fe1-b30a-6608b9623bbf lock stdin: stdout: Exit code: 0 So, the 'lock' attribute is found here (else exit code would be EMISSING), but has an empty value instead of the expected value 002689c3-aeab-433d-bafc-acfb95dafe7c:4581:1318241623. I'm guessing that what's really going on here is that the setattr operation happens asynchronously and hasn't finished by the time the command exits and we do the getattr? Best wishes, Chris. |