At Thu, 7 Oct 2010 06:28:55 +0200, Floris Bos wrote: > > Hi, > > On Thursday, October 07, 2010 05:13:40 am you wrote: > > > Basically what I need is a new read-only snapshot for use by my client, > > > and no changes to the current VDI. > > > After all, the current vdi may be in use by qemu, and qemu is totally > > > unaware of the snapshot I'm taking with my external program. > > > > > > So the original VDI ID must stay writable, as there is no way to signal > > > qemu that it should start using another id. > > > > On second thought, we cannot avoid updating a vdi id when its snapshot > > is created. It is because a sheepdog client does copy-on-write based > > on its vdi id. > > > > So we need to use a savevm command from the qemu monitor to take a > > snapshot of the running VM. Currently, if you want to create a > > snapshot from the external program, you need to get a lock of the vdi > > to avoid corrupting running VMs, and if running VMs exist, you need to > > give up taking a snapshot... > > > > In future, I think we should implement a mechanism to notify the > > running client that an external program creates a snapshot. > > > > For example, if write accesses to snapshot objects return something > > like SD_RES_READONLY_OBJ, we can tell the client that it should update > > the vdi id. > > So the VDI ID decides whether writes are done in-place or COW is used. > Does this also mean that after taking a snapshot, all updates are done using > COW, even if the only snapshot there was is deleted later? > Yes, exactly. > > In the typical use case of making a backup, the snapshot only exist for a > couple minutes: > > 1) temporary read-only snapshot is made > 2) rsync (or other legacy program) reads all the data from the snapshot, and > sends it to the external backup server. > 3) temporary snapshot is deleted again. > > If qemu continues to use COW for updates afterwards, I assume this affects > performance, as a 4 MB object has to be read, updated, and written again, even > if only a 512-byte sector is changed? > > > Ideally there should be a way to signal the client to only use COW temporarily > (while any snapshot exist), and signal it again that it can resume updating > in-place after there are no longer any snapshots. I think this kind of feature would be useful in practice. If a sheep daemon could tell the virtual machine to use the previous vdi id, we could achieve this feature easily, I think. In this case, write accesses to the objects which were already updated during rsync are done in copy-on-write way again, and other accesses are done in-place. > > Asynchronous notification might be relatively complicated to implement in the > qemu block driver, though. > Wonder if it might be more practical to transfer some of the low level stuff > that is currently in the qemu client itself, to sheep. > And let sheep offer a simplified protocol to the client, that does not think in > low level details like which object the data should be written to, but just > specifies "offset" and "data length" to write to in the image. > So that sheep can decide whether or not cow should be used, and also manage > other low level details (like updating inode metadata) instead of the client. > Yes, that might be more practical, but makes a sheep daemon more complicated. Current sheepdog implementation provides something like a simple object storage to the qemu block driver, but I think that managing which object should be updated in COW way is out of scope of the object storage. I guess the notification feature is not so complicated. What the qemu block driver has to do is only updating its vdi object when the driver receives SD_RES_READONLY in the response of write operation, Thanks, Kazutaka |