[sheepdog-users] Lock TTL/Timeout

Thu Jul 9 16:41:56 CEST 2015

At Wed, 08 Jul 2015 07:45:55 +0200,
Fabian Zimmermann wrote:
> 
> Hi,
> 
> Am 06.07.2015 um 14:46 schrieb Hitoshi Mitake:
> > Thanks a lot!
> 
> No, I have to thank you! :D
> 
> > BTW, how do you think about the idea: when a sheep process dies,
> > releasing all VDI locks acquired by qemu processes running on a same
> > host with the sheep process. It can be implemented easily.
> 
> If I remember correctly, qemu only allows one sheepdog-node as target.
> So if this node dies, the qemu-processes are unable to write to
> disk/cluster and therefore it should be save to start a new process
> handling the vm/vdi on a other host/node.
> 
> There is just one problem: What happens if the sheepdog-process gets
> restarted while the old qemu-processes are still running?
> 
> As far as I understood a running process will *not* check if the lock is
> still valid, isn't it? This would lead to corrupt filesystem, because
> two qemu-processes are writing to the same vdi.

Thanks, very nice pointing. As you say, the locking mechanism is unfriendly with reconnect feature.

> 
> So, we have to ensure at least on of the following is happening:
> 
> * If sheepdog-node dies, fence hostsystem
> * If sheepdog-node dies, qemu should terminate themself.

The reconnect feature is provided for rolling update of sheep daemon. But for handling the above case, the reconnection should be optional. BTW, what does the word "fence" mean in this context?

Thanks,
Hitoshi

> 
> But generally I think an automatic release seems a good idea. On the
> other side, maybe it's better located in the fencing-mech of a ha-system
> until qemu is terminating/checking lock themselves, isn't it?
> 
> Thanks,
> 
>  Fabian
>