[Sheepdog] handling network disconnections?
Tomasz Chmielewski
mangoo at wpkg.org
Thu Sep 9 12:27:11 CEST 2010
On 09.09.2010 12:00, MORITA Kazutaka wrote:
> If the vm cannot access to the gateway, all the I/O requests to the
> sheepdog volumes results in EIO, and probably lots of IO errors in
> dmesg.
Is it "fixable"?
For example, with iSCSI, I can set the initiator to try to reconnect to
the target for quite long (default is 2 minutes for open-iscsi, but we
can set it to hours, or even days).
This way, the whole is much more failure-resilient for expected and
unexpected connectivity problems:
- set long timeouts for iSCSI on the initiator,
- connectivity between the target and the initiator is interrupted (i.e.
5 minute maintenance to replace cabling and switches turned out to be a
2 hour one, as additional problems were identified),
- on the guest, all processes which wanted to read (and data was not
cached/buffered already) or write, will be in "uninterruptible sleep",
but other than that, the guest system is working correctly,
- when connectivity is back, initiator will reconnect, guest will resume
to read/write and will function correctly.
Of course it meant that the guest was not usable for these 2 hours when
target could not connect with the initiator; on the other hand, guest
restart was not needed, no filesystem corruption happened.
Is it achievable (long term perhaps) with Sheepdog?
--
Tomasz Chmielewski
http://wpkg.org
More information about the sheepdog
mailing list