[stgt] Maintaining IO QoS

Sat Dec 29 18:16:06 CET 2012

Greetings,

I have created an IB interconnected SSD-backed centralized storage
device running tgtd built from git master, custom 3.5.4 kernels on this
storage array and all hypervisor hosts, and all IB and OVS related stuff
built from master on their respective repos. I'm connecting to LVM LUNS
served from this SSD array from the kvm hypervisor hosts using iSER. The
kvm guests then use the iSER connected LUNS as their disks, attaching to
them with virtio.

First, the performance is absolutely fantastic and everything has been
rock solid too. Thank you all for all of your hard work on this stuff!

My question involves determining the best practice for delivering
predictable and controllable IO QoS to the guests. I do not want a few
IO hogs to slow everyone else down.

Here's my thoughts on this at the moment:

* Cap IO on hosts.
On the hypervisor hosts, use cgroups to limit max IO per guest. I can
easily set some arbitrary cap here, but it misses to point of maximizing
resource usage and leveraging over-subscription.

* Control IOPS per LUN on the array
If I could understand how tgtd partitions/allocates resources per LUN, I
could put each LUN in a cgroup on the array box as well, and control it
from the array side (maybe this would be the only side I would need to
control it from?).

* Hybrid approach
Some combination of array monitoring, guest IO monitoring and dynamic
cgroup IO throttling on the hosts.

How are people addressing this issue today? Any thoughts, ideas or
pointers to existing work already addressing this issue <regardless of
status> would be very much appreciated.

Regards,
Christopher Barry
RJMetrics.com

--
To unsubscribe from this list: send the line "unsubscribe stgt" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html