On Wed, Jul 04, 2012 at 09:08:49AM +0200, Stefan Priebe - Profihost AG wrote: > >exact git revision of sheepdog > > 7c62b6e935b1943c57139a00d1b7d322c8a9c521 Before I would like to let you try 29bfdbd6a95fdf8d827e177046dbab12ee342611 or earlier? because I suspect the "short threads" hurt badly for small IOPS performance, but so far I didn't get around actually verifying my assumption, but given how little cpu time the gateway and storage nodes spend that might not be worth it. > >In general I doubt anyone has optimized sheepdog for iops and low > >latency at this moment as other things have kept people. There's > >some relatively low hanging fruit like avoiding additional copies > >in the gateway, but your numbers still sound very low. > > > >Can you also do a perf record -g on both a storage node and the > >kvm box to see if there's anything interesting on them? > > Option -g is not known by my perf command? perf record -g records the callgraph, and it's been there for a long time. Did you try that above line or something like perf -g record? > > is a perf record sleep 10 enough? Should i upload then the data file > somewhere? Would be nice to get a slightly long run. > > Snapshot of perf top from KVM host: > 14.96% [kernel] [k] _raw_spin_lock With the callchains we could expand what spinlock we hammer here. > 8.13% kvm [.] 0x00000000001d8084 Also if kvm/qemu is self-build can you build it with -g to get debug info? If not see if there is a qemu-dbg/kvm-dbg or similar package for your distribution. > 4.08% [kernel] [k] get_pid_task > 3.91% [kvm] [k] kvm_vcpu_yield_to > 3.83% [kernel] [k] yield_to > 2.62% [kernel] [k] __copy_user_nocache > 2.53% [kvm] [k] vcpu_enter_guest > 1.95% [kvm_intel] [k] vmx_vcpu_run These suggest we spend a fair emount of time in kvm code, probably unrelated to the actual storage transport. > Snapshot of perf top from sheep node (not acting as the gateway / > target for kvm): > 2,78% libgcc_s.so.1 [.] 0x000000000000e72b > 2,21% [kernel] [k] __schedule > 2,14% [kernel] [k] ahci_port_intr > 2,08% [kernel] [k] _raw_spin_lock > 1,77% [kernel] [k] _raw_spin_lock_irqsave > 1,15% [kernel] [k] ahci_interrupt > 1,11% [kernel] [k] ahci_scr_read > 0,94% [kernel] [k] kmem_cache_alloc > 0,90% [kernel] [k] _raw_spin_lock_irq > 0,81% [kernel] [k] menu_select > 0,76% [kernel] [k] _raw_spin_unlock_irqrestore > 0,73% libpthread-2.11.3.so [.] pthread_mutex_lock > 0,70% libc-2.11.3.so [.] vfprintf Mostly not spending any noticable CPU time, which is quite interesting. > > Stefan > -- > sheepdog-users mailing lists > sheepdog-users at lists.wpkg.org > http://lists.wpkg.org/mailman/listinfo/sheepdog-users ---end quoted text--- |