[Sheepdog] [PATCH v4] Add a ftrace-like trace utility

MORITA Kazutaka morita.kazutaka at lab.ntt.co.jp
Tue Apr 10 18:32:37 CEST 2012


On Tue, Apr 10, 2012 at 4:01 PM, Liu Yuan <namei.unix at gmail.com> wrote:
> v4:
>  - change trace interface
>   new interface:
>      start trace: collie debug trace start
>      stop  trace: collie debug trace stop
>      cat  output: collie debug trace cat
>
>  - address Kazum's comments, thanks!
>   - fix() trace_lookup_ip() race.
>   - disable trace as default
>   - refactor trace.h, now trace code is totally removed from binary when disabled.
>   - fix make_text_writable() bug
>   - use SIGUSR2 instead of SIGUSR1
>
> v3:
>  - fix trace_lookup_ip()
>  - fix main thread ring buffer init
>  - fix pop_return_trace()
>
> v2:
>  - use ring buffer for trace output
>  - add a cat operation for collie
>
> This patch set aims to provide a ftrace-like trace utility for performance tuning
> of online system in production environment. This means virtually no overhead when
> disabled, rather low even enabled.
>
> Currently 'graph' tracer is implemented, accounting CPU usage at funciton granularity.
>
> The patch set achieves this by dynamically patching the binary code at runtime, making
> use of GCC -pg instrumentation features, which instrument a 5-bytes opcode in every function.
>
> When the trace is
>  turned off, the opcode is patched as NOP opcode.
>  turned on, the opcode is a call opcode to our trace functions and we can hook callbacks both for
> function entry and exit. This is how the patch atually calculate the time spent by function.
>
> The underlying infrastructure is quite modular, I hope we can implement other tracers that considered useful
> later.
>
> I use the below script[*] to get the following output:
> ---------------------------------------------------------
> tailai.ly at taobao:~/sheepdog$ ./test3.sh "-b farm"
> using backend farm store
> Formatting 'sheepdog:test', fmt=raw size=1073741824
> wrote 10485760/10485760 bytes at offset 0
> 10 MiB, 1 ops; 0.8899 sec (11.237 MiB/sec and 1.1237 ops/sec)
> read 10485760/10485760 bytes at offset 0
> 10 MiB, 1 ops; 0.5130 sec (19.492 MiB/sec and 1.9492 ops/sec)
>
> Begin tracing
>
> wrote 10485760/10485760 bytes at offset 10485760
> 10 MiB, 1 ops; 0.8662 sec (11.544 MiB/sec and 1.1544 ops/sec)
> read 10485760/10485760 bytes at offset 10485760
> 10 MiB, 1 ops; 0.5425 sec (18.431 MiB/sec and 1.8431 ops/sec)
>
> Cat the trace buffer
>
>   Time(us)  |  Function Graph
> -------------------------------
>             |  do_io_request() {
>             |     forward_write_obj_req() {
>             |        addr_to_str() {
>       2.132 |        }
>             |        get_sheep_fd() {
>      29.960 |        }
>             |        send_req() {
>    1538.482 |        }
>             |        addr_to_str() {
>       5.644 |        }
>             |        addr_to_str() {
>       0.598 |        }
>             |        get_sheep_fd() {
>      25.426 |        }
>             |        send_req() {
>    4184.318 |        }
>             |        do_local_io() {
>             |           do_process_work() {
>             |              store_write_obj() {
>             |                 farm_open() {
>             |                    strbuf_add() {
>             |                       strbuf_grow() {
>             |                          xrealloc() {
>       0.786 |                          }
>       1.625 |                       }
>       2.555 |                    }
>             |                    strbuf_addf() {
>             |                       strbuf_grow() {
>             |                          xrealloc() {
>       1.636 |                          }
>       1.894 |                       }
>       4.325 |                    }
>             |                    strbuf_release() {
>       0.98  |                    }
>      31.288 |                 }
>             |                 do_write_obj() {
>             |                    farm_write() {
>             |                       xpwrite() {
>  234185.848 |                       }
>             |                       trunk_update_entry() {
>             |                          lookup_trunk_entry() {
>       0.691 |                          }
>       2.784 |                       }
>  234194.492 |                    }
>  234195.533 |                 }
>             |                 farm_close() {
>       6.697 |                 }
>  234237.173 |              }
>  234238.461 |           }
>  234260.257 |        }
>             |        do_read() {
>       8.18  |        }
>             |        do_read() {
>       9.308 |        }
>  406753.259 |     }
>  406774.40  |  }
> ...
>
> [*]
> #!/bin/bash
>
> pkill -9 sheep
> rm store/* -rf
> for i in 0 1 2; do sheep/sheep -d /home/tailai.ly/sheepdog/store/$i -z $i -p 700$i;sleep 1;done
> collie/collie cluster format $1
> qemu-img create -f raw sheepdog:test 1G
> qemu-io -c "write -P 0x1 0 10M" sheepdog:test
> qemu-io -c "read -P 0x1 0 10M" sheepdog:test
> echo -e "\nBegin tracing\n"
> collie/collie debug trace start
> qemu-io -c "write -P 0x1 10M 10M" sheepdog:test
> qemu-io -c "read -P 0x1 10M 10M" sheepdog:test
> collie/collie debug trace stop
> echo -e "\nCat the trace buffer\n"
> collie/collie debug trace cat > t

Applied after fixing coding style errors, thanks.

Kazutaka



More information about the sheepdog mailing list