[Sheepdog] [PATCH v4] Add a ftrace-like trace utility

Liu Yuan namei.unix at gmail.com
Tue Apr 10 09:01:55 CEST 2012


v4:
 - change trace interface
   new interface:
      start trace: collie debug trace start
      stop  trace: collie debug trace stop
      cat  output: collie debug trace cat

 - address Kazum's comments, thanks!
   - fix() trace_lookup_ip() race.
   - disable trace as default
   - refactor trace.h, now trace code is totally removed from binary when disabled.
   - fix make_text_writable() bug
   - use SIGUSR2 instead of SIGUSR1

v3:
 - fix trace_lookup_ip()
 - fix main thread ring buffer init
 - fix pop_return_trace()

v2:
 - use ring buffer for trace output
 - add a cat operation for collie

This patch set aims to provide a ftrace-like trace utility for performance tuning 
of online system in production environment. This means virtually no overhead when
disabled, rather low even enabled.

Currently 'graph' tracer is implemented, accounting CPU usage at funciton granularity.

The patch set achieves this by dynamically patching the binary code at runtime, making
use of GCC -pg instrumentation features, which instrument a 5-bytes opcode in every function.

When the trace is
 turned off, the opcode is patched as NOP opcode.
 turned on, the opcode is a call opcode to our trace functions and we can hook callbacks both for
function entry and exit. This is how the patch atually calculate the time spent by function.

The underlying infrastructure is quite modular, I hope we can implement other tracers that considered useful
later.

I use the below script[*] to get the following output:
---------------------------------------------------------
tailai.ly at taobao:~/sheepdog$ ./test3.sh "-b farm"
using backend farm store
Formatting 'sheepdog:test', fmt=raw size=1073741824 
wrote 10485760/10485760 bytes at offset 0
10 MiB, 1 ops; 0.8899 sec (11.237 MiB/sec and 1.1237 ops/sec)
read 10485760/10485760 bytes at offset 0
10 MiB, 1 ops; 0.5130 sec (19.492 MiB/sec and 1.9492 ops/sec)

Begin tracing

wrote 10485760/10485760 bytes at offset 10485760
10 MiB, 1 ops; 0.8662 sec (11.544 MiB/sec and 1.1544 ops/sec)
read 10485760/10485760 bytes at offset 10485760
10 MiB, 1 ops; 0.5425 sec (18.431 MiB/sec and 1.8431 ops/sec)

Cat the trace buffer

   Time(us)  |  Function Graph
-------------------------------
             |  do_io_request() {
             |     forward_write_obj_req() {
             |        addr_to_str() {
       2.132 |        }
             |        get_sheep_fd() {
      29.960 |        }
             |        send_req() {
    1538.482 |        }
             |        addr_to_str() {
       5.644 |        }
             |        addr_to_str() {
       0.598 |        }
             |        get_sheep_fd() {
      25.426 |        }
             |        send_req() {
    4184.318 |        }
             |        do_local_io() {
             |           do_process_work() {
             |              store_write_obj() {
             |                 farm_open() {
             |                    strbuf_add() {
             |                       strbuf_grow() {
             |                          xrealloc() {
       0.786 |                          }
       1.625 |                       }
       2.555 |                    }
             |                    strbuf_addf() {
             |                       strbuf_grow() {
             |                          xrealloc() {
       1.636 |                          }
       1.894 |                       }
       4.325 |                    }
             |                    strbuf_release() {
       0.98  |                    }
      31.288 |                 }
             |                 do_write_obj() {
             |                    farm_write() {
             |                       xpwrite() {
  234185.848 |                       }
             |                       trunk_update_entry() {
             |                          lookup_trunk_entry() {
       0.691 |                          }
       2.784 |                       }
  234194.492 |                    }
  234195.533 |                 }
             |                 farm_close() {
       6.697 |                 }
  234237.173 |              }
  234238.461 |           }
  234260.257 |        }
             |        do_read() {
       8.18  |        }
             |        do_read() {
       9.308 |        }
  406753.259 |     }
  406774.40  |  }
...

[*]
#!/bin/bash

pkill -9 sheep
rm store/* -rf
for i in 0 1 2; do sheep/sheep -d /home/tailai.ly/sheepdog/store/$i -z $i -p 700$i;sleep 1;done
collie/collie cluster format $1
qemu-img create -f raw sheepdog:test 1G
qemu-io -c "write -P 0x1 0 10M" sheepdog:test
qemu-io -c "read -P 0x1 0 10M" sheepdog:test
echo -e "\nBegin tracing\n"
collie/collie debug trace start
qemu-io -c "write -P 0x1 10M 10M" sheepdog:test
qemu-io -c "read -P 0x1 10M 10M" sheepdog:test
collie/collie debug trace stop
echo -e "\nCat the trace buffer\n"
collie/collie debug trace cat > t



More information about the sheepdog mailing list