[Sheepdog] [RFC PATCH] Add a ftrace-like trace utility
Liu Yuan
namei.unix at gmail.com
Tue Jan 17 12:59:51 CET 2012
This patch set aims to provide a ftrace-like trace utility for performance tuning
of online system in production environment. This means virtually no overhead when
disabled.
The patch set achieves this by dynamically patching the binary code at runtime, making
use of GCC -pg instrumentation features, which instrument a 5-bytes opcode in every function.
When the trace is
turned off, the opcode is patched as NOP opcode.
turned on, the opcode is a call opcode to our trace functions and we can hook callbacks both for
function entry and exit. This is how the patch atually calculate the time spent by function.
This is an early draft set, thought the trace infrastrure is almost done. The work left is most related
to the trace buffer implementation. Currently in the patch set, a dynamical buffer is drafted but not used
yet. So the output is rely on the original dprintf, which is slow and doing sync logging. A final trace
output version should be per-thread buffer and will be manipulated by collie command.
Further more, the trace infrasture is quite modular, we can hook different callbacks for different purposes.
So now the output is like:
...
Jan 17 19:49:52 default_trace_graph_return(47) strbuf_grow: 8259 (ns) < how much nanosecond used in the function
Jan 17 19:49:52 default_trace_graph_return(47) strbuf_addf: 15663 (ns)
Jan 17 19:49:52 default_trace_graph_return(47) strbuf_release: 106 (ns)
Jan 17 19:49:52 default_trace_graph_return(47) farm_open: 61610 (ns)
Jan 17 19:49:52 default_trace_graph_return(47) xpread: 1647242 (ns)
Jan 17 19:49:52 default_trace_graph_return(47) farm_read: 1704604 (ns)
Jan 17 19:49:52 default_trace_graph_return(47) farm_close: 8024 (ns)
Jan 17 19:49:52 default_trace_graph_return(47) store_read_obj: 1802385 (ns)
Jan 17 19:49:52 default_trace_graph_return(47) do_process_work: 1827672 (ns)
Jan 17 19:49:52 default_trace_graph_return(47) do_local_io: 1845099 (ns)
Jan 17 19:49:52 default_trace_graph_return(47) forward_read_obj_req: 1860139 (ns)
Jan 17 19:49:52 default_trace_graph_return(47) get_sd_op: 141 (ns)
...
and the supposed one should be more humand readable like following:
(take the example from early version of Linux kernel's ftrace output.)
...
sys_read() {
fget_light() {
} 526 <---- nanosecond
vfs_read() {
rw_verify_area() {
security_file_permission() {
cap_file_permission() {
} 519
} 1564
} 2640
do_sync_read() {
pipe_read() {
__might_sleep() {
} 511
pipe_wait() {
prepare_to_wait() {
} 760
deactivate_task() {
dequeue_task() {
dequeue_task_fair() {
dequeue_entity() {
update_curr() {
update_min_vruntime() {
} 504
} 1587
...
Liu Yuan (11):
trace: driver trace to work
trace: low level functions in assembly
trace: add low level ip2function support
logger: teach logger funcs to notrace
sheep: export some structures for trace use
trace: low-level trace infrastructure proper
trace: add return hooker
trace: add a new interface for trace utility
trace: add a dynamic buffer
sheep: revert coroutine for fixing group_handler() race
sheep: fix group_handler face
collie/Makefile.am | 6 +
collie/collie.c | 5 +
collie/collie.h | 1 +
collie/debug.c | 67 +++++++++++++
configure.ac | 30 ++++++-
include/sheep.h | 1 +
include/util.h | 47 +++++++++
lib/Makefile.am | 2 +-
lib/dbuffer.c | 44 +++++++++
lib/logger.c | 22 ++--
sheep/Makefile.am | 6 +-
sheep/group.c | 40 +++-----
sheep/ops.c | 19 ++++
sheep/sheep.c | 8 ++
sheep/strbuf.h | 20 ----
sheep/trace/graph.c | 85 ++++++++++++++++
sheep/trace/mcount.S | 84 ++++++++++++++++
sheep/trace/stabs.c | 198 +++++++++++++++++++++++++++++++++++++
sheep/trace/trace.c | 263 ++++++++++++++++++++++++++++++++++++++++++++++++++
sheep/trace/trace.h | 76 +++++++++++++++
sheep/trace/trace.ld | 18 ++++
sheep/work.c | 46 ++++------
sheep/work.h | 32 ++++++
23 files changed, 1031 insertions(+), 89 deletions(-)
More information about the sheepdog
mailing list