1*4882a593Smuzhiyun================================== 2*4882a593SmuzhiyunUsing the Linux Kernel Tracepoints 3*4882a593Smuzhiyun================================== 4*4882a593Smuzhiyun 5*4882a593Smuzhiyun:Author: Mathieu Desnoyers 6*4882a593Smuzhiyun 7*4882a593Smuzhiyun 8*4882a593SmuzhiyunThis document introduces Linux Kernel Tracepoints and their use. It 9*4882a593Smuzhiyunprovides examples of how to insert tracepoints in the kernel and 10*4882a593Smuzhiyunconnect probe functions to them and provides some examples of probe 11*4882a593Smuzhiyunfunctions. 12*4882a593Smuzhiyun 13*4882a593Smuzhiyun 14*4882a593SmuzhiyunPurpose of tracepoints 15*4882a593Smuzhiyun---------------------- 16*4882a593SmuzhiyunA tracepoint placed in code provides a hook to call a function (probe) 17*4882a593Smuzhiyunthat you can provide at runtime. A tracepoint can be "on" (a probe is 18*4882a593Smuzhiyunconnected to it) or "off" (no probe is attached). When a tracepoint is 19*4882a593Smuzhiyun"off" it has no effect, except for adding a tiny time penalty 20*4882a593Smuzhiyun(checking a condition for a branch) and space penalty (adding a few 21*4882a593Smuzhiyunbytes for the function call at the end of the instrumented function 22*4882a593Smuzhiyunand adds a data structure in a separate section). When a tracepoint 23*4882a593Smuzhiyunis "on", the function you provide is called each time the tracepoint 24*4882a593Smuzhiyunis executed, in the execution context of the caller. When the function 25*4882a593Smuzhiyunprovided ends its execution, it returns to the caller (continuing from 26*4882a593Smuzhiyunthe tracepoint site). 27*4882a593Smuzhiyun 28*4882a593SmuzhiyunYou can put tracepoints at important locations in the code. They are 29*4882a593Smuzhiyunlightweight hooks that can pass an arbitrary number of parameters, 30*4882a593Smuzhiyunwhich prototypes are described in a tracepoint declaration placed in a 31*4882a593Smuzhiyunheader file. 32*4882a593Smuzhiyun 33*4882a593SmuzhiyunThey can be used for tracing and performance accounting. 34*4882a593Smuzhiyun 35*4882a593Smuzhiyun 36*4882a593SmuzhiyunUsage 37*4882a593Smuzhiyun----- 38*4882a593SmuzhiyunTwo elements are required for tracepoints : 39*4882a593Smuzhiyun 40*4882a593Smuzhiyun- A tracepoint definition, placed in a header file. 41*4882a593Smuzhiyun- The tracepoint statement, in C code. 42*4882a593Smuzhiyun 43*4882a593SmuzhiyunIn order to use tracepoints, you should include linux/tracepoint.h. 44*4882a593Smuzhiyun 45*4882a593SmuzhiyunIn include/trace/events/subsys.h:: 46*4882a593Smuzhiyun 47*4882a593Smuzhiyun #undef TRACE_SYSTEM 48*4882a593Smuzhiyun #define TRACE_SYSTEM subsys 49*4882a593Smuzhiyun 50*4882a593Smuzhiyun #if !defined(_TRACE_SUBSYS_H) || defined(TRACE_HEADER_MULTI_READ) 51*4882a593Smuzhiyun #define _TRACE_SUBSYS_H 52*4882a593Smuzhiyun 53*4882a593Smuzhiyun #include <linux/tracepoint.h> 54*4882a593Smuzhiyun 55*4882a593Smuzhiyun DECLARE_TRACE(subsys_eventname, 56*4882a593Smuzhiyun TP_PROTO(int firstarg, struct task_struct *p), 57*4882a593Smuzhiyun TP_ARGS(firstarg, p)); 58*4882a593Smuzhiyun 59*4882a593Smuzhiyun #endif /* _TRACE_SUBSYS_H */ 60*4882a593Smuzhiyun 61*4882a593Smuzhiyun /* This part must be outside protection */ 62*4882a593Smuzhiyun #include <trace/define_trace.h> 63*4882a593Smuzhiyun 64*4882a593SmuzhiyunIn subsys/file.c (where the tracing statement must be added):: 65*4882a593Smuzhiyun 66*4882a593Smuzhiyun #include <trace/events/subsys.h> 67*4882a593Smuzhiyun 68*4882a593Smuzhiyun #define CREATE_TRACE_POINTS 69*4882a593Smuzhiyun DEFINE_TRACE(subsys_eventname); 70*4882a593Smuzhiyun 71*4882a593Smuzhiyun void somefct(void) 72*4882a593Smuzhiyun { 73*4882a593Smuzhiyun ... 74*4882a593Smuzhiyun trace_subsys_eventname(arg, task); 75*4882a593Smuzhiyun ... 76*4882a593Smuzhiyun } 77*4882a593Smuzhiyun 78*4882a593SmuzhiyunWhere : 79*4882a593Smuzhiyun - subsys_eventname is an identifier unique to your event 80*4882a593Smuzhiyun 81*4882a593Smuzhiyun - subsys is the name of your subsystem. 82*4882a593Smuzhiyun - eventname is the name of the event to trace. 83*4882a593Smuzhiyun 84*4882a593Smuzhiyun - `TP_PROTO(int firstarg, struct task_struct *p)` is the prototype of the 85*4882a593Smuzhiyun function called by this tracepoint. 86*4882a593Smuzhiyun 87*4882a593Smuzhiyun - `TP_ARGS(firstarg, p)` are the parameters names, same as found in the 88*4882a593Smuzhiyun prototype. 89*4882a593Smuzhiyun 90*4882a593Smuzhiyun - if you use the header in multiple source files, `#define CREATE_TRACE_POINTS` 91*4882a593Smuzhiyun should appear only in one source file. 92*4882a593Smuzhiyun 93*4882a593SmuzhiyunConnecting a function (probe) to a tracepoint is done by providing a 94*4882a593Smuzhiyunprobe (function to call) for the specific tracepoint through 95*4882a593Smuzhiyunregister_trace_subsys_eventname(). Removing a probe is done through 96*4882a593Smuzhiyununregister_trace_subsys_eventname(); it will remove the probe. 97*4882a593Smuzhiyun 98*4882a593Smuzhiyuntracepoint_synchronize_unregister() must be called before the end of 99*4882a593Smuzhiyunthe module exit function to make sure there is no caller left using 100*4882a593Smuzhiyunthe probe. This, and the fact that preemption is disabled around the 101*4882a593Smuzhiyunprobe call, make sure that probe removal and module unload are safe. 102*4882a593Smuzhiyun 103*4882a593SmuzhiyunThe tracepoint mechanism supports inserting multiple instances of the 104*4882a593Smuzhiyunsame tracepoint, but a single definition must be made of a given 105*4882a593Smuzhiyuntracepoint name over all the kernel to make sure no type conflict will 106*4882a593Smuzhiyunoccur. Name mangling of the tracepoints is done using the prototypes 107*4882a593Smuzhiyunto make sure typing is correct. Verification of probe type correctness 108*4882a593Smuzhiyunis done at the registration site by the compiler. Tracepoints can be 109*4882a593Smuzhiyunput in inline functions, inlined static functions, and unrolled loops 110*4882a593Smuzhiyunas well as regular functions. 111*4882a593Smuzhiyun 112*4882a593SmuzhiyunThe naming scheme "subsys_event" is suggested here as a convention 113*4882a593Smuzhiyunintended to limit collisions. Tracepoint names are global to the 114*4882a593Smuzhiyunkernel: they are considered as being the same whether they are in the 115*4882a593Smuzhiyuncore kernel image or in modules. 116*4882a593Smuzhiyun 117*4882a593SmuzhiyunIf the tracepoint has to be used in kernel modules, an 118*4882a593SmuzhiyunEXPORT_TRACEPOINT_SYMBOL_GPL() or EXPORT_TRACEPOINT_SYMBOL() can be 119*4882a593Smuzhiyunused to export the defined tracepoints. 120*4882a593Smuzhiyun 121*4882a593SmuzhiyunIf you need to do a bit of work for a tracepoint parameter, and 122*4882a593Smuzhiyunthat work is only used for the tracepoint, that work can be encapsulated 123*4882a593Smuzhiyunwithin an if statement with the following:: 124*4882a593Smuzhiyun 125*4882a593Smuzhiyun if (trace_foo_bar_enabled()) { 126*4882a593Smuzhiyun int i; 127*4882a593Smuzhiyun int tot = 0; 128*4882a593Smuzhiyun 129*4882a593Smuzhiyun for (i = 0; i < count; i++) 130*4882a593Smuzhiyun tot += calculate_nuggets(); 131*4882a593Smuzhiyun 132*4882a593Smuzhiyun trace_foo_bar(tot); 133*4882a593Smuzhiyun } 134*4882a593Smuzhiyun 135*4882a593SmuzhiyunAll trace_<tracepoint>() calls have a matching trace_<tracepoint>_enabled() 136*4882a593Smuzhiyunfunction defined that returns true if the tracepoint is enabled and 137*4882a593Smuzhiyunfalse otherwise. The trace_<tracepoint>() should always be within the 138*4882a593Smuzhiyunblock of the if (trace_<tracepoint>_enabled()) to prevent races between 139*4882a593Smuzhiyunthe tracepoint being enabled and the check being seen. 140*4882a593Smuzhiyun 141*4882a593SmuzhiyunThe advantage of using the trace_<tracepoint>_enabled() is that it uses 142*4882a593Smuzhiyunthe static_key of the tracepoint to allow the if statement to be implemented 143*4882a593Smuzhiyunwith jump labels and avoid conditional branches. 144*4882a593Smuzhiyun 145*4882a593Smuzhiyun.. note:: The convenience macro TRACE_EVENT provides an alternative way to 146*4882a593Smuzhiyun define tracepoints. Check http://lwn.net/Articles/379903, 147*4882a593Smuzhiyun http://lwn.net/Articles/381064 and http://lwn.net/Articles/383362 148*4882a593Smuzhiyun for a series of articles with more details. 149*4882a593Smuzhiyun 150*4882a593SmuzhiyunIf you require calling a tracepoint from a header file, it is not 151*4882a593Smuzhiyunrecommended to call one directly or to use the trace_<tracepoint>_enabled() 152*4882a593Smuzhiyunfunction call, as tracepoints in header files can have side effects if a 153*4882a593Smuzhiyunheader is included from a file that has CREATE_TRACE_POINTS set, as 154*4882a593Smuzhiyunwell as the trace_<tracepoint>() is not that small of an inline 155*4882a593Smuzhiyunand can bloat the kernel if used by other inlined functions. Instead, 156*4882a593Smuzhiyuninclude tracepoint-defs.h and use tracepoint_enabled(). 157*4882a593Smuzhiyun 158*4882a593SmuzhiyunIn a C file:: 159*4882a593Smuzhiyun 160*4882a593Smuzhiyun void do_trace_foo_bar_wrapper(args) 161*4882a593Smuzhiyun { 162*4882a593Smuzhiyun trace_foo_bar(args); 163*4882a593Smuzhiyun } 164*4882a593Smuzhiyun 165*4882a593SmuzhiyunIn the header file:: 166*4882a593Smuzhiyun 167*4882a593Smuzhiyun DECLARE_TRACEPOINT(foo_bar); 168*4882a593Smuzhiyun 169*4882a593Smuzhiyun static inline void some_inline_function() 170*4882a593Smuzhiyun { 171*4882a593Smuzhiyun [..] 172*4882a593Smuzhiyun if (tracepoint_enabled(foo_bar)) 173*4882a593Smuzhiyun do_trace_foo_bar_wrapper(args); 174*4882a593Smuzhiyun [..] 175*4882a593Smuzhiyun } 176