1*4882a593Smuzhiyun======================= 2*4882a593SmuzhiyunKernel Probes (Kprobes) 3*4882a593Smuzhiyun======================= 4*4882a593Smuzhiyun 5*4882a593Smuzhiyun:Author: Jim Keniston <jkenisto@us.ibm.com> 6*4882a593Smuzhiyun:Author: Prasanna S Panchamukhi <prasanna.panchamukhi@gmail.com> 7*4882a593Smuzhiyun:Author: Masami Hiramatsu <mhiramat@redhat.com> 8*4882a593Smuzhiyun 9*4882a593Smuzhiyun.. CONTENTS 10*4882a593Smuzhiyun 11*4882a593Smuzhiyun 1. Concepts: Kprobes, and Return Probes 12*4882a593Smuzhiyun 2. Architectures Supported 13*4882a593Smuzhiyun 3. Configuring Kprobes 14*4882a593Smuzhiyun 4. API Reference 15*4882a593Smuzhiyun 5. Kprobes Features and Limitations 16*4882a593Smuzhiyun 6. Probe Overhead 17*4882a593Smuzhiyun 7. TODO 18*4882a593Smuzhiyun 8. Kprobes Example 19*4882a593Smuzhiyun 9. Kretprobes Example 20*4882a593Smuzhiyun 10. Deprecated Features 21*4882a593Smuzhiyun Appendix A: The kprobes debugfs interface 22*4882a593Smuzhiyun Appendix B: The kprobes sysctl interface 23*4882a593Smuzhiyun Appendix C: References 24*4882a593Smuzhiyun 25*4882a593SmuzhiyunConcepts: Kprobes and Return Probes 26*4882a593Smuzhiyun========================================= 27*4882a593Smuzhiyun 28*4882a593SmuzhiyunKprobes enables you to dynamically break into any kernel routine and 29*4882a593Smuzhiyuncollect debugging and performance information non-disruptively. You 30*4882a593Smuzhiyuncan trap at almost any kernel code address [1]_, specifying a handler 31*4882a593Smuzhiyunroutine to be invoked when the breakpoint is hit. 32*4882a593Smuzhiyun 33*4882a593Smuzhiyun.. [1] some parts of the kernel code can not be trapped, see 34*4882a593Smuzhiyun :ref:`kprobes_blacklist`) 35*4882a593Smuzhiyun 36*4882a593SmuzhiyunThere are currently two types of probes: kprobes, and kretprobes 37*4882a593Smuzhiyun(also called return probes). A kprobe can be inserted on virtually 38*4882a593Smuzhiyunany instruction in the kernel. A return probe fires when a specified 39*4882a593Smuzhiyunfunction returns. 40*4882a593Smuzhiyun 41*4882a593SmuzhiyunIn the typical case, Kprobes-based instrumentation is packaged as 42*4882a593Smuzhiyuna kernel module. The module's init function installs ("registers") 43*4882a593Smuzhiyunone or more probes, and the exit function unregisters them. A 44*4882a593Smuzhiyunregistration function such as register_kprobe() specifies where 45*4882a593Smuzhiyunthe probe is to be inserted and what handler is to be called when 46*4882a593Smuzhiyunthe probe is hit. 47*4882a593Smuzhiyun 48*4882a593SmuzhiyunThere are also ``register_/unregister_*probes()`` functions for batch 49*4882a593Smuzhiyunregistration/unregistration of a group of ``*probes``. These functions 50*4882a593Smuzhiyuncan speed up unregistration process when you have to unregister 51*4882a593Smuzhiyuna lot of probes at once. 52*4882a593Smuzhiyun 53*4882a593SmuzhiyunThe next four subsections explain how the different types of 54*4882a593Smuzhiyunprobes work and how jump optimization works. They explain certain 55*4882a593Smuzhiyunthings that you'll need to know in order to make the best use of 56*4882a593SmuzhiyunKprobes -- e.g., the difference between a pre_handler and 57*4882a593Smuzhiyuna post_handler, and how to use the maxactive and nmissed fields of 58*4882a593Smuzhiyuna kretprobe. But if you're in a hurry to start using Kprobes, you 59*4882a593Smuzhiyuncan skip ahead to :ref:`kprobes_archs_supported`. 60*4882a593Smuzhiyun 61*4882a593SmuzhiyunHow Does a Kprobe Work? 62*4882a593Smuzhiyun----------------------- 63*4882a593Smuzhiyun 64*4882a593SmuzhiyunWhen a kprobe is registered, Kprobes makes a copy of the probed 65*4882a593Smuzhiyuninstruction and replaces the first byte(s) of the probed instruction 66*4882a593Smuzhiyunwith a breakpoint instruction (e.g., int3 on i386 and x86_64). 67*4882a593Smuzhiyun 68*4882a593SmuzhiyunWhen a CPU hits the breakpoint instruction, a trap occurs, the CPU's 69*4882a593Smuzhiyunregisters are saved, and control passes to Kprobes via the 70*4882a593Smuzhiyunnotifier_call_chain mechanism. Kprobes executes the "pre_handler" 71*4882a593Smuzhiyunassociated with the kprobe, passing the handler the addresses of the 72*4882a593Smuzhiyunkprobe struct and the saved registers. 73*4882a593Smuzhiyun 74*4882a593SmuzhiyunNext, Kprobes single-steps its copy of the probed instruction. 75*4882a593Smuzhiyun(It would be simpler to single-step the actual instruction in place, 76*4882a593Smuzhiyunbut then Kprobes would have to temporarily remove the breakpoint 77*4882a593Smuzhiyuninstruction. This would open a small time window when another CPU 78*4882a593Smuzhiyuncould sail right past the probepoint.) 79*4882a593Smuzhiyun 80*4882a593SmuzhiyunAfter the instruction is single-stepped, Kprobes executes the 81*4882a593Smuzhiyun"post_handler," if any, that is associated with the kprobe. 82*4882a593SmuzhiyunExecution then continues with the instruction following the probepoint. 83*4882a593Smuzhiyun 84*4882a593SmuzhiyunChanging Execution Path 85*4882a593Smuzhiyun----------------------- 86*4882a593Smuzhiyun 87*4882a593SmuzhiyunSince kprobes can probe into a running kernel code, it can change the 88*4882a593Smuzhiyunregister set, including instruction pointer. This operation requires 89*4882a593Smuzhiyunmaximum care, such as keeping the stack frame, recovering the execution 90*4882a593Smuzhiyunpath etc. Since it operates on a running kernel and needs deep knowledge 91*4882a593Smuzhiyunof computer architecture and concurrent computing, you can easily shoot 92*4882a593Smuzhiyunyour foot. 93*4882a593Smuzhiyun 94*4882a593SmuzhiyunIf you change the instruction pointer (and set up other related 95*4882a593Smuzhiyunregisters) in pre_handler, you must return !0 so that kprobes stops 96*4882a593Smuzhiyunsingle stepping and just returns to the given address. 97*4882a593SmuzhiyunThis also means post_handler should not be called anymore. 98*4882a593Smuzhiyun 99*4882a593SmuzhiyunNote that this operation may be harder on some architectures which use 100*4882a593SmuzhiyunTOC (Table of Contents) for function call, since you have to setup a new 101*4882a593SmuzhiyunTOC for your function in your module, and recover the old one after 102*4882a593Smuzhiyunreturning from it. 103*4882a593Smuzhiyun 104*4882a593SmuzhiyunReturn Probes 105*4882a593Smuzhiyun------------- 106*4882a593Smuzhiyun 107*4882a593SmuzhiyunHow Does a Return Probe Work? 108*4882a593Smuzhiyun^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 109*4882a593Smuzhiyun 110*4882a593SmuzhiyunWhen you call register_kretprobe(), Kprobes establishes a kprobe at 111*4882a593Smuzhiyunthe entry to the function. When the probed function is called and this 112*4882a593Smuzhiyunprobe is hit, Kprobes saves a copy of the return address, and replaces 113*4882a593Smuzhiyunthe return address with the address of a "trampoline." The trampoline 114*4882a593Smuzhiyunis an arbitrary piece of code -- typically just a nop instruction. 115*4882a593SmuzhiyunAt boot time, Kprobes registers a kprobe at the trampoline. 116*4882a593Smuzhiyun 117*4882a593SmuzhiyunWhen the probed function executes its return instruction, control 118*4882a593Smuzhiyunpasses to the trampoline and that probe is hit. Kprobes' trampoline 119*4882a593Smuzhiyunhandler calls the user-specified return handler associated with the 120*4882a593Smuzhiyunkretprobe, then sets the saved instruction pointer to the saved return 121*4882a593Smuzhiyunaddress, and that's where execution resumes upon return from the trap. 122*4882a593Smuzhiyun 123*4882a593SmuzhiyunWhile the probed function is executing, its return address is 124*4882a593Smuzhiyunstored in an object of type kretprobe_instance. Before calling 125*4882a593Smuzhiyunregister_kretprobe(), the user sets the maxactive field of the 126*4882a593Smuzhiyunkretprobe struct to specify how many instances of the specified 127*4882a593Smuzhiyunfunction can be probed simultaneously. register_kretprobe() 128*4882a593Smuzhiyunpre-allocates the indicated number of kretprobe_instance objects. 129*4882a593Smuzhiyun 130*4882a593SmuzhiyunFor example, if the function is non-recursive and is called with a 131*4882a593Smuzhiyunspinlock held, maxactive = 1 should be enough. If the function is 132*4882a593Smuzhiyunnon-recursive and can never relinquish the CPU (e.g., via a semaphore 133*4882a593Smuzhiyunor preemption), NR_CPUS should be enough. If maxactive <= 0, it is 134*4882a593Smuzhiyunset to a default value. If CONFIG_PREEMPT is enabled, the default 135*4882a593Smuzhiyunis max(10, 2*NR_CPUS). Otherwise, the default is NR_CPUS. 136*4882a593Smuzhiyun 137*4882a593SmuzhiyunIt's not a disaster if you set maxactive too low; you'll just miss 138*4882a593Smuzhiyunsome probes. In the kretprobe struct, the nmissed field is set to 139*4882a593Smuzhiyunzero when the return probe is registered, and is incremented every 140*4882a593Smuzhiyuntime the probed function is entered but there is no kretprobe_instance 141*4882a593Smuzhiyunobject available for establishing the return probe. 142*4882a593Smuzhiyun 143*4882a593SmuzhiyunKretprobe entry-handler 144*4882a593Smuzhiyun^^^^^^^^^^^^^^^^^^^^^^^ 145*4882a593Smuzhiyun 146*4882a593SmuzhiyunKretprobes also provides an optional user-specified handler which runs 147*4882a593Smuzhiyunon function entry. This handler is specified by setting the entry_handler 148*4882a593Smuzhiyunfield of the kretprobe struct. Whenever the kprobe placed by kretprobe at the 149*4882a593Smuzhiyunfunction entry is hit, the user-defined entry_handler, if any, is invoked. 150*4882a593SmuzhiyunIf the entry_handler returns 0 (success) then a corresponding return handler 151*4882a593Smuzhiyunis guaranteed to be called upon function return. If the entry_handler 152*4882a593Smuzhiyunreturns a non-zero error then Kprobes leaves the return address as is, and 153*4882a593Smuzhiyunthe kretprobe has no further effect for that particular function instance. 154*4882a593Smuzhiyun 155*4882a593SmuzhiyunMultiple entry and return handler invocations are matched using the unique 156*4882a593Smuzhiyunkretprobe_instance object associated with them. Additionally, a user 157*4882a593Smuzhiyunmay also specify per return-instance private data to be part of each 158*4882a593Smuzhiyunkretprobe_instance object. This is especially useful when sharing private 159*4882a593Smuzhiyundata between corresponding user entry and return handlers. The size of each 160*4882a593Smuzhiyunprivate data object can be specified at kretprobe registration time by 161*4882a593Smuzhiyunsetting the data_size field of the kretprobe struct. This data can be 162*4882a593Smuzhiyunaccessed through the data field of each kretprobe_instance object. 163*4882a593Smuzhiyun 164*4882a593SmuzhiyunIn case probed function is entered but there is no kretprobe_instance 165*4882a593Smuzhiyunobject available, then in addition to incrementing the nmissed count, 166*4882a593Smuzhiyunthe user entry_handler invocation is also skipped. 167*4882a593Smuzhiyun 168*4882a593Smuzhiyun.. _kprobes_jump_optimization: 169*4882a593Smuzhiyun 170*4882a593SmuzhiyunHow Does Jump Optimization Work? 171*4882a593Smuzhiyun-------------------------------- 172*4882a593Smuzhiyun 173*4882a593SmuzhiyunIf your kernel is built with CONFIG_OPTPROBES=y (currently this flag 174*4882a593Smuzhiyunis automatically set 'y' on x86/x86-64, non-preemptive kernel) and 175*4882a593Smuzhiyunthe "debug.kprobes_optimization" kernel parameter is set to 1 (see 176*4882a593Smuzhiyunsysctl(8)), Kprobes tries to reduce probe-hit overhead by using a jump 177*4882a593Smuzhiyuninstruction instead of a breakpoint instruction at each probepoint. 178*4882a593Smuzhiyun 179*4882a593SmuzhiyunInit a Kprobe 180*4882a593Smuzhiyun^^^^^^^^^^^^^ 181*4882a593Smuzhiyun 182*4882a593SmuzhiyunWhen a probe is registered, before attempting this optimization, 183*4882a593SmuzhiyunKprobes inserts an ordinary, breakpoint-based kprobe at the specified 184*4882a593Smuzhiyunaddress. So, even if it's not possible to optimize this particular 185*4882a593Smuzhiyunprobepoint, there'll be a probe there. 186*4882a593Smuzhiyun 187*4882a593SmuzhiyunSafety Check 188*4882a593Smuzhiyun^^^^^^^^^^^^ 189*4882a593Smuzhiyun 190*4882a593SmuzhiyunBefore optimizing a probe, Kprobes performs the following safety checks: 191*4882a593Smuzhiyun 192*4882a593Smuzhiyun- Kprobes verifies that the region that will be replaced by the jump 193*4882a593Smuzhiyun instruction (the "optimized region") lies entirely within one function. 194*4882a593Smuzhiyun (A jump instruction is multiple bytes, and so may overlay multiple 195*4882a593Smuzhiyun instructions.) 196*4882a593Smuzhiyun 197*4882a593Smuzhiyun- Kprobes analyzes the entire function and verifies that there is no 198*4882a593Smuzhiyun jump into the optimized region. Specifically: 199*4882a593Smuzhiyun 200*4882a593Smuzhiyun - the function contains no indirect jump; 201*4882a593Smuzhiyun - the function contains no instruction that causes an exception (since 202*4882a593Smuzhiyun the fixup code triggered by the exception could jump back into the 203*4882a593Smuzhiyun optimized region -- Kprobes checks the exception tables to verify this); 204*4882a593Smuzhiyun - there is no near jump to the optimized region (other than to the first 205*4882a593Smuzhiyun byte). 206*4882a593Smuzhiyun 207*4882a593Smuzhiyun- For each instruction in the optimized region, Kprobes verifies that 208*4882a593Smuzhiyun the instruction can be executed out of line. 209*4882a593Smuzhiyun 210*4882a593SmuzhiyunPreparing Detour Buffer 211*4882a593Smuzhiyun^^^^^^^^^^^^^^^^^^^^^^^ 212*4882a593Smuzhiyun 213*4882a593SmuzhiyunNext, Kprobes prepares a "detour" buffer, which contains the following 214*4882a593Smuzhiyuninstruction sequence: 215*4882a593Smuzhiyun 216*4882a593Smuzhiyun- code to push the CPU's registers (emulating a breakpoint trap) 217*4882a593Smuzhiyun- a call to the trampoline code which calls user's probe handlers. 218*4882a593Smuzhiyun- code to restore registers 219*4882a593Smuzhiyun- the instructions from the optimized region 220*4882a593Smuzhiyun- a jump back to the original execution path. 221*4882a593Smuzhiyun 222*4882a593SmuzhiyunPre-optimization 223*4882a593Smuzhiyun^^^^^^^^^^^^^^^^ 224*4882a593Smuzhiyun 225*4882a593SmuzhiyunAfter preparing the detour buffer, Kprobes verifies that none of the 226*4882a593Smuzhiyunfollowing situations exist: 227*4882a593Smuzhiyun 228*4882a593Smuzhiyun- The probe has a post_handler. 229*4882a593Smuzhiyun- Other instructions in the optimized region are probed. 230*4882a593Smuzhiyun- The probe is disabled. 231*4882a593Smuzhiyun 232*4882a593SmuzhiyunIn any of the above cases, Kprobes won't start optimizing the probe. 233*4882a593SmuzhiyunSince these are temporary situations, Kprobes tries to start 234*4882a593Smuzhiyunoptimizing it again if the situation is changed. 235*4882a593Smuzhiyun 236*4882a593SmuzhiyunIf the kprobe can be optimized, Kprobes enqueues the kprobe to an 237*4882a593Smuzhiyunoptimizing list, and kicks the kprobe-optimizer workqueue to optimize 238*4882a593Smuzhiyunit. If the to-be-optimized probepoint is hit before being optimized, 239*4882a593SmuzhiyunKprobes returns control to the original instruction path by setting 240*4882a593Smuzhiyunthe CPU's instruction pointer to the copied code in the detour buffer 241*4882a593Smuzhiyun-- thus at least avoiding the single-step. 242*4882a593Smuzhiyun 243*4882a593SmuzhiyunOptimization 244*4882a593Smuzhiyun^^^^^^^^^^^^ 245*4882a593Smuzhiyun 246*4882a593SmuzhiyunThe Kprobe-optimizer doesn't insert the jump instruction immediately; 247*4882a593Smuzhiyunrather, it calls synchronize_rcu() for safety first, because it's 248*4882a593Smuzhiyunpossible for a CPU to be interrupted in the middle of executing the 249*4882a593Smuzhiyunoptimized region [3]_. As you know, synchronize_rcu() can ensure 250*4882a593Smuzhiyunthat all interruptions that were active when synchronize_rcu() 251*4882a593Smuzhiyunwas called are done, but only if CONFIG_PREEMPT=n. So, this version 252*4882a593Smuzhiyunof kprobe optimization supports only kernels with CONFIG_PREEMPT=n [4]_. 253*4882a593Smuzhiyun 254*4882a593SmuzhiyunAfter that, the Kprobe-optimizer calls stop_machine() to replace 255*4882a593Smuzhiyunthe optimized region with a jump instruction to the detour buffer, 256*4882a593Smuzhiyunusing text_poke_smp(). 257*4882a593Smuzhiyun 258*4882a593SmuzhiyunUnoptimization 259*4882a593Smuzhiyun^^^^^^^^^^^^^^ 260*4882a593Smuzhiyun 261*4882a593SmuzhiyunWhen an optimized kprobe is unregistered, disabled, or blocked by 262*4882a593Smuzhiyunanother kprobe, it will be unoptimized. If this happens before 263*4882a593Smuzhiyunthe optimization is complete, the kprobe is just dequeued from the 264*4882a593Smuzhiyunoptimized list. If the optimization has been done, the jump is 265*4882a593Smuzhiyunreplaced with the original code (except for an int3 breakpoint in 266*4882a593Smuzhiyunthe first byte) by using text_poke_smp(). 267*4882a593Smuzhiyun 268*4882a593Smuzhiyun.. [3] Please imagine that the 2nd instruction is interrupted and then 269*4882a593Smuzhiyun the optimizer replaces the 2nd instruction with the jump *address* 270*4882a593Smuzhiyun while the interrupt handler is running. When the interrupt 271*4882a593Smuzhiyun returns to original address, there is no valid instruction, 272*4882a593Smuzhiyun and it causes an unexpected result. 273*4882a593Smuzhiyun 274*4882a593Smuzhiyun.. [4] This optimization-safety checking may be replaced with the 275*4882a593Smuzhiyun stop-machine method that ksplice uses for supporting a CONFIG_PREEMPT=y 276*4882a593Smuzhiyun kernel. 277*4882a593Smuzhiyun 278*4882a593SmuzhiyunNOTE for geeks: 279*4882a593SmuzhiyunThe jump optimization changes the kprobe's pre_handler behavior. 280*4882a593SmuzhiyunWithout optimization, the pre_handler can change the kernel's execution 281*4882a593Smuzhiyunpath by changing regs->ip and returning 1. However, when the probe 282*4882a593Smuzhiyunis optimized, that modification is ignored. Thus, if you want to 283*4882a593Smuzhiyuntweak the kernel's execution path, you need to suppress optimization, 284*4882a593Smuzhiyunusing one of the following techniques: 285*4882a593Smuzhiyun 286*4882a593Smuzhiyun- Specify an empty function for the kprobe's post_handler. 287*4882a593Smuzhiyun 288*4882a593Smuzhiyunor 289*4882a593Smuzhiyun 290*4882a593Smuzhiyun- Execute 'sysctl -w debug.kprobes_optimization=n' 291*4882a593Smuzhiyun 292*4882a593Smuzhiyun.. _kprobes_blacklist: 293*4882a593Smuzhiyun 294*4882a593SmuzhiyunBlacklist 295*4882a593Smuzhiyun--------- 296*4882a593Smuzhiyun 297*4882a593SmuzhiyunKprobes can probe most of the kernel except itself. This means 298*4882a593Smuzhiyunthat there are some functions where kprobes cannot probe. Probing 299*4882a593Smuzhiyun(trapping) such functions can cause a recursive trap (e.g. double 300*4882a593Smuzhiyunfault) or the nested probe handler may never be called. 301*4882a593SmuzhiyunKprobes manages such functions as a blacklist. 302*4882a593SmuzhiyunIf you want to add a function into the blacklist, you just need 303*4882a593Smuzhiyunto (1) include linux/kprobes.h and (2) use NOKPROBE_SYMBOL() macro 304*4882a593Smuzhiyunto specify a blacklisted function. 305*4882a593SmuzhiyunKprobes checks the given probe address against the blacklist and 306*4882a593Smuzhiyunrejects registering it, if the given address is in the blacklist. 307*4882a593Smuzhiyun 308*4882a593Smuzhiyun.. _kprobes_archs_supported: 309*4882a593Smuzhiyun 310*4882a593SmuzhiyunArchitectures Supported 311*4882a593Smuzhiyun======================= 312*4882a593Smuzhiyun 313*4882a593SmuzhiyunKprobes and return probes are implemented on the following 314*4882a593Smuzhiyunarchitectures: 315*4882a593Smuzhiyun 316*4882a593Smuzhiyun- i386 (Supports jump optimization) 317*4882a593Smuzhiyun- x86_64 (AMD-64, EM64T) (Supports jump optimization) 318*4882a593Smuzhiyun- ppc64 319*4882a593Smuzhiyun- ia64 (Does not support probes on instruction slot1.) 320*4882a593Smuzhiyun- sparc64 (Return probes not yet implemented.) 321*4882a593Smuzhiyun- arm 322*4882a593Smuzhiyun- ppc 323*4882a593Smuzhiyun- mips 324*4882a593Smuzhiyun- s390 325*4882a593Smuzhiyun- parisc 326*4882a593Smuzhiyun 327*4882a593SmuzhiyunConfiguring Kprobes 328*4882a593Smuzhiyun=================== 329*4882a593Smuzhiyun 330*4882a593SmuzhiyunWhen configuring the kernel using make menuconfig/xconfig/oldconfig, 331*4882a593Smuzhiyunensure that CONFIG_KPROBES is set to "y". Under "General setup", look 332*4882a593Smuzhiyunfor "Kprobes". 333*4882a593Smuzhiyun 334*4882a593SmuzhiyunSo that you can load and unload Kprobes-based instrumentation modules, 335*4882a593Smuzhiyunmake sure "Loadable module support" (CONFIG_MODULES) and "Module 336*4882a593Smuzhiyununloading" (CONFIG_MODULE_UNLOAD) are set to "y". 337*4882a593Smuzhiyun 338*4882a593SmuzhiyunAlso make sure that CONFIG_KALLSYMS and perhaps even CONFIG_KALLSYMS_ALL 339*4882a593Smuzhiyunare set to "y", since kallsyms_lookup_name() is used by the in-kernel 340*4882a593Smuzhiyunkprobe address resolution code. 341*4882a593Smuzhiyun 342*4882a593SmuzhiyunIf you need to insert a probe in the middle of a function, you may find 343*4882a593Smuzhiyunit useful to "Compile the kernel with debug info" (CONFIG_DEBUG_INFO), 344*4882a593Smuzhiyunso you can use "objdump -d -l vmlinux" to see the source-to-object 345*4882a593Smuzhiyuncode mapping. 346*4882a593Smuzhiyun 347*4882a593SmuzhiyunAPI Reference 348*4882a593Smuzhiyun============= 349*4882a593Smuzhiyun 350*4882a593SmuzhiyunThe Kprobes API includes a "register" function and an "unregister" 351*4882a593Smuzhiyunfunction for each type of probe. The API also includes "register_*probes" 352*4882a593Smuzhiyunand "unregister_*probes" functions for (un)registering arrays of probes. 353*4882a593SmuzhiyunHere are terse, mini-man-page specifications for these functions and 354*4882a593Smuzhiyunthe associated probe handlers that you'll write. See the files in the 355*4882a593Smuzhiyunsamples/kprobes/ sub-directory for examples. 356*4882a593Smuzhiyun 357*4882a593Smuzhiyunregister_kprobe 358*4882a593Smuzhiyun--------------- 359*4882a593Smuzhiyun 360*4882a593Smuzhiyun:: 361*4882a593Smuzhiyun 362*4882a593Smuzhiyun #include <linux/kprobes.h> 363*4882a593Smuzhiyun int register_kprobe(struct kprobe *kp); 364*4882a593Smuzhiyun 365*4882a593SmuzhiyunSets a breakpoint at the address kp->addr. When the breakpoint is 366*4882a593Smuzhiyunhit, Kprobes calls kp->pre_handler. After the probed instruction 367*4882a593Smuzhiyunis single-stepped, Kprobe calls kp->post_handler. If a fault 368*4882a593Smuzhiyunoccurs during execution of kp->pre_handler or kp->post_handler, 369*4882a593Smuzhiyunor during single-stepping of the probed instruction, Kprobes calls 370*4882a593Smuzhiyunkp->fault_handler. Any or all handlers can be NULL. If kp->flags 371*4882a593Smuzhiyunis set KPROBE_FLAG_DISABLED, that kp will be registered but disabled, 372*4882a593Smuzhiyunso, its handlers aren't hit until calling enable_kprobe(kp). 373*4882a593Smuzhiyun 374*4882a593Smuzhiyun.. note:: 375*4882a593Smuzhiyun 376*4882a593Smuzhiyun 1. With the introduction of the "symbol_name" field to struct kprobe, 377*4882a593Smuzhiyun the probepoint address resolution will now be taken care of by the kernel. 378*4882a593Smuzhiyun The following will now work:: 379*4882a593Smuzhiyun 380*4882a593Smuzhiyun kp.symbol_name = "symbol_name"; 381*4882a593Smuzhiyun 382*4882a593Smuzhiyun (64-bit powerpc intricacies such as function descriptors are handled 383*4882a593Smuzhiyun transparently) 384*4882a593Smuzhiyun 385*4882a593Smuzhiyun 2. Use the "offset" field of struct kprobe if the offset into the symbol 386*4882a593Smuzhiyun to install a probepoint is known. This field is used to calculate the 387*4882a593Smuzhiyun probepoint. 388*4882a593Smuzhiyun 389*4882a593Smuzhiyun 3. Specify either the kprobe "symbol_name" OR the "addr". If both are 390*4882a593Smuzhiyun specified, kprobe registration will fail with -EINVAL. 391*4882a593Smuzhiyun 392*4882a593Smuzhiyun 4. With CISC architectures (such as i386 and x86_64), the kprobes code 393*4882a593Smuzhiyun does not validate if the kprobe.addr is at an instruction boundary. 394*4882a593Smuzhiyun Use "offset" with caution. 395*4882a593Smuzhiyun 396*4882a593Smuzhiyunregister_kprobe() returns 0 on success, or a negative errno otherwise. 397*4882a593Smuzhiyun 398*4882a593SmuzhiyunUser's pre-handler (kp->pre_handler):: 399*4882a593Smuzhiyun 400*4882a593Smuzhiyun #include <linux/kprobes.h> 401*4882a593Smuzhiyun #include <linux/ptrace.h> 402*4882a593Smuzhiyun int pre_handler(struct kprobe *p, struct pt_regs *regs); 403*4882a593Smuzhiyun 404*4882a593SmuzhiyunCalled with p pointing to the kprobe associated with the breakpoint, 405*4882a593Smuzhiyunand regs pointing to the struct containing the registers saved when 406*4882a593Smuzhiyunthe breakpoint was hit. Return 0 here unless you're a Kprobes geek. 407*4882a593Smuzhiyun 408*4882a593SmuzhiyunUser's post-handler (kp->post_handler):: 409*4882a593Smuzhiyun 410*4882a593Smuzhiyun #include <linux/kprobes.h> 411*4882a593Smuzhiyun #include <linux/ptrace.h> 412*4882a593Smuzhiyun void post_handler(struct kprobe *p, struct pt_regs *regs, 413*4882a593Smuzhiyun unsigned long flags); 414*4882a593Smuzhiyun 415*4882a593Smuzhiyunp and regs are as described for the pre_handler. flags always seems 416*4882a593Smuzhiyunto be zero. 417*4882a593Smuzhiyun 418*4882a593SmuzhiyunUser's fault-handler (kp->fault_handler):: 419*4882a593Smuzhiyun 420*4882a593Smuzhiyun #include <linux/kprobes.h> 421*4882a593Smuzhiyun #include <linux/ptrace.h> 422*4882a593Smuzhiyun int fault_handler(struct kprobe *p, struct pt_regs *regs, int trapnr); 423*4882a593Smuzhiyun 424*4882a593Smuzhiyunp and regs are as described for the pre_handler. trapnr is the 425*4882a593Smuzhiyunarchitecture-specific trap number associated with the fault (e.g., 426*4882a593Smuzhiyunon i386, 13 for a general protection fault or 14 for a page fault). 427*4882a593SmuzhiyunReturns 1 if it successfully handled the exception. 428*4882a593Smuzhiyun 429*4882a593Smuzhiyunregister_kretprobe 430*4882a593Smuzhiyun------------------ 431*4882a593Smuzhiyun 432*4882a593Smuzhiyun:: 433*4882a593Smuzhiyun 434*4882a593Smuzhiyun #include <linux/kprobes.h> 435*4882a593Smuzhiyun int register_kretprobe(struct kretprobe *rp); 436*4882a593Smuzhiyun 437*4882a593SmuzhiyunEstablishes a return probe for the function whose address is 438*4882a593Smuzhiyunrp->kp.addr. When that function returns, Kprobes calls rp->handler. 439*4882a593SmuzhiyunYou must set rp->maxactive appropriately before you call 440*4882a593Smuzhiyunregister_kretprobe(); see "How Does a Return Probe Work?" for details. 441*4882a593Smuzhiyun 442*4882a593Smuzhiyunregister_kretprobe() returns 0 on success, or a negative errno 443*4882a593Smuzhiyunotherwise. 444*4882a593Smuzhiyun 445*4882a593SmuzhiyunUser's return-probe handler (rp->handler):: 446*4882a593Smuzhiyun 447*4882a593Smuzhiyun #include <linux/kprobes.h> 448*4882a593Smuzhiyun #include <linux/ptrace.h> 449*4882a593Smuzhiyun int kretprobe_handler(struct kretprobe_instance *ri, 450*4882a593Smuzhiyun struct pt_regs *regs); 451*4882a593Smuzhiyun 452*4882a593Smuzhiyunregs is as described for kprobe.pre_handler. ri points to the 453*4882a593Smuzhiyunkretprobe_instance object, of which the following fields may be 454*4882a593Smuzhiyunof interest: 455*4882a593Smuzhiyun 456*4882a593Smuzhiyun- ret_addr: the return address 457*4882a593Smuzhiyun- rp: points to the corresponding kretprobe object 458*4882a593Smuzhiyun- task: points to the corresponding task struct 459*4882a593Smuzhiyun- data: points to per return-instance private data; see "Kretprobe 460*4882a593Smuzhiyun entry-handler" for details. 461*4882a593Smuzhiyun 462*4882a593SmuzhiyunThe regs_return_value(regs) macro provides a simple abstraction to 463*4882a593Smuzhiyunextract the return value from the appropriate register as defined by 464*4882a593Smuzhiyunthe architecture's ABI. 465*4882a593Smuzhiyun 466*4882a593SmuzhiyunThe handler's return value is currently ignored. 467*4882a593Smuzhiyun 468*4882a593Smuzhiyununregister_*probe 469*4882a593Smuzhiyun------------------ 470*4882a593Smuzhiyun 471*4882a593Smuzhiyun:: 472*4882a593Smuzhiyun 473*4882a593Smuzhiyun #include <linux/kprobes.h> 474*4882a593Smuzhiyun void unregister_kprobe(struct kprobe *kp); 475*4882a593Smuzhiyun void unregister_kretprobe(struct kretprobe *rp); 476*4882a593Smuzhiyun 477*4882a593SmuzhiyunRemoves the specified probe. The unregister function can be called 478*4882a593Smuzhiyunat any time after the probe has been registered. 479*4882a593Smuzhiyun 480*4882a593Smuzhiyun.. note:: 481*4882a593Smuzhiyun 482*4882a593Smuzhiyun If the functions find an incorrect probe (ex. an unregistered probe), 483*4882a593Smuzhiyun they clear the addr field of the probe. 484*4882a593Smuzhiyun 485*4882a593Smuzhiyunregister_*probes 486*4882a593Smuzhiyun---------------- 487*4882a593Smuzhiyun 488*4882a593Smuzhiyun:: 489*4882a593Smuzhiyun 490*4882a593Smuzhiyun #include <linux/kprobes.h> 491*4882a593Smuzhiyun int register_kprobes(struct kprobe **kps, int num); 492*4882a593Smuzhiyun int register_kretprobes(struct kretprobe **rps, int num); 493*4882a593Smuzhiyun 494*4882a593SmuzhiyunRegisters each of the num probes in the specified array. If any 495*4882a593Smuzhiyunerror occurs during registration, all probes in the array, up to 496*4882a593Smuzhiyunthe bad probe, are safely unregistered before the register_*probes 497*4882a593Smuzhiyunfunction returns. 498*4882a593Smuzhiyun 499*4882a593Smuzhiyun- kps/rps: an array of pointers to ``*probe`` data structures 500*4882a593Smuzhiyun- num: the number of the array entries. 501*4882a593Smuzhiyun 502*4882a593Smuzhiyun.. note:: 503*4882a593Smuzhiyun 504*4882a593Smuzhiyun You have to allocate(or define) an array of pointers and set all 505*4882a593Smuzhiyun of the array entries before using these functions. 506*4882a593Smuzhiyun 507*4882a593Smuzhiyununregister_*probes 508*4882a593Smuzhiyun------------------ 509*4882a593Smuzhiyun 510*4882a593Smuzhiyun:: 511*4882a593Smuzhiyun 512*4882a593Smuzhiyun #include <linux/kprobes.h> 513*4882a593Smuzhiyun void unregister_kprobes(struct kprobe **kps, int num); 514*4882a593Smuzhiyun void unregister_kretprobes(struct kretprobe **rps, int num); 515*4882a593Smuzhiyun 516*4882a593SmuzhiyunRemoves each of the num probes in the specified array at once. 517*4882a593Smuzhiyun 518*4882a593Smuzhiyun.. note:: 519*4882a593Smuzhiyun 520*4882a593Smuzhiyun If the functions find some incorrect probes (ex. unregistered 521*4882a593Smuzhiyun probes) in the specified array, they clear the addr field of those 522*4882a593Smuzhiyun incorrect probes. However, other probes in the array are 523*4882a593Smuzhiyun unregistered correctly. 524*4882a593Smuzhiyun 525*4882a593Smuzhiyundisable_*probe 526*4882a593Smuzhiyun-------------- 527*4882a593Smuzhiyun 528*4882a593Smuzhiyun:: 529*4882a593Smuzhiyun 530*4882a593Smuzhiyun #include <linux/kprobes.h> 531*4882a593Smuzhiyun int disable_kprobe(struct kprobe *kp); 532*4882a593Smuzhiyun int disable_kretprobe(struct kretprobe *rp); 533*4882a593Smuzhiyun 534*4882a593SmuzhiyunTemporarily disables the specified ``*probe``. You can enable it again by using 535*4882a593Smuzhiyunenable_*probe(). You must specify the probe which has been registered. 536*4882a593Smuzhiyun 537*4882a593Smuzhiyunenable_*probe 538*4882a593Smuzhiyun------------- 539*4882a593Smuzhiyun 540*4882a593Smuzhiyun:: 541*4882a593Smuzhiyun 542*4882a593Smuzhiyun #include <linux/kprobes.h> 543*4882a593Smuzhiyun int enable_kprobe(struct kprobe *kp); 544*4882a593Smuzhiyun int enable_kretprobe(struct kretprobe *rp); 545*4882a593Smuzhiyun 546*4882a593SmuzhiyunEnables ``*probe`` which has been disabled by disable_*probe(). You must specify 547*4882a593Smuzhiyunthe probe which has been registered. 548*4882a593Smuzhiyun 549*4882a593SmuzhiyunKprobes Features and Limitations 550*4882a593Smuzhiyun================================ 551*4882a593Smuzhiyun 552*4882a593SmuzhiyunKprobes allows multiple probes at the same address. Also, 553*4882a593Smuzhiyuna probepoint for which there is a post_handler cannot be optimized. 554*4882a593SmuzhiyunSo if you install a kprobe with a post_handler, at an optimized 555*4882a593Smuzhiyunprobepoint, the probepoint will be unoptimized automatically. 556*4882a593Smuzhiyun 557*4882a593SmuzhiyunIn general, you can install a probe anywhere in the kernel. 558*4882a593SmuzhiyunIn particular, you can probe interrupt handlers. Known exceptions 559*4882a593Smuzhiyunare discussed in this section. 560*4882a593Smuzhiyun 561*4882a593SmuzhiyunThe register_*probe functions will return -EINVAL if you attempt 562*4882a593Smuzhiyunto install a probe in the code that implements Kprobes (mostly 563*4882a593Smuzhiyunkernel/kprobes.c and ``arch/*/kernel/kprobes.c``, but also functions such 564*4882a593Smuzhiyunas do_page_fault and notifier_call_chain). 565*4882a593Smuzhiyun 566*4882a593SmuzhiyunIf you install a probe in an inline-able function, Kprobes makes 567*4882a593Smuzhiyunno attempt to chase down all inline instances of the function and 568*4882a593Smuzhiyuninstall probes there. gcc may inline a function without being asked, 569*4882a593Smuzhiyunso keep this in mind if you're not seeing the probe hits you expect. 570*4882a593Smuzhiyun 571*4882a593SmuzhiyunA probe handler can modify the environment of the probed function 572*4882a593Smuzhiyun-- e.g., by modifying kernel data structures, or by modifying the 573*4882a593Smuzhiyuncontents of the pt_regs struct (which are restored to the registers 574*4882a593Smuzhiyunupon return from the breakpoint). So Kprobes can be used, for example, 575*4882a593Smuzhiyunto install a bug fix or to inject faults for testing. Kprobes, of 576*4882a593Smuzhiyuncourse, has no way to distinguish the deliberately injected faults 577*4882a593Smuzhiyunfrom the accidental ones. Don't drink and probe. 578*4882a593Smuzhiyun 579*4882a593SmuzhiyunKprobes makes no attempt to prevent probe handlers from stepping on 580*4882a593Smuzhiyuneach other -- e.g., probing printk() and then calling printk() from a 581*4882a593Smuzhiyunprobe handler. If a probe handler hits a probe, that second probe's 582*4882a593Smuzhiyunhandlers won't be run in that instance, and the kprobe.nmissed member 583*4882a593Smuzhiyunof the second probe will be incremented. 584*4882a593Smuzhiyun 585*4882a593SmuzhiyunAs of Linux v2.6.15-rc1, multiple handlers (or multiple instances of 586*4882a593Smuzhiyunthe same handler) may run concurrently on different CPUs. 587*4882a593Smuzhiyun 588*4882a593SmuzhiyunKprobes does not use mutexes or allocate memory except during 589*4882a593Smuzhiyunregistration and unregistration. 590*4882a593Smuzhiyun 591*4882a593SmuzhiyunProbe handlers are run with preemption disabled or interrupt disabled, 592*4882a593Smuzhiyunwhich depends on the architecture and optimization state. (e.g., 593*4882a593Smuzhiyunkretprobe handlers and optimized kprobe handlers run without interrupt 594*4882a593Smuzhiyundisabled on x86/x86-64). In any case, your handler should not yield 595*4882a593Smuzhiyunthe CPU (e.g., by attempting to acquire a semaphore, or waiting I/O). 596*4882a593Smuzhiyun 597*4882a593SmuzhiyunSince a return probe is implemented by replacing the return 598*4882a593Smuzhiyunaddress with the trampoline's address, stack backtraces and calls 599*4882a593Smuzhiyunto __builtin_return_address() will typically yield the trampoline's 600*4882a593Smuzhiyunaddress instead of the real return address for kretprobed functions. 601*4882a593Smuzhiyun(As far as we can tell, __builtin_return_address() is used only 602*4882a593Smuzhiyunfor instrumentation and error reporting.) 603*4882a593Smuzhiyun 604*4882a593SmuzhiyunIf the number of times a function is called does not match the number 605*4882a593Smuzhiyunof times it returns, registering a return probe on that function may 606*4882a593Smuzhiyunproduce undesirable results. In such a case, a line: 607*4882a593Smuzhiyunkretprobe BUG!: Processing kretprobe d000000000041aa8 @ c00000000004f48c 608*4882a593Smuzhiyungets printed. With this information, one will be able to correlate the 609*4882a593Smuzhiyunexact instance of the kretprobe that caused the problem. We have the 610*4882a593Smuzhiyundo_exit() case covered. do_execve() and do_fork() are not an issue. 611*4882a593SmuzhiyunWe're unaware of other specific cases where this could be a problem. 612*4882a593Smuzhiyun 613*4882a593SmuzhiyunIf, upon entry to or exit from a function, the CPU is running on 614*4882a593Smuzhiyuna stack other than that of the current task, registering a return 615*4882a593Smuzhiyunprobe on that function may produce undesirable results. For this 616*4882a593Smuzhiyunreason, Kprobes doesn't support return probes (or kprobes) 617*4882a593Smuzhiyunon the x86_64 version of __switch_to(); the registration functions 618*4882a593Smuzhiyunreturn -EINVAL. 619*4882a593Smuzhiyun 620*4882a593SmuzhiyunOn x86/x86-64, since the Jump Optimization of Kprobes modifies 621*4882a593Smuzhiyuninstructions widely, there are some limitations to optimization. To 622*4882a593Smuzhiyunexplain it, we introduce some terminology. Imagine a 3-instruction 623*4882a593Smuzhiyunsequence consisting of a two 2-byte instructions and one 3-byte 624*4882a593Smuzhiyuninstruction. 625*4882a593Smuzhiyun 626*4882a593Smuzhiyun:: 627*4882a593Smuzhiyun 628*4882a593Smuzhiyun IA 629*4882a593Smuzhiyun | 630*4882a593Smuzhiyun [-2][-1][0][1][2][3][4][5][6][7] 631*4882a593Smuzhiyun [ins1][ins2][ ins3 ] 632*4882a593Smuzhiyun [<- DCR ->] 633*4882a593Smuzhiyun [<- JTPR ->] 634*4882a593Smuzhiyun 635*4882a593Smuzhiyun ins1: 1st Instruction 636*4882a593Smuzhiyun ins2: 2nd Instruction 637*4882a593Smuzhiyun ins3: 3rd Instruction 638*4882a593Smuzhiyun IA: Insertion Address 639*4882a593Smuzhiyun JTPR: Jump Target Prohibition Region 640*4882a593Smuzhiyun DCR: Detoured Code Region 641*4882a593Smuzhiyun 642*4882a593SmuzhiyunThe instructions in DCR are copied to the out-of-line buffer 643*4882a593Smuzhiyunof the kprobe, because the bytes in DCR are replaced by 644*4882a593Smuzhiyuna 5-byte jump instruction. So there are several limitations. 645*4882a593Smuzhiyun 646*4882a593Smuzhiyuna) The instructions in DCR must be relocatable. 647*4882a593Smuzhiyunb) The instructions in DCR must not include a call instruction. 648*4882a593Smuzhiyunc) JTPR must not be targeted by any jump or call instruction. 649*4882a593Smuzhiyund) DCR must not straddle the border between functions. 650*4882a593Smuzhiyun 651*4882a593SmuzhiyunAnyway, these limitations are checked by the in-kernel instruction 652*4882a593Smuzhiyundecoder, so you don't need to worry about that. 653*4882a593Smuzhiyun 654*4882a593SmuzhiyunProbe Overhead 655*4882a593Smuzhiyun============== 656*4882a593Smuzhiyun 657*4882a593SmuzhiyunOn a typical CPU in use in 2005, a kprobe hit takes 0.5 to 1.0 658*4882a593Smuzhiyunmicroseconds to process. Specifically, a benchmark that hits the same 659*4882a593Smuzhiyunprobepoint repeatedly, firing a simple handler each time, reports 1-2 660*4882a593Smuzhiyunmillion hits per second, depending on the architecture. A return-probe 661*4882a593Smuzhiyunhit typically takes 50-75% longer than a kprobe hit. 662*4882a593SmuzhiyunWhen you have a return probe set on a function, adding a kprobe at 663*4882a593Smuzhiyunthe entry to that function adds essentially no overhead. 664*4882a593Smuzhiyun 665*4882a593SmuzhiyunHere are sample overhead figures (in usec) for different architectures:: 666*4882a593Smuzhiyun 667*4882a593Smuzhiyun k = kprobe; r = return probe; kr = kprobe + return probe 668*4882a593Smuzhiyun on same function 669*4882a593Smuzhiyun 670*4882a593Smuzhiyun i386: Intel Pentium M, 1495 MHz, 2957.31 bogomips 671*4882a593Smuzhiyun k = 0.57 usec; r = 0.92; kr = 0.99 672*4882a593Smuzhiyun 673*4882a593Smuzhiyun x86_64: AMD Opteron 246, 1994 MHz, 3971.48 bogomips 674*4882a593Smuzhiyun k = 0.49 usec; r = 0.80; kr = 0.82 675*4882a593Smuzhiyun 676*4882a593Smuzhiyun ppc64: POWER5 (gr), 1656 MHz (SMT disabled, 1 virtual CPU per physical CPU) 677*4882a593Smuzhiyun k = 0.77 usec; r = 1.26; kr = 1.45 678*4882a593Smuzhiyun 679*4882a593SmuzhiyunOptimized Probe Overhead 680*4882a593Smuzhiyun------------------------ 681*4882a593Smuzhiyun 682*4882a593SmuzhiyunTypically, an optimized kprobe hit takes 0.07 to 0.1 microseconds to 683*4882a593Smuzhiyunprocess. Here are sample overhead figures (in usec) for x86 architectures:: 684*4882a593Smuzhiyun 685*4882a593Smuzhiyun k = unoptimized kprobe, b = boosted (single-step skipped), o = optimized kprobe, 686*4882a593Smuzhiyun r = unoptimized kretprobe, rb = boosted kretprobe, ro = optimized kretprobe. 687*4882a593Smuzhiyun 688*4882a593Smuzhiyun i386: Intel(R) Xeon(R) E5410, 2.33GHz, 4656.90 bogomips 689*4882a593Smuzhiyun k = 0.80 usec; b = 0.33; o = 0.05; r = 1.10; rb = 0.61; ro = 0.33 690*4882a593Smuzhiyun 691*4882a593Smuzhiyun x86-64: Intel(R) Xeon(R) E5410, 2.33GHz, 4656.90 bogomips 692*4882a593Smuzhiyun k = 0.99 usec; b = 0.43; o = 0.06; r = 1.24; rb = 0.68; ro = 0.30 693*4882a593Smuzhiyun 694*4882a593SmuzhiyunTODO 695*4882a593Smuzhiyun==== 696*4882a593Smuzhiyun 697*4882a593Smuzhiyuna. SystemTap (http://sourceware.org/systemtap): Provides a simplified 698*4882a593Smuzhiyun programming interface for probe-based instrumentation. Try it out. 699*4882a593Smuzhiyunb. Kernel return probes for sparc64. 700*4882a593Smuzhiyunc. Support for other architectures. 701*4882a593Smuzhiyund. User-space probes. 702*4882a593Smuzhiyune. Watchpoint probes (which fire on data references). 703*4882a593Smuzhiyun 704*4882a593SmuzhiyunKprobes Example 705*4882a593Smuzhiyun=============== 706*4882a593Smuzhiyun 707*4882a593SmuzhiyunSee samples/kprobes/kprobe_example.c 708*4882a593Smuzhiyun 709*4882a593SmuzhiyunKretprobes Example 710*4882a593Smuzhiyun================== 711*4882a593Smuzhiyun 712*4882a593SmuzhiyunSee samples/kprobes/kretprobe_example.c 713*4882a593Smuzhiyun 714*4882a593SmuzhiyunDeprecated Features 715*4882a593Smuzhiyun=================== 716*4882a593Smuzhiyun 717*4882a593SmuzhiyunJprobes is now a deprecated feature. People who are depending on it should 718*4882a593Smuzhiyunmigrate to other tracing features or use older kernels. Please consider to 719*4882a593Smuzhiyunmigrate your tool to one of the following options: 720*4882a593Smuzhiyun 721*4882a593Smuzhiyun- Use trace-event to trace target function with arguments. 722*4882a593Smuzhiyun 723*4882a593Smuzhiyun trace-event is a low-overhead (and almost no visible overhead if it 724*4882a593Smuzhiyun is off) statically defined event interface. You can define new events 725*4882a593Smuzhiyun and trace it via ftrace or any other tracing tools. 726*4882a593Smuzhiyun 727*4882a593Smuzhiyun See the following urls: 728*4882a593Smuzhiyun 729*4882a593Smuzhiyun - https://lwn.net/Articles/379903/ 730*4882a593Smuzhiyun - https://lwn.net/Articles/381064/ 731*4882a593Smuzhiyun - https://lwn.net/Articles/383362/ 732*4882a593Smuzhiyun 733*4882a593Smuzhiyun- Use ftrace dynamic events (kprobe event) with perf-probe. 734*4882a593Smuzhiyun 735*4882a593Smuzhiyun If you build your kernel with debug info (CONFIG_DEBUG_INFO=y), you can 736*4882a593Smuzhiyun find which register/stack is assigned to which local variable or arguments 737*4882a593Smuzhiyun by using perf-probe and set up new event to trace it. 738*4882a593Smuzhiyun 739*4882a593Smuzhiyun See following documents: 740*4882a593Smuzhiyun 741*4882a593Smuzhiyun - Documentation/trace/kprobetrace.rst 742*4882a593Smuzhiyun - Documentation/trace/events.rst 743*4882a593Smuzhiyun - tools/perf/Documentation/perf-probe.txt 744*4882a593Smuzhiyun 745*4882a593Smuzhiyun 746*4882a593SmuzhiyunThe kprobes debugfs interface 747*4882a593Smuzhiyun============================= 748*4882a593Smuzhiyun 749*4882a593Smuzhiyun 750*4882a593SmuzhiyunWith recent kernels (> 2.6.20) the list of registered kprobes is visible 751*4882a593Smuzhiyununder the /sys/kernel/debug/kprobes/ directory (assuming debugfs is mounted at //sys/kernel/debug). 752*4882a593Smuzhiyun 753*4882a593Smuzhiyun/sys/kernel/debug/kprobes/list: Lists all registered probes on the system:: 754*4882a593Smuzhiyun 755*4882a593Smuzhiyun c015d71a k vfs_read+0x0 756*4882a593Smuzhiyun c03dedc5 r tcp_v4_rcv+0x0 757*4882a593Smuzhiyun 758*4882a593SmuzhiyunThe first column provides the kernel address where the probe is inserted. 759*4882a593SmuzhiyunThe second column identifies the type of probe (k - kprobe and r - kretprobe) 760*4882a593Smuzhiyunwhile the third column specifies the symbol+offset of the probe. 761*4882a593SmuzhiyunIf the probed function belongs to a module, the module name is also 762*4882a593Smuzhiyunspecified. Following columns show probe status. If the probe is on 763*4882a593Smuzhiyuna virtual address that is no longer valid (module init sections, module 764*4882a593Smuzhiyunvirtual addresses that correspond to modules that've been unloaded), 765*4882a593Smuzhiyunsuch probes are marked with [GONE]. If the probe is temporarily disabled, 766*4882a593Smuzhiyunsuch probes are marked with [DISABLED]. If the probe is optimized, it is 767*4882a593Smuzhiyunmarked with [OPTIMIZED]. If the probe is ftrace-based, it is marked with 768*4882a593Smuzhiyun[FTRACE]. 769*4882a593Smuzhiyun 770*4882a593Smuzhiyun/sys/kernel/debug/kprobes/enabled: Turn kprobes ON/OFF forcibly. 771*4882a593Smuzhiyun 772*4882a593SmuzhiyunProvides a knob to globally and forcibly turn registered kprobes ON or OFF. 773*4882a593SmuzhiyunBy default, all kprobes are enabled. By echoing "0" to this file, all 774*4882a593Smuzhiyunregistered probes will be disarmed, till such time a "1" is echoed to this 775*4882a593Smuzhiyunfile. Note that this knob just disarms and arms all kprobes and doesn't 776*4882a593Smuzhiyunchange each probe's disabling state. This means that disabled kprobes (marked 777*4882a593Smuzhiyun[DISABLED]) will be not enabled if you turn ON all kprobes by this knob. 778*4882a593Smuzhiyun 779*4882a593Smuzhiyun 780*4882a593SmuzhiyunThe kprobes sysctl interface 781*4882a593Smuzhiyun============================ 782*4882a593Smuzhiyun 783*4882a593Smuzhiyun/proc/sys/debug/kprobes-optimization: Turn kprobes optimization ON/OFF. 784*4882a593Smuzhiyun 785*4882a593SmuzhiyunWhen CONFIG_OPTPROBES=y, this sysctl interface appears and it provides 786*4882a593Smuzhiyuna knob to globally and forcibly turn jump optimization (see section 787*4882a593Smuzhiyun:ref:`kprobes_jump_optimization`) ON or OFF. By default, jump optimization 788*4882a593Smuzhiyunis allowed (ON). If you echo "0" to this file or set 789*4882a593Smuzhiyun"debug.kprobes_optimization" to 0 via sysctl, all optimized probes will be 790*4882a593Smuzhiyununoptimized, and any new probes registered after that will not be optimized. 791*4882a593Smuzhiyun 792*4882a593SmuzhiyunNote that this knob *changes* the optimized state. This means that optimized 793*4882a593Smuzhiyunprobes (marked [OPTIMIZED]) will be unoptimized ([OPTIMIZED] tag will be 794*4882a593Smuzhiyunremoved). If the knob is turned on, they will be optimized again. 795*4882a593Smuzhiyun 796*4882a593SmuzhiyunReferences 797*4882a593Smuzhiyun========== 798*4882a593Smuzhiyun 799*4882a593SmuzhiyunFor additional information on Kprobes, refer to the following URLs: 800*4882a593Smuzhiyun 801*4882a593Smuzhiyun- https://www.ibm.com/developerworks/library/l-kprobes/index.html 802*4882a593Smuzhiyun- https://www.kernel.org/doc/ols/2006/ols2006v2-pages-109-124.pdf 803*4882a593Smuzhiyun 804