1*4882a593Smuzhiyun=================================== 2*4882a593SmuzhiyunIn-kernel memory-mapped I/O tracing 3*4882a593Smuzhiyun=================================== 4*4882a593Smuzhiyun 5*4882a593Smuzhiyun 6*4882a593SmuzhiyunHome page and links to optional user space tools: 7*4882a593Smuzhiyun 8*4882a593Smuzhiyun https://nouveau.freedesktop.org/wiki/MmioTrace 9*4882a593Smuzhiyun 10*4882a593SmuzhiyunMMIO tracing was originally developed by Intel around 2003 for their Fault 11*4882a593SmuzhiyunInjection Test Harness. In Dec 2006 - Jan 2007, using the code from Intel, 12*4882a593SmuzhiyunJeff Muizelaar created a tool for tracing MMIO accesses with the Nouveau 13*4882a593Smuzhiyunproject in mind. Since then many people have contributed. 14*4882a593Smuzhiyun 15*4882a593SmuzhiyunMmiotrace was built for reverse engineering any memory-mapped IO device with 16*4882a593Smuzhiyunthe Nouveau project as the first real user. Only x86 and x86_64 architectures 17*4882a593Smuzhiyunare supported. 18*4882a593Smuzhiyun 19*4882a593SmuzhiyunOut-of-tree mmiotrace was originally modified for mainline inclusion and 20*4882a593Smuzhiyunftrace framework by Pekka Paalanen <pq@iki.fi>. 21*4882a593Smuzhiyun 22*4882a593Smuzhiyun 23*4882a593SmuzhiyunPreparation 24*4882a593Smuzhiyun----------- 25*4882a593Smuzhiyun 26*4882a593SmuzhiyunMmiotrace feature is compiled in by the CONFIG_MMIOTRACE option. Tracing is 27*4882a593Smuzhiyundisabled by default, so it is safe to have this set to yes. SMP systems are 28*4882a593Smuzhiyunsupported, but tracing is unreliable and may miss events if more than one CPU 29*4882a593Smuzhiyunis on-line, therefore mmiotrace takes all but one CPU off-line during run-time 30*4882a593Smuzhiyunactivation. You can re-enable CPUs by hand, but you have been warned, there 31*4882a593Smuzhiyunis no way to automatically detect if you are losing events due to CPUs racing. 32*4882a593Smuzhiyun 33*4882a593Smuzhiyun 34*4882a593SmuzhiyunUsage Quick Reference 35*4882a593Smuzhiyun--------------------- 36*4882a593Smuzhiyun:: 37*4882a593Smuzhiyun 38*4882a593Smuzhiyun $ mount -t debugfs debugfs /sys/kernel/debug 39*4882a593Smuzhiyun $ echo mmiotrace > /sys/kernel/debug/tracing/current_tracer 40*4882a593Smuzhiyun $ cat /sys/kernel/debug/tracing/trace_pipe > mydump.txt & 41*4882a593Smuzhiyun Start X or whatever. 42*4882a593Smuzhiyun $ echo "X is up" > /sys/kernel/debug/tracing/trace_marker 43*4882a593Smuzhiyun $ echo nop > /sys/kernel/debug/tracing/current_tracer 44*4882a593Smuzhiyun Check for lost events. 45*4882a593Smuzhiyun 46*4882a593Smuzhiyun 47*4882a593SmuzhiyunUsage 48*4882a593Smuzhiyun----- 49*4882a593Smuzhiyun 50*4882a593SmuzhiyunMake sure debugfs is mounted to /sys/kernel/debug. 51*4882a593SmuzhiyunIf not (requires root privileges):: 52*4882a593Smuzhiyun 53*4882a593Smuzhiyun $ mount -t debugfs debugfs /sys/kernel/debug 54*4882a593Smuzhiyun 55*4882a593SmuzhiyunCheck that the driver you are about to trace is not loaded. 56*4882a593Smuzhiyun 57*4882a593SmuzhiyunActivate mmiotrace (requires root privileges):: 58*4882a593Smuzhiyun 59*4882a593Smuzhiyun $ echo mmiotrace > /sys/kernel/debug/tracing/current_tracer 60*4882a593Smuzhiyun 61*4882a593SmuzhiyunStart storing the trace:: 62*4882a593Smuzhiyun 63*4882a593Smuzhiyun $ cat /sys/kernel/debug/tracing/trace_pipe > mydump.txt & 64*4882a593Smuzhiyun 65*4882a593SmuzhiyunThe 'cat' process should stay running (sleeping) in the background. 66*4882a593Smuzhiyun 67*4882a593SmuzhiyunLoad the driver you want to trace and use it. Mmiotrace will only catch MMIO 68*4882a593Smuzhiyunaccesses to areas that are ioremapped while mmiotrace is active. 69*4882a593Smuzhiyun 70*4882a593SmuzhiyunDuring tracing you can place comments (markers) into the trace by 71*4882a593Smuzhiyun$ echo "X is up" > /sys/kernel/debug/tracing/trace_marker 72*4882a593SmuzhiyunThis makes it easier to see which part of the (huge) trace corresponds to 73*4882a593Smuzhiyunwhich action. It is recommended to place descriptive markers about what you 74*4882a593Smuzhiyundo. 75*4882a593Smuzhiyun 76*4882a593SmuzhiyunShut down mmiotrace (requires root privileges):: 77*4882a593Smuzhiyun 78*4882a593Smuzhiyun $ echo nop > /sys/kernel/debug/tracing/current_tracer 79*4882a593Smuzhiyun 80*4882a593SmuzhiyunThe 'cat' process exits. If it does not, kill it by issuing 'fg' command and 81*4882a593Smuzhiyunpressing ctrl+c. 82*4882a593Smuzhiyun 83*4882a593SmuzhiyunCheck that mmiotrace did not lose events due to a buffer filling up. Either:: 84*4882a593Smuzhiyun 85*4882a593Smuzhiyun $ grep -i lost mydump.txt 86*4882a593Smuzhiyun 87*4882a593Smuzhiyunwhich tells you exactly how many events were lost, or use:: 88*4882a593Smuzhiyun 89*4882a593Smuzhiyun $ dmesg 90*4882a593Smuzhiyun 91*4882a593Smuzhiyunto view your kernel log and look for "mmiotrace has lost events" warning. If 92*4882a593Smuzhiyunevents were lost, the trace is incomplete. You should enlarge the buffers and 93*4882a593Smuzhiyuntry again. Buffers are enlarged by first seeing how large the current buffers 94*4882a593Smuzhiyunare:: 95*4882a593Smuzhiyun 96*4882a593Smuzhiyun $ cat /sys/kernel/debug/tracing/buffer_size_kb 97*4882a593Smuzhiyun 98*4882a593Smuzhiyungives you a number. Approximately double this number and write it back, for 99*4882a593Smuzhiyuninstance:: 100*4882a593Smuzhiyun 101*4882a593Smuzhiyun $ echo 128000 > /sys/kernel/debug/tracing/buffer_size_kb 102*4882a593Smuzhiyun 103*4882a593SmuzhiyunThen start again from the top. 104*4882a593Smuzhiyun 105*4882a593SmuzhiyunIf you are doing a trace for a driver project, e.g. Nouveau, you should also 106*4882a593Smuzhiyundo the following before sending your results:: 107*4882a593Smuzhiyun 108*4882a593Smuzhiyun $ lspci -vvv > lspci.txt 109*4882a593Smuzhiyun $ dmesg > dmesg.txt 110*4882a593Smuzhiyun $ tar zcf pciid-nick-mmiotrace.tar.gz mydump.txt lspci.txt dmesg.txt 111*4882a593Smuzhiyun 112*4882a593Smuzhiyunand then send the .tar.gz file. The trace compresses considerably. Replace 113*4882a593Smuzhiyun"pciid" and "nick" with the PCI ID or model name of your piece of hardware 114*4882a593Smuzhiyununder investigation and your nickname. 115*4882a593Smuzhiyun 116*4882a593Smuzhiyun 117*4882a593SmuzhiyunHow Mmiotrace Works 118*4882a593Smuzhiyun------------------- 119*4882a593Smuzhiyun 120*4882a593SmuzhiyunAccess to hardware IO-memory is gained by mapping addresses from PCI bus by 121*4882a593Smuzhiyuncalling one of the ioremap_*() functions. Mmiotrace is hooked into the 122*4882a593Smuzhiyun__ioremap() function and gets called whenever a mapping is created. Mapping is 123*4882a593Smuzhiyunan event that is recorded into the trace log. Note that ISA range mappings 124*4882a593Smuzhiyunare not caught, since the mapping always exists and is returned directly. 125*4882a593Smuzhiyun 126*4882a593SmuzhiyunMMIO accesses are recorded via page faults. Just before __ioremap() returns, 127*4882a593Smuzhiyunthe mapped pages are marked as not present. Any access to the pages causes a 128*4882a593Smuzhiyunfault. The page fault handler calls mmiotrace to handle the fault. Mmiotrace 129*4882a593Smuzhiyunmarks the page present, sets TF flag to achieve single stepping and exits the 130*4882a593Smuzhiyunfault handler. The instruction that faulted is executed and debug trap is 131*4882a593Smuzhiyunentered. Here mmiotrace again marks the page as not present. The instruction 132*4882a593Smuzhiyunis decoded to get the type of operation (read/write), data width and the value 133*4882a593Smuzhiyunread or written. These are stored to the trace log. 134*4882a593Smuzhiyun 135*4882a593SmuzhiyunSetting the page present in the page fault handler has a race condition on SMP 136*4882a593Smuzhiyunmachines. During the single stepping other CPUs may run freely on that page 137*4882a593Smuzhiyunand events can be missed without a notice. Re-enabling other CPUs during 138*4882a593Smuzhiyuntracing is discouraged. 139*4882a593Smuzhiyun 140*4882a593Smuzhiyun 141*4882a593SmuzhiyunTrace Log Format 142*4882a593Smuzhiyun---------------- 143*4882a593Smuzhiyun 144*4882a593SmuzhiyunThe raw log is text and easily filtered with e.g. grep and awk. One record is 145*4882a593Smuzhiyunone line in the log. A record starts with a keyword, followed by keyword- 146*4882a593Smuzhiyundependent arguments. Arguments are separated by a space, or continue until the 147*4882a593Smuzhiyunend of line. The format for version 20070824 is as follows: 148*4882a593Smuzhiyun 149*4882a593SmuzhiyunExplanation Keyword Space-separated arguments 150*4882a593Smuzhiyun--------------------------------------------------------------------------- 151*4882a593Smuzhiyun 152*4882a593Smuzhiyunread event R width, timestamp, map id, physical, value, PC, PID 153*4882a593Smuzhiyunwrite event W width, timestamp, map id, physical, value, PC, PID 154*4882a593Smuzhiyunioremap event MAP timestamp, map id, physical, virtual, length, PC, PID 155*4882a593Smuzhiyuniounmap event UNMAP timestamp, map id, PC, PID 156*4882a593Smuzhiyunmarker MARK timestamp, text 157*4882a593Smuzhiyunversion VERSION the string "20070824" 158*4882a593Smuzhiyuninfo for reader LSPCI one line from lspci -v 159*4882a593SmuzhiyunPCI address map PCIDEV space-separated /proc/bus/pci/devices data 160*4882a593Smuzhiyununk. opcode UNKNOWN timestamp, map id, physical, data, PC, PID 161*4882a593Smuzhiyun 162*4882a593SmuzhiyunTimestamp is in seconds with decimals. Physical is a PCI bus address, virtual 163*4882a593Smuzhiyunis a kernel virtual address. Width is the data width in bytes and value is the 164*4882a593Smuzhiyundata value. Map id is an arbitrary id number identifying the mapping that was 165*4882a593Smuzhiyunused in an operation. PC is the program counter and PID is process id. PC is 166*4882a593Smuzhiyunzero if it is not recorded. PID is always zero as tracing MMIO accesses 167*4882a593Smuzhiyunoriginating in user space memory is not yet supported. 168*4882a593Smuzhiyun 169*4882a593SmuzhiyunFor instance, the following awk filter will pass all 32-bit writes that target 170*4882a593Smuzhiyunphysical addresses in the range [0xfb73ce40, 0xfb800000] 171*4882a593Smuzhiyun:: 172*4882a593Smuzhiyun 173*4882a593Smuzhiyun $ awk '/W 4 / { adr=strtonum($5); if (adr >= 0xfb73ce40 && 174*4882a593Smuzhiyun adr < 0xfb800000) print; }' 175*4882a593Smuzhiyun 176*4882a593Smuzhiyun 177*4882a593SmuzhiyunTools for Developers 178*4882a593Smuzhiyun-------------------- 179*4882a593Smuzhiyun 180*4882a593SmuzhiyunThe user space tools include utilities for: 181*4882a593Smuzhiyun - replacing numeric addresses and values with hardware register names 182*4882a593Smuzhiyun - replaying MMIO logs, i.e., re-executing the recorded writes 183*4882a593Smuzhiyun 184*4882a593Smuzhiyun 185