xref: /OK3568_Linux_fs/kernel/Documentation/trace/mmiotrace.rst (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593Smuzhiyun===================================
2*4882a593SmuzhiyunIn-kernel memory-mapped I/O tracing
3*4882a593Smuzhiyun===================================
4*4882a593Smuzhiyun
5*4882a593Smuzhiyun
6*4882a593SmuzhiyunHome page and links to optional user space tools:
7*4882a593Smuzhiyun
8*4882a593Smuzhiyun	https://nouveau.freedesktop.org/wiki/MmioTrace
9*4882a593Smuzhiyun
10*4882a593SmuzhiyunMMIO tracing was originally developed by Intel around 2003 for their Fault
11*4882a593SmuzhiyunInjection Test Harness. In Dec 2006 - Jan 2007, using the code from Intel,
12*4882a593SmuzhiyunJeff Muizelaar created a tool for tracing MMIO accesses with the Nouveau
13*4882a593Smuzhiyunproject in mind. Since then many people have contributed.
14*4882a593Smuzhiyun
15*4882a593SmuzhiyunMmiotrace was built for reverse engineering any memory-mapped IO device with
16*4882a593Smuzhiyunthe Nouveau project as the first real user. Only x86 and x86_64 architectures
17*4882a593Smuzhiyunare supported.
18*4882a593Smuzhiyun
19*4882a593SmuzhiyunOut-of-tree mmiotrace was originally modified for mainline inclusion and
20*4882a593Smuzhiyunftrace framework by Pekka Paalanen <pq@iki.fi>.
21*4882a593Smuzhiyun
22*4882a593Smuzhiyun
23*4882a593SmuzhiyunPreparation
24*4882a593Smuzhiyun-----------
25*4882a593Smuzhiyun
26*4882a593SmuzhiyunMmiotrace feature is compiled in by the CONFIG_MMIOTRACE option. Tracing is
27*4882a593Smuzhiyundisabled by default, so it is safe to have this set to yes. SMP systems are
28*4882a593Smuzhiyunsupported, but tracing is unreliable and may miss events if more than one CPU
29*4882a593Smuzhiyunis on-line, therefore mmiotrace takes all but one CPU off-line during run-time
30*4882a593Smuzhiyunactivation. You can re-enable CPUs by hand, but you have been warned, there
31*4882a593Smuzhiyunis no way to automatically detect if you are losing events due to CPUs racing.
32*4882a593Smuzhiyun
33*4882a593Smuzhiyun
34*4882a593SmuzhiyunUsage Quick Reference
35*4882a593Smuzhiyun---------------------
36*4882a593Smuzhiyun::
37*4882a593Smuzhiyun
38*4882a593Smuzhiyun	$ mount -t debugfs debugfs /sys/kernel/debug
39*4882a593Smuzhiyun	$ echo mmiotrace > /sys/kernel/debug/tracing/current_tracer
40*4882a593Smuzhiyun	$ cat /sys/kernel/debug/tracing/trace_pipe > mydump.txt &
41*4882a593Smuzhiyun	Start X or whatever.
42*4882a593Smuzhiyun	$ echo "X is up" > /sys/kernel/debug/tracing/trace_marker
43*4882a593Smuzhiyun	$ echo nop > /sys/kernel/debug/tracing/current_tracer
44*4882a593Smuzhiyun	Check for lost events.
45*4882a593Smuzhiyun
46*4882a593Smuzhiyun
47*4882a593SmuzhiyunUsage
48*4882a593Smuzhiyun-----
49*4882a593Smuzhiyun
50*4882a593SmuzhiyunMake sure debugfs is mounted to /sys/kernel/debug.
51*4882a593SmuzhiyunIf not (requires root privileges)::
52*4882a593Smuzhiyun
53*4882a593Smuzhiyun	$ mount -t debugfs debugfs /sys/kernel/debug
54*4882a593Smuzhiyun
55*4882a593SmuzhiyunCheck that the driver you are about to trace is not loaded.
56*4882a593Smuzhiyun
57*4882a593SmuzhiyunActivate mmiotrace (requires root privileges)::
58*4882a593Smuzhiyun
59*4882a593Smuzhiyun	$ echo mmiotrace > /sys/kernel/debug/tracing/current_tracer
60*4882a593Smuzhiyun
61*4882a593SmuzhiyunStart storing the trace::
62*4882a593Smuzhiyun
63*4882a593Smuzhiyun	$ cat /sys/kernel/debug/tracing/trace_pipe > mydump.txt &
64*4882a593Smuzhiyun
65*4882a593SmuzhiyunThe 'cat' process should stay running (sleeping) in the background.
66*4882a593Smuzhiyun
67*4882a593SmuzhiyunLoad the driver you want to trace and use it. Mmiotrace will only catch MMIO
68*4882a593Smuzhiyunaccesses to areas that are ioremapped while mmiotrace is active.
69*4882a593Smuzhiyun
70*4882a593SmuzhiyunDuring tracing you can place comments (markers) into the trace by
71*4882a593Smuzhiyun$ echo "X is up" > /sys/kernel/debug/tracing/trace_marker
72*4882a593SmuzhiyunThis makes it easier to see which part of the (huge) trace corresponds to
73*4882a593Smuzhiyunwhich action. It is recommended to place descriptive markers about what you
74*4882a593Smuzhiyundo.
75*4882a593Smuzhiyun
76*4882a593SmuzhiyunShut down mmiotrace (requires root privileges)::
77*4882a593Smuzhiyun
78*4882a593Smuzhiyun	$ echo nop > /sys/kernel/debug/tracing/current_tracer
79*4882a593Smuzhiyun
80*4882a593SmuzhiyunThe 'cat' process exits. If it does not, kill it by issuing 'fg' command and
81*4882a593Smuzhiyunpressing ctrl+c.
82*4882a593Smuzhiyun
83*4882a593SmuzhiyunCheck that mmiotrace did not lose events due to a buffer filling up. Either::
84*4882a593Smuzhiyun
85*4882a593Smuzhiyun	$ grep -i lost mydump.txt
86*4882a593Smuzhiyun
87*4882a593Smuzhiyunwhich tells you exactly how many events were lost, or use::
88*4882a593Smuzhiyun
89*4882a593Smuzhiyun	$ dmesg
90*4882a593Smuzhiyun
91*4882a593Smuzhiyunto view your kernel log and look for "mmiotrace has lost events" warning. If
92*4882a593Smuzhiyunevents were lost, the trace is incomplete. You should enlarge the buffers and
93*4882a593Smuzhiyuntry again. Buffers are enlarged by first seeing how large the current buffers
94*4882a593Smuzhiyunare::
95*4882a593Smuzhiyun
96*4882a593Smuzhiyun	$ cat /sys/kernel/debug/tracing/buffer_size_kb
97*4882a593Smuzhiyun
98*4882a593Smuzhiyungives you a number. Approximately double this number and write it back, for
99*4882a593Smuzhiyuninstance::
100*4882a593Smuzhiyun
101*4882a593Smuzhiyun	$ echo 128000 > /sys/kernel/debug/tracing/buffer_size_kb
102*4882a593Smuzhiyun
103*4882a593SmuzhiyunThen start again from the top.
104*4882a593Smuzhiyun
105*4882a593SmuzhiyunIf you are doing a trace for a driver project, e.g. Nouveau, you should also
106*4882a593Smuzhiyundo the following before sending your results::
107*4882a593Smuzhiyun
108*4882a593Smuzhiyun	$ lspci -vvv > lspci.txt
109*4882a593Smuzhiyun	$ dmesg > dmesg.txt
110*4882a593Smuzhiyun	$ tar zcf pciid-nick-mmiotrace.tar.gz mydump.txt lspci.txt dmesg.txt
111*4882a593Smuzhiyun
112*4882a593Smuzhiyunand then send the .tar.gz file. The trace compresses considerably. Replace
113*4882a593Smuzhiyun"pciid" and "nick" with the PCI ID or model name of your piece of hardware
114*4882a593Smuzhiyununder investigation and your nickname.
115*4882a593Smuzhiyun
116*4882a593Smuzhiyun
117*4882a593SmuzhiyunHow Mmiotrace Works
118*4882a593Smuzhiyun-------------------
119*4882a593Smuzhiyun
120*4882a593SmuzhiyunAccess to hardware IO-memory is gained by mapping addresses from PCI bus by
121*4882a593Smuzhiyuncalling one of the ioremap_*() functions. Mmiotrace is hooked into the
122*4882a593Smuzhiyun__ioremap() function and gets called whenever a mapping is created. Mapping is
123*4882a593Smuzhiyunan event that is recorded into the trace log. Note that ISA range mappings
124*4882a593Smuzhiyunare not caught, since the mapping always exists and is returned directly.
125*4882a593Smuzhiyun
126*4882a593SmuzhiyunMMIO accesses are recorded via page faults. Just before __ioremap() returns,
127*4882a593Smuzhiyunthe mapped pages are marked as not present. Any access to the pages causes a
128*4882a593Smuzhiyunfault. The page fault handler calls mmiotrace to handle the fault. Mmiotrace
129*4882a593Smuzhiyunmarks the page present, sets TF flag to achieve single stepping and exits the
130*4882a593Smuzhiyunfault handler. The instruction that faulted is executed and debug trap is
131*4882a593Smuzhiyunentered. Here mmiotrace again marks the page as not present. The instruction
132*4882a593Smuzhiyunis decoded to get the type of operation (read/write), data width and the value
133*4882a593Smuzhiyunread or written. These are stored to the trace log.
134*4882a593Smuzhiyun
135*4882a593SmuzhiyunSetting the page present in the page fault handler has a race condition on SMP
136*4882a593Smuzhiyunmachines. During the single stepping other CPUs may run freely on that page
137*4882a593Smuzhiyunand events can be missed without a notice. Re-enabling other CPUs during
138*4882a593Smuzhiyuntracing is discouraged.
139*4882a593Smuzhiyun
140*4882a593Smuzhiyun
141*4882a593SmuzhiyunTrace Log Format
142*4882a593Smuzhiyun----------------
143*4882a593Smuzhiyun
144*4882a593SmuzhiyunThe raw log is text and easily filtered with e.g. grep and awk. One record is
145*4882a593Smuzhiyunone line in the log. A record starts with a keyword, followed by keyword-
146*4882a593Smuzhiyundependent arguments. Arguments are separated by a space, or continue until the
147*4882a593Smuzhiyunend of line. The format for version 20070824 is as follows:
148*4882a593Smuzhiyun
149*4882a593SmuzhiyunExplanation	Keyword	Space-separated arguments
150*4882a593Smuzhiyun---------------------------------------------------------------------------
151*4882a593Smuzhiyun
152*4882a593Smuzhiyunread event	R	width, timestamp, map id, physical, value, PC, PID
153*4882a593Smuzhiyunwrite event	W	width, timestamp, map id, physical, value, PC, PID
154*4882a593Smuzhiyunioremap event	MAP	timestamp, map id, physical, virtual, length, PC, PID
155*4882a593Smuzhiyuniounmap event	UNMAP	timestamp, map id, PC, PID
156*4882a593Smuzhiyunmarker		MARK	timestamp, text
157*4882a593Smuzhiyunversion		VERSION	the string "20070824"
158*4882a593Smuzhiyuninfo for reader	LSPCI	one line from lspci -v
159*4882a593SmuzhiyunPCI address map	PCIDEV	space-separated /proc/bus/pci/devices data
160*4882a593Smuzhiyununk. opcode	UNKNOWN	timestamp, map id, physical, data, PC, PID
161*4882a593Smuzhiyun
162*4882a593SmuzhiyunTimestamp is in seconds with decimals. Physical is a PCI bus address, virtual
163*4882a593Smuzhiyunis a kernel virtual address. Width is the data width in bytes and value is the
164*4882a593Smuzhiyundata value. Map id is an arbitrary id number identifying the mapping that was
165*4882a593Smuzhiyunused in an operation. PC is the program counter and PID is process id. PC is
166*4882a593Smuzhiyunzero if it is not recorded. PID is always zero as tracing MMIO accesses
167*4882a593Smuzhiyunoriginating in user space memory is not yet supported.
168*4882a593Smuzhiyun
169*4882a593SmuzhiyunFor instance, the following awk filter will pass all 32-bit writes that target
170*4882a593Smuzhiyunphysical addresses in the range [0xfb73ce40, 0xfb800000]
171*4882a593Smuzhiyun::
172*4882a593Smuzhiyun
173*4882a593Smuzhiyun	$ awk '/W 4 / { adr=strtonum($5); if (adr >= 0xfb73ce40 &&
174*4882a593Smuzhiyun	adr < 0xfb800000) print; }'
175*4882a593Smuzhiyun
176*4882a593Smuzhiyun
177*4882a593SmuzhiyunTools for Developers
178*4882a593Smuzhiyun--------------------
179*4882a593Smuzhiyun
180*4882a593SmuzhiyunThe user space tools include utilities for:
181*4882a593Smuzhiyun  - replacing numeric addresses and values with hardware register names
182*4882a593Smuzhiyun  - replaying MMIO logs, i.e., re-executing the recorded writes
183*4882a593Smuzhiyun
184*4882a593Smuzhiyun
185