xref: /OK3568_Linux_fs/u-boot/doc/README.trace (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593Smuzhiyun#
2*4882a593Smuzhiyun# Copyright (c) 2013 The Chromium OS Authors.
3*4882a593Smuzhiyun#
4*4882a593Smuzhiyun# SPDX-License-Identifier:	GPL-2.0+
5*4882a593Smuzhiyun#
6*4882a593Smuzhiyun
7*4882a593SmuzhiyunTracing in U-Boot
8*4882a593Smuzhiyun=================
9*4882a593Smuzhiyun
10*4882a593SmuzhiyunU-Boot supports a simple tracing feature which allows a record of excecution
11*4882a593Smuzhiyunto be collected and sent to a host machine for analysis. At present the
12*4882a593Smuzhiyunmain use for this is to profile boot time.
13*4882a593Smuzhiyun
14*4882a593Smuzhiyun
15*4882a593SmuzhiyunOverview
16*4882a593Smuzhiyun--------
17*4882a593Smuzhiyun
18*4882a593SmuzhiyunThe trace feature uses GCC's instrument-functions feature to trace all
19*4882a593Smuzhiyunfunction entry/exit points. These are then recorded in a memory buffer.
20*4882a593SmuzhiyunThe memory buffer can be saved to the host over a network link using
21*4882a593Smuzhiyuntftpput or by writing to an attached memory device such as MMC.
22*4882a593Smuzhiyun
23*4882a593SmuzhiyunOn the host, the file is first converted with a tool called 'proftool',
24*4882a593Smuzhiyunwhich extracts useful information from it. The resulting trace output
25*4882a593Smuzhiyunresembles that emitted by Linux's ftrace feature, so can be visually
26*4882a593Smuzhiyundisplayed by pytimechart.
27*4882a593Smuzhiyun
28*4882a593Smuzhiyun
29*4882a593SmuzhiyunQuick-start using Sandbox
30*4882a593Smuzhiyun-------------------------
31*4882a593Smuzhiyun
32*4882a593SmuzhiyunSandbox is a build of U-Boot that can run under Linux so it is a convenient
33*4882a593Smuzhiyunway of trying out tracing before you use it on your actual board. To do
34*4882a593Smuzhiyunthis, follow these steps:
35*4882a593Smuzhiyun
36*4882a593SmuzhiyunAdd the following to include/configs/sandbox.h (if not already there)
37*4882a593Smuzhiyun
38*4882a593Smuzhiyun#define CONFIG_TRACE
39*4882a593Smuzhiyun#define CONFIG_CMD_TRACE
40*4882a593Smuzhiyun#define CONFIG_TRACE_BUFFER_SIZE		(16 << 20)
41*4882a593Smuzhiyun#define CONFIG_TRACE_EARLY_SIZE		(8 << 20)
42*4882a593Smuzhiyun#define CONFIG_TRACE_EARLY
43*4882a593Smuzhiyun#define CONFIG_TRACE_EARLY_ADDR		0x00100000
44*4882a593Smuzhiyun
45*4882a593SmuzhiyunBuild sandbox U-Boot with tracing enabled:
46*4882a593Smuzhiyun
47*4882a593Smuzhiyun$ make FTRACE=1 O=sandbox sandbox_config
48*4882a593Smuzhiyun$ make FTRACE=1 O=sandbox
49*4882a593Smuzhiyun
50*4882a593SmuzhiyunRun sandbox, wait for a bit of trace information to appear, and then capture
51*4882a593Smuzhiyuna trace:
52*4882a593Smuzhiyun
53*4882a593Smuzhiyun$ ./sandbox/u-boot
54*4882a593Smuzhiyun
55*4882a593Smuzhiyun
56*4882a593SmuzhiyunU-Boot 2013.04-rc2-00100-ga72fcef (Apr 17 2013 - 19:25:24)
57*4882a593Smuzhiyun
58*4882a593SmuzhiyunDRAM:  128 MiB
59*4882a593Smuzhiyuntrace: enabled
60*4882a593SmuzhiyunUsing default environment
61*4882a593Smuzhiyun
62*4882a593SmuzhiyunIn:    serial
63*4882a593SmuzhiyunOut:   serial
64*4882a593SmuzhiyunErr:   serial
65*4882a593Smuzhiyun=>trace stats
66*4882a593Smuzhiyun	671,406 function sites
67*4882a593Smuzhiyun	 69,712 function calls
68*4882a593Smuzhiyun	      0 untracked function calls
69*4882a593Smuzhiyun	 73,373 traced function calls
70*4882a593Smuzhiyun	     16 maximum observed call depth
71*4882a593Smuzhiyun	     15 call depth limit
72*4882a593Smuzhiyun	 66,491 calls not traced due to depth
73*4882a593Smuzhiyun=>trace stats
74*4882a593Smuzhiyun	671,406 function sites
75*4882a593Smuzhiyun      1,279,450 function calls
76*4882a593Smuzhiyun	      0 untracked function calls
77*4882a593Smuzhiyun	950,490 traced function calls (333217 dropped due to overflow)
78*4882a593Smuzhiyun	     16 maximum observed call depth
79*4882a593Smuzhiyun	     15 call depth limit
80*4882a593Smuzhiyun      1,275,767 calls not traced due to depth
81*4882a593Smuzhiyun=>trace calls 0 e00000
82*4882a593SmuzhiyunCall list dumped to 00000000, size 0xae0a40
83*4882a593Smuzhiyun=>print
84*4882a593Smuzhiyunbaudrate=115200
85*4882a593Smuzhiyunprofbase=0
86*4882a593Smuzhiyunprofoffset=ae0a40
87*4882a593Smuzhiyunprofsize=e00000
88*4882a593Smuzhiyunstderr=serial
89*4882a593Smuzhiyunstdin=serial
90*4882a593Smuzhiyunstdout=serial
91*4882a593Smuzhiyun
92*4882a593SmuzhiyunEnvironment size: 117/8188 bytes
93*4882a593Smuzhiyun=>sb save host 0 trace 0 ${profoffset}
94*4882a593Smuzhiyun11405888 bytes written in 10 ms (1.1 GiB/s)
95*4882a593Smuzhiyun=>reset
96*4882a593Smuzhiyun
97*4882a593Smuzhiyun
98*4882a593SmuzhiyunThen run proftool to convert the trace information to ftrace format.
99*4882a593Smuzhiyun
100*4882a593Smuzhiyun$ ./sandbox/tools/proftool -m sandbox/System.map -p trace dump-ftrace >trace.txt
101*4882a593Smuzhiyun
102*4882a593SmuzhiyunFinally run pytimechart to display it:
103*4882a593Smuzhiyun
104*4882a593Smuzhiyun$ pytimechart trace.txt
105*4882a593Smuzhiyun
106*4882a593SmuzhiyunUsing this tool you can zoom and pan across the trace, with the function
107*4882a593Smuzhiyuncalls on the left and little marks representing the start and end of each
108*4882a593Smuzhiyunfunction.
109*4882a593Smuzhiyun
110*4882a593Smuzhiyun
111*4882a593SmuzhiyunCONFIG Options
112*4882a593Smuzhiyun--------------
113*4882a593Smuzhiyun
114*4882a593Smuzhiyun- CONFIG_TRACE
115*4882a593Smuzhiyun		Enables the trace feature in U-Boot.
116*4882a593Smuzhiyun
117*4882a593Smuzhiyun- CONFIG_CMD_TRACE
118*4882a593Smuzhiyun		Enables the trace command.
119*4882a593Smuzhiyun
120*4882a593Smuzhiyun- CONFIG_TRACE_BUFFER_SIZE
121*4882a593Smuzhiyun		Size of trace buffer to allocate for U-Boot. This buffer is
122*4882a593Smuzhiyun		used after relocation, as a place to put function tracing
123*4882a593Smuzhiyun		information. The address of the buffer is determined by
124*4882a593Smuzhiyun		the relocation code.
125*4882a593Smuzhiyun
126*4882a593Smuzhiyun- CONFIG_TRACE_EARLY
127*4882a593Smuzhiyun		Define this to start tracing early, before relocation.
128*4882a593Smuzhiyun
129*4882a593Smuzhiyun- CONFIG_TRACE_EARLY_SIZE
130*4882a593Smuzhiyun		Size of 'early' trace buffer. Before U-Boot has relocated
131*4882a593Smuzhiyun		it doesn't have a proper trace buffer. On many boards
132*4882a593Smuzhiyun		you can define an area of memory to use for the trace
133*4882a593Smuzhiyun		buffer until the 'real' trace buffer is available after
134*4882a593Smuzhiyun		relocation. The contents of this buffer are then copied to
135*4882a593Smuzhiyun		the real buffer.
136*4882a593Smuzhiyun
137*4882a593Smuzhiyun- CONFIG_TRACE_EARLY_ADDR
138*4882a593Smuzhiyun		Address of early trace buffer
139*4882a593Smuzhiyun
140*4882a593Smuzhiyun
141*4882a593SmuzhiyunBuilding U-Boot with Tracing Enabled
142*4882a593Smuzhiyun------------------------------------
143*4882a593Smuzhiyun
144*4882a593SmuzhiyunPass 'FTRACE=1' to the U-Boot Makefile to actually instrument the code.
145*4882a593SmuzhiyunThis is kept as a separate option so that it is easy to enable/disable
146*4882a593Smuzhiyuninstrumenting from the command line instead of having to change board
147*4882a593Smuzhiyunconfig files.
148*4882a593Smuzhiyun
149*4882a593Smuzhiyun
150*4882a593SmuzhiyunCollecting Trace Data
151*4882a593Smuzhiyun---------------------
152*4882a593Smuzhiyun
153*4882a593SmuzhiyunWhen you run U-Boot on your board it will collect trace data up to the
154*4882a593Smuzhiyunlimit of the trace buffer size you have specified. Once that is exhausted
155*4882a593Smuzhiyunno more data will be collected.
156*4882a593Smuzhiyun
157*4882a593SmuzhiyunCollecting trace data has an affect on execution time/performance. You
158*4882a593Smuzhiyunwill notice this particularly with trvial functions - the overhead of
159*4882a593Smuzhiyunrecording their execution may even exceed their normal execution time.
160*4882a593SmuzhiyunIn practice this doesn't matter much so long as you are aware of the
161*4882a593Smuzhiyuneffect. Once you have done your optimisations, turn off tracing before
162*4882a593Smuzhiyundoing end-to-end timing.
163*4882a593Smuzhiyun
164*4882a593SmuzhiyunThe best time to start tracing is right at the beginning of U-Boot. The
165*4882a593Smuzhiyunbest time to stop tracing is right at the end. In practice it is hard
166*4882a593Smuzhiyunto achieve these ideals.
167*4882a593Smuzhiyun
168*4882a593SmuzhiyunThis implementation enables tracing early in board_init_f(). This means
169*4882a593Smuzhiyunthat it captures most of the board init process, missing only the
170*4882a593Smuzhiyunearly architecture-specific init. However, it also misses the entire
171*4882a593SmuzhiyunSPL stage if there is one.
172*4882a593Smuzhiyun
173*4882a593SmuzhiyunU-Boot typically ends with a 'bootm' command which loads and runs an
174*4882a593SmuzhiyunOS. There is useful trace data in the execution of that bootm
175*4882a593Smuzhiyuncommand. Therefore this implementation provides a way to collect trace
176*4882a593Smuzhiyundata after bootm has finished processing, but just before it jumps to
177*4882a593Smuzhiyunthe OS. In practical terms, U-Boot runs the 'fakegocmd' environment
178*4882a593Smuzhiyunvariable at this point. This variable should have a short script which
179*4882a593Smuzhiyuncollects the trace data and writes it somewhere.
180*4882a593Smuzhiyun
181*4882a593SmuzhiyunTrace data collection relies on a microsecond timer, accesed through
182*4882a593Smuzhiyuntimer_get_us(). So the first think you should do is make sure that
183*4882a593Smuzhiyunthis produces sensible results for your board. Suitable sources for
184*4882a593Smuzhiyunthis timer include high resolution timers, PWMs or profile timers if
185*4882a593Smuzhiyunavailable. Most modern SOCs have a suitable timer for this. Make sure
186*4882a593Smuzhiyunthat you mark this timer (and anything it calls) with
187*4882a593Smuzhiyun__attribute__((no_instrument_function)) so that the trace library can
188*4882a593Smuzhiyunuse it without causing an infinite loop.
189*4882a593Smuzhiyun
190*4882a593Smuzhiyun
191*4882a593SmuzhiyunCommands
192*4882a593Smuzhiyun--------
193*4882a593Smuzhiyun
194*4882a593SmuzhiyunThe trace command has variable sub-commands:
195*4882a593Smuzhiyun
196*4882a593Smuzhiyun- stats
197*4882a593Smuzhiyun		Display tracing statistics
198*4882a593Smuzhiyun
199*4882a593Smuzhiyun- pause
200*4882a593Smuzhiyun		Pause tracing
201*4882a593Smuzhiyun
202*4882a593Smuzhiyun- resume
203*4882a593Smuzhiyun		Resume tracing
204*4882a593Smuzhiyun
205*4882a593Smuzhiyun- funclist [<addr> <size>]
206*4882a593Smuzhiyun		Dump a list of functions into the buffer
207*4882a593Smuzhiyun
208*4882a593Smuzhiyun- calls  [<addr> <size>]
209*4882a593Smuzhiyun		Dump function call trace into buffer
210*4882a593Smuzhiyun
211*4882a593SmuzhiyunIf the address and size are not given, these are obtained from environment
212*4882a593Smuzhiyunvariables (see below). In any case the environment variables are updated
213*4882a593Smuzhiyunafter the command runs.
214*4882a593Smuzhiyun
215*4882a593Smuzhiyun
216*4882a593SmuzhiyunEnvironment Variables
217*4882a593Smuzhiyun---------------------
218*4882a593Smuzhiyun
219*4882a593SmuzhiyunThe following are used:
220*4882a593Smuzhiyun
221*4882a593Smuzhiyun- profbase
222*4882a593Smuzhiyun		Base address of trace output buffer
223*4882a593Smuzhiyun
224*4882a593Smuzhiyun- profoffset
225*4882a593Smuzhiyun		Offset of first unwritten byte in trace output buffer
226*4882a593Smuzhiyun
227*4882a593Smuzhiyun- profsize
228*4882a593Smuzhiyun		Size of trace output buffer
229*4882a593Smuzhiyun
230*4882a593SmuzhiyunAll of these are set by the 'trace calls' command.
231*4882a593Smuzhiyun
232*4882a593SmuzhiyunThese variables keep track of the amount of data written to the trace
233*4882a593Smuzhiyunoutput buffer by the 'trace' command. The trace commands which write data
234*4882a593Smuzhiyunto the output buffer can use these to specify the buffer to write to, and
235*4882a593Smuzhiyunupdate profoffset each time. This allows successive commands to append data
236*4882a593Smuzhiyunto the same buffer, for example:
237*4882a593Smuzhiyun
238*4882a593Smuzhiyun	trace funclist 10000 e00000
239*4882a593Smuzhiyun	trace calls
240*4882a593Smuzhiyun
241*4882a593Smuzhiyun(the latter command appends more data to the buffer).
242*4882a593Smuzhiyun
243*4882a593Smuzhiyun
244*4882a593Smuzhiyun- fakegocmd
245*4882a593Smuzhiyun		Specifies commands to run just before booting the OS. This
246*4882a593Smuzhiyun		is a useful time to write the trace data to the host for
247*4882a593Smuzhiyun		processing.
248*4882a593Smuzhiyun
249*4882a593Smuzhiyun
250*4882a593SmuzhiyunWriting Out Trace Data
251*4882a593Smuzhiyun----------------------
252*4882a593Smuzhiyun
253*4882a593SmuzhiyunOnce the trace data is in an output buffer in memory there are various ways
254*4882a593Smuzhiyunto transmit it to the host. Notably you can use tftput to send the data
255*4882a593Smuzhiyunover a network link:
256*4882a593Smuzhiyun
257*4882a593Smuzhiyunfakegocmd=trace pause; usb start; set autoload n; bootp;
258*4882a593Smuzhiyun	trace calls 10000000 1000000;
259*4882a593Smuzhiyun	tftpput ${profbase} ${profoffset} 192.168.1.4:/tftpboot/calls
260*4882a593Smuzhiyun
261*4882a593SmuzhiyunThis starts up USB (to talk to an attached USB Ethernet dongle), writes
262*4882a593Smuzhiyuna trace log to address 10000000 and sends it to a host machine using
263*4882a593SmuzhiyunTFTP. After this, U-Boot will boot the OS normally, albeit a little
264*4882a593Smuzhiyunlater.
265*4882a593Smuzhiyun
266*4882a593Smuzhiyun
267*4882a593SmuzhiyunConverting Trace Output Data
268*4882a593Smuzhiyun----------------------------
269*4882a593Smuzhiyun
270*4882a593SmuzhiyunThe trace output data is kept in a binary format which is not documented
271*4882a593Smuzhiyunhere. To convert it into something useful, you can use proftool.
272*4882a593Smuzhiyun
273*4882a593SmuzhiyunThis tool must be given the U-Boot map file and the trace data received
274*4882a593Smuzhiyunfrom running that U-Boot. It produces a text output file.
275*4882a593Smuzhiyun
276*4882a593SmuzhiyunOptions
277*4882a593Smuzhiyun	-m <map_file>
278*4882a593Smuzhiyun		Specify U-Boot map file
279*4882a593Smuzhiyun
280*4882a593Smuzhiyun	-p <trace_file>
281*4882a593Smuzhiyun		Specifiy profile/trace file
282*4882a593Smuzhiyun
283*4882a593SmuzhiyunCommands:
284*4882a593Smuzhiyun
285*4882a593Smuzhiyun- dump-ftrace
286*4882a593Smuzhiyun	Write a text dump of the file in Linux ftrace format to stdout
287*4882a593Smuzhiyun
288*4882a593Smuzhiyun
289*4882a593SmuzhiyunViewing the Trace Data
290*4882a593Smuzhiyun----------------------
291*4882a593Smuzhiyun
292*4882a593SmuzhiyunYou can use pytimechart for this (sudo apt-get pytimechart might work on
293*4882a593Smuzhiyunyour Debian-style machine, and use your favourite search engine to obtain
294*4882a593Smuzhiyundocumentation). It expects the file to have a .txt extension. The program
295*4882a593Smuzhiyunhas terse user interface but is very convenient for viewing U-Boot
296*4882a593Smuzhiyunprofile information.
297*4882a593Smuzhiyun
298*4882a593Smuzhiyun
299*4882a593SmuzhiyunWorkflow Suggestions
300*4882a593Smuzhiyun--------------------
301*4882a593Smuzhiyun
302*4882a593SmuzhiyunThe following suggestions may be helpful if you are trying to reduce boot
303*4882a593Smuzhiyuntime:
304*4882a593Smuzhiyun
305*4882a593Smuzhiyun1. Enable CONFIG_BOOTSTAGE and CONFIG_BOOTSTAGE_REPORT. This should get
306*4882a593Smuzhiyunyou are helpful overall snapshot of the boot time.
307*4882a593Smuzhiyun
308*4882a593Smuzhiyun2. Build U-Boot with tracing and run it. Note the difference in boot time
309*4882a593Smuzhiyun(it is common for tracing to add 10% to the time)
310*4882a593Smuzhiyun
311*4882a593Smuzhiyun3. Collect the trace information as descibed above. Use this to find where
312*4882a593Smuzhiyunall the time is being spent.
313*4882a593Smuzhiyun
314*4882a593Smuzhiyun4. Take a look at that code and see if you can optimise it. Perhaps it is
315*4882a593Smuzhiyunpossible to speed up the initialisation of a device, or remove an unused
316*4882a593Smuzhiyunfeature.
317*4882a593Smuzhiyun
318*4882a593Smuzhiyun5. Rebuild, run and collect again. Compare your results.
319*4882a593Smuzhiyun
320*4882a593Smuzhiyun6. Keep going until you run out of steam, or your boot is fast enough.
321*4882a593Smuzhiyun
322*4882a593Smuzhiyun
323*4882a593SmuzhiyunConfiguring Trace
324*4882a593Smuzhiyun-----------------
325*4882a593Smuzhiyun
326*4882a593SmuzhiyunThere are a few parameters in the code that you may want to consider.
327*4882a593SmuzhiyunThere is a function call depth limit (set to 15 by default). When the
328*4882a593Smuzhiyunstack depth goes above this then no tracing information is recorded.
329*4882a593SmuzhiyunThe maximum depth reached is recorded and displayed by the 'trace stats'
330*4882a593Smuzhiyuncommand.
331*4882a593Smuzhiyun
332*4882a593Smuzhiyun
333*4882a593SmuzhiyunFuture Work
334*4882a593Smuzhiyun-----------
335*4882a593Smuzhiyun
336*4882a593SmuzhiyunTracing could be a little tidier in some areas, for example providing
337*4882a593Smuzhiyunrun-time configuration options for trace.
338*4882a593Smuzhiyun
339*4882a593SmuzhiyunSome other features that might be useful:
340*4882a593Smuzhiyun
341*4882a593Smuzhiyun- Trace filter to select which functions are recorded
342*4882a593Smuzhiyun- Sample-based profiling using a timer interrupt
343*4882a593Smuzhiyun- Better control over trace depth
344*4882a593Smuzhiyun- Compression of trace information
345*4882a593Smuzhiyun
346*4882a593Smuzhiyun
347*4882a593SmuzhiyunSimon Glass <sjg@chromium.org>
348*4882a593SmuzhiyunApril 2013
349