Lines Matching +full:trace +full:- +full:buffer +full:- +full:extension

1 perf-intel-pt(1)
5 ----
6 perf-intel-pt - Support for Intel Processor Trace within perf tools
9 --------
11 'perf record' -e intel_pt//
14 -----------
16 Intel Processor Trace (Intel PT) is an extension of Intel Architecture that
19 Technical details are documented in the Intel 64 and IA-32 Architectures
20 Software Developer Manuals, Chapter 36 Intel Processor Trace.
23 processors that are based on the Intel micro-architecture code name Broadwell.
25 Trace data is collected by 'perf record' and stored within the perf.data file.
28 Trace data must be 'decoded' which involves walking the object code and matching
29 the trace data packets. For example a TNT packet only tells whether a
33 Decoding is done on-the-fly. The decoder outputs samples in the same format as
43 builds, however the executed images are needed - which makes use in JIT-compiled
44 environments, or with self-modified code, a challenge. Also symbols need to be
47 A limitation of Intel PT is that it produces huge amounts of trace data
51 vary depending on the use-case and architecture.
55 ----------
61 Data is captured with 'perf record' e.g. to trace 'ls' userspace-only:
63 perf record -e intel_pt//u ls
69 To also trace kernel space presents a problem, namely kernel self-modifying
73 --kcore is used, but access to /proc/kcore is restricted e.g.
75 sudo perf record -o pt_ls --kcore -e intel_pt// -- ls
82 sudo perf report -i pt_ls
84 Because samples are synthesized after-the-fact, the sampling period can be
87 sudo perf report pt_ls --itrace=i1usge
89 See the sections below for more information about the --itrace option.
103 perf record -e intel_pt//u ls
104 perf script --itrace=ibxwpe
109 perf script --itrace=ibxwpe -F+flags
112 system, asynchronous, interrupt, transaction abort, trace begin, trace end, and
117 perf script --insn-trace --xed
120 Dumping all instructions in a long trace can be fairly slow. It is usually better
123 perf script --call-trace
127 perf script --call-ret-trace
132 perf script --time starttime,stoptime --insn-trace --xed
134 While examining the trace it's also useful to filter on specific CPUs using
135 the -C option
137 perf script --time starttime,stoptime --insn-trace --xed -C 1
144 perf script --itrace=be -F+ipc
146 There are two ways that instructions-per-cycle (IPC) can be calculated depending
151 used - refer to the 'mtc' config term. When MTC is used, however, the values
166 Another note, in the case of "branches" events, non-taken branches are not
168 TNT packet that starts with a non-taken branch. To see every possible IPC
169 value, "instructions" events can be used e.g. --itrace=i0ns
173 Refer to script export-to-sqlite.py or export-to-postgresql.py for more details,
174 and to script exported-sql-viewer.py for an example of using the database.
176 There is also script intel-pt-events.py which provides an example of how to
184 by inability to access the executed image, self-modified or JIT-ed code, or the
185 inability to match side-band information (such as context switches and mmaps)
189 resulting in data lost because the buffer was full. See 'Buffer handling' below
194 -----------
203 -e intel_pt//
207 -e intel_pt/tsc,noretcomp=0/
211 -e intel_pt/tsc=1,noretcomp=0/
213 Note there are now new config terms - see section 'config terms' further below.
220 $ grep -H . /sys/bus/event_source/devices/intel_pt/format/*
222 /sys/bus/event_source/devices/intel_pt/format/cyc_thresh:config:19-22
224 /sys/bus/event_source/devices/intel_pt/format/mtc_period:config:14-17
226 /sys/bus/event_source/devices/intel_pt/format/psb_period:config:24-27
231 -e intel_pt/noretcomp=0/
235 -e intel_pt/tsc=1,noretcomp=0/
239 -e intel_pt/tsc=0/
243 -e intel_pt/config=0x400/
258 perf_event_attr is displayed if the -vv option is used e.g.
260 ------------------------------------------------------------
274 ------------------------------------------------------------
275 sys_perf_event_open: pid 31104 cpu 0 group_fd -1 flags 0x8
276 sys_perf_event_open: pid 31104 cpu 1 group_fd -1 flags 0x8
277 sys_perf_event_open: pid 31104 cpu 2 group_fd -1 flags 0x8
278 sys_perf_event_open: pid 31104 cpu 3 group_fd -1 flags 0x8
279 ------------------------------------------------------------
285 The June 2015 version of Intel 64 and IA-32 Architectures Software Developer
286 Manuals, Chapter 36 Intel Processor Trace, defined new Intel PT features.
292 without timing information, for example a per-thread context
323 trace bytes between PSB packets as:
332 $ perf record -e intel_pt/psb_period=15/u uname
333 Invalid psb_period for intel_pt. Valid values are: 0-5
339 If decoding is expected to be reliable and the buffer is large
359 The frequency of MTC packets can also be specified - see
362 mtc_period Specifies how frequently MTC packets are produced - see mtc
374 CTC-frequency / (2 ^ value)
376 e.g. value 3 means one eighth of CTC-frequency
384 $ perf record -e intel_pt/mtc_period=15/u uname
405 a threshold - see cyc_thresh below.
407 cyc_thresh Specifies how frequently CYC packets are produced - see cyc
421 2 ^ (value - 1)
430 $ perf record -e intel_pt/cyc,cyc_thresh=15/u uname
431 Invalid cyc_thresh for intel_pt. Valid values are: 0-12
435 pt Specifies pass-through which enables the 'branch' config term.
462 changes to the CPU C-state.
477 --aux-sample
481 --aux-sample=8192
485 -e intel_pt//u
488 following will create Intel PT samples on the branch-misses event, note the
491 perf record --aux-sample -e '{intel_pt//u,branch-misses:u}'
493 An alternative to '--aux-sample' is to add the config term 'aux-sample-size' to
496 perf record -e intel_pt//u -e branch-misses/aux-sample-size=8192/u
500 perf record -e '{intel_pt//u,branch-misses/aux-sample-size=8192/u}'
504 …perf record -e intel_pt//u --filter 'filter * @/bin/ls' -e branch-misses/aux-sample-size=8192/u --
518 difficult to know how big the event might be without the trace sample attached,
525 The difference between full trace and snapshot from the kernel's perspective is
526 that in full trace we don't overwrite trace data that the user hasn't collected
528 the trace run and overwrite older data in the buffer so that whenever something
534 -S
538 -S0x100000
546 The snapshot size is displayed if the option -vv is used e.g.
554 Intel PT buffer size is specified by an addition to the -m option e.g.
556 -m,16
558 selects a buffer size of 16 pages i.e. 64KiB.
560 Note that the existing functionality of -m is unchanged. The auxtrace mmap size
574 In full-trace mode, powers of two are allowed for buffer size, with a minimum
578 The mmap size and auxtrace mmap size are displayed if the -vv option is used e.g.
588 full-trace mode
592 Full-trace mode traces continuously e.g.
594 perf record -e intel_pt//u uname
598 perf record --aux-sample -e intel_pt//u -e branch-misses:u
603 perf record -v -e intel_pt//u -S ./loopy 1000000000 &
605 kill -USR2 11435
609 Note that "Recording AUX area tracing snapshot" is displayed because the -v
619 $ sudo ~/bin/perf record --control fifo:perf.control,perf.ack -S -e intel_pt//u -- sleep 60 &
621 $ ps -e | grep perf
623 $ kill -USR2 15244
624 bash: kill: (15244) - Operation not permitted
631 Buffer handling
634 There may be buffer limitations (i.e. single ToPa entry) which means that actual
635 buffer sizes are limited to powers of 2 up to 4MiB (MAX_ORDER). In order to
639 a) the interrupt may not be handled in time so that the current buffer
640 becomes full and some trace data is lost.
644 If trace data is lost, the driver sets 'truncated' in the PERF_RECORD_AUX event
647 In full-trace mode, the driver waits for data to be copied out before allowing
648 the (logical) buffer to wrap-around. If data is not copied out quickly enough,
651 that happens, perf tools always re-enable the intel_pt event after copying out
658 By default "perf record" post-processes the event stream to find all build ids
666 perf buildid-list
670 perf buildid-list --with-hits
678 collection of side-band information. In order to prevent that, a dummy
681 there is complete side-band information to allow the decoding of subsequent
704 "per thread" mode is selected by -t or by --per-thread (with -p or -u or just a
706 "per cpu" is selected by -C or -a.
710 In per-thread mode an exact list of threads is traced. There is no inheritance.
711 Each thread has its own event buffer.
713 In per-cpu mode all processes (or processes from the selected cgroup i.e. -G
714 option, or processes selected with -p or -u) are traced. Each cpu has its own
715 buffer. Inheritance is allowed.
717 In workload-only mode, the workload is traced but with per-cpu buffers.
718 Inheritance is allowed. Note that you can now trace a workload in per-thread
719 mode by using the --per-thread option.
722 Privileged vs non-privileged users
725 Unless /proc/sys/kernel/perf_event_paranoid is set to -1, unprivileged users
726 have memory limits imposed upon them. That affects what buffer sizes they can
741 Unless /proc/sys/kernel/perf_event_paranoid is set to -1, unprivileged users are
742 not permitted to use tracepoints which means there is insufficient side-band
743 information to decode Intel PT in per-cpu mode, and potentially workload-only
746 Note also, that to use tracepoints, read-access to debugfs is required. So if
747 debugfs is not mounted or the user does not have read-access, it will again not
748 be possible to decode Intel PT in per-cpu mode.
754 The sched_switch tracepoint is used to provide side-band data for Intel PT
761 $ perf record -vv -e intel_pt//u uname
762 ------------------------------------------------------------
776 ------------------------------------------------------------
777 sys_perf_event_open: pid 31104 cpu 0 group_fd -1 flags 0x8
778 sys_perf_event_open: pid 31104 cpu 1 group_fd -1 flags 0x8
779 sys_perf_event_open: pid 31104 cpu 2 group_fd -1 flags 0x8
780 sys_perf_event_open: pid 31104 cpu 3 group_fd -1 flags 0x8
781 ------------------------------------------------------------
792 ------------------------------------------------------------
793 sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8
794 sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8
795 sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8
796 sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8
797 ------------------------------------------------------------
816 ------------------------------------------------------------
817 sys_perf_event_open: pid 31104 cpu 0 group_fd -1 flags 0x8
818 sys_perf_event_open: pid 31104 cpu 1 group_fd -1 flags 0x8
819 sys_perf_event_open: pid 31104 cpu 2 group_fd -1 flags 0x8
820 sys_perf_event_open: pid 31104 cpu 3 group_fd -1 flags 0x8
823 perf event ring buffer mmapped per cpu
830 and only in per-cpu mode.
834 cannot be matched against the Intel PT trace.
838 -----------
840 By default, perf script will decode trace data found in the perf.data file.
841 This can be further controlled by new option --itrace.
844 New --itrace option
849 --itrace
853 --itrace=cepwx
873 "Instructions" events look like they were recorded by "perf record -e
876 "Branches" events look like they were recorded by "perf record -e branches". "c"
894 "Power" events correspond to power event packets and CBR (core-to-bus ratio)
898 C-state changes, whereas CBR is indicative of CPU frequency. perf script
902 pwre: hw: 0 cstate: 2 sub-cstate: 0
906 "cbr" includes the frequency and the percentage of maximum non-turbo
908 "pwre" shows C-state transitions (to a C-state deeper than C0) and
913 For more details refer to the Intel 64 and IA-32 Architectures Software
916 Error events show where the decoder lost the trace. Error events
919 will or will not be reported. Each flag must be preceded by either '+' or '-'.
921 -o Suppress overflow errors
922 -l Suppress trace data lost errors
925 --itrace=e-o-l
931 must be preceded by either '+' or '-'. The flags support by Intel PT are:
932 -a Suppress logging of perf events
939 --itrace=i10us
942 microseconds of trace. Alternatives to "us" are "ms" (milliseconds),
957 'instructions' (i.e. --itrace=i1i).
962 --itrace=ig32
963 --itrace=xg32
968 --itrace=il10
969 --itrace=xl10
976 instead of synthesized events. For example, to record branch-misses events for
977 'ls' and then add a call chain derived from the Intel PT trace:
979 perf record --aux-sample -e '{intel_pt//u,branch-misses:u}' -- ls
980 perf report --itrace=Ge
988 into the event buffer in one go. That reduces interrupts, but can give very
989 late timestamps. Because the Intel PT trace is synchronized by timestamps,
990 the PEBS events do not match the trace. Currently, Large PEBS is used only in
992 - hardware supports it
993 - PEBS is used
994 - event period is specified, instead of frequency
995 - the sample type is limited to the following flags:
1004 cases, avoid specifying the event period i.e. avoid the 'perf record' -c option,
1005 --count option, or 'period' config term.
1007 To disable trace decoding entirely, use the option --no-itrace.
1012 --itrace=i0nss1000000
1016 The q option changes the way the trace is decoded. The decoding is much faster
1023 ranges that could then be decoded fully using the --time option.
1027 - direct calls and jmps
1028 - conditional branches
1029 - non-branch instructions
1033 - asynchronous branches such as interrupts
1034 - indirect branches
1035 - function return target address *if* the noretcomp config term (refer
1037 - start of (control-flow) tracing
1038 - end of (control-flow) tracing, if it is not out of context
1039 - power events, ptwrite, transaction start and abort
1040 - instruction pointer associated with PSB packets
1045 Repeating the q option (double-q i.e. qq) results in even faster decoding and even
1048 PSBEND). Note PSB packets occur regularly in the trace based on the psb_period
1054 - everything except instruction pointer associated with PSB packets
1058 - instruction pointer associated with PSB packets
1064 perf script has an option (-D) to "dump" the events i.e. display the binary
1067 When -D is used, Intel PT packets are displayed. The packet decoder does not
1068 pay attention to PSB packets, but just decodes the bytes - so the packets seen
1070 One example of that would be when the buffer-switching interrupt has been too
1071 slow, and the buffer has been filled completely. In that case, the last packet
1072 in the buffer might be truncated and immediately followed by a PSB as the trace
1073 continues in the next buffer.
1075 To disable the display of Intel PT packets, combine the -D option with
1076 --no-itrace.
1080 -----------
1082 By default, perf report will decode trace data found in the perf.data file.
1083 This can be further controlled by new option --itrace exactly the same as
1084 perf script, with the exception that the default is --itrace=igxe.
1088 -----------
1090 perf inject also accepts the --itrace option in which case tracing data is
1093 perf inject --itrace -i perf.data -o perf.data.new
1100 $ gcc-5 -O3 sort.c -o sort_optimized
1106 [intel-pt]
1107 mispred-all = on
1109 $ perf record -e intel_pt//u ./sort 3000
1114 $ perf inject -i perf.data -o inj --itrace=i100usle --strip
1115 $ ./create_gcov --binary=./sort --profile=inj --gcov=sort.gcov -gcov_version=1
1116 $ gcc-5 -O3 -fauto-profile=sort.gcov sort.c -o sort_autofdo
1126 -----------------
1128 Some hardware has the feature to redirect PEBS records to the Intel PT trace.
1129 Recording is selected by using the aux-output config term e.g.
1131 perf record -c 10000 -e '{intel_pt/branch=0/,cycles/aux-output/ppp}' uname
1135 To display PEBS events from the Intel PT trace, use the itrace 'o' option e.g.
1137 perf script --itrace=oe
1140 ---
1142 include::build-xed.txt[]
1145 --------
1147 linkperf:perf-record[1], linkperf:perf-script[1], linkperf:perf-report[1],
1148 linkperf:perf-inject[1]