xref: /OK3568_Linux_fs/kernel/tools/perf/Documentation/perf-list.txt (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593Smuzhiyunperf-list(1)
2*4882a593Smuzhiyun============
3*4882a593Smuzhiyun
4*4882a593SmuzhiyunNAME
5*4882a593Smuzhiyun----
6*4882a593Smuzhiyunperf-list - List all symbolic event types
7*4882a593Smuzhiyun
8*4882a593SmuzhiyunSYNOPSIS
9*4882a593Smuzhiyun--------
10*4882a593Smuzhiyun[verse]
11*4882a593Smuzhiyun'perf list' [--no-desc] [--long-desc]
12*4882a593Smuzhiyun            [hw|sw|cache|tracepoint|pmu|sdt|metric|metricgroup|event_glob]
13*4882a593Smuzhiyun
14*4882a593SmuzhiyunDESCRIPTION
15*4882a593Smuzhiyun-----------
16*4882a593SmuzhiyunThis command displays the symbolic event types which can be selected in the
17*4882a593Smuzhiyunvarious perf commands with the -e option.
18*4882a593Smuzhiyun
19*4882a593SmuzhiyunOPTIONS
20*4882a593Smuzhiyun-------
21*4882a593Smuzhiyun-d::
22*4882a593Smuzhiyun--desc::
23*4882a593SmuzhiyunPrint extra event descriptions. (default)
24*4882a593Smuzhiyun
25*4882a593Smuzhiyun--no-desc::
26*4882a593SmuzhiyunDon't print descriptions.
27*4882a593Smuzhiyun
28*4882a593Smuzhiyun-v::
29*4882a593Smuzhiyun--long-desc::
30*4882a593SmuzhiyunPrint longer event descriptions.
31*4882a593Smuzhiyun
32*4882a593Smuzhiyun--debug::
33*4882a593SmuzhiyunEnable debugging output.
34*4882a593Smuzhiyun
35*4882a593Smuzhiyun--details::
36*4882a593SmuzhiyunPrint how named events are resolved internally into perf events, and also
37*4882a593Smuzhiyunany extra expressions computed by perf stat.
38*4882a593Smuzhiyun
39*4882a593Smuzhiyun--deprecated::
40*4882a593SmuzhiyunPrint deprecated events. By default the deprecated events are hidden.
41*4882a593Smuzhiyun
42*4882a593Smuzhiyun[[EVENT_MODIFIERS]]
43*4882a593SmuzhiyunEVENT MODIFIERS
44*4882a593Smuzhiyun---------------
45*4882a593Smuzhiyun
46*4882a593SmuzhiyunEvents can optionally have a modifier by appending a colon and one or
47*4882a593Smuzhiyunmore modifiers. Modifiers allow the user to restrict the events to be
48*4882a593Smuzhiyuncounted. The following modifiers exist:
49*4882a593Smuzhiyun
50*4882a593Smuzhiyun u - user-space counting
51*4882a593Smuzhiyun k - kernel counting
52*4882a593Smuzhiyun h - hypervisor counting
53*4882a593Smuzhiyun I - non idle counting
54*4882a593Smuzhiyun G - guest counting (in KVM guests)
55*4882a593Smuzhiyun H - host counting (not in KVM guests)
56*4882a593Smuzhiyun p - precise level
57*4882a593Smuzhiyun P - use maximum detected precise level
58*4882a593Smuzhiyun S - read sample value (PERF_SAMPLE_READ)
59*4882a593Smuzhiyun D - pin the event to the PMU
60*4882a593Smuzhiyun W - group is weak and will fallback to non-group if not schedulable,
61*4882a593Smuzhiyun e - group or event are exclusive and do not share the PMU
62*4882a593Smuzhiyun
63*4882a593SmuzhiyunThe 'p' modifier can be used for specifying how precise the instruction
64*4882a593Smuzhiyunaddress should be. The 'p' modifier can be specified multiple times:
65*4882a593Smuzhiyun
66*4882a593Smuzhiyun 0 - SAMPLE_IP can have arbitrary skid
67*4882a593Smuzhiyun 1 - SAMPLE_IP must have constant skid
68*4882a593Smuzhiyun 2 - SAMPLE_IP requested to have 0 skid
69*4882a593Smuzhiyun 3 - SAMPLE_IP must have 0 skid, or uses randomization to avoid
70*4882a593Smuzhiyun     sample shadowing effects.
71*4882a593Smuzhiyun
72*4882a593SmuzhiyunFor Intel systems precise event sampling is implemented with PEBS
73*4882a593Smuzhiyunwhich supports up to precise-level 2, and precise level 3 for
74*4882a593Smuzhiyunsome special cases
75*4882a593Smuzhiyun
76*4882a593SmuzhiyunOn AMD systems it is implemented using IBS (up to precise-level 2).
77*4882a593SmuzhiyunThe precise modifier works with event types 0x76 (cpu-cycles, CPU
78*4882a593Smuzhiyunclocks not halted) and 0xC1 (micro-ops retired). Both events map to
79*4882a593SmuzhiyunIBS execution sampling (IBS op) with the IBS Op Counter Control bit
80*4882a593Smuzhiyun(IbsOpCntCtl) set respectively (see AMD64 Architecture Programmer’s
81*4882a593SmuzhiyunManual Volume 2: System Programming, 13.3 Instruction-Based
82*4882a593SmuzhiyunSampling). Examples to use IBS:
83*4882a593Smuzhiyun
84*4882a593Smuzhiyun perf record -a -e cpu-cycles:p ...    # use ibs op counting cycles
85*4882a593Smuzhiyun perf record -a -e r076:p ...          # same as -e cpu-cycles:p
86*4882a593Smuzhiyun perf record -a -e r0C1:p ...          # use ibs op counting micro-ops
87*4882a593Smuzhiyun
88*4882a593SmuzhiyunRAW HARDWARE EVENT DESCRIPTOR
89*4882a593Smuzhiyun-----------------------------
90*4882a593SmuzhiyunEven when an event is not available in a symbolic form within perf right now,
91*4882a593Smuzhiyunit can be encoded in a per processor specific way.
92*4882a593Smuzhiyun
93*4882a593SmuzhiyunFor instance For x86 CPUs NNN represents the raw register encoding with the
94*4882a593Smuzhiyunlayout of IA32_PERFEVTSELx MSRs (see [Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 3B: System Programming Guide] Figure 30-1 Layout
95*4882a593Smuzhiyunof IA32_PERFEVTSELx MSRs) or AMD's PerfEvtSeln (see [AMD64 Architecture Programmer’s Manual Volume 2: System Programming], Page 344,
96*4882a593SmuzhiyunFigure 13-7 Performance Event-Select Register (PerfEvtSeln)).
97*4882a593Smuzhiyun
98*4882a593SmuzhiyunNote: Only the following bit fields can be set in x86 counter
99*4882a593Smuzhiyunregisters: event, umask, edge, inv, cmask. Esp. guest/host only and
100*4882a593SmuzhiyunOS/user mode flags must be setup using <<EVENT_MODIFIERS, EVENT
101*4882a593SmuzhiyunMODIFIERS>>.
102*4882a593Smuzhiyun
103*4882a593SmuzhiyunExample:
104*4882a593Smuzhiyun
105*4882a593SmuzhiyunIf the Intel docs for a QM720 Core i7 describe an event as:
106*4882a593Smuzhiyun
107*4882a593Smuzhiyun  Event  Umask  Event Mask
108*4882a593Smuzhiyun  Num.   Value  Mnemonic    Description                        Comment
109*4882a593Smuzhiyun
110*4882a593Smuzhiyun  A8H      01H  LSD.UOPS    Counts the number of micro-ops     Use cmask=1 and
111*4882a593Smuzhiyun                            delivered by loop stream detector  invert to count
112*4882a593Smuzhiyun                                                               cycles
113*4882a593Smuzhiyun
114*4882a593Smuzhiyunraw encoding of 0x1A8 can be used:
115*4882a593Smuzhiyun
116*4882a593Smuzhiyun perf stat -e r1a8 -a sleep 1
117*4882a593Smuzhiyun perf record -e r1a8 ...
118*4882a593Smuzhiyun
119*4882a593SmuzhiyunIt's also possible to use pmu syntax:
120*4882a593Smuzhiyun
121*4882a593Smuzhiyun perf record -e r1a8 -a sleep 1
122*4882a593Smuzhiyun perf record -e cpu/r1a8/ ...
123*4882a593Smuzhiyun perf record -e cpu/r0x1a8/ ...
124*4882a593Smuzhiyun
125*4882a593SmuzhiyunYou should refer to the processor specific documentation for getting these
126*4882a593Smuzhiyundetails. Some of them are referenced in the SEE ALSO section below.
127*4882a593Smuzhiyun
128*4882a593SmuzhiyunARBITRARY PMUS
129*4882a593Smuzhiyun--------------
130*4882a593Smuzhiyun
131*4882a593Smuzhiyunperf also supports an extended syntax for specifying raw parameters
132*4882a593Smuzhiyunto PMUs. Using this typically requires looking up the specific event
133*4882a593Smuzhiyunin the CPU vendor specific documentation.
134*4882a593Smuzhiyun
135*4882a593SmuzhiyunThe available PMUs and their raw parameters can be listed with
136*4882a593Smuzhiyun
137*4882a593Smuzhiyun  ls /sys/devices/*/format
138*4882a593Smuzhiyun
139*4882a593SmuzhiyunFor example the raw event "LSD.UOPS" core pmu event above could
140*4882a593Smuzhiyunbe specified as
141*4882a593Smuzhiyun
142*4882a593Smuzhiyun  perf stat -e cpu/event=0xa8,umask=0x1,name=LSD.UOPS_CYCLES,cmask=0x1/ ...
143*4882a593Smuzhiyun
144*4882a593Smuzhiyun  or using extended name syntax
145*4882a593Smuzhiyun
146*4882a593Smuzhiyun  perf stat -e cpu/event=0xa8,umask=0x1,cmask=0x1,name=\'LSD.UOPS_CYCLES:cmask=0x1\'/ ...
147*4882a593Smuzhiyun
148*4882a593SmuzhiyunPER SOCKET PMUS
149*4882a593Smuzhiyun---------------
150*4882a593Smuzhiyun
151*4882a593SmuzhiyunSome PMUs are not associated with a core, but with a whole CPU socket.
152*4882a593SmuzhiyunEvents on these PMUs generally cannot be sampled, but only counted globally
153*4882a593Smuzhiyunwith perf stat -a. They can be bound to one logical CPU, but will measure
154*4882a593Smuzhiyunall the CPUs in the same socket.
155*4882a593Smuzhiyun
156*4882a593SmuzhiyunThis example measures memory bandwidth every second
157*4882a593Smuzhiyunon the first memory controller on socket 0 of a Intel Xeon system
158*4882a593Smuzhiyun
159*4882a593Smuzhiyun  perf stat -C 0 -a uncore_imc_0/cas_count_read/,uncore_imc_0/cas_count_write/ -I 1000 ...
160*4882a593Smuzhiyun
161*4882a593SmuzhiyunEach memory controller has its own PMU.  Measuring the complete system
162*4882a593Smuzhiyunbandwidth would require specifying all imc PMUs (see perf list output),
163*4882a593Smuzhiyunand adding the values together. To simplify creation of multiple events,
164*4882a593Smuzhiyunprefix and glob matching is supported in the PMU name, and the prefix
165*4882a593Smuzhiyun'uncore_' is also ignored when performing the match. So the command above
166*4882a593Smuzhiyuncan be expanded to all memory controllers by using the syntaxes:
167*4882a593Smuzhiyun
168*4882a593Smuzhiyun  perf stat -C 0 -a imc/cas_count_read/,imc/cas_count_write/ -I 1000 ...
169*4882a593Smuzhiyun  perf stat -C 0 -a *imc*/cas_count_read/,*imc*/cas_count_write/ -I 1000 ...
170*4882a593Smuzhiyun
171*4882a593SmuzhiyunThis example measures the combined core power every second
172*4882a593Smuzhiyun
173*4882a593Smuzhiyun  perf stat -I 1000 -e power/energy-cores/  -a
174*4882a593Smuzhiyun
175*4882a593SmuzhiyunACCESS RESTRICTIONS
176*4882a593Smuzhiyun-------------------
177*4882a593Smuzhiyun
178*4882a593SmuzhiyunFor non root users generally only context switched PMU events are available.
179*4882a593SmuzhiyunThis is normally only the events in the cpu PMU, the predefined events
180*4882a593Smuzhiyunlike cycles and instructions and some software events.
181*4882a593Smuzhiyun
182*4882a593SmuzhiyunOther PMUs and global measurements are normally root only.
183*4882a593SmuzhiyunSome event qualifiers, such as "any", are also root only.
184*4882a593Smuzhiyun
185*4882a593SmuzhiyunThis can be overridden by setting the kernel.perf_event_paranoid
186*4882a593Smuzhiyunsysctl to -1, which allows non root to use these events.
187*4882a593Smuzhiyun
188*4882a593SmuzhiyunFor accessing trace point events perf needs to have read access to
189*4882a593Smuzhiyun/sys/kernel/debug/tracing, even when perf_event_paranoid is in a relaxed
190*4882a593Smuzhiyunsetting.
191*4882a593Smuzhiyun
192*4882a593SmuzhiyunTRACING
193*4882a593Smuzhiyun-------
194*4882a593Smuzhiyun
195*4882a593SmuzhiyunSome PMUs control advanced hardware tracing capabilities, such as Intel PT,
196*4882a593Smuzhiyunthat allows low overhead execution tracing.  These are described in a separate
197*4882a593Smuzhiyunintel-pt.txt document.
198*4882a593Smuzhiyun
199*4882a593SmuzhiyunPARAMETERIZED EVENTS
200*4882a593Smuzhiyun--------------------
201*4882a593Smuzhiyun
202*4882a593SmuzhiyunSome pmu events listed by 'perf-list' will be displayed with '?' in them. For
203*4882a593Smuzhiyunexample:
204*4882a593Smuzhiyun
205*4882a593Smuzhiyun  hv_gpci/dtbp_ptitc,phys_processor_idx=?/
206*4882a593Smuzhiyun
207*4882a593SmuzhiyunThis means that when provided as an event, a value for '?' must
208*4882a593Smuzhiyunalso be supplied. For example:
209*4882a593Smuzhiyun
210*4882a593Smuzhiyun  perf stat -C 0 -e 'hv_gpci/dtbp_ptitc,phys_processor_idx=0x2/' ...
211*4882a593Smuzhiyun
212*4882a593SmuzhiyunEVENT QUALIFIERS:
213*4882a593Smuzhiyun
214*4882a593SmuzhiyunIt is also possible to add extra qualifiers to an event:
215*4882a593Smuzhiyun
216*4882a593Smuzhiyunpercore:
217*4882a593Smuzhiyun
218*4882a593SmuzhiyunSums up the event counts for all hardware threads in a core, e.g.:
219*4882a593Smuzhiyun
220*4882a593Smuzhiyun
221*4882a593Smuzhiyun  perf stat -e cpu/event=0,umask=0x3,percore=1/
222*4882a593Smuzhiyun
223*4882a593Smuzhiyun
224*4882a593SmuzhiyunEVENT GROUPS
225*4882a593Smuzhiyun------------
226*4882a593Smuzhiyun
227*4882a593SmuzhiyunPerf supports time based multiplexing of events, when the number of events
228*4882a593Smuzhiyunactive exceeds the number of hardware performance counters. Multiplexing
229*4882a593Smuzhiyuncan cause measurement errors when the workload changes its execution
230*4882a593Smuzhiyunprofile.
231*4882a593Smuzhiyun
232*4882a593SmuzhiyunWhen metrics are computed using formulas from event counts, it is useful to
233*4882a593Smuzhiyunensure some events are always measured together as a group to minimize multiplexing
234*4882a593Smuzhiyunerrors. Event groups can be specified using { }.
235*4882a593Smuzhiyun
236*4882a593Smuzhiyun  perf stat -e '{instructions,cycles}' ...
237*4882a593Smuzhiyun
238*4882a593SmuzhiyunThe number of available performance counters depend on the CPU. A group
239*4882a593Smuzhiyuncannot contain more events than available counters.
240*4882a593SmuzhiyunFor example Intel Core CPUs typically have four generic performance counters
241*4882a593Smuzhiyunfor the core, plus three fixed counters for instructions, cycles and
242*4882a593Smuzhiyunref-cycles. Some special events have restrictions on which counter they
243*4882a593Smuzhiyuncan schedule, and may not support multiple instances in a single group.
244*4882a593SmuzhiyunWhen too many events are specified in the group some of them will not
245*4882a593Smuzhiyunbe measured.
246*4882a593Smuzhiyun
247*4882a593SmuzhiyunGlobally pinned events can limit the number of counters available for
248*4882a593Smuzhiyunother groups. On x86 systems, the NMI watchdog pins a counter by default.
249*4882a593SmuzhiyunThe nmi watchdog can be disabled as root with
250*4882a593Smuzhiyun
251*4882a593Smuzhiyun	echo 0 > /proc/sys/kernel/nmi_watchdog
252*4882a593Smuzhiyun
253*4882a593SmuzhiyunEvents from multiple different PMUs cannot be mixed in a group, with
254*4882a593Smuzhiyunsome exceptions for software events.
255*4882a593Smuzhiyun
256*4882a593SmuzhiyunLEADER SAMPLING
257*4882a593Smuzhiyun---------------
258*4882a593Smuzhiyun
259*4882a593Smuzhiyunperf also supports group leader sampling using the :S specifier.
260*4882a593Smuzhiyun
261*4882a593Smuzhiyun  perf record -e '{cycles,instructions}:S' ...
262*4882a593Smuzhiyun  perf report --group
263*4882a593Smuzhiyun
264*4882a593SmuzhiyunNormally all events in an event group sample, but with :S only
265*4882a593Smuzhiyunthe first event (the leader) samples, and it only reads the values of the
266*4882a593Smuzhiyunother events in the group.
267*4882a593Smuzhiyun
268*4882a593SmuzhiyunHowever, in the case AUX area events (e.g. Intel PT or CoreSight), the AUX
269*4882a593Smuzhiyunarea event must be the leader, so then the second event samples, not the first.
270*4882a593Smuzhiyun
271*4882a593SmuzhiyunOPTIONS
272*4882a593Smuzhiyun-------
273*4882a593Smuzhiyun
274*4882a593SmuzhiyunWithout options all known events will be listed.
275*4882a593Smuzhiyun
276*4882a593SmuzhiyunTo limit the list use:
277*4882a593Smuzhiyun
278*4882a593Smuzhiyun. 'hw' or 'hardware' to list hardware events such as cache-misses, etc.
279*4882a593Smuzhiyun
280*4882a593Smuzhiyun. 'sw' or 'software' to list software events such as context switches, etc.
281*4882a593Smuzhiyun
282*4882a593Smuzhiyun. 'cache' or 'hwcache' to list hardware cache events such as L1-dcache-loads, etc.
283*4882a593Smuzhiyun
284*4882a593Smuzhiyun. 'tracepoint' to list all tracepoint events, alternatively use
285*4882a593Smuzhiyun  'subsys_glob:event_glob' to filter by tracepoint subsystems such as sched,
286*4882a593Smuzhiyun  block, etc.
287*4882a593Smuzhiyun
288*4882a593Smuzhiyun. 'pmu' to print the kernel supplied PMU events.
289*4882a593Smuzhiyun
290*4882a593Smuzhiyun. 'sdt' to list all Statically Defined Tracepoint events.
291*4882a593Smuzhiyun
292*4882a593Smuzhiyun. 'metric' to list metrics
293*4882a593Smuzhiyun
294*4882a593Smuzhiyun. 'metricgroup' to list metricgroups with metrics.
295*4882a593Smuzhiyun
296*4882a593Smuzhiyun. If none of the above is matched, it will apply the supplied glob to all
297*4882a593Smuzhiyun  events, printing the ones that match.
298*4882a593Smuzhiyun
299*4882a593Smuzhiyun. As a last resort, it will do a substring search in all event names.
300*4882a593Smuzhiyun
301*4882a593SmuzhiyunOne or more types can be used at the same time, listing the events for the
302*4882a593Smuzhiyuntypes specified.
303*4882a593Smuzhiyun
304*4882a593SmuzhiyunSupport raw format:
305*4882a593Smuzhiyun
306*4882a593Smuzhiyun. '--raw-dump', shows the raw-dump of all the events.
307*4882a593Smuzhiyun. '--raw-dump [hw|sw|cache|tracepoint|pmu|event_glob]', shows the raw-dump of
308*4882a593Smuzhiyun  a certain kind of events.
309*4882a593Smuzhiyun
310*4882a593SmuzhiyunSEE ALSO
311*4882a593Smuzhiyun--------
312*4882a593Smuzhiyunlinkperf:perf-stat[1], linkperf:perf-top[1],
313*4882a593Smuzhiyunlinkperf:perf-record[1],
314*4882a593Smuzhiyunhttp://www.intel.com/sdm/[Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 3B: System Programming Guide],
315*4882a593Smuzhiyunhttp://support.amd.com/us/Processor_TechDocs/24593_APM_v2.pdf[AMD64 Architecture Programmer’s Manual Volume 2: System Programming]
316