xref: /OK3568_Linux_fs/kernel/Documentation/admin-guide/pm/intel-speed-select.rst (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593Smuzhiyun.. SPDX-License-Identifier: GPL-2.0
2*4882a593Smuzhiyun
3*4882a593Smuzhiyun============================================================
4*4882a593SmuzhiyunIntel(R) Speed Select Technology User Guide
5*4882a593Smuzhiyun============================================================
6*4882a593Smuzhiyun
7*4882a593SmuzhiyunThe Intel(R) Speed Select Technology (Intel(R) SST) provides a powerful new
8*4882a593Smuzhiyuncollection of features that give more granular control over CPU performance.
9*4882a593SmuzhiyunWith Intel(R) SST, one server can be configured for power and performance for a
10*4882a593Smuzhiyunvariety of diverse workload requirements.
11*4882a593Smuzhiyun
12*4882a593SmuzhiyunRefer to the links below for an overview of the technology:
13*4882a593Smuzhiyun
14*4882a593Smuzhiyun- https://www.intel.com/content/www/us/en/architecture-and-technology/speed-select-technology-article.html
15*4882a593Smuzhiyun- https://builders.intel.com/docs/networkbuilders/intel-speed-select-technology-base-frequency-enhancing-performance.pdf
16*4882a593Smuzhiyun
17*4882a593SmuzhiyunThese capabilities are further enhanced in some of the newer generations of
18*4882a593Smuzhiyunserver platforms where these features can be enumerated and controlled
19*4882a593Smuzhiyundynamically without pre-configuring via BIOS setup options. This dynamic
20*4882a593Smuzhiyunconfiguration is done via mailbox commands to the hardware. One way to enumerate
21*4882a593Smuzhiyunand configure these features is by using the Intel Speed Select utility.
22*4882a593Smuzhiyun
23*4882a593SmuzhiyunThis document explains how to use the Intel Speed Select tool to enumerate and
24*4882a593Smuzhiyuncontrol Intel(R) SST features. This document gives example commands and explains
25*4882a593Smuzhiyunhow these commands change the power and performance profile of the system under
26*4882a593Smuzhiyuntest. Using this tool as an example, customers can replicate the messaging
27*4882a593Smuzhiyunimplemented in the tool in their production software.
28*4882a593Smuzhiyun
29*4882a593Smuzhiyunintel-speed-select configuration tool
30*4882a593Smuzhiyun======================================
31*4882a593Smuzhiyun
32*4882a593SmuzhiyunMost Linux distribution packages may include the "intel-speed-select" tool. If not,
33*4882a593Smuzhiyunit can be built by downloading the Linux kernel tree from kernel.org. Once
34*4882a593Smuzhiyundownloaded, the tool can be built without building the full kernel.
35*4882a593Smuzhiyun
36*4882a593SmuzhiyunFrom the kernel tree, run the following commands::
37*4882a593Smuzhiyun
38*4882a593Smuzhiyun# cd tools/power/x86/intel-speed-select/
39*4882a593Smuzhiyun# make
40*4882a593Smuzhiyun# make install
41*4882a593Smuzhiyun
42*4882a593SmuzhiyunGetting Help
43*4882a593Smuzhiyun------------
44*4882a593Smuzhiyun
45*4882a593SmuzhiyunTo get help with the tool, execute the command below::
46*4882a593Smuzhiyun
47*4882a593Smuzhiyun# intel-speed-select --help
48*4882a593Smuzhiyun
49*4882a593SmuzhiyunThe top-level help describes arguments and features. Notice that there is a
50*4882a593Smuzhiyunmulti-level help structure in the tool. For example, to get help for the feature "perf-profile"::
51*4882a593Smuzhiyun
52*4882a593Smuzhiyun# intel-speed-select perf-profile --help
53*4882a593Smuzhiyun
54*4882a593SmuzhiyunTo get help on a command, another level of help is provided. For example for the command info "info"::
55*4882a593Smuzhiyun
56*4882a593Smuzhiyun# intel-speed-select perf-profile info --help
57*4882a593Smuzhiyun
58*4882a593SmuzhiyunSummary of platform capability
59*4882a593Smuzhiyun------------------------------
60*4882a593SmuzhiyunTo check the current platform and driver capaibilities, execute::
61*4882a593Smuzhiyun
62*4882a593Smuzhiyun#intel-speed-select --info
63*4882a593Smuzhiyun
64*4882a593SmuzhiyunFor example on a test system::
65*4882a593Smuzhiyun
66*4882a593Smuzhiyun # intel-speed-select --info
67*4882a593Smuzhiyun Intel(R) Speed Select Technology
68*4882a593Smuzhiyun Executing on CPU model: X
69*4882a593Smuzhiyun Platform: API version : 1
70*4882a593Smuzhiyun Platform: Driver version : 1
71*4882a593Smuzhiyun Platform: mbox supported : 1
72*4882a593Smuzhiyun Platform: mmio supported : 1
73*4882a593Smuzhiyun Intel(R) SST-PP (feature perf-profile) is supported
74*4882a593Smuzhiyun TDP level change control is unlocked, max level: 4
75*4882a593Smuzhiyun Intel(R) SST-TF (feature turbo-freq) is supported
76*4882a593Smuzhiyun Intel(R) SST-BF (feature base-freq) is not supported
77*4882a593Smuzhiyun Intel(R) SST-CP (feature core-power) is supported
78*4882a593Smuzhiyun
79*4882a593SmuzhiyunIntel(R) Speed Select Technology - Performance Profile (Intel(R) SST-PP)
80*4882a593Smuzhiyun------------------------------------------------------------------------
81*4882a593Smuzhiyun
82*4882a593SmuzhiyunThis feature allows configuration of a server dynamically based on workload
83*4882a593Smuzhiyunperformance requirements. This helps users during deployment as they do not have
84*4882a593Smuzhiyunto choose a specific server configuration statically.  This Intel(R) Speed Select
85*4882a593SmuzhiyunTechnology - Performance Profile (Intel(R) SST-PP) feature introduces a mechanism
86*4882a593Smuzhiyunthat allows multiple optimized performance profiles per system. Each profile
87*4882a593Smuzhiyundefines a set of CPUs that need to be online and rest offline to sustain a
88*4882a593Smuzhiyunguaranteed base frequency. Once the user issues a command to use a specific
89*4882a593Smuzhiyunperformance profile and meet CPU online/offline requirement, the user can expect
90*4882a593Smuzhiyuna change in the base frequency dynamically. This feature is called
91*4882a593Smuzhiyun"perf-profile" when using the Intel Speed Select tool.
92*4882a593Smuzhiyun
93*4882a593SmuzhiyunNumber or performance levels
94*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
95*4882a593Smuzhiyun
96*4882a593SmuzhiyunThere can be multiple performance profiles on a system. To get the number of
97*4882a593Smuzhiyunprofiles, execute the command below::
98*4882a593Smuzhiyun
99*4882a593Smuzhiyun # intel-speed-select perf-profile get-config-levels
100*4882a593Smuzhiyun Intel(R) Speed Select Technology
101*4882a593Smuzhiyun Executing on CPU model: X
102*4882a593Smuzhiyun package-0
103*4882a593Smuzhiyun  die-0
104*4882a593Smuzhiyun    cpu-0
105*4882a593Smuzhiyun        get-config-levels:4
106*4882a593Smuzhiyun package-1
107*4882a593Smuzhiyun  die-0
108*4882a593Smuzhiyun    cpu-14
109*4882a593Smuzhiyun        get-config-levels:4
110*4882a593Smuzhiyun
111*4882a593SmuzhiyunOn this system under test, there are 4 performance profiles in addition to the
112*4882a593Smuzhiyunbase performance profile (which is performance level 0).
113*4882a593Smuzhiyun
114*4882a593SmuzhiyunLock/Unlock status
115*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~
116*4882a593Smuzhiyun
117*4882a593SmuzhiyunEven if there are multiple performance profiles, it is possible that they
118*4882a593Smuzhiyunare locked. If they are locked, users cannot issue a command to change the
119*4882a593Smuzhiyunperformance state. It is possible that there is a BIOS setup to unlock or check
120*4882a593Smuzhiyunwith your system vendor.
121*4882a593Smuzhiyun
122*4882a593SmuzhiyunTo check if the system is locked, execute the following command::
123*4882a593Smuzhiyun
124*4882a593Smuzhiyun # intel-speed-select perf-profile get-lock-status
125*4882a593Smuzhiyun Intel(R) Speed Select Technology
126*4882a593Smuzhiyun Executing on CPU model: X
127*4882a593Smuzhiyun package-0
128*4882a593Smuzhiyun  die-0
129*4882a593Smuzhiyun    cpu-0
130*4882a593Smuzhiyun        get-lock-status:0
131*4882a593Smuzhiyun package-1
132*4882a593Smuzhiyun  die-0
133*4882a593Smuzhiyun    cpu-14
134*4882a593Smuzhiyun        get-lock-status:0
135*4882a593Smuzhiyun
136*4882a593SmuzhiyunIn this case, lock status is 0, which means that the system is unlocked.
137*4882a593Smuzhiyun
138*4882a593SmuzhiyunProperties of a performance level
139*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
140*4882a593Smuzhiyun
141*4882a593SmuzhiyunTo get properties of a specific performance level (For example for the level 0, below), execute the command below::
142*4882a593Smuzhiyun
143*4882a593Smuzhiyun # intel-speed-select perf-profile info -l 0
144*4882a593Smuzhiyun Intel(R) Speed Select Technology
145*4882a593Smuzhiyun Executing on CPU model: X
146*4882a593Smuzhiyun package-0
147*4882a593Smuzhiyun  die-0
148*4882a593Smuzhiyun    cpu-0
149*4882a593Smuzhiyun      perf-profile-level-0
150*4882a593Smuzhiyun        cpu-count:28
151*4882a593Smuzhiyun        enable-cpu-mask:000003ff,f0003fff
152*4882a593Smuzhiyun        enable-cpu-list:0,1,2,3,4,5,6,7,8,9,10,11,12,13,28,29,30,31,32,33,34,35,36,37,38,39,40,41
153*4882a593Smuzhiyun        thermal-design-power-ratio:26
154*4882a593Smuzhiyun        base-frequency(MHz):2600
155*4882a593Smuzhiyun        speed-select-turbo-freq:disabled
156*4882a593Smuzhiyun        speed-select-base-freq:disabled
157*4882a593Smuzhiyun	...
158*4882a593Smuzhiyun	...
159*4882a593Smuzhiyun
160*4882a593SmuzhiyunHere -l option is used to specify a performance level.
161*4882a593Smuzhiyun
162*4882a593SmuzhiyunIf the option -l is omitted, then this command will print information about all
163*4882a593Smuzhiyunthe performance levels. The above command is printing properties of the
164*4882a593Smuzhiyunperformance level 0.
165*4882a593Smuzhiyun
166*4882a593SmuzhiyunFor this performance profile, the list of CPUs displayed by the
167*4882a593Smuzhiyun"enable-cpu-mask/enable-cpu-list" at the max can be "online." When that
168*4882a593Smuzhiyuncondition is met, then base frequency of 2600 MHz can be maintained. To
169*4882a593Smuzhiyununderstand more, execute "intel-speed-select perf-profile info" for performance
170*4882a593Smuzhiyunlevel 4::
171*4882a593Smuzhiyun
172*4882a593Smuzhiyun # intel-speed-select perf-profile info -l 4
173*4882a593Smuzhiyun Intel(R) Speed Select Technology
174*4882a593Smuzhiyun Executing on CPU model: X
175*4882a593Smuzhiyun package-0
176*4882a593Smuzhiyun  die-0
177*4882a593Smuzhiyun    cpu-0
178*4882a593Smuzhiyun      perf-profile-level-4
179*4882a593Smuzhiyun        cpu-count:28
180*4882a593Smuzhiyun        enable-cpu-mask:000000fa,f0000faf
181*4882a593Smuzhiyun        enable-cpu-list:0,1,2,3,5,7,8,9,10,11,28,29,30,31,33,35,36,37,38,39
182*4882a593Smuzhiyun        thermal-design-power-ratio:28
183*4882a593Smuzhiyun        base-frequency(MHz):2800
184*4882a593Smuzhiyun        speed-select-turbo-freq:disabled
185*4882a593Smuzhiyun        speed-select-base-freq:unsupported
186*4882a593Smuzhiyun	...
187*4882a593Smuzhiyun	...
188*4882a593Smuzhiyun
189*4882a593SmuzhiyunThere are fewer CPUs in the "enable-cpu-mask/enable-cpu-list". Consequently, if
190*4882a593Smuzhiyunthe user only keeps these CPUs online and the rest "offline," then the base
191*4882a593Smuzhiyunfrequency is increased to 2.8 GHz compared to 2.6 GHz at performance level 0.
192*4882a593Smuzhiyun
193*4882a593SmuzhiyunGet current performance level
194*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
195*4882a593Smuzhiyun
196*4882a593SmuzhiyunTo get the current performance level, execute::
197*4882a593Smuzhiyun
198*4882a593Smuzhiyun # intel-speed-select perf-profile get-config-current-level
199*4882a593Smuzhiyun Intel(R) Speed Select Technology
200*4882a593Smuzhiyun Executing on CPU model: X
201*4882a593Smuzhiyun package-0
202*4882a593Smuzhiyun  die-0
203*4882a593Smuzhiyun    cpu-0
204*4882a593Smuzhiyun        get-config-current_level:0
205*4882a593Smuzhiyun
206*4882a593SmuzhiyunFirst verify that the base_frequency displayed by the cpufreq sysfs is correct::
207*4882a593Smuzhiyun
208*4882a593Smuzhiyun # cat /sys/devices/system/cpu/cpu0/cpufreq/base_frequency
209*4882a593Smuzhiyun 2600000
210*4882a593Smuzhiyun
211*4882a593SmuzhiyunThis matches the base-frequency (MHz) field value displayed from the
212*4882a593Smuzhiyun"perf-profile info" command for performance level 0(cpufreq frequency is in
213*4882a593SmuzhiyunKHz).
214*4882a593Smuzhiyun
215*4882a593SmuzhiyunTo check if the average frequency is equal to the base frequency for a 100% busy
216*4882a593Smuzhiyunworkload, disable turbo::
217*4882a593Smuzhiyun
218*4882a593Smuzhiyun# echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo
219*4882a593Smuzhiyun
220*4882a593SmuzhiyunThen runs a busy workload on all CPUs, for example::
221*4882a593Smuzhiyun
222*4882a593Smuzhiyun#stress -c 64
223*4882a593Smuzhiyun
224*4882a593SmuzhiyunTo verify the base frequency, run turbostat::
225*4882a593Smuzhiyun
226*4882a593Smuzhiyun #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
227*4882a593Smuzhiyun
228*4882a593Smuzhiyun  Package	Core	CPU	Bzy_MHz
229*4882a593Smuzhiyun		-	-	2600
230*4882a593Smuzhiyun  0		0	0	2600
231*4882a593Smuzhiyun  0		1	1	2600
232*4882a593Smuzhiyun  0		2	2	2600
233*4882a593Smuzhiyun  0		3	3	2600
234*4882a593Smuzhiyun  0		4	4	2600
235*4882a593Smuzhiyun  .		.	.	.
236*4882a593Smuzhiyun
237*4882a593Smuzhiyun
238*4882a593SmuzhiyunChanging performance level
239*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~~~~
240*4882a593Smuzhiyun
241*4882a593SmuzhiyunTo the change the performance level to 4, execute::
242*4882a593Smuzhiyun
243*4882a593Smuzhiyun # intel-speed-select -d perf-profile set-config-level -l 4 -o
244*4882a593Smuzhiyun Intel(R) Speed Select Technology
245*4882a593Smuzhiyun Executing on CPU model: X
246*4882a593Smuzhiyun package-0
247*4882a593Smuzhiyun  die-0
248*4882a593Smuzhiyun    cpu-0
249*4882a593Smuzhiyun      perf-profile
250*4882a593Smuzhiyun        set_tdp_level:success
251*4882a593Smuzhiyun
252*4882a593SmuzhiyunIn the command above, "-o" is optional. If it is specified, then it will also
253*4882a593Smuzhiyunoffline CPUs which are not present in the enable_cpu_mask for this performance
254*4882a593Smuzhiyunlevel.
255*4882a593Smuzhiyun
256*4882a593SmuzhiyunNow if the base_frequency is checked::
257*4882a593Smuzhiyun
258*4882a593Smuzhiyun #cat /sys/devices/system/cpu/cpu0/cpufreq/base_frequency
259*4882a593Smuzhiyun 2800000
260*4882a593Smuzhiyun
261*4882a593SmuzhiyunWhich shows that the base frequency now increased from 2600 MHz at performance
262*4882a593Smuzhiyunlevel 0 to 2800 MHz at performance level 4. As a result, any workload, which can
263*4882a593Smuzhiyunuse fewer CPUs, can see a boost of 200 MHz compared to performance level 0.
264*4882a593Smuzhiyun
265*4882a593SmuzhiyunCheck presence of other Intel(R) SST features
266*4882a593Smuzhiyun---------------------------------------------
267*4882a593Smuzhiyun
268*4882a593SmuzhiyunEach of the performance profiles also specifies weather there is support of
269*4882a593Smuzhiyunother two Intel(R) SST features (Intel(R) Speed Select Technology - Base Frequency
270*4882a593Smuzhiyun(Intel(R) SST-BF) and Intel(R) Speed Select Technology - Turbo Frequency (Intel
271*4882a593SmuzhiyunSST-TF)).
272*4882a593Smuzhiyun
273*4882a593SmuzhiyunFor example, from the output of "perf-profile info" above, for level 0 and level
274*4882a593Smuzhiyun4:
275*4882a593Smuzhiyun
276*4882a593SmuzhiyunFor level 0::
277*4882a593Smuzhiyun       speed-select-turbo-freq:disabled
278*4882a593Smuzhiyun       speed-select-base-freq:disabled
279*4882a593Smuzhiyun
280*4882a593SmuzhiyunFor level 4::
281*4882a593Smuzhiyun       speed-select-turbo-freq:disabled
282*4882a593Smuzhiyun       speed-select-base-freq:unsupported
283*4882a593Smuzhiyun
284*4882a593SmuzhiyunGiven these results, the "speed-select-base-freq" (Intel(R) SST-BF) in level 4
285*4882a593Smuzhiyunchanged from "disabled" to "unsupported" compared to performance level 0.
286*4882a593Smuzhiyun
287*4882a593SmuzhiyunThis means that at performance level 4, the "speed-select-base-freq" feature is
288*4882a593Smuzhiyunnot supported. However, at performance level 0, this feature is "supported", but
289*4882a593Smuzhiyuncurrently "disabled", meaning the user has not activated this feature. Whereas
290*4882a593Smuzhiyun"speed-select-turbo-freq" (Intel(R) SST-TF) is supported at both performance
291*4882a593Smuzhiyunlevels, but currently not activated by the user.
292*4882a593Smuzhiyun
293*4882a593SmuzhiyunThe Intel(R) SST-BF and the Intel(R) SST-TF features are built on a foundation
294*4882a593Smuzhiyuntechnology called Intel(R) Speed Select Technology - Core Power (Intel(R) SST-CP).
295*4882a593SmuzhiyunThe platform firmware enables this feature when Intel(R) SST-BF or Intel(R) SST-TF
296*4882a593Smuzhiyunis supported on a platform.
297*4882a593Smuzhiyun
298*4882a593SmuzhiyunIntel(R) Speed Select Technology Core Power (Intel(R) SST-CP)
299*4882a593Smuzhiyun---------------------------------------------------------------
300*4882a593Smuzhiyun
301*4882a593SmuzhiyunIntel(R) Speed Select Technology Core Power (Intel(R) SST-CP) is an interface that
302*4882a593Smuzhiyunallows users to define per core priority. This defines a mechanism to distribute
303*4882a593Smuzhiyunpower among cores when there is a power constrained scenario. This defines a
304*4882a593Smuzhiyunclass of service (CLOS) configuration.
305*4882a593Smuzhiyun
306*4882a593SmuzhiyunThe user can configure up to 4 class of service configurations. Each CLOS group
307*4882a593Smuzhiyunconfiguration allows definitions of parameters, which affects how the frequency
308*4882a593Smuzhiyuncan be limited and power is distributed. Each CPU core can be tied to a class of
309*4882a593Smuzhiyunservice and hence an associated priority. The granularity is at core level not
310*4882a593Smuzhiyunat per CPU level.
311*4882a593Smuzhiyun
312*4882a593SmuzhiyunEnable CLOS based prioritization
313*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
314*4882a593Smuzhiyun
315*4882a593SmuzhiyunTo use CLOS based prioritization feature, firmware must be informed to enable
316*4882a593Smuzhiyunand use a priority type. There is a default per platform priority type, which
317*4882a593Smuzhiyuncan be changed with optional command line parameter.
318*4882a593Smuzhiyun
319*4882a593SmuzhiyunTo enable and check the options, execute::
320*4882a593Smuzhiyun
321*4882a593Smuzhiyun # intel-speed-select core-power enable --help
322*4882a593Smuzhiyun Intel(R) Speed Select Technology
323*4882a593Smuzhiyun Executing on CPU model: X
324*4882a593Smuzhiyun Enable core-power for a package/die
325*4882a593Smuzhiyun	Clos Enable: Specify priority type with [--priority|-p]
326*4882a593Smuzhiyun		 0: Proportional, 1: Ordered
327*4882a593Smuzhiyun
328*4882a593SmuzhiyunThere are two types of priority types:
329*4882a593Smuzhiyun
330*4882a593Smuzhiyun- Ordered
331*4882a593Smuzhiyun
332*4882a593SmuzhiyunPriority for ordered throttling is defined based on the index of the assigned
333*4882a593SmuzhiyunCLOS group. Where CLOS0 gets highest priority (throttled last).
334*4882a593Smuzhiyun
335*4882a593SmuzhiyunPriority order is:
336*4882a593SmuzhiyunCLOS0 > CLOS1 > CLOS2 > CLOS3.
337*4882a593Smuzhiyun
338*4882a593Smuzhiyun- Proportional
339*4882a593Smuzhiyun
340*4882a593SmuzhiyunWhen proportional priority is used, there is an additional parameter called
341*4882a593Smuzhiyunfrequency_weight, which can be specified per CLOS group. The goal of
342*4882a593Smuzhiyunproportional priority is to provide each core with the requested min., then
343*4882a593Smuzhiyundistribute all remaining (excess/deficit) budgets in proportion to a defined
344*4882a593Smuzhiyunweight. This proportional priority can be configured using "core-power config"
345*4882a593Smuzhiyuncommand.
346*4882a593Smuzhiyun
347*4882a593SmuzhiyunTo enable with the platform default priority type, execute::
348*4882a593Smuzhiyun
349*4882a593Smuzhiyun # intel-speed-select core-power enable
350*4882a593Smuzhiyun Intel(R) Speed Select Technology
351*4882a593Smuzhiyun Executing on CPU model: X
352*4882a593Smuzhiyun package-0
353*4882a593Smuzhiyun  die-0
354*4882a593Smuzhiyun    cpu-0
355*4882a593Smuzhiyun      core-power
356*4882a593Smuzhiyun        enable:success
357*4882a593Smuzhiyun package-1
358*4882a593Smuzhiyun  die-0
359*4882a593Smuzhiyun    cpu-6
360*4882a593Smuzhiyun      core-power
361*4882a593Smuzhiyun        enable:success
362*4882a593Smuzhiyun
363*4882a593SmuzhiyunThe scope of this enable is per package or die scoped when a package contains
364*4882a593Smuzhiyunmultiple dies. To check if CLOS is enabled and get priority type, "core-power
365*4882a593Smuzhiyuninfo" command can be used. For example to check the status of core-power feature
366*4882a593Smuzhiyunon CPU 0, execute::
367*4882a593Smuzhiyun
368*4882a593Smuzhiyun # intel-speed-select -c 0 core-power info
369*4882a593Smuzhiyun Intel(R) Speed Select Technology
370*4882a593Smuzhiyun Executing on CPU model: X
371*4882a593Smuzhiyun package-0
372*4882a593Smuzhiyun  die-0
373*4882a593Smuzhiyun    cpu-0
374*4882a593Smuzhiyun      core-power
375*4882a593Smuzhiyun        support-status:supported
376*4882a593Smuzhiyun        enable-status:enabled
377*4882a593Smuzhiyun        clos-enable-status:enabled
378*4882a593Smuzhiyun        priority-type:proportional
379*4882a593Smuzhiyun package-1
380*4882a593Smuzhiyun  die-0
381*4882a593Smuzhiyun    cpu-24
382*4882a593Smuzhiyun      core-power
383*4882a593Smuzhiyun        support-status:supported
384*4882a593Smuzhiyun        enable-status:enabled
385*4882a593Smuzhiyun        clos-enable-status:enabled
386*4882a593Smuzhiyun        priority-type:proportional
387*4882a593Smuzhiyun
388*4882a593SmuzhiyunConfiguring CLOS groups
389*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~
390*4882a593Smuzhiyun
391*4882a593SmuzhiyunEach CLOS group has its own attributes including min, max, freq_weight and
392*4882a593Smuzhiyundesired. These parameters can be configured with "core-power config" command.
393*4882a593SmuzhiyunDefaults will be used if user skips setting a parameter except clos id, which is
394*4882a593Smuzhiyunmandatory. To check core-power config options, execute::
395*4882a593Smuzhiyun
396*4882a593Smuzhiyun # intel-speed-select core-power config --help
397*4882a593Smuzhiyun Intel(R) Speed Select Technology
398*4882a593Smuzhiyun Executing on CPU model: X
399*4882a593Smuzhiyun Set core-power configuration for one of the four clos ids
400*4882a593Smuzhiyun	Specify targeted clos id with [--clos|-c]
401*4882a593Smuzhiyun	Specify clos Proportional Priority [--weight|-w]
402*4882a593Smuzhiyun	Specify clos min in MHz with [--min|-n]
403*4882a593Smuzhiyun	Specify clos max in MHz with [--max|-m]
404*4882a593Smuzhiyun
405*4882a593SmuzhiyunFor example::
406*4882a593Smuzhiyun
407*4882a593Smuzhiyun # intel-speed-select core-power config -c 0
408*4882a593Smuzhiyun Intel(R) Speed Select Technology
409*4882a593Smuzhiyun Executing on CPU model: X
410*4882a593Smuzhiyun clos epp is not specified, default: 0
411*4882a593Smuzhiyun clos frequency weight is not specified, default: 0
412*4882a593Smuzhiyun clos min is not specified, default: 0 MHz
413*4882a593Smuzhiyun clos max is not specified, default: 25500 MHz
414*4882a593Smuzhiyun clos desired is not specified, default: 0
415*4882a593Smuzhiyun package-0
416*4882a593Smuzhiyun  die-0
417*4882a593Smuzhiyun    cpu-0
418*4882a593Smuzhiyun      core-power
419*4882a593Smuzhiyun        config:success
420*4882a593Smuzhiyun package-1
421*4882a593Smuzhiyun  die-0
422*4882a593Smuzhiyun    cpu-6
423*4882a593Smuzhiyun      core-power
424*4882a593Smuzhiyun        config:success
425*4882a593Smuzhiyun
426*4882a593SmuzhiyunThe user has the option to change defaults. For example, the user can change the
427*4882a593Smuzhiyun"min" and set the base frequency to always get guaranteed base frequency.
428*4882a593Smuzhiyun
429*4882a593SmuzhiyunGet the current CLOS configuration
430*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
431*4882a593Smuzhiyun
432*4882a593SmuzhiyunTo check the current configuration, "core-power get-config" can be used. For
433*4882a593Smuzhiyunexample, to get the configuration of CLOS 0::
434*4882a593Smuzhiyun
435*4882a593Smuzhiyun # intel-speed-select core-power get-config -c 0
436*4882a593Smuzhiyun Intel(R) Speed Select Technology
437*4882a593Smuzhiyun Executing on CPU model: X
438*4882a593Smuzhiyun package-0
439*4882a593Smuzhiyun  die-0
440*4882a593Smuzhiyun    cpu-0
441*4882a593Smuzhiyun      core-power
442*4882a593Smuzhiyun        clos:0
443*4882a593Smuzhiyun        epp:0
444*4882a593Smuzhiyun        clos-proportional-priority:0
445*4882a593Smuzhiyun        clos-min:0 MHz
446*4882a593Smuzhiyun        clos-max:Max Turbo frequency
447*4882a593Smuzhiyun        clos-desired:0 MHz
448*4882a593Smuzhiyun package-1
449*4882a593Smuzhiyun  die-0
450*4882a593Smuzhiyun    cpu-24
451*4882a593Smuzhiyun      core-power
452*4882a593Smuzhiyun        clos:0
453*4882a593Smuzhiyun        epp:0
454*4882a593Smuzhiyun        clos-proportional-priority:0
455*4882a593Smuzhiyun        clos-min:0 MHz
456*4882a593Smuzhiyun        clos-max:Max Turbo frequency
457*4882a593Smuzhiyun        clos-desired:0 MHz
458*4882a593Smuzhiyun
459*4882a593SmuzhiyunAssociating a CPU with a CLOS group
460*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
461*4882a593Smuzhiyun
462*4882a593SmuzhiyunTo associate a CPU to a CLOS group "core-power assoc" command can be used::
463*4882a593Smuzhiyun
464*4882a593Smuzhiyun # intel-speed-select core-power assoc --help
465*4882a593Smuzhiyun Intel(R) Speed Select Technology
466*4882a593Smuzhiyun Executing on CPU model: X
467*4882a593Smuzhiyun Associate a clos id to a CPU
468*4882a593Smuzhiyun	Specify targeted clos id with [--clos|-c]
469*4882a593Smuzhiyun
470*4882a593Smuzhiyun
471*4882a593SmuzhiyunFor example to associate CPU 10 to CLOS group 3, execute::
472*4882a593Smuzhiyun
473*4882a593Smuzhiyun # intel-speed-select -c 10 core-power assoc -c 3
474*4882a593Smuzhiyun Intel(R) Speed Select Technology
475*4882a593Smuzhiyun Executing on CPU model: X
476*4882a593Smuzhiyun package-0
477*4882a593Smuzhiyun  die-0
478*4882a593Smuzhiyun    cpu-10
479*4882a593Smuzhiyun      core-power
480*4882a593Smuzhiyun        assoc:success
481*4882a593Smuzhiyun
482*4882a593SmuzhiyunOnce a CPU is associated, its sibling CPUs are also associated to a CLOS group.
483*4882a593SmuzhiyunOnce associated, avoid changing Linux "cpufreq" subsystem scaling frequency
484*4882a593Smuzhiyunlimits.
485*4882a593Smuzhiyun
486*4882a593SmuzhiyunTo check the existing association for a CPU, "core-power get-assoc" command can
487*4882a593Smuzhiyunbe used. For example, to get association of CPU 10, execute::
488*4882a593Smuzhiyun
489*4882a593Smuzhiyun # intel-speed-select -c 10 core-power get-assoc
490*4882a593Smuzhiyun Intel(R) Speed Select Technology
491*4882a593Smuzhiyun Executing on CPU model: X
492*4882a593Smuzhiyun package-1
493*4882a593Smuzhiyun  die-0
494*4882a593Smuzhiyun    cpu-10
495*4882a593Smuzhiyun      get-assoc
496*4882a593Smuzhiyun        clos:3
497*4882a593Smuzhiyun
498*4882a593SmuzhiyunThis shows that CPU 10 is part of a CLOS group 3.
499*4882a593Smuzhiyun
500*4882a593Smuzhiyun
501*4882a593SmuzhiyunDisable CLOS based prioritization
502*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
503*4882a593Smuzhiyun
504*4882a593SmuzhiyunTo disable, execute::
505*4882a593Smuzhiyun
506*4882a593Smuzhiyun# intel-speed-select core-power disable
507*4882a593Smuzhiyun
508*4882a593SmuzhiyunSome features like Intel(R) SST-TF can only be enabled when CLOS based prioritization
509*4882a593Smuzhiyunis enabled. For this reason, disabling while Intel(R) SST-TF is enabled can cause
510*4882a593SmuzhiyunIntel(R) SST-TF to fail. This will cause the "disable" command to display an error
511*4882a593Smuzhiyunif Intel(R) SST-TF is already enabled. In turn, to disable, the Intel(R) SST-TF
512*4882a593Smuzhiyunfeature must be disabled first.
513*4882a593Smuzhiyun
514*4882a593SmuzhiyunIntel(R) Speed Select Technology - Base Frequency (Intel(R) SST-BF)
515*4882a593Smuzhiyun-------------------------------------------------------------------
516*4882a593Smuzhiyun
517*4882a593SmuzhiyunThe Intel(R) Speed Select Technology - Base Frequency (Intel(R) SST-BF) feature lets
518*4882a593Smuzhiyunthe user control base frequency. If some critical workload threads demand
519*4882a593Smuzhiyunconstant high guaranteed performance, then this feature can be used to execute
520*4882a593Smuzhiyunthe thread at higher base frequency on specific sets of CPUs (high priority
521*4882a593SmuzhiyunCPUs) at the cost of lower base frequency (low priority CPUs) on other CPUs.
522*4882a593SmuzhiyunThis feature does not require offline of the low priority CPUs.
523*4882a593Smuzhiyun
524*4882a593SmuzhiyunThe support of Intel(R) SST-BF depends on the Intel(R) Speed Select Technology -
525*4882a593SmuzhiyunPerformance Profile (Intel(R) SST-PP) performance level configuration. It is
526*4882a593Smuzhiyunpossible that only certain performance levels support Intel(R) SST-BF. It is also
527*4882a593Smuzhiyunpossible that only base performance level (level = 0) has support of Intel
528*4882a593SmuzhiyunSST-BF. Consequently, first select the desired performance level to enable this
529*4882a593Smuzhiyunfeature.
530*4882a593Smuzhiyun
531*4882a593SmuzhiyunIn the system under test here, Intel(R) SST-BF is supported at the base
532*4882a593Smuzhiyunperformance level 0, but currently disabled. For example for the level 0::
533*4882a593Smuzhiyun
534*4882a593Smuzhiyun # intel-speed-select -c 0 perf-profile info -l 0
535*4882a593Smuzhiyun Intel(R) Speed Select Technology
536*4882a593Smuzhiyun Executing on CPU model: X
537*4882a593Smuzhiyun package-0
538*4882a593Smuzhiyun  die-0
539*4882a593Smuzhiyun    cpu-0
540*4882a593Smuzhiyun      perf-profile-level-0
541*4882a593Smuzhiyun        ...
542*4882a593Smuzhiyun
543*4882a593Smuzhiyun        speed-select-base-freq:disabled
544*4882a593Smuzhiyun	...
545*4882a593Smuzhiyun
546*4882a593SmuzhiyunBefore enabling Intel(R) SST-BF and measuring its impact on a workload
547*4882a593Smuzhiyunperformance, execute some workload and measure performance and get a baseline
548*4882a593Smuzhiyunperformance to compare against.
549*4882a593Smuzhiyun
550*4882a593SmuzhiyunHere the user wants more guaranteed performance. For this reason, it is likely
551*4882a593Smuzhiyunthat turbo is disabled. To disable turbo, execute::
552*4882a593Smuzhiyun
553*4882a593Smuzhiyun#echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo
554*4882a593Smuzhiyun
555*4882a593SmuzhiyunBased on the output of the "intel-speed-select perf-profile info -l 0" base
556*4882a593Smuzhiyunfrequency of guaranteed frequency 2600 MHz.
557*4882a593Smuzhiyun
558*4882a593Smuzhiyun
559*4882a593SmuzhiyunMeasure baseline performance for comparison
560*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
561*4882a593Smuzhiyun
562*4882a593SmuzhiyunTo compare, pick a multi-threaded workload where each thread can be scheduled on
563*4882a593Smuzhiyunseparate CPUs. "Hackbench pipe" test is a good example on how to improve
564*4882a593Smuzhiyunperformance using Intel(R) SST-BF.
565*4882a593Smuzhiyun
566*4882a593SmuzhiyunBelow, the workload is measuring average scheduler wakeup latency, so a lower
567*4882a593Smuzhiyunnumber means better performance::
568*4882a593Smuzhiyun
569*4882a593Smuzhiyun # taskset -c 3,4 perf bench -r 100 sched pipe
570*4882a593Smuzhiyun # Running 'sched/pipe' benchmark:
571*4882a593Smuzhiyun # Executed 1000000 pipe operations between two processes
572*4882a593Smuzhiyun     Total time: 6.102 [sec]
573*4882a593Smuzhiyun       6.102445 usecs/op
574*4882a593Smuzhiyun         163868 ops/sec
575*4882a593Smuzhiyun
576*4882a593SmuzhiyunWhile running the above test, if we take turbostat output, it will show us that
577*4882a593Smuzhiyun2 of the CPUs are busy and reaching max. frequency (which would be the base
578*4882a593Smuzhiyunfrequency as the turbo is disabled). The turbostat output::
579*4882a593Smuzhiyun
580*4882a593Smuzhiyun #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
581*4882a593Smuzhiyun Package	Core	CPU	Bzy_MHz
582*4882a593Smuzhiyun 0		0	0	1000
583*4882a593Smuzhiyun 0		1	1	1005
584*4882a593Smuzhiyun 0		2	2	1000
585*4882a593Smuzhiyun 0		3	3	2600
586*4882a593Smuzhiyun 0		4	4	2600
587*4882a593Smuzhiyun 0		5	5	1000
588*4882a593Smuzhiyun 0		6	6	1000
589*4882a593Smuzhiyun 0		7	7	1005
590*4882a593Smuzhiyun 0		8	8	1005
591*4882a593Smuzhiyun 0		9	9	1000
592*4882a593Smuzhiyun 0		10	10	1000
593*4882a593Smuzhiyun 0		11	11	995
594*4882a593Smuzhiyun 0		12	12	1000
595*4882a593Smuzhiyun 0		13	13	1000
596*4882a593Smuzhiyun
597*4882a593SmuzhiyunFrom the above turbostat output, both CPU 3 and 4 are very busy and reaching
598*4882a593Smuzhiyunfull guaranteed frequency of 2600 MHz.
599*4882a593Smuzhiyun
600*4882a593SmuzhiyunIntel(R) SST-BF Capabilities
601*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~~~~
602*4882a593Smuzhiyun
603*4882a593SmuzhiyunTo get capabilities of Intel(R) SST-BF for the current performance level 0,
604*4882a593Smuzhiyunexecute::
605*4882a593Smuzhiyun
606*4882a593Smuzhiyun # intel-speed-select base-freq info -l 0
607*4882a593Smuzhiyun Intel(R) Speed Select Technology
608*4882a593Smuzhiyun Executing on CPU model: X
609*4882a593Smuzhiyun package-0
610*4882a593Smuzhiyun  die-0
611*4882a593Smuzhiyun    cpu-0
612*4882a593Smuzhiyun      speed-select-base-freq
613*4882a593Smuzhiyun        high-priority-base-frequency(MHz):3000
614*4882a593Smuzhiyun        high-priority-cpu-mask:00000216,00002160
615*4882a593Smuzhiyun        high-priority-cpu-list:5,6,8,13,33,34,36,41
616*4882a593Smuzhiyun        low-priority-base-frequency(MHz):2400
617*4882a593Smuzhiyun        tjunction-temperature(C):125
618*4882a593Smuzhiyun        thermal-design-power(W):205
619*4882a593Smuzhiyun
620*4882a593SmuzhiyunThe above capabilities show that there are some CPUs on this system that can
621*4882a593Smuzhiyunoffer base frequency of 3000 MHz compared to the standard base frequency at this
622*4882a593Smuzhiyunperformance levels. Nevertheless, these CPUs are fixed, and they are presented
623*4882a593Smuzhiyunvia high-priority-cpu-list/high-priority-cpu-mask. But if this Intel(R) SST-BF
624*4882a593Smuzhiyunfeature is selected, the low priorities CPUs (which are not in
625*4882a593Smuzhiyunhigh-priority-cpu-list) can only offer up to 2400 MHz. As a result, if this
626*4882a593Smuzhiyunclipping of low priority CPUs is acceptable, then the user can enable Intel
627*4882a593SmuzhiyunSST-BF feature particularly for the above "sched pipe" workload since only two
628*4882a593SmuzhiyunCPUs are used, they can be scheduled on high priority CPUs and can get boost of
629*4882a593Smuzhiyun400 MHz.
630*4882a593Smuzhiyun
631*4882a593SmuzhiyunEnable Intel(R) SST-BF
632*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~
633*4882a593Smuzhiyun
634*4882a593SmuzhiyunTo enable Intel(R) SST-BF feature, execute::
635*4882a593Smuzhiyun
636*4882a593Smuzhiyun # intel-speed-select base-freq enable -a
637*4882a593Smuzhiyun Intel(R) Speed Select Technology
638*4882a593Smuzhiyun Executing on CPU model: X
639*4882a593Smuzhiyun package-0
640*4882a593Smuzhiyun  die-0
641*4882a593Smuzhiyun    cpu-0
642*4882a593Smuzhiyun      base-freq
643*4882a593Smuzhiyun        enable:success
644*4882a593Smuzhiyun package-1
645*4882a593Smuzhiyun  die-0
646*4882a593Smuzhiyun    cpu-14
647*4882a593Smuzhiyun      base-freq
648*4882a593Smuzhiyun        enable:success
649*4882a593Smuzhiyun
650*4882a593SmuzhiyunIn this case, -a option is optional. This not only enables Intel(R) SST-BF, but it
651*4882a593Smuzhiyunalso adjusts the priority of cores using Intel(R) Speed Select Technology Core
652*4882a593SmuzhiyunPower (Intel(R) SST-CP) features. This option sets the minimum performance of each
653*4882a593SmuzhiyunIntel(R) Speed Select Technology - Performance Profile (Intel(R) SST-PP) class to
654*4882a593Smuzhiyunmaximum performance so that the hardware will give maximum performance possible
655*4882a593Smuzhiyunfor each CPU.
656*4882a593Smuzhiyun
657*4882a593SmuzhiyunIf -a option is not used, then the following steps are required before enabling
658*4882a593SmuzhiyunIntel(R) SST-BF:
659*4882a593Smuzhiyun
660*4882a593Smuzhiyun- Discover Intel(R) SST-BF and note low and high priority base frequency
661*4882a593Smuzhiyun- Note the high prioity CPU list
662*4882a593Smuzhiyun- Enable CLOS using core-power feature set
663*4882a593Smuzhiyun- Configure CLOS parameters. Use CLOS.min to set to minimum performance
664*4882a593Smuzhiyun- Subscribe desired CPUs to CLOS groups
665*4882a593Smuzhiyun
666*4882a593SmuzhiyunWith this configuration, if the same workload is executed by pinning the
667*4882a593Smuzhiyunworkload to high priority CPUs (CPU 5 and 6 in this case)::
668*4882a593Smuzhiyun
669*4882a593Smuzhiyun #taskset -c 5,6 perf bench -r 100 sched pipe
670*4882a593Smuzhiyun # Running 'sched/pipe' benchmark:
671*4882a593Smuzhiyun # Executed 1000000 pipe operations between two processes
672*4882a593Smuzhiyun     Total time: 5.627 [sec]
673*4882a593Smuzhiyun       5.627922 usecs/op
674*4882a593Smuzhiyun         177685 ops/sec
675*4882a593Smuzhiyun
676*4882a593SmuzhiyunThis way, by enabling Intel(R) SST-BF, the performance of this benchmark is
677*4882a593Smuzhiyunimproved (latency reduced) by 7.79%. From the turbostat output, it can be
678*4882a593Smuzhiyunobserved that the high priority CPUs reached 3000 MHz compared to 2600 MHz.
679*4882a593SmuzhiyunThe turbostat output::
680*4882a593Smuzhiyun
681*4882a593Smuzhiyun #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
682*4882a593Smuzhiyun Package	Core	CPU	Bzy_MHz
683*4882a593Smuzhiyun 0		0	0	2151
684*4882a593Smuzhiyun 0		1	1	2166
685*4882a593Smuzhiyun 0		2	2	2175
686*4882a593Smuzhiyun 0		3	3	2175
687*4882a593Smuzhiyun 0		4	4	2175
688*4882a593Smuzhiyun 0		5	5	3000
689*4882a593Smuzhiyun 0		6	6	3000
690*4882a593Smuzhiyun 0		7	7	2180
691*4882a593Smuzhiyun 0		8	8	2662
692*4882a593Smuzhiyun 0		9	9	2176
693*4882a593Smuzhiyun 0		10	10	2175
694*4882a593Smuzhiyun 0		11	11	2176
695*4882a593Smuzhiyun 0		12	12	2176
696*4882a593Smuzhiyun 0		13	13	2661
697*4882a593Smuzhiyun
698*4882a593SmuzhiyunDisable Intel(R) SST-BF
699*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~
700*4882a593Smuzhiyun
701*4882a593SmuzhiyunTo disable the Intel(R) SST-BF feature, execute::
702*4882a593Smuzhiyun
703*4882a593Smuzhiyun# intel-speed-select base-freq disable -a
704*4882a593Smuzhiyun
705*4882a593Smuzhiyun
706*4882a593SmuzhiyunIntel(R) Speed Select Technology - Turbo Frequency (Intel(R) SST-TF)
707*4882a593Smuzhiyun--------------------------------------------------------------------
708*4882a593Smuzhiyun
709*4882a593SmuzhiyunThis feature enables the ability to set different "All core turbo ratio limits"
710*4882a593Smuzhiyunto cores based on the priority. By using this feature, some cores can be
711*4882a593Smuzhiyunconfigured to get higher turbo frequency by designating them as high priority at
712*4882a593Smuzhiyunthe cost of lower or no turbo frequency on the low priority cores.
713*4882a593Smuzhiyun
714*4882a593SmuzhiyunFor this reason, this feature is only useful when system is busy utilizing all
715*4882a593SmuzhiyunCPUs, but the user wants some configurable option to get high performance on
716*4882a593Smuzhiyunsome CPUs.
717*4882a593Smuzhiyun
718*4882a593SmuzhiyunThe support of Intel(R) Speed Select Technology - Turbo Frequency (Intel(R) SST-TF)
719*4882a593Smuzhiyundepends on the Intel(R) Speed Select Technology - Performance Profile (Intel
720*4882a593SmuzhiyunSST-PP) performance level configuration. It is possible that only a certain
721*4882a593Smuzhiyunperformance level supports Intel(R) SST-TF. It is also possible that only the base
722*4882a593Smuzhiyunperformance level (level = 0) has the support of Intel(R) SST-TF. Hence, first
723*4882a593Smuzhiyunselect the desired performance level to enable this feature.
724*4882a593Smuzhiyun
725*4882a593SmuzhiyunIn the system under test here, Intel(R) SST-TF is supported at the base
726*4882a593Smuzhiyunperformance level 0, but currently disabled::
727*4882a593Smuzhiyun
728*4882a593Smuzhiyun # intel-speed-select -c 0 perf-profile info -l 0
729*4882a593Smuzhiyun Intel(R) Speed Select Technology
730*4882a593Smuzhiyun package-0
731*4882a593Smuzhiyun  die-0
732*4882a593Smuzhiyun    cpu-0
733*4882a593Smuzhiyun      perf-profile-level-0
734*4882a593Smuzhiyun        ...
735*4882a593Smuzhiyun        ...
736*4882a593Smuzhiyun        speed-select-turbo-freq:disabled
737*4882a593Smuzhiyun        ...
738*4882a593Smuzhiyun        ...
739*4882a593Smuzhiyun
740*4882a593Smuzhiyun
741*4882a593SmuzhiyunTo check if performance can be improved using Intel(R) SST-TF feature, get the turbo
742*4882a593Smuzhiyunfrequency properties with Intel(R) SST-TF enabled and compare to the base turbo
743*4882a593Smuzhiyuncapability of this system.
744*4882a593Smuzhiyun
745*4882a593SmuzhiyunGet Base turbo capability
746*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~
747*4882a593Smuzhiyun
748*4882a593SmuzhiyunTo get the base turbo capability of performance level 0, execute::
749*4882a593Smuzhiyun
750*4882a593Smuzhiyun # intel-speed-select perf-profile info -l 0
751*4882a593Smuzhiyun Intel(R) Speed Select Technology
752*4882a593Smuzhiyun Executing on CPU model: X
753*4882a593Smuzhiyun package-0
754*4882a593Smuzhiyun  die-0
755*4882a593Smuzhiyun    cpu-0
756*4882a593Smuzhiyun      perf-profile-level-0
757*4882a593Smuzhiyun        ...
758*4882a593Smuzhiyun        ...
759*4882a593Smuzhiyun        turbo-ratio-limits-sse
760*4882a593Smuzhiyun          bucket-0
761*4882a593Smuzhiyun            core-count:2
762*4882a593Smuzhiyun            max-turbo-frequency(MHz):3200
763*4882a593Smuzhiyun          bucket-1
764*4882a593Smuzhiyun            core-count:4
765*4882a593Smuzhiyun            max-turbo-frequency(MHz):3100
766*4882a593Smuzhiyun          bucket-2
767*4882a593Smuzhiyun            core-count:6
768*4882a593Smuzhiyun            max-turbo-frequency(MHz):3100
769*4882a593Smuzhiyun          bucket-3
770*4882a593Smuzhiyun            core-count:8
771*4882a593Smuzhiyun            max-turbo-frequency(MHz):3100
772*4882a593Smuzhiyun          bucket-4
773*4882a593Smuzhiyun            core-count:10
774*4882a593Smuzhiyun            max-turbo-frequency(MHz):3100
775*4882a593Smuzhiyun          bucket-5
776*4882a593Smuzhiyun            core-count:12
777*4882a593Smuzhiyun            max-turbo-frequency(MHz):3100
778*4882a593Smuzhiyun          bucket-6
779*4882a593Smuzhiyun            core-count:14
780*4882a593Smuzhiyun            max-turbo-frequency(MHz):3100
781*4882a593Smuzhiyun          bucket-7
782*4882a593Smuzhiyun            core-count:16
783*4882a593Smuzhiyun            max-turbo-frequency(MHz):3100
784*4882a593Smuzhiyun
785*4882a593SmuzhiyunBased on the data above, when all the CPUS are busy, the max. frequency of 3100
786*4882a593SmuzhiyunMHz can be achieved. If there is some busy workload on cpu 0 - 11 (e.g. stress)
787*4882a593Smuzhiyunand on CPU 12 and 13, execute "hackbench pipe" workload::
788*4882a593Smuzhiyun
789*4882a593Smuzhiyun # taskset -c 12,13 perf bench -r 100 sched pipe
790*4882a593Smuzhiyun # Running 'sched/pipe' benchmark:
791*4882a593Smuzhiyun # Executed 1000000 pipe operations between two processes
792*4882a593Smuzhiyun     Total time: 5.705 [sec]
793*4882a593Smuzhiyun       5.705488 usecs/op
794*4882a593Smuzhiyun         175269 ops/sec
795*4882a593Smuzhiyun
796*4882a593SmuzhiyunThe turbostat output::
797*4882a593Smuzhiyun
798*4882a593Smuzhiyun #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
799*4882a593Smuzhiyun Package	Core	CPU	Bzy_MHz
800*4882a593Smuzhiyun 0		0	0	3000
801*4882a593Smuzhiyun 0		1	1	3000
802*4882a593Smuzhiyun 0		2	2	3000
803*4882a593Smuzhiyun 0		3	3	3000
804*4882a593Smuzhiyun 0		4	4	3000
805*4882a593Smuzhiyun 0		5	5	3100
806*4882a593Smuzhiyun 0		6	6	3100
807*4882a593Smuzhiyun 0		7	7	3000
808*4882a593Smuzhiyun 0		8	8	3100
809*4882a593Smuzhiyun 0		9	9	3000
810*4882a593Smuzhiyun 0		10	10	3000
811*4882a593Smuzhiyun 0		11	11	3000
812*4882a593Smuzhiyun 0		12	12	3100
813*4882a593Smuzhiyun 0		13	13	3100
814*4882a593Smuzhiyun
815*4882a593SmuzhiyunBased on turbostat output, the performance is limited by frequency cap of 3100
816*4882a593SmuzhiyunMHz. To check if the hackbench performance can be improved for CPU 12 and CPU
817*4882a593Smuzhiyun13, first check the capability of the Intel(R) SST-TF feature for this performance
818*4882a593Smuzhiyunlevel.
819*4882a593Smuzhiyun
820*4882a593SmuzhiyunGet Intel(R) SST-TF Capability
821*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
822*4882a593Smuzhiyun
823*4882a593SmuzhiyunTo get the capability, the "turbo-freq info" command can be used::
824*4882a593Smuzhiyun
825*4882a593Smuzhiyun # intel-speed-select turbo-freq info -l 0
826*4882a593Smuzhiyun Intel(R) Speed Select Technology
827*4882a593Smuzhiyun Executing on CPU model: X
828*4882a593Smuzhiyun package-0
829*4882a593Smuzhiyun  die-0
830*4882a593Smuzhiyun    cpu-0
831*4882a593Smuzhiyun      speed-select-turbo-freq
832*4882a593Smuzhiyun          bucket-0
833*4882a593Smuzhiyun            high-priority-cores-count:2
834*4882a593Smuzhiyun            high-priority-max-frequency(MHz):3200
835*4882a593Smuzhiyun            high-priority-max-avx2-frequency(MHz):3200
836*4882a593Smuzhiyun            high-priority-max-avx512-frequency(MHz):3100
837*4882a593Smuzhiyun          bucket-1
838*4882a593Smuzhiyun            high-priority-cores-count:4
839*4882a593Smuzhiyun            high-priority-max-frequency(MHz):3100
840*4882a593Smuzhiyun            high-priority-max-avx2-frequency(MHz):3000
841*4882a593Smuzhiyun            high-priority-max-avx512-frequency(MHz):2900
842*4882a593Smuzhiyun          bucket-2
843*4882a593Smuzhiyun            high-priority-cores-count:6
844*4882a593Smuzhiyun            high-priority-max-frequency(MHz):3100
845*4882a593Smuzhiyun            high-priority-max-avx2-frequency(MHz):3000
846*4882a593Smuzhiyun            high-priority-max-avx512-frequency(MHz):2900
847*4882a593Smuzhiyun          speed-select-turbo-freq-clip-frequencies
848*4882a593Smuzhiyun            low-priority-max-frequency(MHz):2600
849*4882a593Smuzhiyun            low-priority-max-avx2-frequency(MHz):2400
850*4882a593Smuzhiyun            low-priority-max-avx512-frequency(MHz):2100
851*4882a593Smuzhiyun
852*4882a593SmuzhiyunBased on the output above, there is an Intel(R) SST-TF bucket for which there are
853*4882a593Smuzhiyuntwo high priority cores. If only two high priority cores are set, then max.
854*4882a593Smuzhiyunturbo frequency on those cores can be increased to 3200 MHz. This is 100 MHz
855*4882a593Smuzhiyunmore than the base turbo capability for all cores.
856*4882a593Smuzhiyun
857*4882a593SmuzhiyunIn turn, for the hackbench workload, two CPUs can be set as high priority and
858*4882a593Smuzhiyunrest as low priority. One side effect is that once enabled, the low priority
859*4882a593Smuzhiyuncores will be clipped to a lower frequency of 2600 MHz.
860*4882a593Smuzhiyun
861*4882a593SmuzhiyunEnable Intel(R) SST-TF
862*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~
863*4882a593Smuzhiyun
864*4882a593SmuzhiyunTo enable Intel(R) SST-TF, execute::
865*4882a593Smuzhiyun
866*4882a593Smuzhiyun # intel-speed-select -c 12,13 turbo-freq enable -a
867*4882a593Smuzhiyun Intel(R) Speed Select Technology
868*4882a593Smuzhiyun Executing on CPU model: X
869*4882a593Smuzhiyun package-0
870*4882a593Smuzhiyun  die-0
871*4882a593Smuzhiyun    cpu-12
872*4882a593Smuzhiyun      turbo-freq
873*4882a593Smuzhiyun        enable:success
874*4882a593Smuzhiyun package-0
875*4882a593Smuzhiyun  die-0
876*4882a593Smuzhiyun    cpu-13
877*4882a593Smuzhiyun      turbo-freq
878*4882a593Smuzhiyun        enable:success
879*4882a593Smuzhiyun package--1
880*4882a593Smuzhiyun  die-0
881*4882a593Smuzhiyun    cpu-63
882*4882a593Smuzhiyun      turbo-freq --auto
883*4882a593Smuzhiyun        enable:success
884*4882a593Smuzhiyun
885*4882a593SmuzhiyunIn this case, the option "-a" is optional. If set, it enables Intel(R) SST-TF
886*4882a593Smuzhiyunfeature and also sets the CPUs to high and low priority using Intel Speed
887*4882a593SmuzhiyunSelect Technology Core Power (Intel(R) SST-CP) features. The CPU numbers passed
888*4882a593Smuzhiyunwith "-c" arguments are marked as high priority, including its siblings.
889*4882a593Smuzhiyun
890*4882a593SmuzhiyunIf -a option is not used, then the following steps are required before enabling
891*4882a593SmuzhiyunIntel(R) SST-TF:
892*4882a593Smuzhiyun
893*4882a593Smuzhiyun- Discover Intel(R) SST-TF and note buckets of high priority cores and maximum frequency
894*4882a593Smuzhiyun
895*4882a593Smuzhiyun- Enable CLOS using core-power feature set - Configure CLOS parameters
896*4882a593Smuzhiyun
897*4882a593Smuzhiyun- Subscribe desired CPUs to CLOS groups making sure that high priority cores are set to the maximum frequency
898*4882a593Smuzhiyun
899*4882a593SmuzhiyunIf the same hackbench workload is executed, schedule hackbench threads on high
900*4882a593Smuzhiyunpriority CPUs::
901*4882a593Smuzhiyun
902*4882a593Smuzhiyun #taskset -c 12,13 perf bench -r 100 sched pipe
903*4882a593Smuzhiyun # Running 'sched/pipe' benchmark:
904*4882a593Smuzhiyun # Executed 1000000 pipe operations between two processes
905*4882a593Smuzhiyun     Total time: 5.510 [sec]
906*4882a593Smuzhiyun       5.510165 usecs/op
907*4882a593Smuzhiyun         180826 ops/sec
908*4882a593Smuzhiyun
909*4882a593SmuzhiyunThis improved performance by around 3.3% improvement on a busy system. Here the
910*4882a593Smuzhiyunturbostat output will show that the CPU 12 and CPU 13 are getting 100 MHz boost.
911*4882a593SmuzhiyunThe turbostat output::
912*4882a593Smuzhiyun
913*4882a593Smuzhiyun #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
914*4882a593Smuzhiyun Package	Core	CPU	Bzy_MHz
915*4882a593Smuzhiyun ...
916*4882a593Smuzhiyun 0		12	12	3200
917*4882a593Smuzhiyun 0		13	13	3200
918