1*4882a593Smuzhiyun.. SPDX-License-Identifier: GPL-2.0 2*4882a593Smuzhiyun 3*4882a593Smuzhiyun============================================================ 4*4882a593SmuzhiyunIntel(R) Speed Select Technology User Guide 5*4882a593Smuzhiyun============================================================ 6*4882a593Smuzhiyun 7*4882a593SmuzhiyunThe Intel(R) Speed Select Technology (Intel(R) SST) provides a powerful new 8*4882a593Smuzhiyuncollection of features that give more granular control over CPU performance. 9*4882a593SmuzhiyunWith Intel(R) SST, one server can be configured for power and performance for a 10*4882a593Smuzhiyunvariety of diverse workload requirements. 11*4882a593Smuzhiyun 12*4882a593SmuzhiyunRefer to the links below for an overview of the technology: 13*4882a593Smuzhiyun 14*4882a593Smuzhiyun- https://www.intel.com/content/www/us/en/architecture-and-technology/speed-select-technology-article.html 15*4882a593Smuzhiyun- https://builders.intel.com/docs/networkbuilders/intel-speed-select-technology-base-frequency-enhancing-performance.pdf 16*4882a593Smuzhiyun 17*4882a593SmuzhiyunThese capabilities are further enhanced in some of the newer generations of 18*4882a593Smuzhiyunserver platforms where these features can be enumerated and controlled 19*4882a593Smuzhiyundynamically without pre-configuring via BIOS setup options. This dynamic 20*4882a593Smuzhiyunconfiguration is done via mailbox commands to the hardware. One way to enumerate 21*4882a593Smuzhiyunand configure these features is by using the Intel Speed Select utility. 22*4882a593Smuzhiyun 23*4882a593SmuzhiyunThis document explains how to use the Intel Speed Select tool to enumerate and 24*4882a593Smuzhiyuncontrol Intel(R) SST features. This document gives example commands and explains 25*4882a593Smuzhiyunhow these commands change the power and performance profile of the system under 26*4882a593Smuzhiyuntest. Using this tool as an example, customers can replicate the messaging 27*4882a593Smuzhiyunimplemented in the tool in their production software. 28*4882a593Smuzhiyun 29*4882a593Smuzhiyunintel-speed-select configuration tool 30*4882a593Smuzhiyun====================================== 31*4882a593Smuzhiyun 32*4882a593SmuzhiyunMost Linux distribution packages may include the "intel-speed-select" tool. If not, 33*4882a593Smuzhiyunit can be built by downloading the Linux kernel tree from kernel.org. Once 34*4882a593Smuzhiyundownloaded, the tool can be built without building the full kernel. 35*4882a593Smuzhiyun 36*4882a593SmuzhiyunFrom the kernel tree, run the following commands:: 37*4882a593Smuzhiyun 38*4882a593Smuzhiyun# cd tools/power/x86/intel-speed-select/ 39*4882a593Smuzhiyun# make 40*4882a593Smuzhiyun# make install 41*4882a593Smuzhiyun 42*4882a593SmuzhiyunGetting Help 43*4882a593Smuzhiyun------------ 44*4882a593Smuzhiyun 45*4882a593SmuzhiyunTo get help with the tool, execute the command below:: 46*4882a593Smuzhiyun 47*4882a593Smuzhiyun# intel-speed-select --help 48*4882a593Smuzhiyun 49*4882a593SmuzhiyunThe top-level help describes arguments and features. Notice that there is a 50*4882a593Smuzhiyunmulti-level help structure in the tool. For example, to get help for the feature "perf-profile":: 51*4882a593Smuzhiyun 52*4882a593Smuzhiyun# intel-speed-select perf-profile --help 53*4882a593Smuzhiyun 54*4882a593SmuzhiyunTo get help on a command, another level of help is provided. For example for the command info "info":: 55*4882a593Smuzhiyun 56*4882a593Smuzhiyun# intel-speed-select perf-profile info --help 57*4882a593Smuzhiyun 58*4882a593SmuzhiyunSummary of platform capability 59*4882a593Smuzhiyun------------------------------ 60*4882a593SmuzhiyunTo check the current platform and driver capaibilities, execute:: 61*4882a593Smuzhiyun 62*4882a593Smuzhiyun#intel-speed-select --info 63*4882a593Smuzhiyun 64*4882a593SmuzhiyunFor example on a test system:: 65*4882a593Smuzhiyun 66*4882a593Smuzhiyun # intel-speed-select --info 67*4882a593Smuzhiyun Intel(R) Speed Select Technology 68*4882a593Smuzhiyun Executing on CPU model: X 69*4882a593Smuzhiyun Platform: API version : 1 70*4882a593Smuzhiyun Platform: Driver version : 1 71*4882a593Smuzhiyun Platform: mbox supported : 1 72*4882a593Smuzhiyun Platform: mmio supported : 1 73*4882a593Smuzhiyun Intel(R) SST-PP (feature perf-profile) is supported 74*4882a593Smuzhiyun TDP level change control is unlocked, max level: 4 75*4882a593Smuzhiyun Intel(R) SST-TF (feature turbo-freq) is supported 76*4882a593Smuzhiyun Intel(R) SST-BF (feature base-freq) is not supported 77*4882a593Smuzhiyun Intel(R) SST-CP (feature core-power) is supported 78*4882a593Smuzhiyun 79*4882a593SmuzhiyunIntel(R) Speed Select Technology - Performance Profile (Intel(R) SST-PP) 80*4882a593Smuzhiyun------------------------------------------------------------------------ 81*4882a593Smuzhiyun 82*4882a593SmuzhiyunThis feature allows configuration of a server dynamically based on workload 83*4882a593Smuzhiyunperformance requirements. This helps users during deployment as they do not have 84*4882a593Smuzhiyunto choose a specific server configuration statically. This Intel(R) Speed Select 85*4882a593SmuzhiyunTechnology - Performance Profile (Intel(R) SST-PP) feature introduces a mechanism 86*4882a593Smuzhiyunthat allows multiple optimized performance profiles per system. Each profile 87*4882a593Smuzhiyundefines a set of CPUs that need to be online and rest offline to sustain a 88*4882a593Smuzhiyunguaranteed base frequency. Once the user issues a command to use a specific 89*4882a593Smuzhiyunperformance profile and meet CPU online/offline requirement, the user can expect 90*4882a593Smuzhiyuna change in the base frequency dynamically. This feature is called 91*4882a593Smuzhiyun"perf-profile" when using the Intel Speed Select tool. 92*4882a593Smuzhiyun 93*4882a593SmuzhiyunNumber or performance levels 94*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 95*4882a593Smuzhiyun 96*4882a593SmuzhiyunThere can be multiple performance profiles on a system. To get the number of 97*4882a593Smuzhiyunprofiles, execute the command below:: 98*4882a593Smuzhiyun 99*4882a593Smuzhiyun # intel-speed-select perf-profile get-config-levels 100*4882a593Smuzhiyun Intel(R) Speed Select Technology 101*4882a593Smuzhiyun Executing on CPU model: X 102*4882a593Smuzhiyun package-0 103*4882a593Smuzhiyun die-0 104*4882a593Smuzhiyun cpu-0 105*4882a593Smuzhiyun get-config-levels:4 106*4882a593Smuzhiyun package-1 107*4882a593Smuzhiyun die-0 108*4882a593Smuzhiyun cpu-14 109*4882a593Smuzhiyun get-config-levels:4 110*4882a593Smuzhiyun 111*4882a593SmuzhiyunOn this system under test, there are 4 performance profiles in addition to the 112*4882a593Smuzhiyunbase performance profile (which is performance level 0). 113*4882a593Smuzhiyun 114*4882a593SmuzhiyunLock/Unlock status 115*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~ 116*4882a593Smuzhiyun 117*4882a593SmuzhiyunEven if there are multiple performance profiles, it is possible that they 118*4882a593Smuzhiyunare locked. If they are locked, users cannot issue a command to change the 119*4882a593Smuzhiyunperformance state. It is possible that there is a BIOS setup to unlock or check 120*4882a593Smuzhiyunwith your system vendor. 121*4882a593Smuzhiyun 122*4882a593SmuzhiyunTo check if the system is locked, execute the following command:: 123*4882a593Smuzhiyun 124*4882a593Smuzhiyun # intel-speed-select perf-profile get-lock-status 125*4882a593Smuzhiyun Intel(R) Speed Select Technology 126*4882a593Smuzhiyun Executing on CPU model: X 127*4882a593Smuzhiyun package-0 128*4882a593Smuzhiyun die-0 129*4882a593Smuzhiyun cpu-0 130*4882a593Smuzhiyun get-lock-status:0 131*4882a593Smuzhiyun package-1 132*4882a593Smuzhiyun die-0 133*4882a593Smuzhiyun cpu-14 134*4882a593Smuzhiyun get-lock-status:0 135*4882a593Smuzhiyun 136*4882a593SmuzhiyunIn this case, lock status is 0, which means that the system is unlocked. 137*4882a593Smuzhiyun 138*4882a593SmuzhiyunProperties of a performance level 139*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 140*4882a593Smuzhiyun 141*4882a593SmuzhiyunTo get properties of a specific performance level (For example for the level 0, below), execute the command below:: 142*4882a593Smuzhiyun 143*4882a593Smuzhiyun # intel-speed-select perf-profile info -l 0 144*4882a593Smuzhiyun Intel(R) Speed Select Technology 145*4882a593Smuzhiyun Executing on CPU model: X 146*4882a593Smuzhiyun package-0 147*4882a593Smuzhiyun die-0 148*4882a593Smuzhiyun cpu-0 149*4882a593Smuzhiyun perf-profile-level-0 150*4882a593Smuzhiyun cpu-count:28 151*4882a593Smuzhiyun enable-cpu-mask:000003ff,f0003fff 152*4882a593Smuzhiyun enable-cpu-list:0,1,2,3,4,5,6,7,8,9,10,11,12,13,28,29,30,31,32,33,34,35,36,37,38,39,40,41 153*4882a593Smuzhiyun thermal-design-power-ratio:26 154*4882a593Smuzhiyun base-frequency(MHz):2600 155*4882a593Smuzhiyun speed-select-turbo-freq:disabled 156*4882a593Smuzhiyun speed-select-base-freq:disabled 157*4882a593Smuzhiyun ... 158*4882a593Smuzhiyun ... 159*4882a593Smuzhiyun 160*4882a593SmuzhiyunHere -l option is used to specify a performance level. 161*4882a593Smuzhiyun 162*4882a593SmuzhiyunIf the option -l is omitted, then this command will print information about all 163*4882a593Smuzhiyunthe performance levels. The above command is printing properties of the 164*4882a593Smuzhiyunperformance level 0. 165*4882a593Smuzhiyun 166*4882a593SmuzhiyunFor this performance profile, the list of CPUs displayed by the 167*4882a593Smuzhiyun"enable-cpu-mask/enable-cpu-list" at the max can be "online." When that 168*4882a593Smuzhiyuncondition is met, then base frequency of 2600 MHz can be maintained. To 169*4882a593Smuzhiyununderstand more, execute "intel-speed-select perf-profile info" for performance 170*4882a593Smuzhiyunlevel 4:: 171*4882a593Smuzhiyun 172*4882a593Smuzhiyun # intel-speed-select perf-profile info -l 4 173*4882a593Smuzhiyun Intel(R) Speed Select Technology 174*4882a593Smuzhiyun Executing on CPU model: X 175*4882a593Smuzhiyun package-0 176*4882a593Smuzhiyun die-0 177*4882a593Smuzhiyun cpu-0 178*4882a593Smuzhiyun perf-profile-level-4 179*4882a593Smuzhiyun cpu-count:28 180*4882a593Smuzhiyun enable-cpu-mask:000000fa,f0000faf 181*4882a593Smuzhiyun enable-cpu-list:0,1,2,3,5,7,8,9,10,11,28,29,30,31,33,35,36,37,38,39 182*4882a593Smuzhiyun thermal-design-power-ratio:28 183*4882a593Smuzhiyun base-frequency(MHz):2800 184*4882a593Smuzhiyun speed-select-turbo-freq:disabled 185*4882a593Smuzhiyun speed-select-base-freq:unsupported 186*4882a593Smuzhiyun ... 187*4882a593Smuzhiyun ... 188*4882a593Smuzhiyun 189*4882a593SmuzhiyunThere are fewer CPUs in the "enable-cpu-mask/enable-cpu-list". Consequently, if 190*4882a593Smuzhiyunthe user only keeps these CPUs online and the rest "offline," then the base 191*4882a593Smuzhiyunfrequency is increased to 2.8 GHz compared to 2.6 GHz at performance level 0. 192*4882a593Smuzhiyun 193*4882a593SmuzhiyunGet current performance level 194*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 195*4882a593Smuzhiyun 196*4882a593SmuzhiyunTo get the current performance level, execute:: 197*4882a593Smuzhiyun 198*4882a593Smuzhiyun # intel-speed-select perf-profile get-config-current-level 199*4882a593Smuzhiyun Intel(R) Speed Select Technology 200*4882a593Smuzhiyun Executing on CPU model: X 201*4882a593Smuzhiyun package-0 202*4882a593Smuzhiyun die-0 203*4882a593Smuzhiyun cpu-0 204*4882a593Smuzhiyun get-config-current_level:0 205*4882a593Smuzhiyun 206*4882a593SmuzhiyunFirst verify that the base_frequency displayed by the cpufreq sysfs is correct:: 207*4882a593Smuzhiyun 208*4882a593Smuzhiyun # cat /sys/devices/system/cpu/cpu0/cpufreq/base_frequency 209*4882a593Smuzhiyun 2600000 210*4882a593Smuzhiyun 211*4882a593SmuzhiyunThis matches the base-frequency (MHz) field value displayed from the 212*4882a593Smuzhiyun"perf-profile info" command for performance level 0(cpufreq frequency is in 213*4882a593SmuzhiyunKHz). 214*4882a593Smuzhiyun 215*4882a593SmuzhiyunTo check if the average frequency is equal to the base frequency for a 100% busy 216*4882a593Smuzhiyunworkload, disable turbo:: 217*4882a593Smuzhiyun 218*4882a593Smuzhiyun# echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo 219*4882a593Smuzhiyun 220*4882a593SmuzhiyunThen runs a busy workload on all CPUs, for example:: 221*4882a593Smuzhiyun 222*4882a593Smuzhiyun#stress -c 64 223*4882a593Smuzhiyun 224*4882a593SmuzhiyunTo verify the base frequency, run turbostat:: 225*4882a593Smuzhiyun 226*4882a593Smuzhiyun #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1 227*4882a593Smuzhiyun 228*4882a593Smuzhiyun Package Core CPU Bzy_MHz 229*4882a593Smuzhiyun - - 2600 230*4882a593Smuzhiyun 0 0 0 2600 231*4882a593Smuzhiyun 0 1 1 2600 232*4882a593Smuzhiyun 0 2 2 2600 233*4882a593Smuzhiyun 0 3 3 2600 234*4882a593Smuzhiyun 0 4 4 2600 235*4882a593Smuzhiyun . . . . 236*4882a593Smuzhiyun 237*4882a593Smuzhiyun 238*4882a593SmuzhiyunChanging performance level 239*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 240*4882a593Smuzhiyun 241*4882a593SmuzhiyunTo the change the performance level to 4, execute:: 242*4882a593Smuzhiyun 243*4882a593Smuzhiyun # intel-speed-select -d perf-profile set-config-level -l 4 -o 244*4882a593Smuzhiyun Intel(R) Speed Select Technology 245*4882a593Smuzhiyun Executing on CPU model: X 246*4882a593Smuzhiyun package-0 247*4882a593Smuzhiyun die-0 248*4882a593Smuzhiyun cpu-0 249*4882a593Smuzhiyun perf-profile 250*4882a593Smuzhiyun set_tdp_level:success 251*4882a593Smuzhiyun 252*4882a593SmuzhiyunIn the command above, "-o" is optional. If it is specified, then it will also 253*4882a593Smuzhiyunoffline CPUs which are not present in the enable_cpu_mask for this performance 254*4882a593Smuzhiyunlevel. 255*4882a593Smuzhiyun 256*4882a593SmuzhiyunNow if the base_frequency is checked:: 257*4882a593Smuzhiyun 258*4882a593Smuzhiyun #cat /sys/devices/system/cpu/cpu0/cpufreq/base_frequency 259*4882a593Smuzhiyun 2800000 260*4882a593Smuzhiyun 261*4882a593SmuzhiyunWhich shows that the base frequency now increased from 2600 MHz at performance 262*4882a593Smuzhiyunlevel 0 to 2800 MHz at performance level 4. As a result, any workload, which can 263*4882a593Smuzhiyunuse fewer CPUs, can see a boost of 200 MHz compared to performance level 0. 264*4882a593Smuzhiyun 265*4882a593SmuzhiyunCheck presence of other Intel(R) SST features 266*4882a593Smuzhiyun--------------------------------------------- 267*4882a593Smuzhiyun 268*4882a593SmuzhiyunEach of the performance profiles also specifies weather there is support of 269*4882a593Smuzhiyunother two Intel(R) SST features (Intel(R) Speed Select Technology - Base Frequency 270*4882a593Smuzhiyun(Intel(R) SST-BF) and Intel(R) Speed Select Technology - Turbo Frequency (Intel 271*4882a593SmuzhiyunSST-TF)). 272*4882a593Smuzhiyun 273*4882a593SmuzhiyunFor example, from the output of "perf-profile info" above, for level 0 and level 274*4882a593Smuzhiyun4: 275*4882a593Smuzhiyun 276*4882a593SmuzhiyunFor level 0:: 277*4882a593Smuzhiyun speed-select-turbo-freq:disabled 278*4882a593Smuzhiyun speed-select-base-freq:disabled 279*4882a593Smuzhiyun 280*4882a593SmuzhiyunFor level 4:: 281*4882a593Smuzhiyun speed-select-turbo-freq:disabled 282*4882a593Smuzhiyun speed-select-base-freq:unsupported 283*4882a593Smuzhiyun 284*4882a593SmuzhiyunGiven these results, the "speed-select-base-freq" (Intel(R) SST-BF) in level 4 285*4882a593Smuzhiyunchanged from "disabled" to "unsupported" compared to performance level 0. 286*4882a593Smuzhiyun 287*4882a593SmuzhiyunThis means that at performance level 4, the "speed-select-base-freq" feature is 288*4882a593Smuzhiyunnot supported. However, at performance level 0, this feature is "supported", but 289*4882a593Smuzhiyuncurrently "disabled", meaning the user has not activated this feature. Whereas 290*4882a593Smuzhiyun"speed-select-turbo-freq" (Intel(R) SST-TF) is supported at both performance 291*4882a593Smuzhiyunlevels, but currently not activated by the user. 292*4882a593Smuzhiyun 293*4882a593SmuzhiyunThe Intel(R) SST-BF and the Intel(R) SST-TF features are built on a foundation 294*4882a593Smuzhiyuntechnology called Intel(R) Speed Select Technology - Core Power (Intel(R) SST-CP). 295*4882a593SmuzhiyunThe platform firmware enables this feature when Intel(R) SST-BF or Intel(R) SST-TF 296*4882a593Smuzhiyunis supported on a platform. 297*4882a593Smuzhiyun 298*4882a593SmuzhiyunIntel(R) Speed Select Technology Core Power (Intel(R) SST-CP) 299*4882a593Smuzhiyun--------------------------------------------------------------- 300*4882a593Smuzhiyun 301*4882a593SmuzhiyunIntel(R) Speed Select Technology Core Power (Intel(R) SST-CP) is an interface that 302*4882a593Smuzhiyunallows users to define per core priority. This defines a mechanism to distribute 303*4882a593Smuzhiyunpower among cores when there is a power constrained scenario. This defines a 304*4882a593Smuzhiyunclass of service (CLOS) configuration. 305*4882a593Smuzhiyun 306*4882a593SmuzhiyunThe user can configure up to 4 class of service configurations. Each CLOS group 307*4882a593Smuzhiyunconfiguration allows definitions of parameters, which affects how the frequency 308*4882a593Smuzhiyuncan be limited and power is distributed. Each CPU core can be tied to a class of 309*4882a593Smuzhiyunservice and hence an associated priority. The granularity is at core level not 310*4882a593Smuzhiyunat per CPU level. 311*4882a593Smuzhiyun 312*4882a593SmuzhiyunEnable CLOS based prioritization 313*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 314*4882a593Smuzhiyun 315*4882a593SmuzhiyunTo use CLOS based prioritization feature, firmware must be informed to enable 316*4882a593Smuzhiyunand use a priority type. There is a default per platform priority type, which 317*4882a593Smuzhiyuncan be changed with optional command line parameter. 318*4882a593Smuzhiyun 319*4882a593SmuzhiyunTo enable and check the options, execute:: 320*4882a593Smuzhiyun 321*4882a593Smuzhiyun # intel-speed-select core-power enable --help 322*4882a593Smuzhiyun Intel(R) Speed Select Technology 323*4882a593Smuzhiyun Executing on CPU model: X 324*4882a593Smuzhiyun Enable core-power for a package/die 325*4882a593Smuzhiyun Clos Enable: Specify priority type with [--priority|-p] 326*4882a593Smuzhiyun 0: Proportional, 1: Ordered 327*4882a593Smuzhiyun 328*4882a593SmuzhiyunThere are two types of priority types: 329*4882a593Smuzhiyun 330*4882a593Smuzhiyun- Ordered 331*4882a593Smuzhiyun 332*4882a593SmuzhiyunPriority for ordered throttling is defined based on the index of the assigned 333*4882a593SmuzhiyunCLOS group. Where CLOS0 gets highest priority (throttled last). 334*4882a593Smuzhiyun 335*4882a593SmuzhiyunPriority order is: 336*4882a593SmuzhiyunCLOS0 > CLOS1 > CLOS2 > CLOS3. 337*4882a593Smuzhiyun 338*4882a593Smuzhiyun- Proportional 339*4882a593Smuzhiyun 340*4882a593SmuzhiyunWhen proportional priority is used, there is an additional parameter called 341*4882a593Smuzhiyunfrequency_weight, which can be specified per CLOS group. The goal of 342*4882a593Smuzhiyunproportional priority is to provide each core with the requested min., then 343*4882a593Smuzhiyundistribute all remaining (excess/deficit) budgets in proportion to a defined 344*4882a593Smuzhiyunweight. This proportional priority can be configured using "core-power config" 345*4882a593Smuzhiyuncommand. 346*4882a593Smuzhiyun 347*4882a593SmuzhiyunTo enable with the platform default priority type, execute:: 348*4882a593Smuzhiyun 349*4882a593Smuzhiyun # intel-speed-select core-power enable 350*4882a593Smuzhiyun Intel(R) Speed Select Technology 351*4882a593Smuzhiyun Executing on CPU model: X 352*4882a593Smuzhiyun package-0 353*4882a593Smuzhiyun die-0 354*4882a593Smuzhiyun cpu-0 355*4882a593Smuzhiyun core-power 356*4882a593Smuzhiyun enable:success 357*4882a593Smuzhiyun package-1 358*4882a593Smuzhiyun die-0 359*4882a593Smuzhiyun cpu-6 360*4882a593Smuzhiyun core-power 361*4882a593Smuzhiyun enable:success 362*4882a593Smuzhiyun 363*4882a593SmuzhiyunThe scope of this enable is per package or die scoped when a package contains 364*4882a593Smuzhiyunmultiple dies. To check if CLOS is enabled and get priority type, "core-power 365*4882a593Smuzhiyuninfo" command can be used. For example to check the status of core-power feature 366*4882a593Smuzhiyunon CPU 0, execute:: 367*4882a593Smuzhiyun 368*4882a593Smuzhiyun # intel-speed-select -c 0 core-power info 369*4882a593Smuzhiyun Intel(R) Speed Select Technology 370*4882a593Smuzhiyun Executing on CPU model: X 371*4882a593Smuzhiyun package-0 372*4882a593Smuzhiyun die-0 373*4882a593Smuzhiyun cpu-0 374*4882a593Smuzhiyun core-power 375*4882a593Smuzhiyun support-status:supported 376*4882a593Smuzhiyun enable-status:enabled 377*4882a593Smuzhiyun clos-enable-status:enabled 378*4882a593Smuzhiyun priority-type:proportional 379*4882a593Smuzhiyun package-1 380*4882a593Smuzhiyun die-0 381*4882a593Smuzhiyun cpu-24 382*4882a593Smuzhiyun core-power 383*4882a593Smuzhiyun support-status:supported 384*4882a593Smuzhiyun enable-status:enabled 385*4882a593Smuzhiyun clos-enable-status:enabled 386*4882a593Smuzhiyun priority-type:proportional 387*4882a593Smuzhiyun 388*4882a593SmuzhiyunConfiguring CLOS groups 389*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~ 390*4882a593Smuzhiyun 391*4882a593SmuzhiyunEach CLOS group has its own attributes including min, max, freq_weight and 392*4882a593Smuzhiyundesired. These parameters can be configured with "core-power config" command. 393*4882a593SmuzhiyunDefaults will be used if user skips setting a parameter except clos id, which is 394*4882a593Smuzhiyunmandatory. To check core-power config options, execute:: 395*4882a593Smuzhiyun 396*4882a593Smuzhiyun # intel-speed-select core-power config --help 397*4882a593Smuzhiyun Intel(R) Speed Select Technology 398*4882a593Smuzhiyun Executing on CPU model: X 399*4882a593Smuzhiyun Set core-power configuration for one of the four clos ids 400*4882a593Smuzhiyun Specify targeted clos id with [--clos|-c] 401*4882a593Smuzhiyun Specify clos Proportional Priority [--weight|-w] 402*4882a593Smuzhiyun Specify clos min in MHz with [--min|-n] 403*4882a593Smuzhiyun Specify clos max in MHz with [--max|-m] 404*4882a593Smuzhiyun 405*4882a593SmuzhiyunFor example:: 406*4882a593Smuzhiyun 407*4882a593Smuzhiyun # intel-speed-select core-power config -c 0 408*4882a593Smuzhiyun Intel(R) Speed Select Technology 409*4882a593Smuzhiyun Executing on CPU model: X 410*4882a593Smuzhiyun clos epp is not specified, default: 0 411*4882a593Smuzhiyun clos frequency weight is not specified, default: 0 412*4882a593Smuzhiyun clos min is not specified, default: 0 MHz 413*4882a593Smuzhiyun clos max is not specified, default: 25500 MHz 414*4882a593Smuzhiyun clos desired is not specified, default: 0 415*4882a593Smuzhiyun package-0 416*4882a593Smuzhiyun die-0 417*4882a593Smuzhiyun cpu-0 418*4882a593Smuzhiyun core-power 419*4882a593Smuzhiyun config:success 420*4882a593Smuzhiyun package-1 421*4882a593Smuzhiyun die-0 422*4882a593Smuzhiyun cpu-6 423*4882a593Smuzhiyun core-power 424*4882a593Smuzhiyun config:success 425*4882a593Smuzhiyun 426*4882a593SmuzhiyunThe user has the option to change defaults. For example, the user can change the 427*4882a593Smuzhiyun"min" and set the base frequency to always get guaranteed base frequency. 428*4882a593Smuzhiyun 429*4882a593SmuzhiyunGet the current CLOS configuration 430*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 431*4882a593Smuzhiyun 432*4882a593SmuzhiyunTo check the current configuration, "core-power get-config" can be used. For 433*4882a593Smuzhiyunexample, to get the configuration of CLOS 0:: 434*4882a593Smuzhiyun 435*4882a593Smuzhiyun # intel-speed-select core-power get-config -c 0 436*4882a593Smuzhiyun Intel(R) Speed Select Technology 437*4882a593Smuzhiyun Executing on CPU model: X 438*4882a593Smuzhiyun package-0 439*4882a593Smuzhiyun die-0 440*4882a593Smuzhiyun cpu-0 441*4882a593Smuzhiyun core-power 442*4882a593Smuzhiyun clos:0 443*4882a593Smuzhiyun epp:0 444*4882a593Smuzhiyun clos-proportional-priority:0 445*4882a593Smuzhiyun clos-min:0 MHz 446*4882a593Smuzhiyun clos-max:Max Turbo frequency 447*4882a593Smuzhiyun clos-desired:0 MHz 448*4882a593Smuzhiyun package-1 449*4882a593Smuzhiyun die-0 450*4882a593Smuzhiyun cpu-24 451*4882a593Smuzhiyun core-power 452*4882a593Smuzhiyun clos:0 453*4882a593Smuzhiyun epp:0 454*4882a593Smuzhiyun clos-proportional-priority:0 455*4882a593Smuzhiyun clos-min:0 MHz 456*4882a593Smuzhiyun clos-max:Max Turbo frequency 457*4882a593Smuzhiyun clos-desired:0 MHz 458*4882a593Smuzhiyun 459*4882a593SmuzhiyunAssociating a CPU with a CLOS group 460*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 461*4882a593Smuzhiyun 462*4882a593SmuzhiyunTo associate a CPU to a CLOS group "core-power assoc" command can be used:: 463*4882a593Smuzhiyun 464*4882a593Smuzhiyun # intel-speed-select core-power assoc --help 465*4882a593Smuzhiyun Intel(R) Speed Select Technology 466*4882a593Smuzhiyun Executing on CPU model: X 467*4882a593Smuzhiyun Associate a clos id to a CPU 468*4882a593Smuzhiyun Specify targeted clos id with [--clos|-c] 469*4882a593Smuzhiyun 470*4882a593Smuzhiyun 471*4882a593SmuzhiyunFor example to associate CPU 10 to CLOS group 3, execute:: 472*4882a593Smuzhiyun 473*4882a593Smuzhiyun # intel-speed-select -c 10 core-power assoc -c 3 474*4882a593Smuzhiyun Intel(R) Speed Select Technology 475*4882a593Smuzhiyun Executing on CPU model: X 476*4882a593Smuzhiyun package-0 477*4882a593Smuzhiyun die-0 478*4882a593Smuzhiyun cpu-10 479*4882a593Smuzhiyun core-power 480*4882a593Smuzhiyun assoc:success 481*4882a593Smuzhiyun 482*4882a593SmuzhiyunOnce a CPU is associated, its sibling CPUs are also associated to a CLOS group. 483*4882a593SmuzhiyunOnce associated, avoid changing Linux "cpufreq" subsystem scaling frequency 484*4882a593Smuzhiyunlimits. 485*4882a593Smuzhiyun 486*4882a593SmuzhiyunTo check the existing association for a CPU, "core-power get-assoc" command can 487*4882a593Smuzhiyunbe used. For example, to get association of CPU 10, execute:: 488*4882a593Smuzhiyun 489*4882a593Smuzhiyun # intel-speed-select -c 10 core-power get-assoc 490*4882a593Smuzhiyun Intel(R) Speed Select Technology 491*4882a593Smuzhiyun Executing on CPU model: X 492*4882a593Smuzhiyun package-1 493*4882a593Smuzhiyun die-0 494*4882a593Smuzhiyun cpu-10 495*4882a593Smuzhiyun get-assoc 496*4882a593Smuzhiyun clos:3 497*4882a593Smuzhiyun 498*4882a593SmuzhiyunThis shows that CPU 10 is part of a CLOS group 3. 499*4882a593Smuzhiyun 500*4882a593Smuzhiyun 501*4882a593SmuzhiyunDisable CLOS based prioritization 502*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 503*4882a593Smuzhiyun 504*4882a593SmuzhiyunTo disable, execute:: 505*4882a593Smuzhiyun 506*4882a593Smuzhiyun# intel-speed-select core-power disable 507*4882a593Smuzhiyun 508*4882a593SmuzhiyunSome features like Intel(R) SST-TF can only be enabled when CLOS based prioritization 509*4882a593Smuzhiyunis enabled. For this reason, disabling while Intel(R) SST-TF is enabled can cause 510*4882a593SmuzhiyunIntel(R) SST-TF to fail. This will cause the "disable" command to display an error 511*4882a593Smuzhiyunif Intel(R) SST-TF is already enabled. In turn, to disable, the Intel(R) SST-TF 512*4882a593Smuzhiyunfeature must be disabled first. 513*4882a593Smuzhiyun 514*4882a593SmuzhiyunIntel(R) Speed Select Technology - Base Frequency (Intel(R) SST-BF) 515*4882a593Smuzhiyun------------------------------------------------------------------- 516*4882a593Smuzhiyun 517*4882a593SmuzhiyunThe Intel(R) Speed Select Technology - Base Frequency (Intel(R) SST-BF) feature lets 518*4882a593Smuzhiyunthe user control base frequency. If some critical workload threads demand 519*4882a593Smuzhiyunconstant high guaranteed performance, then this feature can be used to execute 520*4882a593Smuzhiyunthe thread at higher base frequency on specific sets of CPUs (high priority 521*4882a593SmuzhiyunCPUs) at the cost of lower base frequency (low priority CPUs) on other CPUs. 522*4882a593SmuzhiyunThis feature does not require offline of the low priority CPUs. 523*4882a593Smuzhiyun 524*4882a593SmuzhiyunThe support of Intel(R) SST-BF depends on the Intel(R) Speed Select Technology - 525*4882a593SmuzhiyunPerformance Profile (Intel(R) SST-PP) performance level configuration. It is 526*4882a593Smuzhiyunpossible that only certain performance levels support Intel(R) SST-BF. It is also 527*4882a593Smuzhiyunpossible that only base performance level (level = 0) has support of Intel 528*4882a593SmuzhiyunSST-BF. Consequently, first select the desired performance level to enable this 529*4882a593Smuzhiyunfeature. 530*4882a593Smuzhiyun 531*4882a593SmuzhiyunIn the system under test here, Intel(R) SST-BF is supported at the base 532*4882a593Smuzhiyunperformance level 0, but currently disabled. For example for the level 0:: 533*4882a593Smuzhiyun 534*4882a593Smuzhiyun # intel-speed-select -c 0 perf-profile info -l 0 535*4882a593Smuzhiyun Intel(R) Speed Select Technology 536*4882a593Smuzhiyun Executing on CPU model: X 537*4882a593Smuzhiyun package-0 538*4882a593Smuzhiyun die-0 539*4882a593Smuzhiyun cpu-0 540*4882a593Smuzhiyun perf-profile-level-0 541*4882a593Smuzhiyun ... 542*4882a593Smuzhiyun 543*4882a593Smuzhiyun speed-select-base-freq:disabled 544*4882a593Smuzhiyun ... 545*4882a593Smuzhiyun 546*4882a593SmuzhiyunBefore enabling Intel(R) SST-BF and measuring its impact on a workload 547*4882a593Smuzhiyunperformance, execute some workload and measure performance and get a baseline 548*4882a593Smuzhiyunperformance to compare against. 549*4882a593Smuzhiyun 550*4882a593SmuzhiyunHere the user wants more guaranteed performance. For this reason, it is likely 551*4882a593Smuzhiyunthat turbo is disabled. To disable turbo, execute:: 552*4882a593Smuzhiyun 553*4882a593Smuzhiyun#echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo 554*4882a593Smuzhiyun 555*4882a593SmuzhiyunBased on the output of the "intel-speed-select perf-profile info -l 0" base 556*4882a593Smuzhiyunfrequency of guaranteed frequency 2600 MHz. 557*4882a593Smuzhiyun 558*4882a593Smuzhiyun 559*4882a593SmuzhiyunMeasure baseline performance for comparison 560*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 561*4882a593Smuzhiyun 562*4882a593SmuzhiyunTo compare, pick a multi-threaded workload where each thread can be scheduled on 563*4882a593Smuzhiyunseparate CPUs. "Hackbench pipe" test is a good example on how to improve 564*4882a593Smuzhiyunperformance using Intel(R) SST-BF. 565*4882a593Smuzhiyun 566*4882a593SmuzhiyunBelow, the workload is measuring average scheduler wakeup latency, so a lower 567*4882a593Smuzhiyunnumber means better performance:: 568*4882a593Smuzhiyun 569*4882a593Smuzhiyun # taskset -c 3,4 perf bench -r 100 sched pipe 570*4882a593Smuzhiyun # Running 'sched/pipe' benchmark: 571*4882a593Smuzhiyun # Executed 1000000 pipe operations between two processes 572*4882a593Smuzhiyun Total time: 6.102 [sec] 573*4882a593Smuzhiyun 6.102445 usecs/op 574*4882a593Smuzhiyun 163868 ops/sec 575*4882a593Smuzhiyun 576*4882a593SmuzhiyunWhile running the above test, if we take turbostat output, it will show us that 577*4882a593Smuzhiyun2 of the CPUs are busy and reaching max. frequency (which would be the base 578*4882a593Smuzhiyunfrequency as the turbo is disabled). The turbostat output:: 579*4882a593Smuzhiyun 580*4882a593Smuzhiyun #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1 581*4882a593Smuzhiyun Package Core CPU Bzy_MHz 582*4882a593Smuzhiyun 0 0 0 1000 583*4882a593Smuzhiyun 0 1 1 1005 584*4882a593Smuzhiyun 0 2 2 1000 585*4882a593Smuzhiyun 0 3 3 2600 586*4882a593Smuzhiyun 0 4 4 2600 587*4882a593Smuzhiyun 0 5 5 1000 588*4882a593Smuzhiyun 0 6 6 1000 589*4882a593Smuzhiyun 0 7 7 1005 590*4882a593Smuzhiyun 0 8 8 1005 591*4882a593Smuzhiyun 0 9 9 1000 592*4882a593Smuzhiyun 0 10 10 1000 593*4882a593Smuzhiyun 0 11 11 995 594*4882a593Smuzhiyun 0 12 12 1000 595*4882a593Smuzhiyun 0 13 13 1000 596*4882a593Smuzhiyun 597*4882a593SmuzhiyunFrom the above turbostat output, both CPU 3 and 4 are very busy and reaching 598*4882a593Smuzhiyunfull guaranteed frequency of 2600 MHz. 599*4882a593Smuzhiyun 600*4882a593SmuzhiyunIntel(R) SST-BF Capabilities 601*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 602*4882a593Smuzhiyun 603*4882a593SmuzhiyunTo get capabilities of Intel(R) SST-BF for the current performance level 0, 604*4882a593Smuzhiyunexecute:: 605*4882a593Smuzhiyun 606*4882a593Smuzhiyun # intel-speed-select base-freq info -l 0 607*4882a593Smuzhiyun Intel(R) Speed Select Technology 608*4882a593Smuzhiyun Executing on CPU model: X 609*4882a593Smuzhiyun package-0 610*4882a593Smuzhiyun die-0 611*4882a593Smuzhiyun cpu-0 612*4882a593Smuzhiyun speed-select-base-freq 613*4882a593Smuzhiyun high-priority-base-frequency(MHz):3000 614*4882a593Smuzhiyun high-priority-cpu-mask:00000216,00002160 615*4882a593Smuzhiyun high-priority-cpu-list:5,6,8,13,33,34,36,41 616*4882a593Smuzhiyun low-priority-base-frequency(MHz):2400 617*4882a593Smuzhiyun tjunction-temperature(C):125 618*4882a593Smuzhiyun thermal-design-power(W):205 619*4882a593Smuzhiyun 620*4882a593SmuzhiyunThe above capabilities show that there are some CPUs on this system that can 621*4882a593Smuzhiyunoffer base frequency of 3000 MHz compared to the standard base frequency at this 622*4882a593Smuzhiyunperformance levels. Nevertheless, these CPUs are fixed, and they are presented 623*4882a593Smuzhiyunvia high-priority-cpu-list/high-priority-cpu-mask. But if this Intel(R) SST-BF 624*4882a593Smuzhiyunfeature is selected, the low priorities CPUs (which are not in 625*4882a593Smuzhiyunhigh-priority-cpu-list) can only offer up to 2400 MHz. As a result, if this 626*4882a593Smuzhiyunclipping of low priority CPUs is acceptable, then the user can enable Intel 627*4882a593SmuzhiyunSST-BF feature particularly for the above "sched pipe" workload since only two 628*4882a593SmuzhiyunCPUs are used, they can be scheduled on high priority CPUs and can get boost of 629*4882a593Smuzhiyun400 MHz. 630*4882a593Smuzhiyun 631*4882a593SmuzhiyunEnable Intel(R) SST-BF 632*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~ 633*4882a593Smuzhiyun 634*4882a593SmuzhiyunTo enable Intel(R) SST-BF feature, execute:: 635*4882a593Smuzhiyun 636*4882a593Smuzhiyun # intel-speed-select base-freq enable -a 637*4882a593Smuzhiyun Intel(R) Speed Select Technology 638*4882a593Smuzhiyun Executing on CPU model: X 639*4882a593Smuzhiyun package-0 640*4882a593Smuzhiyun die-0 641*4882a593Smuzhiyun cpu-0 642*4882a593Smuzhiyun base-freq 643*4882a593Smuzhiyun enable:success 644*4882a593Smuzhiyun package-1 645*4882a593Smuzhiyun die-0 646*4882a593Smuzhiyun cpu-14 647*4882a593Smuzhiyun base-freq 648*4882a593Smuzhiyun enable:success 649*4882a593Smuzhiyun 650*4882a593SmuzhiyunIn this case, -a option is optional. This not only enables Intel(R) SST-BF, but it 651*4882a593Smuzhiyunalso adjusts the priority of cores using Intel(R) Speed Select Technology Core 652*4882a593SmuzhiyunPower (Intel(R) SST-CP) features. This option sets the minimum performance of each 653*4882a593SmuzhiyunIntel(R) Speed Select Technology - Performance Profile (Intel(R) SST-PP) class to 654*4882a593Smuzhiyunmaximum performance so that the hardware will give maximum performance possible 655*4882a593Smuzhiyunfor each CPU. 656*4882a593Smuzhiyun 657*4882a593SmuzhiyunIf -a option is not used, then the following steps are required before enabling 658*4882a593SmuzhiyunIntel(R) SST-BF: 659*4882a593Smuzhiyun 660*4882a593Smuzhiyun- Discover Intel(R) SST-BF and note low and high priority base frequency 661*4882a593Smuzhiyun- Note the high prioity CPU list 662*4882a593Smuzhiyun- Enable CLOS using core-power feature set 663*4882a593Smuzhiyun- Configure CLOS parameters. Use CLOS.min to set to minimum performance 664*4882a593Smuzhiyun- Subscribe desired CPUs to CLOS groups 665*4882a593Smuzhiyun 666*4882a593SmuzhiyunWith this configuration, if the same workload is executed by pinning the 667*4882a593Smuzhiyunworkload to high priority CPUs (CPU 5 and 6 in this case):: 668*4882a593Smuzhiyun 669*4882a593Smuzhiyun #taskset -c 5,6 perf bench -r 100 sched pipe 670*4882a593Smuzhiyun # Running 'sched/pipe' benchmark: 671*4882a593Smuzhiyun # Executed 1000000 pipe operations between two processes 672*4882a593Smuzhiyun Total time: 5.627 [sec] 673*4882a593Smuzhiyun 5.627922 usecs/op 674*4882a593Smuzhiyun 177685 ops/sec 675*4882a593Smuzhiyun 676*4882a593SmuzhiyunThis way, by enabling Intel(R) SST-BF, the performance of this benchmark is 677*4882a593Smuzhiyunimproved (latency reduced) by 7.79%. From the turbostat output, it can be 678*4882a593Smuzhiyunobserved that the high priority CPUs reached 3000 MHz compared to 2600 MHz. 679*4882a593SmuzhiyunThe turbostat output:: 680*4882a593Smuzhiyun 681*4882a593Smuzhiyun #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1 682*4882a593Smuzhiyun Package Core CPU Bzy_MHz 683*4882a593Smuzhiyun 0 0 0 2151 684*4882a593Smuzhiyun 0 1 1 2166 685*4882a593Smuzhiyun 0 2 2 2175 686*4882a593Smuzhiyun 0 3 3 2175 687*4882a593Smuzhiyun 0 4 4 2175 688*4882a593Smuzhiyun 0 5 5 3000 689*4882a593Smuzhiyun 0 6 6 3000 690*4882a593Smuzhiyun 0 7 7 2180 691*4882a593Smuzhiyun 0 8 8 2662 692*4882a593Smuzhiyun 0 9 9 2176 693*4882a593Smuzhiyun 0 10 10 2175 694*4882a593Smuzhiyun 0 11 11 2176 695*4882a593Smuzhiyun 0 12 12 2176 696*4882a593Smuzhiyun 0 13 13 2661 697*4882a593Smuzhiyun 698*4882a593SmuzhiyunDisable Intel(R) SST-BF 699*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~ 700*4882a593Smuzhiyun 701*4882a593SmuzhiyunTo disable the Intel(R) SST-BF feature, execute:: 702*4882a593Smuzhiyun 703*4882a593Smuzhiyun# intel-speed-select base-freq disable -a 704*4882a593Smuzhiyun 705*4882a593Smuzhiyun 706*4882a593SmuzhiyunIntel(R) Speed Select Technology - Turbo Frequency (Intel(R) SST-TF) 707*4882a593Smuzhiyun-------------------------------------------------------------------- 708*4882a593Smuzhiyun 709*4882a593SmuzhiyunThis feature enables the ability to set different "All core turbo ratio limits" 710*4882a593Smuzhiyunto cores based on the priority. By using this feature, some cores can be 711*4882a593Smuzhiyunconfigured to get higher turbo frequency by designating them as high priority at 712*4882a593Smuzhiyunthe cost of lower or no turbo frequency on the low priority cores. 713*4882a593Smuzhiyun 714*4882a593SmuzhiyunFor this reason, this feature is only useful when system is busy utilizing all 715*4882a593SmuzhiyunCPUs, but the user wants some configurable option to get high performance on 716*4882a593Smuzhiyunsome CPUs. 717*4882a593Smuzhiyun 718*4882a593SmuzhiyunThe support of Intel(R) Speed Select Technology - Turbo Frequency (Intel(R) SST-TF) 719*4882a593Smuzhiyundepends on the Intel(R) Speed Select Technology - Performance Profile (Intel 720*4882a593SmuzhiyunSST-PP) performance level configuration. It is possible that only a certain 721*4882a593Smuzhiyunperformance level supports Intel(R) SST-TF. It is also possible that only the base 722*4882a593Smuzhiyunperformance level (level = 0) has the support of Intel(R) SST-TF. Hence, first 723*4882a593Smuzhiyunselect the desired performance level to enable this feature. 724*4882a593Smuzhiyun 725*4882a593SmuzhiyunIn the system under test here, Intel(R) SST-TF is supported at the base 726*4882a593Smuzhiyunperformance level 0, but currently disabled:: 727*4882a593Smuzhiyun 728*4882a593Smuzhiyun # intel-speed-select -c 0 perf-profile info -l 0 729*4882a593Smuzhiyun Intel(R) Speed Select Technology 730*4882a593Smuzhiyun package-0 731*4882a593Smuzhiyun die-0 732*4882a593Smuzhiyun cpu-0 733*4882a593Smuzhiyun perf-profile-level-0 734*4882a593Smuzhiyun ... 735*4882a593Smuzhiyun ... 736*4882a593Smuzhiyun speed-select-turbo-freq:disabled 737*4882a593Smuzhiyun ... 738*4882a593Smuzhiyun ... 739*4882a593Smuzhiyun 740*4882a593Smuzhiyun 741*4882a593SmuzhiyunTo check if performance can be improved using Intel(R) SST-TF feature, get the turbo 742*4882a593Smuzhiyunfrequency properties with Intel(R) SST-TF enabled and compare to the base turbo 743*4882a593Smuzhiyuncapability of this system. 744*4882a593Smuzhiyun 745*4882a593SmuzhiyunGet Base turbo capability 746*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~ 747*4882a593Smuzhiyun 748*4882a593SmuzhiyunTo get the base turbo capability of performance level 0, execute:: 749*4882a593Smuzhiyun 750*4882a593Smuzhiyun # intel-speed-select perf-profile info -l 0 751*4882a593Smuzhiyun Intel(R) Speed Select Technology 752*4882a593Smuzhiyun Executing on CPU model: X 753*4882a593Smuzhiyun package-0 754*4882a593Smuzhiyun die-0 755*4882a593Smuzhiyun cpu-0 756*4882a593Smuzhiyun perf-profile-level-0 757*4882a593Smuzhiyun ... 758*4882a593Smuzhiyun ... 759*4882a593Smuzhiyun turbo-ratio-limits-sse 760*4882a593Smuzhiyun bucket-0 761*4882a593Smuzhiyun core-count:2 762*4882a593Smuzhiyun max-turbo-frequency(MHz):3200 763*4882a593Smuzhiyun bucket-1 764*4882a593Smuzhiyun core-count:4 765*4882a593Smuzhiyun max-turbo-frequency(MHz):3100 766*4882a593Smuzhiyun bucket-2 767*4882a593Smuzhiyun core-count:6 768*4882a593Smuzhiyun max-turbo-frequency(MHz):3100 769*4882a593Smuzhiyun bucket-3 770*4882a593Smuzhiyun core-count:8 771*4882a593Smuzhiyun max-turbo-frequency(MHz):3100 772*4882a593Smuzhiyun bucket-4 773*4882a593Smuzhiyun core-count:10 774*4882a593Smuzhiyun max-turbo-frequency(MHz):3100 775*4882a593Smuzhiyun bucket-5 776*4882a593Smuzhiyun core-count:12 777*4882a593Smuzhiyun max-turbo-frequency(MHz):3100 778*4882a593Smuzhiyun bucket-6 779*4882a593Smuzhiyun core-count:14 780*4882a593Smuzhiyun max-turbo-frequency(MHz):3100 781*4882a593Smuzhiyun bucket-7 782*4882a593Smuzhiyun core-count:16 783*4882a593Smuzhiyun max-turbo-frequency(MHz):3100 784*4882a593Smuzhiyun 785*4882a593SmuzhiyunBased on the data above, when all the CPUS are busy, the max. frequency of 3100 786*4882a593SmuzhiyunMHz can be achieved. If there is some busy workload on cpu 0 - 11 (e.g. stress) 787*4882a593Smuzhiyunand on CPU 12 and 13, execute "hackbench pipe" workload:: 788*4882a593Smuzhiyun 789*4882a593Smuzhiyun # taskset -c 12,13 perf bench -r 100 sched pipe 790*4882a593Smuzhiyun # Running 'sched/pipe' benchmark: 791*4882a593Smuzhiyun # Executed 1000000 pipe operations between two processes 792*4882a593Smuzhiyun Total time: 5.705 [sec] 793*4882a593Smuzhiyun 5.705488 usecs/op 794*4882a593Smuzhiyun 175269 ops/sec 795*4882a593Smuzhiyun 796*4882a593SmuzhiyunThe turbostat output:: 797*4882a593Smuzhiyun 798*4882a593Smuzhiyun #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1 799*4882a593Smuzhiyun Package Core CPU Bzy_MHz 800*4882a593Smuzhiyun 0 0 0 3000 801*4882a593Smuzhiyun 0 1 1 3000 802*4882a593Smuzhiyun 0 2 2 3000 803*4882a593Smuzhiyun 0 3 3 3000 804*4882a593Smuzhiyun 0 4 4 3000 805*4882a593Smuzhiyun 0 5 5 3100 806*4882a593Smuzhiyun 0 6 6 3100 807*4882a593Smuzhiyun 0 7 7 3000 808*4882a593Smuzhiyun 0 8 8 3100 809*4882a593Smuzhiyun 0 9 9 3000 810*4882a593Smuzhiyun 0 10 10 3000 811*4882a593Smuzhiyun 0 11 11 3000 812*4882a593Smuzhiyun 0 12 12 3100 813*4882a593Smuzhiyun 0 13 13 3100 814*4882a593Smuzhiyun 815*4882a593SmuzhiyunBased on turbostat output, the performance is limited by frequency cap of 3100 816*4882a593SmuzhiyunMHz. To check if the hackbench performance can be improved for CPU 12 and CPU 817*4882a593Smuzhiyun13, first check the capability of the Intel(R) SST-TF feature for this performance 818*4882a593Smuzhiyunlevel. 819*4882a593Smuzhiyun 820*4882a593SmuzhiyunGet Intel(R) SST-TF Capability 821*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 822*4882a593Smuzhiyun 823*4882a593SmuzhiyunTo get the capability, the "turbo-freq info" command can be used:: 824*4882a593Smuzhiyun 825*4882a593Smuzhiyun # intel-speed-select turbo-freq info -l 0 826*4882a593Smuzhiyun Intel(R) Speed Select Technology 827*4882a593Smuzhiyun Executing on CPU model: X 828*4882a593Smuzhiyun package-0 829*4882a593Smuzhiyun die-0 830*4882a593Smuzhiyun cpu-0 831*4882a593Smuzhiyun speed-select-turbo-freq 832*4882a593Smuzhiyun bucket-0 833*4882a593Smuzhiyun high-priority-cores-count:2 834*4882a593Smuzhiyun high-priority-max-frequency(MHz):3200 835*4882a593Smuzhiyun high-priority-max-avx2-frequency(MHz):3200 836*4882a593Smuzhiyun high-priority-max-avx512-frequency(MHz):3100 837*4882a593Smuzhiyun bucket-1 838*4882a593Smuzhiyun high-priority-cores-count:4 839*4882a593Smuzhiyun high-priority-max-frequency(MHz):3100 840*4882a593Smuzhiyun high-priority-max-avx2-frequency(MHz):3000 841*4882a593Smuzhiyun high-priority-max-avx512-frequency(MHz):2900 842*4882a593Smuzhiyun bucket-2 843*4882a593Smuzhiyun high-priority-cores-count:6 844*4882a593Smuzhiyun high-priority-max-frequency(MHz):3100 845*4882a593Smuzhiyun high-priority-max-avx2-frequency(MHz):3000 846*4882a593Smuzhiyun high-priority-max-avx512-frequency(MHz):2900 847*4882a593Smuzhiyun speed-select-turbo-freq-clip-frequencies 848*4882a593Smuzhiyun low-priority-max-frequency(MHz):2600 849*4882a593Smuzhiyun low-priority-max-avx2-frequency(MHz):2400 850*4882a593Smuzhiyun low-priority-max-avx512-frequency(MHz):2100 851*4882a593Smuzhiyun 852*4882a593SmuzhiyunBased on the output above, there is an Intel(R) SST-TF bucket for which there are 853*4882a593Smuzhiyuntwo high priority cores. If only two high priority cores are set, then max. 854*4882a593Smuzhiyunturbo frequency on those cores can be increased to 3200 MHz. This is 100 MHz 855*4882a593Smuzhiyunmore than the base turbo capability for all cores. 856*4882a593Smuzhiyun 857*4882a593SmuzhiyunIn turn, for the hackbench workload, two CPUs can be set as high priority and 858*4882a593Smuzhiyunrest as low priority. One side effect is that once enabled, the low priority 859*4882a593Smuzhiyuncores will be clipped to a lower frequency of 2600 MHz. 860*4882a593Smuzhiyun 861*4882a593SmuzhiyunEnable Intel(R) SST-TF 862*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~ 863*4882a593Smuzhiyun 864*4882a593SmuzhiyunTo enable Intel(R) SST-TF, execute:: 865*4882a593Smuzhiyun 866*4882a593Smuzhiyun # intel-speed-select -c 12,13 turbo-freq enable -a 867*4882a593Smuzhiyun Intel(R) Speed Select Technology 868*4882a593Smuzhiyun Executing on CPU model: X 869*4882a593Smuzhiyun package-0 870*4882a593Smuzhiyun die-0 871*4882a593Smuzhiyun cpu-12 872*4882a593Smuzhiyun turbo-freq 873*4882a593Smuzhiyun enable:success 874*4882a593Smuzhiyun package-0 875*4882a593Smuzhiyun die-0 876*4882a593Smuzhiyun cpu-13 877*4882a593Smuzhiyun turbo-freq 878*4882a593Smuzhiyun enable:success 879*4882a593Smuzhiyun package--1 880*4882a593Smuzhiyun die-0 881*4882a593Smuzhiyun cpu-63 882*4882a593Smuzhiyun turbo-freq --auto 883*4882a593Smuzhiyun enable:success 884*4882a593Smuzhiyun 885*4882a593SmuzhiyunIn this case, the option "-a" is optional. If set, it enables Intel(R) SST-TF 886*4882a593Smuzhiyunfeature and also sets the CPUs to high and low priority using Intel Speed 887*4882a593SmuzhiyunSelect Technology Core Power (Intel(R) SST-CP) features. The CPU numbers passed 888*4882a593Smuzhiyunwith "-c" arguments are marked as high priority, including its siblings. 889*4882a593Smuzhiyun 890*4882a593SmuzhiyunIf -a option is not used, then the following steps are required before enabling 891*4882a593SmuzhiyunIntel(R) SST-TF: 892*4882a593Smuzhiyun 893*4882a593Smuzhiyun- Discover Intel(R) SST-TF and note buckets of high priority cores and maximum frequency 894*4882a593Smuzhiyun 895*4882a593Smuzhiyun- Enable CLOS using core-power feature set - Configure CLOS parameters 896*4882a593Smuzhiyun 897*4882a593Smuzhiyun- Subscribe desired CPUs to CLOS groups making sure that high priority cores are set to the maximum frequency 898*4882a593Smuzhiyun 899*4882a593SmuzhiyunIf the same hackbench workload is executed, schedule hackbench threads on high 900*4882a593Smuzhiyunpriority CPUs:: 901*4882a593Smuzhiyun 902*4882a593Smuzhiyun #taskset -c 12,13 perf bench -r 100 sched pipe 903*4882a593Smuzhiyun # Running 'sched/pipe' benchmark: 904*4882a593Smuzhiyun # Executed 1000000 pipe operations between two processes 905*4882a593Smuzhiyun Total time: 5.510 [sec] 906*4882a593Smuzhiyun 5.510165 usecs/op 907*4882a593Smuzhiyun 180826 ops/sec 908*4882a593Smuzhiyun 909*4882a593SmuzhiyunThis improved performance by around 3.3% improvement on a busy system. Here the 910*4882a593Smuzhiyunturbostat output will show that the CPU 12 and CPU 13 are getting 100 MHz boost. 911*4882a593SmuzhiyunThe turbostat output:: 912*4882a593Smuzhiyun 913*4882a593Smuzhiyun #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1 914*4882a593Smuzhiyun Package Core CPU Bzy_MHz 915*4882a593Smuzhiyun ... 916*4882a593Smuzhiyun 0 12 12 3200 917*4882a593Smuzhiyun 0 13 13 3200 918