1*4882a593Smuzhiyun========================================== 2*4882a593SmuzhiyunARM CPUs capacity bindings 3*4882a593Smuzhiyun========================================== 4*4882a593Smuzhiyun 5*4882a593Smuzhiyun========================================== 6*4882a593Smuzhiyun1 - Introduction 7*4882a593Smuzhiyun========================================== 8*4882a593Smuzhiyun 9*4882a593SmuzhiyunARM systems may be configured to have cpus with different power/performance 10*4882a593Smuzhiyuncharacteristics within the same chip. In this case, additional information has 11*4882a593Smuzhiyunto be made available to the kernel for it to be aware of such differences and 12*4882a593Smuzhiyuntake decisions accordingly. 13*4882a593Smuzhiyun 14*4882a593Smuzhiyun========================================== 15*4882a593Smuzhiyun2 - CPU capacity definition 16*4882a593Smuzhiyun========================================== 17*4882a593Smuzhiyun 18*4882a593SmuzhiyunCPU capacity is a number that provides the scheduler information about CPUs 19*4882a593Smuzhiyunheterogeneity. Such heterogeneity can come from micro-architectural differences 20*4882a593Smuzhiyun(e.g., ARM big.LITTLE systems) or maximum frequency at which CPUs can run 21*4882a593Smuzhiyun(e.g., SMP systems with multiple frequency domains). Heterogeneity in this 22*4882a593Smuzhiyuncontext is about differing performance characteristics; this binding tries to 23*4882a593Smuzhiyuncapture a first-order approximation of the relative performance of CPUs. 24*4882a593Smuzhiyun 25*4882a593SmuzhiyunCPU capacities are obtained by running a suitable benchmark. This binding makes 26*4882a593Smuzhiyunno guarantees on the validity or suitability of any particular benchmark, the 27*4882a593Smuzhiyunfinal capacity should, however, be: 28*4882a593Smuzhiyun 29*4882a593Smuzhiyun* A "single-threaded" or CPU affine benchmark 30*4882a593Smuzhiyun* Divided by the running frequency of the CPU executing the benchmark 31*4882a593Smuzhiyun* Not subject to dynamic frequency scaling of the CPU 32*4882a593Smuzhiyun 33*4882a593SmuzhiyunFor the time being we however advise usage of the Dhrystone benchmark. What 34*4882a593Smuzhiyunabove thus becomes: 35*4882a593Smuzhiyun 36*4882a593SmuzhiyunCPU capacities are obtained by running the Dhrystone benchmark on each CPU at 37*4882a593Smuzhiyunmax frequency (with caches enabled). The obtained DMIPS score is then divided 38*4882a593Smuzhiyunby the frequency (in MHz) at which the benchmark has been run, so that 39*4882a593SmuzhiyunDMIPS/MHz are obtained. Such values are then normalized w.r.t. the highest 40*4882a593Smuzhiyunscore obtained in the system. 41*4882a593Smuzhiyun 42*4882a593Smuzhiyun========================================== 43*4882a593Smuzhiyun3 - capacity-dmips-mhz 44*4882a593Smuzhiyun========================================== 45*4882a593Smuzhiyun 46*4882a593Smuzhiyuncapacity-dmips-mhz is an optional cpu node [1] property: u32 value 47*4882a593Smuzhiyunrepresenting CPU capacity expressed in normalized DMIPS/MHz. At boot time, the 48*4882a593Smuzhiyunmaximum frequency available to the cpu is then used to calculate the capacity 49*4882a593Smuzhiyunvalue internally used by the kernel. 50*4882a593Smuzhiyun 51*4882a593Smuzhiyuncapacity-dmips-mhz property is all-or-nothing: if it is specified for a cpu 52*4882a593Smuzhiyunnode, it has to be specified for every other cpu nodes, or the system will 53*4882a593Smuzhiyunfall back to the default capacity value for every CPU. If cpufreq is not 54*4882a593Smuzhiyunavailable, final capacities are calculated by directly using capacity-dmips- 55*4882a593Smuzhiyunmhz values (normalized w.r.t. the highest value found while parsing the DT). 56*4882a593Smuzhiyun 57*4882a593Smuzhiyun=========================================== 58*4882a593Smuzhiyun4 - Examples 59*4882a593Smuzhiyun=========================================== 60*4882a593Smuzhiyun 61*4882a593SmuzhiyunExample 1 (ARM 64-bit, 6-cpu system, two clusters): 62*4882a593SmuzhiyunThe capacities-dmips-mhz or DMIPS/MHz values (scaled to 1024) 63*4882a593Smuzhiyunare 1024 and 578 for cluster0 and cluster1. Further normalization 64*4882a593Smuzhiyunis done by the operating system based on cluster0@max-freq=1100 and 65*4882a593Smuzhiyuncuster1@max-freq=850, final capacities are 1024 for cluster0 and 66*4882a593Smuzhiyun446 for cluster1 (576*850/1100). 67*4882a593Smuzhiyun 68*4882a593Smuzhiyuncpus { 69*4882a593Smuzhiyun #address-cells = <2>; 70*4882a593Smuzhiyun #size-cells = <0>; 71*4882a593Smuzhiyun 72*4882a593Smuzhiyun cpu-map { 73*4882a593Smuzhiyun cluster0 { 74*4882a593Smuzhiyun core0 { 75*4882a593Smuzhiyun cpu = <&A57_0>; 76*4882a593Smuzhiyun }; 77*4882a593Smuzhiyun core1 { 78*4882a593Smuzhiyun cpu = <&A57_1>; 79*4882a593Smuzhiyun }; 80*4882a593Smuzhiyun }; 81*4882a593Smuzhiyun 82*4882a593Smuzhiyun cluster1 { 83*4882a593Smuzhiyun core0 { 84*4882a593Smuzhiyun cpu = <&A53_0>; 85*4882a593Smuzhiyun }; 86*4882a593Smuzhiyun core1 { 87*4882a593Smuzhiyun cpu = <&A53_1>; 88*4882a593Smuzhiyun }; 89*4882a593Smuzhiyun core2 { 90*4882a593Smuzhiyun cpu = <&A53_2>; 91*4882a593Smuzhiyun }; 92*4882a593Smuzhiyun core3 { 93*4882a593Smuzhiyun cpu = <&A53_3>; 94*4882a593Smuzhiyun }; 95*4882a593Smuzhiyun }; 96*4882a593Smuzhiyun }; 97*4882a593Smuzhiyun 98*4882a593Smuzhiyun idle-states { 99*4882a593Smuzhiyun entry-method = "psci"; 100*4882a593Smuzhiyun 101*4882a593Smuzhiyun CPU_SLEEP_0: cpu-sleep-0 { 102*4882a593Smuzhiyun compatible = "arm,idle-state"; 103*4882a593Smuzhiyun arm,psci-suspend-param = <0x0010000>; 104*4882a593Smuzhiyun local-timer-stop; 105*4882a593Smuzhiyun entry-latency-us = <100>; 106*4882a593Smuzhiyun exit-latency-us = <250>; 107*4882a593Smuzhiyun min-residency-us = <150>; 108*4882a593Smuzhiyun }; 109*4882a593Smuzhiyun 110*4882a593Smuzhiyun CLUSTER_SLEEP_0: cluster-sleep-0 { 111*4882a593Smuzhiyun compatible = "arm,idle-state"; 112*4882a593Smuzhiyun arm,psci-suspend-param = <0x1010000>; 113*4882a593Smuzhiyun local-timer-stop; 114*4882a593Smuzhiyun entry-latency-us = <800>; 115*4882a593Smuzhiyun exit-latency-us = <700>; 116*4882a593Smuzhiyun min-residency-us = <2500>; 117*4882a593Smuzhiyun }; 118*4882a593Smuzhiyun }; 119*4882a593Smuzhiyun 120*4882a593Smuzhiyun A57_0: cpu@0 { 121*4882a593Smuzhiyun compatible = "arm,cortex-a57"; 122*4882a593Smuzhiyun reg = <0x0 0x0>; 123*4882a593Smuzhiyun device_type = "cpu"; 124*4882a593Smuzhiyun enable-method = "psci"; 125*4882a593Smuzhiyun next-level-cache = <&A57_L2>; 126*4882a593Smuzhiyun clocks = <&scpi_dvfs 0>; 127*4882a593Smuzhiyun cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>; 128*4882a593Smuzhiyun capacity-dmips-mhz = <1024>; 129*4882a593Smuzhiyun }; 130*4882a593Smuzhiyun 131*4882a593Smuzhiyun A57_1: cpu@1 { 132*4882a593Smuzhiyun compatible = "arm,cortex-a57"; 133*4882a593Smuzhiyun reg = <0x0 0x1>; 134*4882a593Smuzhiyun device_type = "cpu"; 135*4882a593Smuzhiyun enable-method = "psci"; 136*4882a593Smuzhiyun next-level-cache = <&A57_L2>; 137*4882a593Smuzhiyun clocks = <&scpi_dvfs 0>; 138*4882a593Smuzhiyun cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>; 139*4882a593Smuzhiyun capacity-dmips-mhz = <1024>; 140*4882a593Smuzhiyun }; 141*4882a593Smuzhiyun 142*4882a593Smuzhiyun A53_0: cpu@100 { 143*4882a593Smuzhiyun compatible = "arm,cortex-a53"; 144*4882a593Smuzhiyun reg = <0x0 0x100>; 145*4882a593Smuzhiyun device_type = "cpu"; 146*4882a593Smuzhiyun enable-method = "psci"; 147*4882a593Smuzhiyun next-level-cache = <&A53_L2>; 148*4882a593Smuzhiyun clocks = <&scpi_dvfs 1>; 149*4882a593Smuzhiyun cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>; 150*4882a593Smuzhiyun capacity-dmips-mhz = <578>; 151*4882a593Smuzhiyun }; 152*4882a593Smuzhiyun 153*4882a593Smuzhiyun A53_1: cpu@101 { 154*4882a593Smuzhiyun compatible = "arm,cortex-a53"; 155*4882a593Smuzhiyun reg = <0x0 0x101>; 156*4882a593Smuzhiyun device_type = "cpu"; 157*4882a593Smuzhiyun enable-method = "psci"; 158*4882a593Smuzhiyun next-level-cache = <&A53_L2>; 159*4882a593Smuzhiyun clocks = <&scpi_dvfs 1>; 160*4882a593Smuzhiyun cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>; 161*4882a593Smuzhiyun capacity-dmips-mhz = <578>; 162*4882a593Smuzhiyun }; 163*4882a593Smuzhiyun 164*4882a593Smuzhiyun A53_2: cpu@102 { 165*4882a593Smuzhiyun compatible = "arm,cortex-a53"; 166*4882a593Smuzhiyun reg = <0x0 0x102>; 167*4882a593Smuzhiyun device_type = "cpu"; 168*4882a593Smuzhiyun enable-method = "psci"; 169*4882a593Smuzhiyun next-level-cache = <&A53_L2>; 170*4882a593Smuzhiyun clocks = <&scpi_dvfs 1>; 171*4882a593Smuzhiyun cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>; 172*4882a593Smuzhiyun capacity-dmips-mhz = <578>; 173*4882a593Smuzhiyun }; 174*4882a593Smuzhiyun 175*4882a593Smuzhiyun A53_3: cpu@103 { 176*4882a593Smuzhiyun compatible = "arm,cortex-a53"; 177*4882a593Smuzhiyun reg = <0x0 0x103>; 178*4882a593Smuzhiyun device_type = "cpu"; 179*4882a593Smuzhiyun enable-method = "psci"; 180*4882a593Smuzhiyun next-level-cache = <&A53_L2>; 181*4882a593Smuzhiyun clocks = <&scpi_dvfs 1>; 182*4882a593Smuzhiyun cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>; 183*4882a593Smuzhiyun capacity-dmips-mhz = <578>; 184*4882a593Smuzhiyun }; 185*4882a593Smuzhiyun 186*4882a593Smuzhiyun A57_L2: l2-cache0 { 187*4882a593Smuzhiyun compatible = "cache"; 188*4882a593Smuzhiyun }; 189*4882a593Smuzhiyun 190*4882a593Smuzhiyun A53_L2: l2-cache1 { 191*4882a593Smuzhiyun compatible = "cache"; 192*4882a593Smuzhiyun }; 193*4882a593Smuzhiyun}; 194*4882a593Smuzhiyun 195*4882a593SmuzhiyunExample 2 (ARM 32-bit, 4-cpu system, two clusters, 196*4882a593Smuzhiyun cpus 0,1@1GHz, cpus 2,3@500MHz): 197*4882a593Smuzhiyuncapacities-dmips-mhz are scaled w.r.t. 2 (cpu@0 and cpu@1), this means that first 198*4882a593Smuzhiyuncpu@0 and cpu@1 are twice fast than cpu@2 and cpu@3 (at the same frequency) 199*4882a593Smuzhiyun 200*4882a593Smuzhiyuncpus { 201*4882a593Smuzhiyun #address-cells = <1>; 202*4882a593Smuzhiyun #size-cells = <0>; 203*4882a593Smuzhiyun 204*4882a593Smuzhiyun cpu0: cpu@0 { 205*4882a593Smuzhiyun device_type = "cpu"; 206*4882a593Smuzhiyun compatible = "arm,cortex-a15"; 207*4882a593Smuzhiyun reg = <0>; 208*4882a593Smuzhiyun capacity-dmips-mhz = <2>; 209*4882a593Smuzhiyun }; 210*4882a593Smuzhiyun 211*4882a593Smuzhiyun cpu1: cpu@1 { 212*4882a593Smuzhiyun device_type = "cpu"; 213*4882a593Smuzhiyun compatible = "arm,cortex-a15"; 214*4882a593Smuzhiyun reg = <1>; 215*4882a593Smuzhiyun capacity-dmips-mhz = <2>; 216*4882a593Smuzhiyun }; 217*4882a593Smuzhiyun 218*4882a593Smuzhiyun cpu2: cpu@2 { 219*4882a593Smuzhiyun device_type = "cpu"; 220*4882a593Smuzhiyun compatible = "arm,cortex-a15"; 221*4882a593Smuzhiyun reg = <0x100>; 222*4882a593Smuzhiyun capacity-dmips-mhz = <1>; 223*4882a593Smuzhiyun }; 224*4882a593Smuzhiyun 225*4882a593Smuzhiyun cpu3: cpu@3 { 226*4882a593Smuzhiyun device_type = "cpu"; 227*4882a593Smuzhiyun compatible = "arm,cortex-a15"; 228*4882a593Smuzhiyun reg = <0x101>; 229*4882a593Smuzhiyun capacity-dmips-mhz = <1>; 230*4882a593Smuzhiyun }; 231*4882a593Smuzhiyun}; 232*4882a593Smuzhiyun 233*4882a593Smuzhiyun=========================================== 234*4882a593Smuzhiyun5 - References 235*4882a593Smuzhiyun=========================================== 236*4882a593Smuzhiyun 237*4882a593Smuzhiyun[1] ARM Linux Kernel documentation - CPUs bindings 238*4882a593Smuzhiyun Documentation/devicetree/bindings/arm/cpus.yaml 239