1*4882a593Smuzhiyun.. SPDX-License-Identifier: GPL-2.0 2*4882a593Smuzhiyun 3*4882a593Smuzhiyun================ 4*4882a593SmuzhiyunCPU Idle Cooling 5*4882a593Smuzhiyun================ 6*4882a593Smuzhiyun 7*4882a593SmuzhiyunSituation: 8*4882a593Smuzhiyun---------- 9*4882a593Smuzhiyun 10*4882a593SmuzhiyunUnder certain circumstances a SoC can reach a critical temperature 11*4882a593Smuzhiyunlimit and is unable to stabilize the temperature around a temperature 12*4882a593Smuzhiyuncontrol. When the SoC has to stabilize the temperature, the kernel can 13*4882a593Smuzhiyunact on a cooling device to mitigate the dissipated power. When the 14*4882a593Smuzhiyuncritical temperature is reached, a decision must be taken to reduce 15*4882a593Smuzhiyunthe temperature, that, in turn impacts performance. 16*4882a593Smuzhiyun 17*4882a593SmuzhiyunAnother situation is when the silicon temperature continues to 18*4882a593Smuzhiyunincrease even after the dynamic leakage is reduced to its minimum by 19*4882a593Smuzhiyunclock gating the component. This runaway phenomenon can continue due 20*4882a593Smuzhiyunto the static leakage. The only solution is to power down the 21*4882a593Smuzhiyuncomponent, thus dropping the dynamic and static leakage that will 22*4882a593Smuzhiyunallow the component to cool down. 23*4882a593Smuzhiyun 24*4882a593SmuzhiyunLast but not least, the system can ask for a specific power budget but 25*4882a593Smuzhiyunbecause of the OPP density, we can only choose an OPP with a power 26*4882a593Smuzhiyunbudget lower than the requested one and under-utilize the CPU, thus 27*4882a593Smuzhiyunlosing performance. In other words, one OPP under-utilizes the CPU 28*4882a593Smuzhiyunwith a power less than the requested power budget and the next OPP 29*4882a593Smuzhiyunexceeds the power budget. An intermediate OPP could have been used if 30*4882a593Smuzhiyunit were present. 31*4882a593Smuzhiyun 32*4882a593SmuzhiyunSolutions: 33*4882a593Smuzhiyun---------- 34*4882a593Smuzhiyun 35*4882a593SmuzhiyunIf we can remove the static and the dynamic leakage for a specific 36*4882a593Smuzhiyunduration in a controlled period, the SoC temperature will 37*4882a593Smuzhiyundecrease. Acting on the idle state duration or the idle cycle 38*4882a593Smuzhiyuninjection period, we can mitigate the temperature by modulating the 39*4882a593Smuzhiyunpower budget. 40*4882a593Smuzhiyun 41*4882a593SmuzhiyunThe Operating Performance Point (OPP) density has a great influence on 42*4882a593Smuzhiyunthe control precision of cpufreq, however different vendors have a 43*4882a593Smuzhiyunplethora of OPP density, and some have large power gap between OPPs, 44*4882a593Smuzhiyunthat will result in loss of performance during thermal control and 45*4882a593Smuzhiyunloss of power in other scenarios. 46*4882a593Smuzhiyun 47*4882a593SmuzhiyunAt a specific OPP, we can assume that injecting idle cycle on all CPUs 48*4882a593Smuzhiyunbelong to the same cluster, with a duration greater than the cluster 49*4882a593Smuzhiyunidle state target residency, we lead to dropping the static and the 50*4882a593Smuzhiyundynamic leakage for this period (modulo the energy needed to enter 51*4882a593Smuzhiyunthis state). So the sustainable power with idle cycles has a linear 52*4882a593Smuzhiyunrelation with the OPP’s sustainable power and can be computed with a 53*4882a593Smuzhiyuncoefficient similar to:: 54*4882a593Smuzhiyun 55*4882a593Smuzhiyun Power(IdleCycle) = Coef x Power(OPP) 56*4882a593Smuzhiyun 57*4882a593SmuzhiyunIdle Injection: 58*4882a593Smuzhiyun--------------- 59*4882a593Smuzhiyun 60*4882a593SmuzhiyunThe base concept of the idle injection is to force the CPU to go to an 61*4882a593Smuzhiyunidle state for a specified time each control cycle, it provides 62*4882a593Smuzhiyunanother way to control CPU power and heat in addition to 63*4882a593Smuzhiyuncpufreq. Ideally, if all CPUs belonging to the same cluster, inject 64*4882a593Smuzhiyuntheir idle cycles synchronously, the cluster can reach its power down 65*4882a593Smuzhiyunstate with a minimum power consumption and reduce the static leakage 66*4882a593Smuzhiyunto almost zero. However, these idle cycles injection will add extra 67*4882a593Smuzhiyunlatencies as the CPUs will have to wakeup from a deep sleep state. 68*4882a593Smuzhiyun 69*4882a593SmuzhiyunWe use a fixed duration of idle injection that gives an acceptable 70*4882a593Smuzhiyunperformance penalty and a fixed latency. Mitigation can be increased 71*4882a593Smuzhiyunor decreased by modulating the duty cycle of the idle injection. 72*4882a593Smuzhiyun 73*4882a593Smuzhiyun:: 74*4882a593Smuzhiyun 75*4882a593Smuzhiyun ^ 76*4882a593Smuzhiyun | 77*4882a593Smuzhiyun | 78*4882a593Smuzhiyun |------- ------- 79*4882a593Smuzhiyun |_______|_______________________|_______|___________ 80*4882a593Smuzhiyun 81*4882a593Smuzhiyun <------> 82*4882a593Smuzhiyun idle <----------------------> 83*4882a593Smuzhiyun running 84*4882a593Smuzhiyun 85*4882a593Smuzhiyun <-----------------------------> 86*4882a593Smuzhiyun duty cycle 25% 87*4882a593Smuzhiyun 88*4882a593Smuzhiyun 89*4882a593SmuzhiyunThe implementation of the cooling device bases the number of states on 90*4882a593Smuzhiyunthe duty cycle percentage. When no mitigation is happening the cooling 91*4882a593Smuzhiyundevice state is zero, meaning the duty cycle is 0%. 92*4882a593Smuzhiyun 93*4882a593SmuzhiyunWhen the mitigation begins, depending on the governor's policy, a 94*4882a593Smuzhiyunstarting state is selected. With a fixed idle duration and the duty 95*4882a593Smuzhiyuncycle (aka the cooling device state), the running duration can be 96*4882a593Smuzhiyuncomputed. 97*4882a593Smuzhiyun 98*4882a593SmuzhiyunThe governor will change the cooling device state thus the duty cycle 99*4882a593Smuzhiyunand this variation will modulate the cooling effect. 100*4882a593Smuzhiyun 101*4882a593Smuzhiyun:: 102*4882a593Smuzhiyun 103*4882a593Smuzhiyun ^ 104*4882a593Smuzhiyun | 105*4882a593Smuzhiyun | 106*4882a593Smuzhiyun |------- ------- 107*4882a593Smuzhiyun |_______|_______________|_______|___________ 108*4882a593Smuzhiyun 109*4882a593Smuzhiyun <------> 110*4882a593Smuzhiyun idle <--------------> 111*4882a593Smuzhiyun running 112*4882a593Smuzhiyun 113*4882a593Smuzhiyun <---------------------> 114*4882a593Smuzhiyun duty cycle 33% 115*4882a593Smuzhiyun 116*4882a593Smuzhiyun 117*4882a593Smuzhiyun ^ 118*4882a593Smuzhiyun | 119*4882a593Smuzhiyun | 120*4882a593Smuzhiyun |------- ------- 121*4882a593Smuzhiyun |_______|_______|_______|___________ 122*4882a593Smuzhiyun 123*4882a593Smuzhiyun <------> 124*4882a593Smuzhiyun idle <------> 125*4882a593Smuzhiyun running 126*4882a593Smuzhiyun 127*4882a593Smuzhiyun <-------------> 128*4882a593Smuzhiyun duty cycle 50% 129*4882a593Smuzhiyun 130*4882a593SmuzhiyunThe idle injection duration value must comply with the constraints: 131*4882a593Smuzhiyun 132*4882a593Smuzhiyun- It is less than or equal to the latency we tolerate when the 133*4882a593Smuzhiyun mitigation begins. It is platform dependent and will depend on the 134*4882a593Smuzhiyun user experience, reactivity vs performance trade off we want. This 135*4882a593Smuzhiyun value should be specified. 136*4882a593Smuzhiyun 137*4882a593Smuzhiyun- It is greater than the idle state’s target residency we want to go 138*4882a593Smuzhiyun for thermal mitigation, otherwise we end up consuming more energy. 139*4882a593Smuzhiyun 140*4882a593SmuzhiyunPower considerations 141*4882a593Smuzhiyun-------------------- 142*4882a593Smuzhiyun 143*4882a593SmuzhiyunWhen we reach the thermal trip point, we have to sustain a specified 144*4882a593Smuzhiyunpower for a specific temperature but at this time we consume:: 145*4882a593Smuzhiyun 146*4882a593Smuzhiyun Power = Capacitance x Voltage^2 x Frequency x Utilisation 147*4882a593Smuzhiyun 148*4882a593Smuzhiyun... which is more than the sustainable power (or there is something 149*4882a593Smuzhiyunwrong in the system setup). The ‘Capacitance’ and ‘Utilisation’ are a 150*4882a593Smuzhiyunfixed value, ‘Voltage’ and the ‘Frequency’ are fixed artificially 151*4882a593Smuzhiyunbecause we don’t want to change the OPP. We can group the 152*4882a593Smuzhiyun‘Capacitance’ and the ‘Utilisation’ into a single term which is the 153*4882a593Smuzhiyun‘Dynamic Power Coefficient (Cdyn)’ Simplifying the above, we have:: 154*4882a593Smuzhiyun 155*4882a593Smuzhiyun Pdyn = Cdyn x Voltage^2 x Frequency 156*4882a593Smuzhiyun 157*4882a593SmuzhiyunThe power allocator governor will ask us somehow to reduce our power 158*4882a593Smuzhiyunin order to target the sustainable power defined in the device 159*4882a593Smuzhiyuntree. So with the idle injection mechanism, we want an average power 160*4882a593Smuzhiyun(Ptarget) resulting in an amount of time running at full power on a 161*4882a593Smuzhiyunspecific OPP and idle another amount of time. That could be put in a 162*4882a593Smuzhiyunequation:: 163*4882a593Smuzhiyun 164*4882a593Smuzhiyun P(opp)target = ((Trunning x (P(opp)running) + (Tidle x P(opp)idle)) / 165*4882a593Smuzhiyun (Trunning + Tidle) 166*4882a593Smuzhiyun 167*4882a593Smuzhiyun ... 168*4882a593Smuzhiyun 169*4882a593Smuzhiyun Tidle = Trunning x ((P(opp)running / P(opp)target) - 1) 170*4882a593Smuzhiyun 171*4882a593SmuzhiyunAt this point if we know the running period for the CPU, that gives us 172*4882a593Smuzhiyunthe idle injection we need. Alternatively if we have the idle 173*4882a593Smuzhiyuninjection duration, we can compute the running duration with:: 174*4882a593Smuzhiyun 175*4882a593Smuzhiyun Trunning = Tidle / ((P(opp)running / P(opp)target) - 1) 176*4882a593Smuzhiyun 177*4882a593SmuzhiyunPractically, if the running power is less than the targeted power, we 178*4882a593Smuzhiyunend up with a negative time value, so obviously the equation usage is 179*4882a593Smuzhiyunbound to a power reduction, hence a higher OPP is needed to have the 180*4882a593Smuzhiyunrunning power greater than the targeted power. 181*4882a593Smuzhiyun 182*4882a593SmuzhiyunHowever, in this demonstration we ignore three aspects: 183*4882a593Smuzhiyun 184*4882a593Smuzhiyun * The static leakage is not defined here, we can introduce it in the 185*4882a593Smuzhiyun equation but assuming it will be zero most of the time as it is 186*4882a593Smuzhiyun difficult to get the values from the SoC vendors 187*4882a593Smuzhiyun 188*4882a593Smuzhiyun * The idle state wake up latency (or entry + exit latency) is not 189*4882a593Smuzhiyun taken into account, it must be added in the equation in order to 190*4882a593Smuzhiyun rigorously compute the idle injection 191*4882a593Smuzhiyun 192*4882a593Smuzhiyun * The injected idle duration must be greater than the idle state 193*4882a593Smuzhiyun target residency, otherwise we end up consuming more energy and 194*4882a593Smuzhiyun potentially invert the mitigation effect 195*4882a593Smuzhiyun 196*4882a593SmuzhiyunSo the final equation is:: 197*4882a593Smuzhiyun 198*4882a593Smuzhiyun Trunning = (Tidle - Twakeup ) x 199*4882a593Smuzhiyun (((P(opp)dyn + P(opp)static ) - P(opp)target) / P(opp)target ) 200