1*4882a593Smuzhiyun.. SPDX-License-Identifier: GPL-2.0 2*4882a593Smuzhiyun 3*4882a593Smuzhiyun======================= 4*4882a593SmuzhiyunEnergy Model of devices 5*4882a593Smuzhiyun======================= 6*4882a593Smuzhiyun 7*4882a593Smuzhiyun1. Overview 8*4882a593Smuzhiyun----------- 9*4882a593Smuzhiyun 10*4882a593SmuzhiyunThe Energy Model (EM) framework serves as an interface between drivers knowing 11*4882a593Smuzhiyunthe power consumed by devices at various performance levels, and the kernel 12*4882a593Smuzhiyunsubsystems willing to use that information to make energy-aware decisions. 13*4882a593Smuzhiyun 14*4882a593SmuzhiyunThe source of the information about the power consumed by devices can vary greatly 15*4882a593Smuzhiyunfrom one platform to another. These power costs can be estimated using 16*4882a593Smuzhiyundevicetree data in some cases. In others, the firmware will know better. 17*4882a593SmuzhiyunAlternatively, userspace might be best positioned. And so on. In order to avoid 18*4882a593Smuzhiyuneach and every client subsystem to re-implement support for each and every 19*4882a593Smuzhiyunpossible source of information on its own, the EM framework intervenes as an 20*4882a593Smuzhiyunabstraction layer which standardizes the format of power cost tables in the 21*4882a593Smuzhiyunkernel, hence enabling to avoid redundant work. 22*4882a593Smuzhiyun 23*4882a593SmuzhiyunThe figure below depicts an example of drivers (Arm-specific here, but the 24*4882a593Smuzhiyunapproach is applicable to any architecture) providing power costs to the EM 25*4882a593Smuzhiyunframework, and interested clients reading the data from it:: 26*4882a593Smuzhiyun 27*4882a593Smuzhiyun +---------------+ +-----------------+ +---------------+ 28*4882a593Smuzhiyun | Thermal (IPA) | | Scheduler (EAS) | | Other | 29*4882a593Smuzhiyun +---------------+ +-----------------+ +---------------+ 30*4882a593Smuzhiyun | | em_cpu_energy() | 31*4882a593Smuzhiyun | | em_cpu_get() | 32*4882a593Smuzhiyun +---------+ | +---------+ 33*4882a593Smuzhiyun | | | 34*4882a593Smuzhiyun v v v 35*4882a593Smuzhiyun +---------------------+ 36*4882a593Smuzhiyun | Energy Model | 37*4882a593Smuzhiyun | Framework | 38*4882a593Smuzhiyun +---------------------+ 39*4882a593Smuzhiyun ^ ^ ^ 40*4882a593Smuzhiyun | | | em_dev_register_perf_domain() 41*4882a593Smuzhiyun +----------+ | +---------+ 42*4882a593Smuzhiyun | | | 43*4882a593Smuzhiyun +---------------+ +---------------+ +--------------+ 44*4882a593Smuzhiyun | cpufreq-dt | | arm_scmi | | Other | 45*4882a593Smuzhiyun +---------------+ +---------------+ +--------------+ 46*4882a593Smuzhiyun ^ ^ ^ 47*4882a593Smuzhiyun | | | 48*4882a593Smuzhiyun +--------------+ +---------------+ +--------------+ 49*4882a593Smuzhiyun | Device Tree | | Firmware | | ? | 50*4882a593Smuzhiyun +--------------+ +---------------+ +--------------+ 51*4882a593Smuzhiyun 52*4882a593SmuzhiyunIn case of CPU devices the EM framework manages power cost tables per 53*4882a593Smuzhiyun'performance domain' in the system. A performance domain is a group of CPUs 54*4882a593Smuzhiyunwhose performance is scaled together. Performance domains generally have a 55*4882a593Smuzhiyun1-to-1 mapping with CPUFreq policies. All CPUs in a performance domain are 56*4882a593Smuzhiyunrequired to have the same micro-architecture. CPUs in different performance 57*4882a593Smuzhiyundomains can have different micro-architectures. 58*4882a593Smuzhiyun 59*4882a593Smuzhiyun 60*4882a593Smuzhiyun2. Core APIs 61*4882a593Smuzhiyun------------ 62*4882a593Smuzhiyun 63*4882a593Smuzhiyun2.1 Config options 64*4882a593Smuzhiyun^^^^^^^^^^^^^^^^^^ 65*4882a593Smuzhiyun 66*4882a593SmuzhiyunCONFIG_ENERGY_MODEL must be enabled to use the EM framework. 67*4882a593Smuzhiyun 68*4882a593Smuzhiyun 69*4882a593Smuzhiyun2.2 Registration of performance domains 70*4882a593Smuzhiyun^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 71*4882a593Smuzhiyun 72*4882a593SmuzhiyunDrivers are expected to register performance domains into the EM framework by 73*4882a593Smuzhiyuncalling the following API:: 74*4882a593Smuzhiyun 75*4882a593Smuzhiyun int em_dev_register_perf_domain(struct device *dev, unsigned int nr_states, 76*4882a593Smuzhiyun struct em_data_callback *cb, cpumask_t *cpus); 77*4882a593Smuzhiyun 78*4882a593SmuzhiyunDrivers must provide a callback function returning <frequency, power> tuples 79*4882a593Smuzhiyunfor each performance state. The callback function provided by the driver is free 80*4882a593Smuzhiyunto fetch data from any relevant location (DT, firmware, ...), and by any mean 81*4882a593Smuzhiyundeemed necessary. Only for CPU devices, drivers must specify the CPUs of the 82*4882a593Smuzhiyunperformance domains using cpumask. For other devices than CPUs the last 83*4882a593Smuzhiyunargument must be set to NULL. 84*4882a593SmuzhiyunSee Section 3. for an example of driver implementing this 85*4882a593Smuzhiyuncallback, and kernel/power/energy_model.c for further documentation on this 86*4882a593SmuzhiyunAPI. 87*4882a593Smuzhiyun 88*4882a593Smuzhiyun 89*4882a593Smuzhiyun2.3 Accessing performance domains 90*4882a593Smuzhiyun^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 91*4882a593Smuzhiyun 92*4882a593SmuzhiyunThere are two API functions which provide the access to the energy model: 93*4882a593Smuzhiyunem_cpu_get() which takes CPU id as an argument and em_pd_get() with device 94*4882a593Smuzhiyunpointer as an argument. It depends on the subsystem which interface it is 95*4882a593Smuzhiyungoing to use, but in case of CPU devices both functions return the same 96*4882a593Smuzhiyunperformance domain. 97*4882a593Smuzhiyun 98*4882a593SmuzhiyunSubsystems interested in the energy model of a CPU can retrieve it using the 99*4882a593Smuzhiyunem_cpu_get() API. The energy model tables are allocated once upon creation of 100*4882a593Smuzhiyunthe performance domains, and kept in memory untouched. 101*4882a593Smuzhiyun 102*4882a593SmuzhiyunThe energy consumed by a performance domain can be estimated using the 103*4882a593Smuzhiyunem_cpu_energy() API. The estimation is performed assuming that the schedutil 104*4882a593SmuzhiyunCPUfreq governor is in use in case of CPU device. Currently this calculation is 105*4882a593Smuzhiyunnot provided for other type of devices. 106*4882a593Smuzhiyun 107*4882a593SmuzhiyunMore details about the above APIs can be found in include/linux/energy_model.h. 108*4882a593Smuzhiyun 109*4882a593Smuzhiyun 110*4882a593Smuzhiyun3. Example driver 111*4882a593Smuzhiyun----------------- 112*4882a593Smuzhiyun 113*4882a593SmuzhiyunThis section provides a simple example of a CPUFreq driver registering a 114*4882a593Smuzhiyunperformance domain in the Energy Model framework using the (fake) 'foo' 115*4882a593Smuzhiyunprotocol. The driver implements an est_power() function to be provided to the 116*4882a593SmuzhiyunEM framework:: 117*4882a593Smuzhiyun 118*4882a593Smuzhiyun -> drivers/cpufreq/foo_cpufreq.c 119*4882a593Smuzhiyun 120*4882a593Smuzhiyun 01 static int est_power(unsigned long *mW, unsigned long *KHz, 121*4882a593Smuzhiyun 02 struct device *dev) 122*4882a593Smuzhiyun 03 { 123*4882a593Smuzhiyun 04 long freq, power; 124*4882a593Smuzhiyun 05 125*4882a593Smuzhiyun 06 /* Use the 'foo' protocol to ceil the frequency */ 126*4882a593Smuzhiyun 07 freq = foo_get_freq_ceil(dev, *KHz); 127*4882a593Smuzhiyun 08 if (freq < 0); 128*4882a593Smuzhiyun 09 return freq; 129*4882a593Smuzhiyun 10 130*4882a593Smuzhiyun 11 /* Estimate the power cost for the dev at the relevant freq. */ 131*4882a593Smuzhiyun 12 power = foo_estimate_power(dev, freq); 132*4882a593Smuzhiyun 13 if (power < 0); 133*4882a593Smuzhiyun 14 return power; 134*4882a593Smuzhiyun 15 135*4882a593Smuzhiyun 16 /* Return the values to the EM framework */ 136*4882a593Smuzhiyun 17 *mW = power; 137*4882a593Smuzhiyun 18 *KHz = freq; 138*4882a593Smuzhiyun 19 139*4882a593Smuzhiyun 20 return 0; 140*4882a593Smuzhiyun 21 } 141*4882a593Smuzhiyun 22 142*4882a593Smuzhiyun 23 static int foo_cpufreq_init(struct cpufreq_policy *policy) 143*4882a593Smuzhiyun 24 { 144*4882a593Smuzhiyun 25 struct em_data_callback em_cb = EM_DATA_CB(est_power); 145*4882a593Smuzhiyun 26 struct device *cpu_dev; 146*4882a593Smuzhiyun 27 int nr_opp, ret; 147*4882a593Smuzhiyun 28 148*4882a593Smuzhiyun 29 cpu_dev = get_cpu_device(cpumask_first(policy->cpus)); 149*4882a593Smuzhiyun 30 150*4882a593Smuzhiyun 31 /* Do the actual CPUFreq init work ... */ 151*4882a593Smuzhiyun 32 ret = do_foo_cpufreq_init(policy); 152*4882a593Smuzhiyun 33 if (ret) 153*4882a593Smuzhiyun 34 return ret; 154*4882a593Smuzhiyun 35 155*4882a593Smuzhiyun 36 /* Find the number of OPPs for this policy */ 156*4882a593Smuzhiyun 37 nr_opp = foo_get_nr_opp(policy); 157*4882a593Smuzhiyun 38 158*4882a593Smuzhiyun 39 /* And register the new performance domain */ 159*4882a593Smuzhiyun 40 em_dev_register_perf_domain(cpu_dev, nr_opp, &em_cb, policy->cpus); 160*4882a593Smuzhiyun 41 161*4882a593Smuzhiyun 42 return 0; 162*4882a593Smuzhiyun 43 } 163