1*4882a593Smuzhiyun================================================== 2*4882a593SmuzhiyunRuntime Power Management Framework for I/O Devices 3*4882a593Smuzhiyun================================================== 4*4882a593Smuzhiyun 5*4882a593Smuzhiyun(C) 2009-2011 Rafael J. Wysocki <rjw@sisk.pl>, Novell Inc. 6*4882a593Smuzhiyun 7*4882a593Smuzhiyun(C) 2010 Alan Stern <stern@rowland.harvard.edu> 8*4882a593Smuzhiyun 9*4882a593Smuzhiyun(C) 2014 Intel Corp., Rafael J. Wysocki <rafael.j.wysocki@intel.com> 10*4882a593Smuzhiyun 11*4882a593Smuzhiyun1. Introduction 12*4882a593Smuzhiyun=============== 13*4882a593Smuzhiyun 14*4882a593SmuzhiyunSupport for runtime power management (runtime PM) of I/O devices is provided 15*4882a593Smuzhiyunat the power management core (PM core) level by means of: 16*4882a593Smuzhiyun 17*4882a593Smuzhiyun* The power management workqueue pm_wq in which bus types and device drivers can 18*4882a593Smuzhiyun put their PM-related work items. It is strongly recommended that pm_wq be 19*4882a593Smuzhiyun used for queuing all work items related to runtime PM, because this allows 20*4882a593Smuzhiyun them to be synchronized with system-wide power transitions (suspend to RAM, 21*4882a593Smuzhiyun hibernation and resume from system sleep states). pm_wq is declared in 22*4882a593Smuzhiyun include/linux/pm_runtime.h and defined in kernel/power/main.c. 23*4882a593Smuzhiyun 24*4882a593Smuzhiyun* A number of runtime PM fields in the 'power' member of 'struct device' (which 25*4882a593Smuzhiyun is of the type 'struct dev_pm_info', defined in include/linux/pm.h) that can 26*4882a593Smuzhiyun be used for synchronizing runtime PM operations with one another. 27*4882a593Smuzhiyun 28*4882a593Smuzhiyun* Three device runtime PM callbacks in 'struct dev_pm_ops' (defined in 29*4882a593Smuzhiyun include/linux/pm.h). 30*4882a593Smuzhiyun 31*4882a593Smuzhiyun* A set of helper functions defined in drivers/base/power/runtime.c that can be 32*4882a593Smuzhiyun used for carrying out runtime PM operations in such a way that the 33*4882a593Smuzhiyun synchronization between them is taken care of by the PM core. Bus types and 34*4882a593Smuzhiyun device drivers are encouraged to use these functions. 35*4882a593Smuzhiyun 36*4882a593SmuzhiyunThe runtime PM callbacks present in 'struct dev_pm_ops', the device runtime PM 37*4882a593Smuzhiyunfields of 'struct dev_pm_info' and the core helper functions provided for 38*4882a593Smuzhiyunruntime PM are described below. 39*4882a593Smuzhiyun 40*4882a593Smuzhiyun2. Device Runtime PM Callbacks 41*4882a593Smuzhiyun============================== 42*4882a593Smuzhiyun 43*4882a593SmuzhiyunThere are three device runtime PM callbacks defined in 'struct dev_pm_ops':: 44*4882a593Smuzhiyun 45*4882a593Smuzhiyun struct dev_pm_ops { 46*4882a593Smuzhiyun ... 47*4882a593Smuzhiyun int (*runtime_suspend)(struct device *dev); 48*4882a593Smuzhiyun int (*runtime_resume)(struct device *dev); 49*4882a593Smuzhiyun int (*runtime_idle)(struct device *dev); 50*4882a593Smuzhiyun ... 51*4882a593Smuzhiyun }; 52*4882a593Smuzhiyun 53*4882a593SmuzhiyunThe ->runtime_suspend(), ->runtime_resume() and ->runtime_idle() callbacks 54*4882a593Smuzhiyunare executed by the PM core for the device's subsystem that may be either of 55*4882a593Smuzhiyunthe following: 56*4882a593Smuzhiyun 57*4882a593Smuzhiyun 1. PM domain of the device, if the device's PM domain object, dev->pm_domain, 58*4882a593Smuzhiyun is present. 59*4882a593Smuzhiyun 60*4882a593Smuzhiyun 2. Device type of the device, if both dev->type and dev->type->pm are present. 61*4882a593Smuzhiyun 62*4882a593Smuzhiyun 3. Device class of the device, if both dev->class and dev->class->pm are 63*4882a593Smuzhiyun present. 64*4882a593Smuzhiyun 65*4882a593Smuzhiyun 4. Bus type of the device, if both dev->bus and dev->bus->pm are present. 66*4882a593Smuzhiyun 67*4882a593SmuzhiyunIf the subsystem chosen by applying the above rules doesn't provide the relevant 68*4882a593Smuzhiyuncallback, the PM core will invoke the corresponding driver callback stored in 69*4882a593Smuzhiyundev->driver->pm directly (if present). 70*4882a593Smuzhiyun 71*4882a593SmuzhiyunThe PM core always checks which callback to use in the order given above, so the 72*4882a593Smuzhiyunpriority order of callbacks from high to low is: PM domain, device type, class 73*4882a593Smuzhiyunand bus type. Moreover, the high-priority one will always take precedence over 74*4882a593Smuzhiyuna low-priority one. The PM domain, bus type, device type and class callbacks 75*4882a593Smuzhiyunare referred to as subsystem-level callbacks in what follows. 76*4882a593Smuzhiyun 77*4882a593SmuzhiyunBy default, the callbacks are always invoked in process context with interrupts 78*4882a593Smuzhiyunenabled. However, the pm_runtime_irq_safe() helper function can be used to tell 79*4882a593Smuzhiyunthe PM core that it is safe to run the ->runtime_suspend(), ->runtime_resume() 80*4882a593Smuzhiyunand ->runtime_idle() callbacks for the given device in atomic context with 81*4882a593Smuzhiyuninterrupts disabled. This implies that the callback routines in question must 82*4882a593Smuzhiyunnot block or sleep, but it also means that the synchronous helper functions 83*4882a593Smuzhiyunlisted at the end of Section 4 may be used for that device within an interrupt 84*4882a593Smuzhiyunhandler or generally in an atomic context. 85*4882a593Smuzhiyun 86*4882a593SmuzhiyunThe subsystem-level suspend callback, if present, is _entirely_ _responsible_ 87*4882a593Smuzhiyunfor handling the suspend of the device as appropriate, which may, but need not 88*4882a593Smuzhiyuninclude executing the device driver's own ->runtime_suspend() callback (from the 89*4882a593SmuzhiyunPM core's point of view it is not necessary to implement a ->runtime_suspend() 90*4882a593Smuzhiyuncallback in a device driver as long as the subsystem-level suspend callback 91*4882a593Smuzhiyunknows what to do to handle the device). 92*4882a593Smuzhiyun 93*4882a593Smuzhiyun * Once the subsystem-level suspend callback (or the driver suspend callback, 94*4882a593Smuzhiyun if invoked directly) has completed successfully for the given device, the PM 95*4882a593Smuzhiyun core regards the device as suspended, which need not mean that it has been 96*4882a593Smuzhiyun put into a low power state. It is supposed to mean, however, that the 97*4882a593Smuzhiyun device will not process data and will not communicate with the CPU(s) and 98*4882a593Smuzhiyun RAM until the appropriate resume callback is executed for it. The runtime 99*4882a593Smuzhiyun PM status of a device after successful execution of the suspend callback is 100*4882a593Smuzhiyun 'suspended'. 101*4882a593Smuzhiyun 102*4882a593Smuzhiyun * If the suspend callback returns -EBUSY or -EAGAIN, the device's runtime PM 103*4882a593Smuzhiyun status remains 'active', which means that the device _must_ be fully 104*4882a593Smuzhiyun operational afterwards. 105*4882a593Smuzhiyun 106*4882a593Smuzhiyun * If the suspend callback returns an error code different from -EBUSY and 107*4882a593Smuzhiyun -EAGAIN, the PM core regards this as a fatal error and will refuse to run 108*4882a593Smuzhiyun the helper functions described in Section 4 for the device until its status 109*4882a593Smuzhiyun is directly set to either 'active', or 'suspended' (the PM core provides 110*4882a593Smuzhiyun special helper functions for this purpose). 111*4882a593Smuzhiyun 112*4882a593SmuzhiyunIn particular, if the driver requires remote wakeup capability (i.e. hardware 113*4882a593Smuzhiyunmechanism allowing the device to request a change of its power state, such as 114*4882a593SmuzhiyunPCI PME) for proper functioning and device_can_wakeup() returns 'false' for the 115*4882a593Smuzhiyundevice, then ->runtime_suspend() should return -EBUSY. On the other hand, if 116*4882a593Smuzhiyundevice_can_wakeup() returns 'true' for the device and the device is put into a 117*4882a593Smuzhiyunlow-power state during the execution of the suspend callback, it is expected 118*4882a593Smuzhiyunthat remote wakeup will be enabled for the device. Generally, remote wakeup 119*4882a593Smuzhiyunshould be enabled for all input devices put into low-power states at run time. 120*4882a593Smuzhiyun 121*4882a593SmuzhiyunThe subsystem-level resume callback, if present, is **entirely responsible** for 122*4882a593Smuzhiyunhandling the resume of the device as appropriate, which may, but need not 123*4882a593Smuzhiyuninclude executing the device driver's own ->runtime_resume() callback (from the 124*4882a593SmuzhiyunPM core's point of view it is not necessary to implement a ->runtime_resume() 125*4882a593Smuzhiyuncallback in a device driver as long as the subsystem-level resume callback knows 126*4882a593Smuzhiyunwhat to do to handle the device). 127*4882a593Smuzhiyun 128*4882a593Smuzhiyun * Once the subsystem-level resume callback (or the driver resume callback, if 129*4882a593Smuzhiyun invoked directly) has completed successfully, the PM core regards the device 130*4882a593Smuzhiyun as fully operational, which means that the device _must_ be able to complete 131*4882a593Smuzhiyun I/O operations as needed. The runtime PM status of the device is then 132*4882a593Smuzhiyun 'active'. 133*4882a593Smuzhiyun 134*4882a593Smuzhiyun * If the resume callback returns an error code, the PM core regards this as a 135*4882a593Smuzhiyun fatal error and will refuse to run the helper functions described in Section 136*4882a593Smuzhiyun 4 for the device, until its status is directly set to either 'active', or 137*4882a593Smuzhiyun 'suspended' (by means of special helper functions provided by the PM core 138*4882a593Smuzhiyun for this purpose). 139*4882a593Smuzhiyun 140*4882a593SmuzhiyunThe idle callback (a subsystem-level one, if present, or the driver one) is 141*4882a593Smuzhiyunexecuted by the PM core whenever the device appears to be idle, which is 142*4882a593Smuzhiyunindicated to the PM core by two counters, the device's usage counter and the 143*4882a593Smuzhiyuncounter of 'active' children of the device. 144*4882a593Smuzhiyun 145*4882a593Smuzhiyun * If any of these counters is decreased using a helper function provided by 146*4882a593Smuzhiyun the PM core and it turns out to be equal to zero, the other counter is 147*4882a593Smuzhiyun checked. If that counter also is equal to zero, the PM core executes the 148*4882a593Smuzhiyun idle callback with the device as its argument. 149*4882a593Smuzhiyun 150*4882a593SmuzhiyunThe action performed by the idle callback is totally dependent on the subsystem 151*4882a593Smuzhiyun(or driver) in question, but the expected and recommended action is to check 152*4882a593Smuzhiyunif the device can be suspended (i.e. if all of the conditions necessary for 153*4882a593Smuzhiyunsuspending the device are satisfied) and to queue up a suspend request for the 154*4882a593Smuzhiyundevice in that case. If there is no idle callback, or if the callback returns 155*4882a593Smuzhiyun0, then the PM core will attempt to carry out a runtime suspend of the device, 156*4882a593Smuzhiyunalso respecting devices configured for autosuspend. In essence this means a 157*4882a593Smuzhiyuncall to pm_runtime_autosuspend() (do note that drivers needs to update the 158*4882a593Smuzhiyundevice last busy mark, pm_runtime_mark_last_busy(), to control the delay under 159*4882a593Smuzhiyunthis circumstance). To prevent this (for example, if the callback routine has 160*4882a593Smuzhiyunstarted a delayed suspend), the routine must return a non-zero value. Negative 161*4882a593Smuzhiyunerror return codes are ignored by the PM core. 162*4882a593Smuzhiyun 163*4882a593SmuzhiyunThe helper functions provided by the PM core, described in Section 4, guarantee 164*4882a593Smuzhiyunthat the following constraints are met with respect to runtime PM callbacks for 165*4882a593Smuzhiyunone device: 166*4882a593Smuzhiyun 167*4882a593Smuzhiyun(1) The callbacks are mutually exclusive (e.g. it is forbidden to execute 168*4882a593Smuzhiyun ->runtime_suspend() in parallel with ->runtime_resume() or with another 169*4882a593Smuzhiyun instance of ->runtime_suspend() for the same device) with the exception that 170*4882a593Smuzhiyun ->runtime_suspend() or ->runtime_resume() can be executed in parallel with 171*4882a593Smuzhiyun ->runtime_idle() (although ->runtime_idle() will not be started while any 172*4882a593Smuzhiyun of the other callbacks is being executed for the same device). 173*4882a593Smuzhiyun 174*4882a593Smuzhiyun(2) ->runtime_idle() and ->runtime_suspend() can only be executed for 'active' 175*4882a593Smuzhiyun devices (i.e. the PM core will only execute ->runtime_idle() or 176*4882a593Smuzhiyun ->runtime_suspend() for the devices the runtime PM status of which is 177*4882a593Smuzhiyun 'active'). 178*4882a593Smuzhiyun 179*4882a593Smuzhiyun(3) ->runtime_idle() and ->runtime_suspend() can only be executed for a device 180*4882a593Smuzhiyun the usage counter of which is equal to zero _and_ either the counter of 181*4882a593Smuzhiyun 'active' children of which is equal to zero, or the 'power.ignore_children' 182*4882a593Smuzhiyun flag of which is set. 183*4882a593Smuzhiyun 184*4882a593Smuzhiyun(4) ->runtime_resume() can only be executed for 'suspended' devices (i.e. the 185*4882a593Smuzhiyun PM core will only execute ->runtime_resume() for the devices the runtime 186*4882a593Smuzhiyun PM status of which is 'suspended'). 187*4882a593Smuzhiyun 188*4882a593SmuzhiyunAdditionally, the helper functions provided by the PM core obey the following 189*4882a593Smuzhiyunrules: 190*4882a593Smuzhiyun 191*4882a593Smuzhiyun * If ->runtime_suspend() is about to be executed or there's a pending request 192*4882a593Smuzhiyun to execute it, ->runtime_idle() will not be executed for the same device. 193*4882a593Smuzhiyun 194*4882a593Smuzhiyun * A request to execute or to schedule the execution of ->runtime_suspend() 195*4882a593Smuzhiyun will cancel any pending requests to execute ->runtime_idle() for the same 196*4882a593Smuzhiyun device. 197*4882a593Smuzhiyun 198*4882a593Smuzhiyun * If ->runtime_resume() is about to be executed or there's a pending request 199*4882a593Smuzhiyun to execute it, the other callbacks will not be executed for the same device. 200*4882a593Smuzhiyun 201*4882a593Smuzhiyun * A request to execute ->runtime_resume() will cancel any pending or 202*4882a593Smuzhiyun scheduled requests to execute the other callbacks for the same device, 203*4882a593Smuzhiyun except for scheduled autosuspends. 204*4882a593Smuzhiyun 205*4882a593Smuzhiyun3. Runtime PM Device Fields 206*4882a593Smuzhiyun=========================== 207*4882a593Smuzhiyun 208*4882a593SmuzhiyunThe following device runtime PM fields are present in 'struct dev_pm_info', as 209*4882a593Smuzhiyundefined in include/linux/pm.h: 210*4882a593Smuzhiyun 211*4882a593Smuzhiyun `struct timer_list suspend_timer;` 212*4882a593Smuzhiyun - timer used for scheduling (delayed) suspend and autosuspend requests 213*4882a593Smuzhiyun 214*4882a593Smuzhiyun `unsigned long timer_expires;` 215*4882a593Smuzhiyun - timer expiration time, in jiffies (if this is different from zero, the 216*4882a593Smuzhiyun timer is running and will expire at that time, otherwise the timer is not 217*4882a593Smuzhiyun running) 218*4882a593Smuzhiyun 219*4882a593Smuzhiyun `struct work_struct work;` 220*4882a593Smuzhiyun - work structure used for queuing up requests (i.e. work items in pm_wq) 221*4882a593Smuzhiyun 222*4882a593Smuzhiyun `wait_queue_head_t wait_queue;` 223*4882a593Smuzhiyun - wait queue used if any of the helper functions needs to wait for another 224*4882a593Smuzhiyun one to complete 225*4882a593Smuzhiyun 226*4882a593Smuzhiyun `spinlock_t lock;` 227*4882a593Smuzhiyun - lock used for synchronization 228*4882a593Smuzhiyun 229*4882a593Smuzhiyun `atomic_t usage_count;` 230*4882a593Smuzhiyun - the usage counter of the device 231*4882a593Smuzhiyun 232*4882a593Smuzhiyun `atomic_t child_count;` 233*4882a593Smuzhiyun - the count of 'active' children of the device 234*4882a593Smuzhiyun 235*4882a593Smuzhiyun `unsigned int ignore_children;` 236*4882a593Smuzhiyun - if set, the value of child_count is ignored (but still updated) 237*4882a593Smuzhiyun 238*4882a593Smuzhiyun `unsigned int disable_depth;` 239*4882a593Smuzhiyun - used for disabling the helper functions (they work normally if this is 240*4882a593Smuzhiyun equal to zero); the initial value of it is 1 (i.e. runtime PM is 241*4882a593Smuzhiyun initially disabled for all devices) 242*4882a593Smuzhiyun 243*4882a593Smuzhiyun `int runtime_error;` 244*4882a593Smuzhiyun - if set, there was a fatal error (one of the callbacks returned error code 245*4882a593Smuzhiyun as described in Section 2), so the helper functions will not work until 246*4882a593Smuzhiyun this flag is cleared; this is the error code returned by the failing 247*4882a593Smuzhiyun callback 248*4882a593Smuzhiyun 249*4882a593Smuzhiyun `unsigned int idle_notification;` 250*4882a593Smuzhiyun - if set, ->runtime_idle() is being executed 251*4882a593Smuzhiyun 252*4882a593Smuzhiyun `unsigned int request_pending;` 253*4882a593Smuzhiyun - if set, there's a pending request (i.e. a work item queued up into pm_wq) 254*4882a593Smuzhiyun 255*4882a593Smuzhiyun `enum rpm_request request;` 256*4882a593Smuzhiyun - type of request that's pending (valid if request_pending is set) 257*4882a593Smuzhiyun 258*4882a593Smuzhiyun `unsigned int deferred_resume;` 259*4882a593Smuzhiyun - set if ->runtime_resume() is about to be run while ->runtime_suspend() is 260*4882a593Smuzhiyun being executed for that device and it is not practical to wait for the 261*4882a593Smuzhiyun suspend to complete; means "start a resume as soon as you've suspended" 262*4882a593Smuzhiyun 263*4882a593Smuzhiyun `enum rpm_status runtime_status;` 264*4882a593Smuzhiyun - the runtime PM status of the device; this field's initial value is 265*4882a593Smuzhiyun RPM_SUSPENDED, which means that each device is initially regarded by the 266*4882a593Smuzhiyun PM core as 'suspended', regardless of its real hardware status 267*4882a593Smuzhiyun 268*4882a593Smuzhiyun `unsigned int runtime_auto;` 269*4882a593Smuzhiyun - if set, indicates that the user space has allowed the device driver to 270*4882a593Smuzhiyun power manage the device at run time via the /sys/devices/.../power/control 271*4882a593Smuzhiyun `interface;` it may only be modified with the help of the 272*4882a593Smuzhiyun pm_runtime_allow() and pm_runtime_forbid() helper functions 273*4882a593Smuzhiyun 274*4882a593Smuzhiyun `unsigned int no_callbacks;` 275*4882a593Smuzhiyun - indicates that the device does not use the runtime PM callbacks (see 276*4882a593Smuzhiyun Section 8); it may be modified only by the pm_runtime_no_callbacks() 277*4882a593Smuzhiyun helper function 278*4882a593Smuzhiyun 279*4882a593Smuzhiyun `unsigned int irq_safe;` 280*4882a593Smuzhiyun - indicates that the ->runtime_suspend() and ->runtime_resume() callbacks 281*4882a593Smuzhiyun will be invoked with the spinlock held and interrupts disabled 282*4882a593Smuzhiyun 283*4882a593Smuzhiyun `unsigned int use_autosuspend;` 284*4882a593Smuzhiyun - indicates that the device's driver supports delayed autosuspend (see 285*4882a593Smuzhiyun Section 9); it may be modified only by the 286*4882a593Smuzhiyun pm_runtime{_dont}_use_autosuspend() helper functions 287*4882a593Smuzhiyun 288*4882a593Smuzhiyun `unsigned int timer_autosuspends;` 289*4882a593Smuzhiyun - indicates that the PM core should attempt to carry out an autosuspend 290*4882a593Smuzhiyun when the timer expires rather than a normal suspend 291*4882a593Smuzhiyun 292*4882a593Smuzhiyun `int autosuspend_delay;` 293*4882a593Smuzhiyun - the delay time (in milliseconds) to be used for autosuspend 294*4882a593Smuzhiyun 295*4882a593Smuzhiyun `unsigned long last_busy;` 296*4882a593Smuzhiyun - the time (in jiffies) when the pm_runtime_mark_last_busy() helper 297*4882a593Smuzhiyun function was last called for this device; used in calculating inactivity 298*4882a593Smuzhiyun periods for autosuspend 299*4882a593Smuzhiyun 300*4882a593SmuzhiyunAll of the above fields are members of the 'power' member of 'struct device'. 301*4882a593Smuzhiyun 302*4882a593Smuzhiyun4. Runtime PM Device Helper Functions 303*4882a593Smuzhiyun===================================== 304*4882a593Smuzhiyun 305*4882a593SmuzhiyunThe following runtime PM helper functions are defined in 306*4882a593Smuzhiyundrivers/base/power/runtime.c and include/linux/pm_runtime.h: 307*4882a593Smuzhiyun 308*4882a593Smuzhiyun `void pm_runtime_init(struct device *dev);` 309*4882a593Smuzhiyun - initialize the device runtime PM fields in 'struct dev_pm_info' 310*4882a593Smuzhiyun 311*4882a593Smuzhiyun `void pm_runtime_remove(struct device *dev);` 312*4882a593Smuzhiyun - make sure that the runtime PM of the device will be disabled after 313*4882a593Smuzhiyun removing the device from device hierarchy 314*4882a593Smuzhiyun 315*4882a593Smuzhiyun `int pm_runtime_idle(struct device *dev);` 316*4882a593Smuzhiyun - execute the subsystem-level idle callback for the device; returns an 317*4882a593Smuzhiyun error code on failure, where -EINPROGRESS means that ->runtime_idle() is 318*4882a593Smuzhiyun already being executed; if there is no callback or the callback returns 0 319*4882a593Smuzhiyun then run pm_runtime_autosuspend(dev) and return its result 320*4882a593Smuzhiyun 321*4882a593Smuzhiyun `int pm_runtime_suspend(struct device *dev);` 322*4882a593Smuzhiyun - execute the subsystem-level suspend callback for the device; returns 0 on 323*4882a593Smuzhiyun success, 1 if the device's runtime PM status was already 'suspended', or 324*4882a593Smuzhiyun error code on failure, where -EAGAIN or -EBUSY means it is safe to attempt 325*4882a593Smuzhiyun to suspend the device again in future and -EACCES means that 326*4882a593Smuzhiyun 'power.disable_depth' is different from 0 327*4882a593Smuzhiyun 328*4882a593Smuzhiyun `int pm_runtime_autosuspend(struct device *dev);` 329*4882a593Smuzhiyun - same as pm_runtime_suspend() except that the autosuspend delay is taken 330*4882a593Smuzhiyun `into account;` if pm_runtime_autosuspend_expiration() says the delay has 331*4882a593Smuzhiyun not yet expired then an autosuspend is scheduled for the appropriate time 332*4882a593Smuzhiyun and 0 is returned 333*4882a593Smuzhiyun 334*4882a593Smuzhiyun `int pm_runtime_resume(struct device *dev);` 335*4882a593Smuzhiyun - execute the subsystem-level resume callback for the device; returns 0 on 336*4882a593Smuzhiyun success, 1 if the device's runtime PM status was already 'active' or 337*4882a593Smuzhiyun error code on failure, where -EAGAIN means it may be safe to attempt to 338*4882a593Smuzhiyun resume the device again in future, but 'power.runtime_error' should be 339*4882a593Smuzhiyun checked additionally, and -EACCES means that 'power.disable_depth' is 340*4882a593Smuzhiyun different from 0 341*4882a593Smuzhiyun 342*4882a593Smuzhiyun `int pm_request_idle(struct device *dev);` 343*4882a593Smuzhiyun - submit a request to execute the subsystem-level idle callback for the 344*4882a593Smuzhiyun device (the request is represented by a work item in pm_wq); returns 0 on 345*4882a593Smuzhiyun success or error code if the request has not been queued up 346*4882a593Smuzhiyun 347*4882a593Smuzhiyun `int pm_request_autosuspend(struct device *dev);` 348*4882a593Smuzhiyun - schedule the execution of the subsystem-level suspend callback for the 349*4882a593Smuzhiyun device when the autosuspend delay has expired; if the delay has already 350*4882a593Smuzhiyun expired then the work item is queued up immediately 351*4882a593Smuzhiyun 352*4882a593Smuzhiyun `int pm_schedule_suspend(struct device *dev, unsigned int delay);` 353*4882a593Smuzhiyun - schedule the execution of the subsystem-level suspend callback for the 354*4882a593Smuzhiyun device in future, where 'delay' is the time to wait before queuing up a 355*4882a593Smuzhiyun suspend work item in pm_wq, in milliseconds (if 'delay' is zero, the work 356*4882a593Smuzhiyun item is queued up immediately); returns 0 on success, 1 if the device's PM 357*4882a593Smuzhiyun runtime status was already 'suspended', or error code if the request 358*4882a593Smuzhiyun hasn't been scheduled (or queued up if 'delay' is 0); if the execution of 359*4882a593Smuzhiyun ->runtime_suspend() is already scheduled and not yet expired, the new 360*4882a593Smuzhiyun value of 'delay' will be used as the time to wait 361*4882a593Smuzhiyun 362*4882a593Smuzhiyun `int pm_request_resume(struct device *dev);` 363*4882a593Smuzhiyun - submit a request to execute the subsystem-level resume callback for the 364*4882a593Smuzhiyun device (the request is represented by a work item in pm_wq); returns 0 on 365*4882a593Smuzhiyun success, 1 if the device's runtime PM status was already 'active', or 366*4882a593Smuzhiyun error code if the request hasn't been queued up 367*4882a593Smuzhiyun 368*4882a593Smuzhiyun `void pm_runtime_get_noresume(struct device *dev);` 369*4882a593Smuzhiyun - increment the device's usage counter 370*4882a593Smuzhiyun 371*4882a593Smuzhiyun `int pm_runtime_get(struct device *dev);` 372*4882a593Smuzhiyun - increment the device's usage counter, run pm_request_resume(dev) and 373*4882a593Smuzhiyun return its result 374*4882a593Smuzhiyun 375*4882a593Smuzhiyun `int pm_runtime_get_sync(struct device *dev);` 376*4882a593Smuzhiyun - increment the device's usage counter, run pm_runtime_resume(dev) and 377*4882a593Smuzhiyun return its result 378*4882a593Smuzhiyun 379*4882a593Smuzhiyun `int pm_runtime_get_if_in_use(struct device *dev);` 380*4882a593Smuzhiyun - return -EINVAL if 'power.disable_depth' is nonzero; otherwise, if the 381*4882a593Smuzhiyun runtime PM status is RPM_ACTIVE and the runtime PM usage counter is 382*4882a593Smuzhiyun nonzero, increment the counter and return 1; otherwise return 0 without 383*4882a593Smuzhiyun changing the counter 384*4882a593Smuzhiyun 385*4882a593Smuzhiyun `int pm_runtime_get_if_active(struct device *dev, bool ign_usage_count);` 386*4882a593Smuzhiyun - return -EINVAL if 'power.disable_depth' is nonzero; otherwise, if the 387*4882a593Smuzhiyun runtime PM status is RPM_ACTIVE, and either ign_usage_count is true 388*4882a593Smuzhiyun or the device's usage_count is non-zero, increment the counter and 389*4882a593Smuzhiyun return 1; otherwise return 0 without changing the counter 390*4882a593Smuzhiyun 391*4882a593Smuzhiyun `void pm_runtime_put_noidle(struct device *dev);` 392*4882a593Smuzhiyun - decrement the device's usage counter 393*4882a593Smuzhiyun 394*4882a593Smuzhiyun `int pm_runtime_put(struct device *dev);` 395*4882a593Smuzhiyun - decrement the device's usage counter; if the result is 0 then run 396*4882a593Smuzhiyun pm_request_idle(dev) and return its result 397*4882a593Smuzhiyun 398*4882a593Smuzhiyun `int pm_runtime_put_autosuspend(struct device *dev);` 399*4882a593Smuzhiyun - decrement the device's usage counter; if the result is 0 then run 400*4882a593Smuzhiyun pm_request_autosuspend(dev) and return its result 401*4882a593Smuzhiyun 402*4882a593Smuzhiyun `int pm_runtime_put_sync(struct device *dev);` 403*4882a593Smuzhiyun - decrement the device's usage counter; if the result is 0 then run 404*4882a593Smuzhiyun pm_runtime_idle(dev) and return its result 405*4882a593Smuzhiyun 406*4882a593Smuzhiyun `int pm_runtime_put_sync_suspend(struct device *dev);` 407*4882a593Smuzhiyun - decrement the device's usage counter; if the result is 0 then run 408*4882a593Smuzhiyun pm_runtime_suspend(dev) and return its result 409*4882a593Smuzhiyun 410*4882a593Smuzhiyun `int pm_runtime_put_sync_autosuspend(struct device *dev);` 411*4882a593Smuzhiyun - decrement the device's usage counter; if the result is 0 then run 412*4882a593Smuzhiyun pm_runtime_autosuspend(dev) and return its result 413*4882a593Smuzhiyun 414*4882a593Smuzhiyun `void pm_runtime_enable(struct device *dev);` 415*4882a593Smuzhiyun - decrement the device's 'power.disable_depth' field; if that field is equal 416*4882a593Smuzhiyun to zero, the runtime PM helper functions can execute subsystem-level 417*4882a593Smuzhiyun callbacks described in Section 2 for the device 418*4882a593Smuzhiyun 419*4882a593Smuzhiyun `int pm_runtime_disable(struct device *dev);` 420*4882a593Smuzhiyun - increment the device's 'power.disable_depth' field (if the value of that 421*4882a593Smuzhiyun field was previously zero, this prevents subsystem-level runtime PM 422*4882a593Smuzhiyun callbacks from being run for the device), make sure that all of the 423*4882a593Smuzhiyun pending runtime PM operations on the device are either completed or 424*4882a593Smuzhiyun canceled; returns 1 if there was a resume request pending and it was 425*4882a593Smuzhiyun necessary to execute the subsystem-level resume callback for the device 426*4882a593Smuzhiyun to satisfy that request, otherwise 0 is returned 427*4882a593Smuzhiyun 428*4882a593Smuzhiyun `int pm_runtime_barrier(struct device *dev);` 429*4882a593Smuzhiyun - check if there's a resume request pending for the device and resume it 430*4882a593Smuzhiyun (synchronously) in that case, cancel any other pending runtime PM requests 431*4882a593Smuzhiyun regarding it and wait for all runtime PM operations on it in progress to 432*4882a593Smuzhiyun complete; returns 1 if there was a resume request pending and it was 433*4882a593Smuzhiyun necessary to execute the subsystem-level resume callback for the device to 434*4882a593Smuzhiyun satisfy that request, otherwise 0 is returned 435*4882a593Smuzhiyun 436*4882a593Smuzhiyun `void pm_suspend_ignore_children(struct device *dev, bool enable);` 437*4882a593Smuzhiyun - set/unset the power.ignore_children flag of the device 438*4882a593Smuzhiyun 439*4882a593Smuzhiyun `int pm_runtime_set_active(struct device *dev);` 440*4882a593Smuzhiyun - clear the device's 'power.runtime_error' flag, set the device's runtime 441*4882a593Smuzhiyun PM status to 'active' and update its parent's counter of 'active' 442*4882a593Smuzhiyun children as appropriate (it is only valid to use this function if 443*4882a593Smuzhiyun 'power.runtime_error' is set or 'power.disable_depth' is greater than 444*4882a593Smuzhiyun zero); it will fail and return error code if the device has a parent 445*4882a593Smuzhiyun which is not active and the 'power.ignore_children' flag of which is unset 446*4882a593Smuzhiyun 447*4882a593Smuzhiyun `void pm_runtime_set_suspended(struct device *dev);` 448*4882a593Smuzhiyun - clear the device's 'power.runtime_error' flag, set the device's runtime 449*4882a593Smuzhiyun PM status to 'suspended' and update its parent's counter of 'active' 450*4882a593Smuzhiyun children as appropriate (it is only valid to use this function if 451*4882a593Smuzhiyun 'power.runtime_error' is set or 'power.disable_depth' is greater than 452*4882a593Smuzhiyun zero) 453*4882a593Smuzhiyun 454*4882a593Smuzhiyun `bool pm_runtime_active(struct device *dev);` 455*4882a593Smuzhiyun - return true if the device's runtime PM status is 'active' or its 456*4882a593Smuzhiyun 'power.disable_depth' field is not equal to zero, or false otherwise 457*4882a593Smuzhiyun 458*4882a593Smuzhiyun `bool pm_runtime_suspended(struct device *dev);` 459*4882a593Smuzhiyun - return true if the device's runtime PM status is 'suspended' and its 460*4882a593Smuzhiyun 'power.disable_depth' field is equal to zero, or false otherwise 461*4882a593Smuzhiyun 462*4882a593Smuzhiyun `bool pm_runtime_status_suspended(struct device *dev);` 463*4882a593Smuzhiyun - return true if the device's runtime PM status is 'suspended' 464*4882a593Smuzhiyun 465*4882a593Smuzhiyun `void pm_runtime_allow(struct device *dev);` 466*4882a593Smuzhiyun - set the power.runtime_auto flag for the device and decrease its usage 467*4882a593Smuzhiyun counter (used by the /sys/devices/.../power/control interface to 468*4882a593Smuzhiyun effectively allow the device to be power managed at run time) 469*4882a593Smuzhiyun 470*4882a593Smuzhiyun `void pm_runtime_forbid(struct device *dev);` 471*4882a593Smuzhiyun - unset the power.runtime_auto flag for the device and increase its usage 472*4882a593Smuzhiyun counter (used by the /sys/devices/.../power/control interface to 473*4882a593Smuzhiyun effectively prevent the device from being power managed at run time) 474*4882a593Smuzhiyun 475*4882a593Smuzhiyun `void pm_runtime_no_callbacks(struct device *dev);` 476*4882a593Smuzhiyun - set the power.no_callbacks flag for the device and remove the runtime 477*4882a593Smuzhiyun PM attributes from /sys/devices/.../power (or prevent them from being 478*4882a593Smuzhiyun added when the device is registered) 479*4882a593Smuzhiyun 480*4882a593Smuzhiyun `void pm_runtime_irq_safe(struct device *dev);` 481*4882a593Smuzhiyun - set the power.irq_safe flag for the device, causing the runtime-PM 482*4882a593Smuzhiyun callbacks to be invoked with interrupts off 483*4882a593Smuzhiyun 484*4882a593Smuzhiyun `bool pm_runtime_is_irq_safe(struct device *dev);` 485*4882a593Smuzhiyun - return true if power.irq_safe flag was set for the device, causing 486*4882a593Smuzhiyun the runtime-PM callbacks to be invoked with interrupts off 487*4882a593Smuzhiyun 488*4882a593Smuzhiyun `void pm_runtime_mark_last_busy(struct device *dev);` 489*4882a593Smuzhiyun - set the power.last_busy field to the current time 490*4882a593Smuzhiyun 491*4882a593Smuzhiyun `void pm_runtime_use_autosuspend(struct device *dev);` 492*4882a593Smuzhiyun - set the power.use_autosuspend flag, enabling autosuspend delays; call 493*4882a593Smuzhiyun pm_runtime_get_sync if the flag was previously cleared and 494*4882a593Smuzhiyun power.autosuspend_delay is negative 495*4882a593Smuzhiyun 496*4882a593Smuzhiyun `void pm_runtime_dont_use_autosuspend(struct device *dev);` 497*4882a593Smuzhiyun - clear the power.use_autosuspend flag, disabling autosuspend delays; 498*4882a593Smuzhiyun decrement the device's usage counter if the flag was previously set and 499*4882a593Smuzhiyun power.autosuspend_delay is negative; call pm_runtime_idle 500*4882a593Smuzhiyun 501*4882a593Smuzhiyun `void pm_runtime_set_autosuspend_delay(struct device *dev, int delay);` 502*4882a593Smuzhiyun - set the power.autosuspend_delay value to 'delay' (expressed in 503*4882a593Smuzhiyun milliseconds); if 'delay' is negative then runtime suspends are 504*4882a593Smuzhiyun prevented; if power.use_autosuspend is set, pm_runtime_get_sync may be 505*4882a593Smuzhiyun called or the device's usage counter may be decremented and 506*4882a593Smuzhiyun pm_runtime_idle called depending on if power.autosuspend_delay is 507*4882a593Smuzhiyun changed to or from a negative value; if power.use_autosuspend is clear, 508*4882a593Smuzhiyun pm_runtime_idle is called 509*4882a593Smuzhiyun 510*4882a593Smuzhiyun `unsigned long pm_runtime_autosuspend_expiration(struct device *dev);` 511*4882a593Smuzhiyun - calculate the time when the current autosuspend delay period will expire, 512*4882a593Smuzhiyun based on power.last_busy and power.autosuspend_delay; if the delay time 513*4882a593Smuzhiyun is 1000 ms or larger then the expiration time is rounded up to the 514*4882a593Smuzhiyun nearest second; returns 0 if the delay period has already expired or 515*4882a593Smuzhiyun power.use_autosuspend isn't set, otherwise returns the expiration time 516*4882a593Smuzhiyun in jiffies 517*4882a593Smuzhiyun 518*4882a593SmuzhiyunIt is safe to execute the following helper functions from interrupt context: 519*4882a593Smuzhiyun 520*4882a593Smuzhiyun- pm_request_idle() 521*4882a593Smuzhiyun- pm_request_autosuspend() 522*4882a593Smuzhiyun- pm_schedule_suspend() 523*4882a593Smuzhiyun- pm_request_resume() 524*4882a593Smuzhiyun- pm_runtime_get_noresume() 525*4882a593Smuzhiyun- pm_runtime_get() 526*4882a593Smuzhiyun- pm_runtime_put_noidle() 527*4882a593Smuzhiyun- pm_runtime_put() 528*4882a593Smuzhiyun- pm_runtime_put_autosuspend() 529*4882a593Smuzhiyun- pm_runtime_enable() 530*4882a593Smuzhiyun- pm_suspend_ignore_children() 531*4882a593Smuzhiyun- pm_runtime_set_active() 532*4882a593Smuzhiyun- pm_runtime_set_suspended() 533*4882a593Smuzhiyun- pm_runtime_suspended() 534*4882a593Smuzhiyun- pm_runtime_mark_last_busy() 535*4882a593Smuzhiyun- pm_runtime_autosuspend_expiration() 536*4882a593Smuzhiyun 537*4882a593SmuzhiyunIf pm_runtime_irq_safe() has been called for a device then the following helper 538*4882a593Smuzhiyunfunctions may also be used in interrupt context: 539*4882a593Smuzhiyun 540*4882a593Smuzhiyun- pm_runtime_idle() 541*4882a593Smuzhiyun- pm_runtime_suspend() 542*4882a593Smuzhiyun- pm_runtime_autosuspend() 543*4882a593Smuzhiyun- pm_runtime_resume() 544*4882a593Smuzhiyun- pm_runtime_get_sync() 545*4882a593Smuzhiyun- pm_runtime_put_sync() 546*4882a593Smuzhiyun- pm_runtime_put_sync_suspend() 547*4882a593Smuzhiyun- pm_runtime_put_sync_autosuspend() 548*4882a593Smuzhiyun 549*4882a593Smuzhiyun5. Runtime PM Initialization, Device Probing and Removal 550*4882a593Smuzhiyun======================================================== 551*4882a593Smuzhiyun 552*4882a593SmuzhiyunInitially, the runtime PM is disabled for all devices, which means that the 553*4882a593Smuzhiyunmajority of the runtime PM helper functions described in Section 4 will return 554*4882a593Smuzhiyun-EAGAIN until pm_runtime_enable() is called for the device. 555*4882a593Smuzhiyun 556*4882a593SmuzhiyunIn addition to that, the initial runtime PM status of all devices is 557*4882a593Smuzhiyun'suspended', but it need not reflect the actual physical state of the device. 558*4882a593SmuzhiyunThus, if the device is initially active (i.e. it is able to process I/O), its 559*4882a593Smuzhiyunruntime PM status must be changed to 'active', with the help of 560*4882a593Smuzhiyunpm_runtime_set_active(), before pm_runtime_enable() is called for the device. 561*4882a593Smuzhiyun 562*4882a593SmuzhiyunHowever, if the device has a parent and the parent's runtime PM is enabled, 563*4882a593Smuzhiyuncalling pm_runtime_set_active() for the device will affect the parent, unless 564*4882a593Smuzhiyunthe parent's 'power.ignore_children' flag is set. Namely, in that case the 565*4882a593Smuzhiyunparent won't be able to suspend at run time, using the PM core's helper 566*4882a593Smuzhiyunfunctions, as long as the child's status is 'active', even if the child's 567*4882a593Smuzhiyunruntime PM is still disabled (i.e. pm_runtime_enable() hasn't been called for 568*4882a593Smuzhiyunthe child yet or pm_runtime_disable() has been called for it). For this reason, 569*4882a593Smuzhiyunonce pm_runtime_set_active() has been called for the device, pm_runtime_enable() 570*4882a593Smuzhiyunshould be called for it too as soon as reasonably possible or its runtime PM 571*4882a593Smuzhiyunstatus should be changed back to 'suspended' with the help of 572*4882a593Smuzhiyunpm_runtime_set_suspended(). 573*4882a593Smuzhiyun 574*4882a593SmuzhiyunIf the default initial runtime PM status of the device (i.e. 'suspended') 575*4882a593Smuzhiyunreflects the actual state of the device, its bus type's or its driver's 576*4882a593Smuzhiyun->probe() callback will likely need to wake it up using one of the PM core's 577*4882a593Smuzhiyunhelper functions described in Section 4. In that case, pm_runtime_resume() 578*4882a593Smuzhiyunshould be used. Of course, for this purpose the device's runtime PM has to be 579*4882a593Smuzhiyunenabled earlier by calling pm_runtime_enable(). 580*4882a593Smuzhiyun 581*4882a593SmuzhiyunNote, if the device may execute pm_runtime calls during the probe (such as 582*4882a593Smuzhiyunif it is registers with a subsystem that may call back in) then the 583*4882a593Smuzhiyunpm_runtime_get_sync() call paired with a pm_runtime_put() call will be 584*4882a593Smuzhiyunappropriate to ensure that the device is not put back to sleep during the 585*4882a593Smuzhiyunprobe. This can happen with systems such as the network device layer. 586*4882a593Smuzhiyun 587*4882a593SmuzhiyunIt may be desirable to suspend the device once ->probe() has finished. 588*4882a593SmuzhiyunTherefore the driver core uses the asynchronous pm_request_idle() to submit a 589*4882a593Smuzhiyunrequest to execute the subsystem-level idle callback for the device at that 590*4882a593Smuzhiyuntime. A driver that makes use of the runtime autosuspend feature, may want to 591*4882a593Smuzhiyunupdate the last busy mark before returning from ->probe(). 592*4882a593Smuzhiyun 593*4882a593SmuzhiyunMoreover, the driver core prevents runtime PM callbacks from racing with the bus 594*4882a593Smuzhiyunnotifier callback in __device_release_driver(), which is necessary, because the 595*4882a593Smuzhiyunnotifier is used by some subsystems to carry out operations affecting the 596*4882a593Smuzhiyunruntime PM functionality. It does so by calling pm_runtime_get_sync() before 597*4882a593Smuzhiyundriver_sysfs_remove() and the BUS_NOTIFY_UNBIND_DRIVER notifications. This 598*4882a593Smuzhiyunresumes the device if it's in the suspended state and prevents it from 599*4882a593Smuzhiyunbeing suspended again while those routines are being executed. 600*4882a593Smuzhiyun 601*4882a593SmuzhiyunTo allow bus types and drivers to put devices into the suspended state by 602*4882a593Smuzhiyuncalling pm_runtime_suspend() from their ->remove() routines, the driver core 603*4882a593Smuzhiyunexecutes pm_runtime_put_sync() after running the BUS_NOTIFY_UNBIND_DRIVER 604*4882a593Smuzhiyunnotifications in __device_release_driver(). This requires bus types and 605*4882a593Smuzhiyundrivers to make their ->remove() callbacks avoid races with runtime PM directly, 606*4882a593Smuzhiyunbut also it allows of more flexibility in the handling of devices during the 607*4882a593Smuzhiyunremoval of their drivers. 608*4882a593Smuzhiyun 609*4882a593SmuzhiyunDrivers in ->remove() callback should undo the runtime PM changes done 610*4882a593Smuzhiyunin ->probe(). Usually this means calling pm_runtime_disable(), 611*4882a593Smuzhiyunpm_runtime_dont_use_autosuspend() etc. 612*4882a593Smuzhiyun 613*4882a593SmuzhiyunThe user space can effectively disallow the driver of the device to power manage 614*4882a593Smuzhiyunit at run time by changing the value of its /sys/devices/.../power/control 615*4882a593Smuzhiyunattribute to "on", which causes pm_runtime_forbid() to be called. In principle, 616*4882a593Smuzhiyunthis mechanism may also be used by the driver to effectively turn off the 617*4882a593Smuzhiyunruntime power management of the device until the user space turns it on. 618*4882a593SmuzhiyunNamely, during the initialization the driver can make sure that the runtime PM 619*4882a593Smuzhiyunstatus of the device is 'active' and call pm_runtime_forbid(). It should be 620*4882a593Smuzhiyunnoted, however, that if the user space has already intentionally changed the 621*4882a593Smuzhiyunvalue of /sys/devices/.../power/control to "auto" to allow the driver to power 622*4882a593Smuzhiyunmanage the device at run time, the driver may confuse it by using 623*4882a593Smuzhiyunpm_runtime_forbid() this way. 624*4882a593Smuzhiyun 625*4882a593Smuzhiyun6. Runtime PM and System Sleep 626*4882a593Smuzhiyun============================== 627*4882a593Smuzhiyun 628*4882a593SmuzhiyunRuntime PM and system sleep (i.e., system suspend and hibernation, also known 629*4882a593Smuzhiyunas suspend-to-RAM and suspend-to-disk) interact with each other in a couple of 630*4882a593Smuzhiyunways. If a device is active when a system sleep starts, everything is 631*4882a593Smuzhiyunstraightforward. But what should happen if the device is already suspended? 632*4882a593Smuzhiyun 633*4882a593SmuzhiyunThe device may have different wake-up settings for runtime PM and system sleep. 634*4882a593SmuzhiyunFor example, remote wake-up may be enabled for runtime suspend but disallowed 635*4882a593Smuzhiyunfor system sleep (device_may_wakeup(dev) returns 'false'). When this happens, 636*4882a593Smuzhiyunthe subsystem-level system suspend callback is responsible for changing the 637*4882a593Smuzhiyundevice's wake-up setting (it may leave that to the device driver's system 638*4882a593Smuzhiyunsuspend routine). It may be necessary to resume the device and suspend it again 639*4882a593Smuzhiyunin order to do so. The same is true if the driver uses different power levels 640*4882a593Smuzhiyunor other settings for runtime suspend and system sleep. 641*4882a593Smuzhiyun 642*4882a593SmuzhiyunDuring system resume, the simplest approach is to bring all devices back to full 643*4882a593Smuzhiyunpower, even if they had been suspended before the system suspend began. There 644*4882a593Smuzhiyunare several reasons for this, including: 645*4882a593Smuzhiyun 646*4882a593Smuzhiyun * The device might need to switch power levels, wake-up settings, etc. 647*4882a593Smuzhiyun 648*4882a593Smuzhiyun * Remote wake-up events might have been lost by the firmware. 649*4882a593Smuzhiyun 650*4882a593Smuzhiyun * The device's children may need the device to be at full power in order 651*4882a593Smuzhiyun to resume themselves. 652*4882a593Smuzhiyun 653*4882a593Smuzhiyun * The driver's idea of the device state may not agree with the device's 654*4882a593Smuzhiyun physical state. This can happen during resume from hibernation. 655*4882a593Smuzhiyun 656*4882a593Smuzhiyun * The device might need to be reset. 657*4882a593Smuzhiyun 658*4882a593Smuzhiyun * Even though the device was suspended, if its usage counter was > 0 then most 659*4882a593Smuzhiyun likely it would need a runtime resume in the near future anyway. 660*4882a593Smuzhiyun 661*4882a593SmuzhiyunIf the device had been suspended before the system suspend began and it's 662*4882a593Smuzhiyunbrought back to full power during resume, then its runtime PM status will have 663*4882a593Smuzhiyunto be updated to reflect the actual post-system sleep status. The way to do 664*4882a593Smuzhiyunthis is: 665*4882a593Smuzhiyun 666*4882a593Smuzhiyun - pm_runtime_disable(dev); 667*4882a593Smuzhiyun - pm_runtime_set_active(dev); 668*4882a593Smuzhiyun - pm_runtime_enable(dev); 669*4882a593Smuzhiyun 670*4882a593SmuzhiyunThe PM core always increments the runtime usage counter before calling the 671*4882a593Smuzhiyun->suspend() callback and decrements it after calling the ->resume() callback. 672*4882a593SmuzhiyunHence disabling runtime PM temporarily like this will not cause any runtime 673*4882a593Smuzhiyunsuspend attempts to be permanently lost. If the usage count goes to zero 674*4882a593Smuzhiyunfollowing the return of the ->resume() callback, the ->runtime_idle() callback 675*4882a593Smuzhiyunwill be invoked as usual. 676*4882a593Smuzhiyun 677*4882a593SmuzhiyunOn some systems, however, system sleep is not entered through a global firmware 678*4882a593Smuzhiyunor hardware operation. Instead, all hardware components are put into low-power 679*4882a593Smuzhiyunstates directly by the kernel in a coordinated way. Then, the system sleep 680*4882a593Smuzhiyunstate effectively follows from the states the hardware components end up in 681*4882a593Smuzhiyunand the system is woken up from that state by a hardware interrupt or a similar 682*4882a593Smuzhiyunmechanism entirely under the kernel's control. As a result, the kernel never 683*4882a593Smuzhiyungives control away and the states of all devices during resume are precisely 684*4882a593Smuzhiyunknown to it. If that is the case and none of the situations listed above takes 685*4882a593Smuzhiyunplace (in particular, if the system is not waking up from hibernation), it may 686*4882a593Smuzhiyunbe more efficient to leave the devices that had been suspended before the system 687*4882a593Smuzhiyunsuspend began in the suspended state. 688*4882a593Smuzhiyun 689*4882a593SmuzhiyunTo this end, the PM core provides a mechanism allowing some coordination between 690*4882a593Smuzhiyundifferent levels of device hierarchy. Namely, if a system suspend .prepare() 691*4882a593Smuzhiyuncallback returns a positive number for a device, that indicates to the PM core 692*4882a593Smuzhiyunthat the device appears to be runtime-suspended and its state is fine, so it 693*4882a593Smuzhiyunmay be left in runtime suspend provided that all of its descendants are also 694*4882a593Smuzhiyunleft in runtime suspend. If that happens, the PM core will not execute any 695*4882a593Smuzhiyunsystem suspend and resume callbacks for all of those devices, except for the 696*4882a593Smuzhiyuncomplete callback, which is then entirely responsible for handling the device 697*4882a593Smuzhiyunas appropriate. This only applies to system suspend transitions that are not 698*4882a593Smuzhiyunrelated to hibernation (see Documentation/driver-api/pm/devices.rst for more 699*4882a593Smuzhiyuninformation). 700*4882a593Smuzhiyun 701*4882a593SmuzhiyunThe PM core does its best to reduce the probability of race conditions between 702*4882a593Smuzhiyunthe runtime PM and system suspend/resume (and hibernation) callbacks by carrying 703*4882a593Smuzhiyunout the following operations: 704*4882a593Smuzhiyun 705*4882a593Smuzhiyun * During system suspend pm_runtime_get_noresume() is called for every device 706*4882a593Smuzhiyun right before executing the subsystem-level .prepare() callback for it and 707*4882a593Smuzhiyun pm_runtime_barrier() is called for every device right before executing the 708*4882a593Smuzhiyun subsystem-level .suspend() callback for it. In addition to that the PM core 709*4882a593Smuzhiyun calls __pm_runtime_disable() with 'false' as the second argument for every 710*4882a593Smuzhiyun device right before executing the subsystem-level .suspend_late() callback 711*4882a593Smuzhiyun for it. 712*4882a593Smuzhiyun 713*4882a593Smuzhiyun * During system resume pm_runtime_enable() and pm_runtime_put() are called for 714*4882a593Smuzhiyun every device right after executing the subsystem-level .resume_early() 715*4882a593Smuzhiyun callback and right after executing the subsystem-level .complete() callback 716*4882a593Smuzhiyun for it, respectively. 717*4882a593Smuzhiyun 718*4882a593Smuzhiyun7. Generic subsystem callbacks 719*4882a593Smuzhiyun 720*4882a593SmuzhiyunSubsystems may wish to conserve code space by using the set of generic power 721*4882a593Smuzhiyunmanagement callbacks provided by the PM core, defined in 722*4882a593Smuzhiyundriver/base/power/generic_ops.c: 723*4882a593Smuzhiyun 724*4882a593Smuzhiyun `int pm_generic_runtime_suspend(struct device *dev);` 725*4882a593Smuzhiyun - invoke the ->runtime_suspend() callback provided by the driver of this 726*4882a593Smuzhiyun device and return its result, or return 0 if not defined 727*4882a593Smuzhiyun 728*4882a593Smuzhiyun `int pm_generic_runtime_resume(struct device *dev);` 729*4882a593Smuzhiyun - invoke the ->runtime_resume() callback provided by the driver of this 730*4882a593Smuzhiyun device and return its result, or return 0 if not defined 731*4882a593Smuzhiyun 732*4882a593Smuzhiyun `int pm_generic_suspend(struct device *dev);` 733*4882a593Smuzhiyun - if the device has not been suspended at run time, invoke the ->suspend() 734*4882a593Smuzhiyun callback provided by its driver and return its result, or return 0 if not 735*4882a593Smuzhiyun defined 736*4882a593Smuzhiyun 737*4882a593Smuzhiyun `int pm_generic_suspend_noirq(struct device *dev);` 738*4882a593Smuzhiyun - if pm_runtime_suspended(dev) returns "false", invoke the ->suspend_noirq() 739*4882a593Smuzhiyun callback provided by the device's driver and return its result, or return 740*4882a593Smuzhiyun 0 if not defined 741*4882a593Smuzhiyun 742*4882a593Smuzhiyun `int pm_generic_resume(struct device *dev);` 743*4882a593Smuzhiyun - invoke the ->resume() callback provided by the driver of this device and, 744*4882a593Smuzhiyun if successful, change the device's runtime PM status to 'active' 745*4882a593Smuzhiyun 746*4882a593Smuzhiyun `int pm_generic_resume_noirq(struct device *dev);` 747*4882a593Smuzhiyun - invoke the ->resume_noirq() callback provided by the driver of this device 748*4882a593Smuzhiyun 749*4882a593Smuzhiyun `int pm_generic_freeze(struct device *dev);` 750*4882a593Smuzhiyun - if the device has not been suspended at run time, invoke the ->freeze() 751*4882a593Smuzhiyun callback provided by its driver and return its result, or return 0 if not 752*4882a593Smuzhiyun defined 753*4882a593Smuzhiyun 754*4882a593Smuzhiyun `int pm_generic_freeze_noirq(struct device *dev);` 755*4882a593Smuzhiyun - if pm_runtime_suspended(dev) returns "false", invoke the ->freeze_noirq() 756*4882a593Smuzhiyun callback provided by the device's driver and return its result, or return 757*4882a593Smuzhiyun 0 if not defined 758*4882a593Smuzhiyun 759*4882a593Smuzhiyun `int pm_generic_thaw(struct device *dev);` 760*4882a593Smuzhiyun - if the device has not been suspended at run time, invoke the ->thaw() 761*4882a593Smuzhiyun callback provided by its driver and return its result, or return 0 if not 762*4882a593Smuzhiyun defined 763*4882a593Smuzhiyun 764*4882a593Smuzhiyun `int pm_generic_thaw_noirq(struct device *dev);` 765*4882a593Smuzhiyun - if pm_runtime_suspended(dev) returns "false", invoke the ->thaw_noirq() 766*4882a593Smuzhiyun callback provided by the device's driver and return its result, or return 767*4882a593Smuzhiyun 0 if not defined 768*4882a593Smuzhiyun 769*4882a593Smuzhiyun `int pm_generic_poweroff(struct device *dev);` 770*4882a593Smuzhiyun - if the device has not been suspended at run time, invoke the ->poweroff() 771*4882a593Smuzhiyun callback provided by its driver and return its result, or return 0 if not 772*4882a593Smuzhiyun defined 773*4882a593Smuzhiyun 774*4882a593Smuzhiyun `int pm_generic_poweroff_noirq(struct device *dev);` 775*4882a593Smuzhiyun - if pm_runtime_suspended(dev) returns "false", run the ->poweroff_noirq() 776*4882a593Smuzhiyun callback provided by the device's driver and return its result, or return 777*4882a593Smuzhiyun 0 if not defined 778*4882a593Smuzhiyun 779*4882a593Smuzhiyun `int pm_generic_restore(struct device *dev);` 780*4882a593Smuzhiyun - invoke the ->restore() callback provided by the driver of this device and, 781*4882a593Smuzhiyun if successful, change the device's runtime PM status to 'active' 782*4882a593Smuzhiyun 783*4882a593Smuzhiyun `int pm_generic_restore_noirq(struct device *dev);` 784*4882a593Smuzhiyun - invoke the ->restore_noirq() callback provided by the device's driver 785*4882a593Smuzhiyun 786*4882a593SmuzhiyunThese functions are the defaults used by the PM core, if a subsystem doesn't 787*4882a593Smuzhiyunprovide its own callbacks for ->runtime_idle(), ->runtime_suspend(), 788*4882a593Smuzhiyun->runtime_resume(), ->suspend(), ->suspend_noirq(), ->resume(), 789*4882a593Smuzhiyun->resume_noirq(), ->freeze(), ->freeze_noirq(), ->thaw(), ->thaw_noirq(), 790*4882a593Smuzhiyun->poweroff(), ->poweroff_noirq(), ->restore(), ->restore_noirq() in the 791*4882a593Smuzhiyunsubsystem-level dev_pm_ops structure. 792*4882a593Smuzhiyun 793*4882a593SmuzhiyunDevice drivers that wish to use the same function as a system suspend, freeze, 794*4882a593Smuzhiyunpoweroff and runtime suspend callback, and similarly for system resume, thaw, 795*4882a593Smuzhiyunrestore, and runtime resume, can achieve this with the help of the 796*4882a593SmuzhiyunUNIVERSAL_DEV_PM_OPS macro defined in include/linux/pm.h (possibly setting its 797*4882a593Smuzhiyunlast argument to NULL). 798*4882a593Smuzhiyun 799*4882a593Smuzhiyun8. "No-Callback" Devices 800*4882a593Smuzhiyun======================== 801*4882a593Smuzhiyun 802*4882a593SmuzhiyunSome "devices" are only logical sub-devices of their parent and cannot be 803*4882a593Smuzhiyunpower-managed on their own. (The prototype example is a USB interface. Entire 804*4882a593SmuzhiyunUSB devices can go into low-power mode or send wake-up requests, but neither is 805*4882a593Smuzhiyunpossible for individual interfaces.) The drivers for these devices have no 806*4882a593Smuzhiyunneed of runtime PM callbacks; if the callbacks did exist, ->runtime_suspend() 807*4882a593Smuzhiyunand ->runtime_resume() would always return 0 without doing anything else and 808*4882a593Smuzhiyun->runtime_idle() would always call pm_runtime_suspend(). 809*4882a593Smuzhiyun 810*4882a593SmuzhiyunSubsystems can tell the PM core about these devices by calling 811*4882a593Smuzhiyunpm_runtime_no_callbacks(). This should be done after the device structure is 812*4882a593Smuzhiyuninitialized and before it is registered (although after device registration is 813*4882a593Smuzhiyunalso okay). The routine will set the device's power.no_callbacks flag and 814*4882a593Smuzhiyunprevent the non-debugging runtime PM sysfs attributes from being created. 815*4882a593Smuzhiyun 816*4882a593SmuzhiyunWhen power.no_callbacks is set, the PM core will not invoke the 817*4882a593Smuzhiyun->runtime_idle(), ->runtime_suspend(), or ->runtime_resume() callbacks. 818*4882a593SmuzhiyunInstead it will assume that suspends and resumes always succeed and that idle 819*4882a593Smuzhiyundevices should be suspended. 820*4882a593Smuzhiyun 821*4882a593SmuzhiyunAs a consequence, the PM core will never directly inform the device's subsystem 822*4882a593Smuzhiyunor driver about runtime power changes. Instead, the driver for the device's 823*4882a593Smuzhiyunparent must take responsibility for telling the device's driver when the 824*4882a593Smuzhiyunparent's power state changes. 825*4882a593Smuzhiyun 826*4882a593Smuzhiyun9. Autosuspend, or automatically-delayed suspends 827*4882a593Smuzhiyun================================================= 828*4882a593Smuzhiyun 829*4882a593SmuzhiyunChanging a device's power state isn't free; it requires both time and energy. 830*4882a593SmuzhiyunA device should be put in a low-power state only when there's some reason to 831*4882a593Smuzhiyunthink it will remain in that state for a substantial time. A common heuristic 832*4882a593Smuzhiyunsays that a device which hasn't been used for a while is liable to remain 833*4882a593Smuzhiyununused; following this advice, drivers should not allow devices to be suspended 834*4882a593Smuzhiyunat runtime until they have been inactive for some minimum period. Even when 835*4882a593Smuzhiyunthe heuristic ends up being non-optimal, it will still prevent devices from 836*4882a593Smuzhiyun"bouncing" too rapidly between low-power and full-power states. 837*4882a593Smuzhiyun 838*4882a593SmuzhiyunThe term "autosuspend" is an historical remnant. It doesn't mean that the 839*4882a593Smuzhiyundevice is automatically suspended (the subsystem or driver still has to call 840*4882a593Smuzhiyunthe appropriate PM routines); rather it means that runtime suspends will 841*4882a593Smuzhiyunautomatically be delayed until the desired period of inactivity has elapsed. 842*4882a593Smuzhiyun 843*4882a593SmuzhiyunInactivity is determined based on the power.last_busy field. Drivers should 844*4882a593Smuzhiyuncall pm_runtime_mark_last_busy() to update this field after carrying out I/O, 845*4882a593Smuzhiyuntypically just before calling pm_runtime_put_autosuspend(). The desired length 846*4882a593Smuzhiyunof the inactivity period is a matter of policy. Subsystems can set this length 847*4882a593Smuzhiyuninitially by calling pm_runtime_set_autosuspend_delay(), but after device 848*4882a593Smuzhiyunregistration the length should be controlled by user space, using the 849*4882a593Smuzhiyun/sys/devices/.../power/autosuspend_delay_ms attribute. 850*4882a593Smuzhiyun 851*4882a593SmuzhiyunIn order to use autosuspend, subsystems or drivers must call 852*4882a593Smuzhiyunpm_runtime_use_autosuspend() (preferably before registering the device), and 853*4882a593Smuzhiyunthereafter they should use the various `*_autosuspend()` helper functions 854*4882a593Smuzhiyuninstead of the non-autosuspend counterparts:: 855*4882a593Smuzhiyun 856*4882a593Smuzhiyun Instead of: pm_runtime_suspend use: pm_runtime_autosuspend; 857*4882a593Smuzhiyun Instead of: pm_schedule_suspend use: pm_request_autosuspend; 858*4882a593Smuzhiyun Instead of: pm_runtime_put use: pm_runtime_put_autosuspend; 859*4882a593Smuzhiyun Instead of: pm_runtime_put_sync use: pm_runtime_put_sync_autosuspend. 860*4882a593Smuzhiyun 861*4882a593SmuzhiyunDrivers may also continue to use the non-autosuspend helper functions; they 862*4882a593Smuzhiyunwill behave normally, which means sometimes taking the autosuspend delay into 863*4882a593Smuzhiyunaccount (see pm_runtime_idle). 864*4882a593Smuzhiyun 865*4882a593SmuzhiyunUnder some circumstances a driver or subsystem may want to prevent a device 866*4882a593Smuzhiyunfrom autosuspending immediately, even though the usage counter is zero and the 867*4882a593Smuzhiyunautosuspend delay time has expired. If the ->runtime_suspend() callback 868*4882a593Smuzhiyunreturns -EAGAIN or -EBUSY, and if the next autosuspend delay expiration time is 869*4882a593Smuzhiyunin the future (as it normally would be if the callback invoked 870*4882a593Smuzhiyunpm_runtime_mark_last_busy()), the PM core will automatically reschedule the 871*4882a593Smuzhiyunautosuspend. The ->runtime_suspend() callback can't do this rescheduling 872*4882a593Smuzhiyunitself because no suspend requests of any kind are accepted while the device is 873*4882a593Smuzhiyunsuspending (i.e., while the callback is running). 874*4882a593Smuzhiyun 875*4882a593SmuzhiyunThe implementation is well suited for asynchronous use in interrupt contexts. 876*4882a593SmuzhiyunHowever such use inevitably involves races, because the PM core can't 877*4882a593Smuzhiyunsynchronize ->runtime_suspend() callbacks with the arrival of I/O requests. 878*4882a593SmuzhiyunThis synchronization must be handled by the driver, using its private lock. 879*4882a593SmuzhiyunHere is a schematic pseudo-code example:: 880*4882a593Smuzhiyun 881*4882a593Smuzhiyun foo_read_or_write(struct foo_priv *foo, void *data) 882*4882a593Smuzhiyun { 883*4882a593Smuzhiyun lock(&foo->private_lock); 884*4882a593Smuzhiyun add_request_to_io_queue(foo, data); 885*4882a593Smuzhiyun if (foo->num_pending_requests++ == 0) 886*4882a593Smuzhiyun pm_runtime_get(&foo->dev); 887*4882a593Smuzhiyun if (!foo->is_suspended) 888*4882a593Smuzhiyun foo_process_next_request(foo); 889*4882a593Smuzhiyun unlock(&foo->private_lock); 890*4882a593Smuzhiyun } 891*4882a593Smuzhiyun 892*4882a593Smuzhiyun foo_io_completion(struct foo_priv *foo, void *req) 893*4882a593Smuzhiyun { 894*4882a593Smuzhiyun lock(&foo->private_lock); 895*4882a593Smuzhiyun if (--foo->num_pending_requests == 0) { 896*4882a593Smuzhiyun pm_runtime_mark_last_busy(&foo->dev); 897*4882a593Smuzhiyun pm_runtime_put_autosuspend(&foo->dev); 898*4882a593Smuzhiyun } else { 899*4882a593Smuzhiyun foo_process_next_request(foo); 900*4882a593Smuzhiyun } 901*4882a593Smuzhiyun unlock(&foo->private_lock); 902*4882a593Smuzhiyun /* Send req result back to the user ... */ 903*4882a593Smuzhiyun } 904*4882a593Smuzhiyun 905*4882a593Smuzhiyun int foo_runtime_suspend(struct device *dev) 906*4882a593Smuzhiyun { 907*4882a593Smuzhiyun struct foo_priv foo = container_of(dev, ...); 908*4882a593Smuzhiyun int ret = 0; 909*4882a593Smuzhiyun 910*4882a593Smuzhiyun lock(&foo->private_lock); 911*4882a593Smuzhiyun if (foo->num_pending_requests > 0) { 912*4882a593Smuzhiyun ret = -EBUSY; 913*4882a593Smuzhiyun } else { 914*4882a593Smuzhiyun /* ... suspend the device ... */ 915*4882a593Smuzhiyun foo->is_suspended = 1; 916*4882a593Smuzhiyun } 917*4882a593Smuzhiyun unlock(&foo->private_lock); 918*4882a593Smuzhiyun return ret; 919*4882a593Smuzhiyun } 920*4882a593Smuzhiyun 921*4882a593Smuzhiyun int foo_runtime_resume(struct device *dev) 922*4882a593Smuzhiyun { 923*4882a593Smuzhiyun struct foo_priv foo = container_of(dev, ...); 924*4882a593Smuzhiyun 925*4882a593Smuzhiyun lock(&foo->private_lock); 926*4882a593Smuzhiyun /* ... resume the device ... */ 927*4882a593Smuzhiyun foo->is_suspended = 0; 928*4882a593Smuzhiyun pm_runtime_mark_last_busy(&foo->dev); 929*4882a593Smuzhiyun if (foo->num_pending_requests > 0) 930*4882a593Smuzhiyun foo_process_next_request(foo); 931*4882a593Smuzhiyun unlock(&foo->private_lock); 932*4882a593Smuzhiyun return 0; 933*4882a593Smuzhiyun } 934*4882a593Smuzhiyun 935*4882a593SmuzhiyunThe important point is that after foo_io_completion() asks for an autosuspend, 936*4882a593Smuzhiyunthe foo_runtime_suspend() callback may race with foo_read_or_write(). 937*4882a593SmuzhiyunTherefore foo_runtime_suspend() has to check whether there are any pending I/O 938*4882a593Smuzhiyunrequests (while holding the private lock) before allowing the suspend to 939*4882a593Smuzhiyunproceed. 940*4882a593Smuzhiyun 941*4882a593SmuzhiyunIn addition, the power.autosuspend_delay field can be changed by user space at 942*4882a593Smuzhiyunany time. If a driver cares about this, it can call 943*4882a593Smuzhiyunpm_runtime_autosuspend_expiration() from within the ->runtime_suspend() 944*4882a593Smuzhiyuncallback while holding its private lock. If the function returns a nonzero 945*4882a593Smuzhiyunvalue then the delay has not yet expired and the callback should return 946*4882a593Smuzhiyun-EAGAIN. 947