diff options
Diffstat (limited to 'Documentation/power/devices.txt')
-rw-r--r-- | Documentation/power/devices.txt | 319 |
1 files changed, 319 insertions, 0 deletions
diff --git a/Documentation/power/devices.txt b/Documentation/power/devices.txt new file mode 100644 index 00000000000..5d4ae9a39f1 --- /dev/null +++ b/Documentation/power/devices.txt @@ -0,0 +1,319 @@ + +Device Power Management + + +Device power management encompasses two areas - the ability to save +state and transition a device to a low-power state when the system is +entering a low-power state; and the ability to transition a device to +a low-power state while the system is running (and independently of +any other power management activity). + + +Methods + +The methods to suspend and resume devices reside in struct bus_type: + +struct bus_type { + ... + int (*suspend)(struct device * dev, pm_message_t state); + int (*resume)(struct device * dev); +}; + +Each bus driver is responsible implementing these methods, translating +the call into a bus-specific request and forwarding the call to the +bus-specific drivers. For example, PCI drivers implement suspend() and +resume() methods in struct pci_driver. The PCI core is simply +responsible for translating the pointers to PCI-specific ones and +calling the low-level driver. + +This is done to a) ease transition to the new power management methods +and leverage the existing PM code in various bus drivers; b) allow +buses to implement generic and default PM routines for devices, and c) +make the flow of execution obvious to the reader. + + +System Power Management + +When the system enters a low-power state, the device tree is walked in +a depth-first fashion to transition each device into a low-power +state. The ordering of the device tree is guaranteed by the order in +which devices get registered - children are never registered before +their ancestors, and devices are placed at the back of the list when +registered. By walking the list in reverse order, we are guaranteed to +suspend devices in the proper order. + +Devices are suspended once with interrupts enabled. Drivers are +expected to stop I/O transactions, save device state, and place the +device into a low-power state. Drivers may sleep, allocate memory, +etc. at will. + +Some devices are broken and will inevitably have problems powering +down or disabling themselves with interrupts enabled. For these +special cases, they may return -EAGAIN. This will put the device on a +list to be taken care of later. When interrupts are disabled, before +we enter the low-power state, their drivers are called again to put +their device to sleep. + +On resume, the devices that returned -EAGAIN will be called to power +themselves back on with interrupts disabled. Once interrupts have been +re-enabled, the rest of the drivers will be called to resume their +devices. On resume, a driver is responsible for powering back on each +device, restoring state, and re-enabling I/O transactions for that +device. + +System devices follow a slightly different API, which can be found in + + include/linux/sysdev.h + drivers/base/sys.c + +System devices will only be suspended with interrupts disabled, and +after all other devices have been suspended. On resume, they will be +resumed before any other devices, and also with interrupts disabled. + + +Runtime Power Management + +Many devices are able to dynamically power down while the system is +still running. This feature is useful for devices that are not being +used, and can offer significant power savings on a running system. + +In each device's directory, there is a 'power' directory, which +contains at least a 'state' file. Reading from this file displays what +power state the device is currently in. Writing to this file initiates +a transition to the specified power state, which must be a decimal in +the range 1-3, inclusive; or 0 for 'On'. + +The PM core will call the ->suspend() method in the bus_type object +that the device belongs to if the specified state is not 0, or +->resume() if it is. + +Nothing will happen if the specified state is the same state the +device is currently in. + +If the device is already in a low-power state, and the specified state +is another, but different, low-power state, the ->resume() method will +first be called to power the device back on, then ->suspend() will be +called again with the new state. + +The driver is responsible for saving the working state of the device +and putting it into the low-power state specified. If this was +successful, it returns 0, and the device's power_state field is +updated. + +The driver must take care to know whether or not it is able to +properly resume the device, including all step of reinitialization +necessary. (This is the hardest part, and the one most protected by +NDA'd documents). + +The driver must also take care not to suspend a device that is +currently in use. It is their responsibility to provide their own +exclusion mechanisms. + +The runtime power transition happens with interrupts enabled. If a +device cannot support being powered down with interrupts, it may +return -EAGAIN (as it would during a system power management +transition), but it will _not_ be called again, and the transaction +will fail. + +There is currently no way to know what states a device or driver +supports a priori. This will change in the future. + +pm_message_t meaning + +pm_message_t has two fields. event ("major"), and flags. If driver +does not know event code, it aborts the request, returning error. Some +drivers may need to deal with special cases based on the actual type +of suspend operation being done at the system level. This is why +there are flags. + +Event codes are: + +ON -- no need to do anything except special cases like broken +HW. + +# NOTIFICATION -- pretty much same as ON? + +FREEZE -- stop DMA and interrupts, and be prepared to reinit HW from +scratch. That probably means stop accepting upstream requests, the +actual policy of what to do with them beeing specific to a given +driver. It's acceptable for a network driver to just drop packets +while a block driver is expected to block the queue so no request is +lost. (Use IDE as an example on how to do that). FREEZE requires no +power state change, and it's expected for drivers to be able to +quickly transition back to operating state. + +SUSPEND -- like FREEZE, but also put hardware into low-power state. If +there's need to distinguish several levels of sleep, additional flag +is probably best way to do that. + +Transitions are only from a resumed state to a suspended state, never +between 2 suspended states. (ON -> FREEZE or ON -> SUSPEND can happen, +FREEZE -> SUSPEND or SUSPEND -> FREEZE can not). + +All events are: + +[NOTE NOTE NOTE: If you are driver author, you should not care; you +should only look at event, and ignore flags.] + +#Prepare for suspend -- userland is still running but we are going to +#enter suspend state. This gives drivers chance to load firmware from +#disk and store it in memory, or do other activities taht require +#operating userland, ability to kmalloc GFP_KERNEL, etc... All of these +#are forbiden once the suspend dance is started.. event = ON, flags = +#PREPARE_TO_SUSPEND + +Apm standby -- prepare for APM event. Quiesce devices to make life +easier for APM BIOS. event = FREEZE, flags = APM_STANDBY + +Apm suspend -- same as APM_STANDBY, but it we should probably avoid +spinning down disks. event = FREEZE, flags = APM_SUSPEND + +System halt, reboot -- quiesce devices to make life easier for BIOS. event += FREEZE, flags = SYSTEM_HALT or SYSTEM_REBOOT + +System shutdown -- at least disks need to be spun down, or data may be +lost. Quiesce devices, just to make life easier for BIOS. event = +FREEZE, flags = SYSTEM_SHUTDOWN + +Kexec -- turn off DMAs and put hardware into some state where new +kernel can take over. event = FREEZE, flags = KEXEC + +Powerdown at end of swsusp -- very similar to SYSTEM_SHUTDOWN, except wake +may need to be enabled on some devices. This actually has at least 3 +subtypes, system can reboot, enter S4 and enter S5 at the end of +swsusp. event = FREEZE, flags = SWSUSP and one of SYSTEM_REBOOT, +SYSTEM_SHUTDOWN, SYSTEM_S4 + +Suspend to ram -- put devices into low power state. event = SUSPEND, +flags = SUSPEND_TO_RAM + +Freeze for swsusp snapshot -- stop DMA and interrupts. No need to put +devices into low power mode, but you must be able to reinitialize +device from scratch in resume method. This has two flavors, its done +once on suspending kernel, once on resuming kernel. event = FREEZE, +flags = DURING_SUSPEND or DURING_RESUME + +Device detach requested from /sys -- deinitialize device; proably same as +SYSTEM_SHUTDOWN, I do not understand this one too much. probably event += FREEZE, flags = DEV_DETACH. + +#These are not really events sent: +# +#System fully on -- device is working normally; this is probably never +#passed to suspend() method... event = ON, flags = 0 +# +#Ready after resume -- userland is now running, again. Time to free any +#memory you ate during prepare to suspend... event = ON, flags = +#READY_AFTER_RESUME +# + +Driver Detach Power Management + +The kernel now supports the ability to place a device in a low-power +state when it is detached from its driver, which happens when its +module is removed. + +Each device contains a 'detach_state' file in its sysfs directory +which can be used to control this state. Reading from this file +displays what the current detach state is set to. This is 0 (On) by +default. A user may write a positive integer value to this file in the +range of 1-4 inclusive. + +A value of 1-3 will indicate the device should be placed in that +low-power state, which will cause ->suspend() to be called for that +device. A value of 4 indicates that the device should be shutdown, so +->shutdown() will be called for that device. + +The driver is responsible for reinitializing the device when the +module is re-inserted during it's ->probe() (or equivalent) method. +The driver core will not call any extra functions when binding the +device to the driver. + +pm_message_t meaning + +pm_message_t has two fields. event ("major"), and flags. If driver +does not know event code, it aborts the request, returning error. Some +drivers may need to deal with special cases based on the actual type +of suspend operation being done at the system level. This is why +there are flags. + +Event codes are: + +ON -- no need to do anything except special cases like broken +HW. + +# NOTIFICATION -- pretty much same as ON? + +FREEZE -- stop DMA and interrupts, and be prepared to reinit HW from +scratch. That probably means stop accepting upstream requests, the +actual policy of what to do with them being specific to a given +driver. It's acceptable for a network driver to just drop packets +while a block driver is expected to block the queue so no request is +lost. (Use IDE as an example on how to do that). FREEZE requires no +power state change, and it's expected for drivers to be able to +quickly transition back to operating state. + +SUSPEND -- like FREEZE, but also put hardware into low-power state. If +there's need to distinguish several levels of sleep, additional flag +is probably best way to do that. + +Transitions are only from a resumed state to a suspended state, never +between 2 suspended states. (ON -> FREEZE or ON -> SUSPEND can happen, +FREEZE -> SUSPEND or SUSPEND -> FREEZE can not). + +All events are: + +[NOTE NOTE NOTE: If you are driver author, you should not care; you +should only look at event, and ignore flags.] + +#Prepare for suspend -- userland is still running but we are going to +#enter suspend state. This gives drivers chance to load firmware from +#disk and store it in memory, or do other activities taht require +#operating userland, ability to kmalloc GFP_KERNEL, etc... All of these +#are forbiden once the suspend dance is started.. event = ON, flags = +#PREPARE_TO_SUSPEND + +Apm standby -- prepare for APM event. Quiesce devices to make life +easier for APM BIOS. event = FREEZE, flags = APM_STANDBY + +Apm suspend -- same as APM_STANDBY, but it we should probably avoid +spinning down disks. event = FREEZE, flags = APM_SUSPEND + +System halt, reboot -- quiesce devices to make life easier for BIOS. event += FREEZE, flags = SYSTEM_HALT or SYSTEM_REBOOT + +System shutdown -- at least disks need to be spun down, or data may be +lost. Quiesce devices, just to make life easier for BIOS. event = +FREEZE, flags = SYSTEM_SHUTDOWN + +Kexec -- turn off DMAs and put hardware into some state where new +kernel can take over. event = FREEZE, flags = KEXEC + +Powerdown at end of swsusp -- very similar to SYSTEM_SHUTDOWN, except wake +may need to be enabled on some devices. This actually has at least 3 +subtypes, system can reboot, enter S4 and enter S5 at the end of +swsusp. event = FREEZE, flags = SWSUSP and one of SYSTEM_REBOOT, +SYSTEM_SHUTDOWN, SYSTEM_S4 + +Suspend to ram -- put devices into low power state. event = SUSPEND, +flags = SUSPEND_TO_RAM + +Freeze for swsusp snapshot -- stop DMA and interrupts. No need to put +devices into low power mode, but you must be able to reinitialize +device from scratch in resume method. This has two flavors, its done +once on suspending kernel, once on resuming kernel. event = FREEZE, +flags = DURING_SUSPEND or DURING_RESUME + +Device detach requested from /sys -- deinitialize device; proably same as +SYSTEM_SHUTDOWN, I do not understand this one too much. probably event += FREEZE, flags = DEV_DETACH. + +#These are not really events sent: +# +#System fully on -- device is working normally; this is probably never +#passed to suspend() method... event = ON, flags = 0 +# +#Ready after resume -- userland is now running, again. Time to free any +#memory you ate during prepare to suspend... event = ON, flags = +#READY_AFTER_RESUME +# |