Documentation/core-api/dma-api-howto.rst

*4882a593Smuzhiyun=========================
*4882a593SmuzhiyunDynamic DMA mapping Guide
*4882a593Smuzhiyun=========================
*4882a593Smuzhiyun
*4882a593Smuzhiyun:Author: David S. Miller <davem@redhat.com>
*4882a593Smuzhiyun:Author: Richard Henderson <rth@cygnus.com>
*4882a593Smuzhiyun:Author: Jakub Jelinek <jakub@redhat.com>
*4882a593Smuzhiyun
*4882a593SmuzhiyunThis is a guide to device driver writers on how to use the DMA API
*4882a593Smuzhiyunwith example pseudo-code.  For a concise description of the API, see
*4882a593SmuzhiyunDMA-API.txt.
*4882a593Smuzhiyun
*4882a593SmuzhiyunCPU and DMA addresses
*4882a593Smuzhiyun=====================
*4882a593Smuzhiyun
*4882a593SmuzhiyunThere are several kinds of addresses involved in the DMA API, and it's
*4882a593Smuzhiyunimportant to understand the differences.
*4882a593Smuzhiyun
*4882a593SmuzhiyunThe kernel normally uses virtual addresses.  Any address returned by
*4882a593Smuzhiyunkmalloc(), vmalloc(), and similar interfaces is a virtual address and can
*4882a593Smuzhiyunbe stored in a ``void *``.
*4882a593Smuzhiyun
*4882a593SmuzhiyunThe virtual memory system (TLB, page tables, etc.) translates virtual
*4882a593Smuzhiyunaddresses to CPU physical addresses, which are stored as "phys_addr_t" or
*4882a593Smuzhiyun"resource_size_t".  The kernel manages device resources like registers as
*4882a593Smuzhiyunphysical addresses.  These are the addresses in /proc/iomem.  The physical
*4882a593Smuzhiyunaddress is not directly useful to a driver; it must use ioremap() to map
*4882a593Smuzhiyunthe space and produce a virtual address.
*4882a593Smuzhiyun
*4882a593SmuzhiyunI/O devices use a third kind of address: a "bus address".  If a device has
*4882a593Smuzhiyunregisters at an MMIO address, or if it performs DMA to read or write system
*4882a593Smuzhiyunmemory, the addresses used by the device are bus addresses.  In some
*4882a593Smuzhiyunsystems, bus addresses are identical to CPU physical addresses, but in
*4882a593Smuzhiyungeneral they are not.  IOMMUs and host bridges can produce arbitrary
*4882a593Smuzhiyunmappings between physical and bus addresses.
*4882a593Smuzhiyun
*4882a593SmuzhiyunFrom a device's point of view, DMA uses the bus address space, but it may
*4882a593Smuzhiyunbe restricted to a subset of that space.  For example, even if a system
*4882a593Smuzhiyunsupports 64-bit addresses for main memory and PCI BARs, it may use an IOMMU
*4882a593Smuzhiyunso devices only need to use 32-bit DMA addresses.
*4882a593Smuzhiyun
*4882a593SmuzhiyunHere's a picture and some examples::
*4882a593Smuzhiyun
*4882a593Smuzhiyun               CPU                  CPU                  Bus
*4882a593Smuzhiyun             Virtual              Physical             Address
*4882a593Smuzhiyun             Address              Address               Space
*4882a593Smuzhiyun              Space                Space
*4882a593Smuzhiyun
*4882a593Smuzhiyun            +-------+             +------+             +------+
*4882a593Smuzhiyun            |       |             |MMIO  |   Offset    |      |
*4882a593Smuzhiyun            |       |  Virtual    |Space |   applied   |      |
*4882a593Smuzhiyun          C +-------+ --------> B +------+ ----------> +------+ A
*4882a593Smuzhiyun            |       |  mapping    |      |   by host   |      |
*4882a593Smuzhiyun  +-----+   |       |             |      |   bridge    |      |   +--------+
*4882a593Smuzhiyun  |     |   |       |             +------+             |      |   |        |
*4882a593Smuzhiyun  | CPU |   |       |             | RAM  |             |      |   | Device |
*4882a593Smuzhiyun  |     |   |       |             |      |             |      |   |        |
*4882a593Smuzhiyun  +-----+   +-------+             +------+             +------+   +--------+
*4882a593Smuzhiyun            |       |  Virtual    |Buffer|   Mapping   |      |
*4882a593Smuzhiyun          X +-------+ --------> Y +------+ <---------- +------+ Z
*4882a593Smuzhiyun            |       |  mapping    | RAM  |   by IOMMU
*4882a593Smuzhiyun            |       |             |      |
*4882a593Smuzhiyun            |       |             |      |
*4882a593Smuzhiyun            +-------+             +------+
*4882a593Smuzhiyun
*4882a593SmuzhiyunDuring the enumeration process, the kernel learns about I/O devices and
*4882a593Smuzhiyuntheir MMIO space and the host bridges that connect them to the system.  For
*4882a593Smuzhiyunexample, if a PCI device has a BAR, the kernel reads the bus address (A)
*4882a593Smuzhiyunfrom the BAR and converts it to a CPU physical address (B).  The address B
*4882a593Smuzhiyunis stored in a struct resource and usually exposed via /proc/iomem.  When a
*4882a593Smuzhiyundriver claims a device, it typically uses ioremap() to map physical address
*4882a593SmuzhiyunB at a virtual address (C).  It can then use, e.g., ioread32(C), to access
*4882a593Smuzhiyunthe device registers at bus address A.
*4882a593Smuzhiyun
*4882a593SmuzhiyunIf the device supports DMA, the driver sets up a buffer using kmalloc() or
*4882a593Smuzhiyuna similar interface, which returns a virtual address (X).  The virtual
*4882a593Smuzhiyunmemory system maps X to a physical address (Y) in system RAM.  The driver
*4882a593Smuzhiyuncan use virtual address X to access the buffer, but the device itself
*4882a593Smuzhiyuncannot because DMA doesn't go through the CPU virtual memory system.
*4882a593Smuzhiyun
*4882a593SmuzhiyunIn some simple systems, the device can do DMA directly to physical address
*4882a593SmuzhiyunY.  But in many others, there is IOMMU hardware that translates DMA
*4882a593Smuzhiyunaddresses to physical addresses, e.g., it translates Z to Y.  This is part
*4882a593Smuzhiyunof the reason for the DMA API: the driver can give a virtual address X to
*4882a593Smuzhiyunan interface like dma_map_single(), which sets up any required IOMMU
*4882a593Smuzhiyunmapping and returns the DMA address Z.  The driver then tells the device to
*4882a593Smuzhiyundo DMA to Z, and the IOMMU maps it to the buffer at address Y in system
*4882a593SmuzhiyunRAM.
*4882a593Smuzhiyun
*4882a593SmuzhiyunSo that Linux can use the dynamic DMA mapping, it needs some help from the
*4882a593Smuzhiyundrivers, namely it has to take into account that DMA addresses should be
*4882a593Smuzhiyunmapped only for the time they are actually used and unmapped after the DMA
*4882a593Smuzhiyuntransfer.
*4882a593Smuzhiyun
*4882a593SmuzhiyunThe following API will work of course even on platforms where no such
*4882a593Smuzhiyunhardware exists.
*4882a593Smuzhiyun
*4882a593SmuzhiyunNote that the DMA API works with any bus independent of the underlying
*4882a593Smuzhiyunmicroprocessor architecture. You should use the DMA API rather than the
*4882a593Smuzhiyunbus-specific DMA API, i.e., use the dma_map_*() interfaces rather than the
*4882a593Smuzhiyunpci_map_*() interfaces.
*4882a593Smuzhiyun
*4882a593SmuzhiyunFirst of all, you should make sure::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	#include <linux/dma-mapping.h>
*4882a593Smuzhiyun
*4882a593Smuzhiyunis in your driver, which provides the definition of dma_addr_t.  This type
*4882a593Smuzhiyuncan hold any valid DMA address for the platform and should be used
*4882a593Smuzhiyuneverywhere you hold a DMA address returned from the DMA mapping functions.
*4882a593Smuzhiyun
*4882a593SmuzhiyunWhat memory is DMA'able?
*4882a593Smuzhiyun========================
*4882a593Smuzhiyun
*4882a593SmuzhiyunThe first piece of information you must know is what kernel memory can
*4882a593Smuzhiyunbe used with the DMA mapping facilities.  There has been an unwritten
*4882a593Smuzhiyunset of rules regarding this, and this text is an attempt to finally
*4882a593Smuzhiyunwrite them down.
*4882a593Smuzhiyun
*4882a593SmuzhiyunIf you acquired your memory via the page allocator
*4882a593Smuzhiyun(i.e. __get_free_page*()) or the generic memory allocators
*4882a593Smuzhiyun(i.e. kmalloc() or kmem_cache_alloc()) then you may DMA to/from
*4882a593Smuzhiyunthat memory using the addresses returned from those routines.
*4882a593Smuzhiyun
*4882a593SmuzhiyunThis means specifically that you may _not_ use the memory/addresses
*4882a593Smuzhiyunreturned from vmalloc() for DMA.  It is possible to DMA to the
*4882a593Smuzhiyun_underlying_ memory mapped into a vmalloc() area, but this requires
*4882a593Smuzhiyunwalking page tables to get the physical addresses, and then
*4882a593Smuzhiyuntranslating each of those pages back to a kernel address using
*4882a593Smuzhiyunsomething like __va().  [ EDIT: Update this when we integrate
*4882a593SmuzhiyunGerd Knorr's generic code which does this. ]
*4882a593Smuzhiyun
*4882a593SmuzhiyunThis rule also means that you may use neither kernel image addresses
*4882a593Smuzhiyun(items in data/text/bss segments), nor module image addresses, nor
*4882a593Smuzhiyunstack addresses for DMA.  These could all be mapped somewhere entirely
*4882a593Smuzhiyundifferent than the rest of physical memory.  Even if those classes of
*4882a593Smuzhiyunmemory could physically work with DMA, you'd need to ensure the I/O
*4882a593Smuzhiyunbuffers were cacheline-aligned.  Without that, you'd see cacheline
*4882a593Smuzhiyunsharing problems (data corruption) on CPUs with DMA-incoherent caches.
*4882a593Smuzhiyun(The CPU could write to one word, DMA would write to a different one
*4882a593Smuzhiyunin the same cache line, and one of them could be overwritten.)
*4882a593Smuzhiyun
*4882a593SmuzhiyunAlso, this means that you cannot take the return of a kmap()
*4882a593Smuzhiyuncall and DMA to/from that.  This is similar to vmalloc().
*4882a593Smuzhiyun
*4882a593SmuzhiyunWhat about block I/O and networking buffers?  The block I/O and
*4882a593Smuzhiyunnetworking subsystems make sure that the buffers they use are valid
*4882a593Smuzhiyunfor you to DMA from/to.
*4882a593Smuzhiyun
*4882a593SmuzhiyunDMA addressing capabilities
*4882a593Smuzhiyun===========================
*4882a593Smuzhiyun
*4882a593SmuzhiyunBy default, the kernel assumes that your device can address 32-bits of DMA
*4882a593Smuzhiyunaddressing.  For a 64-bit capable device, this needs to be increased, and for
*4882a593Smuzhiyuna device with limitations, it needs to be decreased.
*4882a593Smuzhiyun
*4882a593SmuzhiyunSpecial note about PCI: PCI-X specification requires PCI-X devices to support
*4882a593Smuzhiyun64-bit addressing (DAC) for all transactions.  And at least one platform (SGI
*4882a593SmuzhiyunSN2) requires 64-bit consistent allocations to operate correctly when the IO
*4882a593Smuzhiyunbus is in PCI-X mode.
*4882a593Smuzhiyun
*4882a593SmuzhiyunFor correct operation, you must set the DMA mask to inform the kernel about
*4882a593Smuzhiyunyour devices DMA addressing capabilities.
*4882a593Smuzhiyun
*4882a593SmuzhiyunThis is performed via a call to dma_set_mask_and_coherent()::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	int dma_set_mask_and_coherent(struct device *dev, u64 mask);
*4882a593Smuzhiyun
*4882a593Smuzhiyunwhich will set the mask for both streaming and coherent APIs together.  If you
*4882a593Smuzhiyunhave some special requirements, then the following two separate calls can be
*4882a593Smuzhiyunused instead:
*4882a593Smuzhiyun
*4882a593Smuzhiyun	The setup for streaming mappings is performed via a call to
*4882a593Smuzhiyun	dma_set_mask()::
*4882a593Smuzhiyun
*4882a593Smuzhiyun		int dma_set_mask(struct device *dev, u64 mask);
*4882a593Smuzhiyun
*4882a593Smuzhiyun	The setup for consistent allocations is performed via a call
*4882a593Smuzhiyun	to dma_set_coherent_mask()::
*4882a593Smuzhiyun
*4882a593Smuzhiyun		int dma_set_coherent_mask(struct device *dev, u64 mask);
*4882a593Smuzhiyun
*4882a593SmuzhiyunHere, dev is a pointer to the device struct of your device, and mask is a bit
*4882a593Smuzhiyunmask describing which bits of an address your device supports.  Often the
*4882a593Smuzhiyundevice struct of your device is embedded in the bus-specific device struct of
*4882a593Smuzhiyunyour device.  For example, &pdev->dev is a pointer to the device struct of a
*4882a593SmuzhiyunPCI device (pdev is a pointer to the PCI device struct of your device).
*4882a593Smuzhiyun
*4882a593SmuzhiyunThese calls usually return zero to indicated your device can perform DMA
*4882a593Smuzhiyunproperly on the machine given the address mask you provided, but they might
*4882a593Smuzhiyunreturn an error if the mask is too small to be supportable on the given
*4882a593Smuzhiyunsystem.  If it returns non-zero, your device cannot perform DMA properly on
*4882a593Smuzhiyunthis platform, and attempting to do so will result in undefined behavior.
*4882a593SmuzhiyunYou must not use DMA on this device unless the dma_set_mask family of
*4882a593Smuzhiyunfunctions has returned success.
*4882a593Smuzhiyun
*4882a593SmuzhiyunThis means that in the failure case, you have two options:
*4882a593Smuzhiyun
*4882a593Smuzhiyun1) Use some non-DMA mode for data transfer, if possible.
*4882a593Smuzhiyun2) Ignore this device and do not initialize it.
*4882a593Smuzhiyun
*4882a593SmuzhiyunIt is recommended that your driver print a kernel KERN_WARNING message when
*4882a593Smuzhiyunsetting the DMA mask fails.  In this manner, if a user of your driver reports
*4882a593Smuzhiyunthat performance is bad or that the device is not even detected, you can ask
*4882a593Smuzhiyunthem for the kernel messages to find out exactly why.
*4882a593Smuzhiyun
*4882a593SmuzhiyunThe standard 64-bit addressing device would do something like this::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	if (dma_set_mask_and_coherent(dev, DMA_BIT_MASK(64))) {
*4882a593Smuzhiyun		dev_warn(dev, "mydev: No suitable DMA available\n");
*4882a593Smuzhiyun		goto ignore_this_device;
*4882a593Smuzhiyun	}
*4882a593Smuzhiyun
*4882a593SmuzhiyunIf the device only supports 32-bit addressing for descriptors in the
*4882a593Smuzhiyuncoherent allocations, but supports full 64-bits for streaming mappings
*4882a593Smuzhiyunit would look like this::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	if (dma_set_mask(dev, DMA_BIT_MASK(64))) {
*4882a593Smuzhiyun		dev_warn(dev, "mydev: No suitable DMA available\n");
*4882a593Smuzhiyun		goto ignore_this_device;
*4882a593Smuzhiyun	}
*4882a593Smuzhiyun
*4882a593SmuzhiyunThe coherent mask will always be able to set the same or a smaller mask as
*4882a593Smuzhiyunthe streaming mask. However for the rare case that a device driver only
*4882a593Smuzhiyunuses consistent allocations, one would have to check the return value from
*4882a593Smuzhiyundma_set_coherent_mask().
*4882a593Smuzhiyun
*4882a593SmuzhiyunFinally, if your device can only drive the low 24-bits of
*4882a593Smuzhiyunaddress you might do something like::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	if (dma_set_mask(dev, DMA_BIT_MASK(24))) {
*4882a593Smuzhiyun		dev_warn(dev, "mydev: 24-bit DMA addressing not available\n");
*4882a593Smuzhiyun		goto ignore_this_device;
*4882a593Smuzhiyun	}
*4882a593Smuzhiyun
*4882a593SmuzhiyunWhen dma_set_mask() or dma_set_mask_and_coherent() is successful, and
*4882a593Smuzhiyunreturns zero, the kernel saves away this mask you have provided.  The
*4882a593Smuzhiyunkernel will use this information later when you make DMA mappings.
*4882a593Smuzhiyun
*4882a593SmuzhiyunThere is a case which we are aware of at this time, which is worth
*4882a593Smuzhiyunmentioning in this documentation.  If your device supports multiple
*4882a593Smuzhiyunfunctions (for example a sound card provides playback and record
*4882a593Smuzhiyunfunctions) and the various different functions have _different_
*4882a593SmuzhiyunDMA addressing limitations, you may wish to probe each mask and
*4882a593Smuzhiyunonly provide the functionality which the machine can handle.  It
*4882a593Smuzhiyunis important that the last call to dma_set_mask() be for the
*4882a593Smuzhiyunmost specific mask.
*4882a593Smuzhiyun
*4882a593SmuzhiyunHere is pseudo-code showing how this might be done::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	#define PLAYBACK_ADDRESS_BITS	DMA_BIT_MASK(32)
*4882a593Smuzhiyun	#define RECORD_ADDRESS_BITS	DMA_BIT_MASK(24)
*4882a593Smuzhiyun
*4882a593Smuzhiyun	struct my_sound_card *card;
*4882a593Smuzhiyun	struct device *dev;
*4882a593Smuzhiyun
*4882a593Smuzhiyun	...
*4882a593Smuzhiyun	if (!dma_set_mask(dev, PLAYBACK_ADDRESS_BITS)) {
*4882a593Smuzhiyun		card->playback_enabled = 1;
*4882a593Smuzhiyun	} else {
*4882a593Smuzhiyun		card->playback_enabled = 0;
*4882a593Smuzhiyun		dev_warn(dev, "%s: Playback disabled due to DMA limitations\n",
*4882a593Smuzhiyun		       card->name);
*4882a593Smuzhiyun	}
*4882a593Smuzhiyun	if (!dma_set_mask(dev, RECORD_ADDRESS_BITS)) {
*4882a593Smuzhiyun		card->record_enabled = 1;
*4882a593Smuzhiyun	} else {
*4882a593Smuzhiyun		card->record_enabled = 0;
*4882a593Smuzhiyun		dev_warn(dev, "%s: Record disabled due to DMA limitations\n",
*4882a593Smuzhiyun		       card->name);
*4882a593Smuzhiyun	}
*4882a593Smuzhiyun
*4882a593SmuzhiyunA sound card was used as an example here because this genre of PCI
*4882a593Smuzhiyundevices seems to be littered with ISA chips given a PCI front end,
*4882a593Smuzhiyunand thus retaining the 16MB DMA addressing limitations of ISA.
*4882a593Smuzhiyun
*4882a593SmuzhiyunTypes of DMA mappings
*4882a593Smuzhiyun=====================
*4882a593Smuzhiyun
*4882a593SmuzhiyunThere are two types of DMA mappings:
*4882a593Smuzhiyun
*4882a593Smuzhiyun- Consistent DMA mappings which are usually mapped at driver
*4882a593Smuzhiyun  initialization, unmapped at the end and for which the hardware should
*4882a593Smuzhiyun  guarantee that the device and the CPU can access the data
*4882a593Smuzhiyun  in parallel and will see updates made by each other without any
*4882a593Smuzhiyun  explicit software flushing.
*4882a593Smuzhiyun
*4882a593Smuzhiyun  Think of "consistent" as "synchronous" or "coherent".
*4882a593Smuzhiyun
*4882a593Smuzhiyun  The current default is to return consistent memory in the low 32
*4882a593Smuzhiyun  bits of the DMA space.  However, for future compatibility you should
*4882a593Smuzhiyun  set the consistent mask even if this default is fine for your
*4882a593Smuzhiyun  driver.
*4882a593Smuzhiyun
*4882a593Smuzhiyun  Good examples of what to use consistent mappings for are:
*4882a593Smuzhiyun
*4882a593Smuzhiyun	- Network card DMA ring descriptors.
*4882a593Smuzhiyun	- SCSI adapter mailbox command data structures.
*4882a593Smuzhiyun	- Device firmware microcode executed out of
*4882a593Smuzhiyun	  main memory.
*4882a593Smuzhiyun
*4882a593Smuzhiyun  The invariant these examples all require is that any CPU store
*4882a593Smuzhiyun  to memory is immediately visible to the device, and vice
*4882a593Smuzhiyun  versa.  Consistent mappings guarantee this.
*4882a593Smuzhiyun
*4882a593Smuzhiyun  .. important::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	     Consistent DMA memory does not preclude the usage of
*4882a593Smuzhiyun	     proper memory barriers.  The CPU may reorder stores to
*4882a593Smuzhiyun	     consistent memory just as it may normal memory.  Example:
*4882a593Smuzhiyun	     if it is important for the device to see the first word
*4882a593Smuzhiyun	     of a descriptor updated before the second, you must do
*4882a593Smuzhiyun	     something like::
*4882a593Smuzhiyun
*4882a593Smuzhiyun		desc->word0 = address;
*4882a593Smuzhiyun		wmb();
*4882a593Smuzhiyun		desc->word1 = DESC_VALID;
*4882a593Smuzhiyun
*4882a593Smuzhiyun             in order to get correct behavior on all platforms.
*4882a593Smuzhiyun
*4882a593Smuzhiyun	     Also, on some platforms your driver may need to flush CPU write
*4882a593Smuzhiyun	     buffers in much the same way as it needs to flush write buffers
*4882a593Smuzhiyun	     found in PCI bridges (such as by reading a register's value
*4882a593Smuzhiyun	     after writing it).
*4882a593Smuzhiyun
*4882a593Smuzhiyun- Streaming DMA mappings which are usually mapped for one DMA
*4882a593Smuzhiyun  transfer, unmapped right after it (unless you use dma_sync_* below)
*4882a593Smuzhiyun  and for which hardware can optimize for sequential accesses.
*4882a593Smuzhiyun
*4882a593Smuzhiyun  Think of "streaming" as "asynchronous" or "outside the coherency
*4882a593Smuzhiyun  domain".
*4882a593Smuzhiyun
*4882a593Smuzhiyun  Good examples of what to use streaming mappings for are:
*4882a593Smuzhiyun
*4882a593Smuzhiyun	- Networking buffers transmitted/received by a device.
*4882a593Smuzhiyun	- Filesystem buffers written/read by a SCSI device.
*4882a593Smuzhiyun
*4882a593Smuzhiyun  The interfaces for using this type of mapping were designed in
*4882a593Smuzhiyun  such a way that an implementation can make whatever performance
*4882a593Smuzhiyun  optimizations the hardware allows.  To this end, when using
*4882a593Smuzhiyun  such mappings you must be explicit about what you want to happen.
*4882a593Smuzhiyun
*4882a593SmuzhiyunNeither type of DMA mapping has alignment restrictions that come from
*4882a593Smuzhiyunthe underlying bus, although some devices may have such restrictions.
*4882a593SmuzhiyunAlso, systems with caches that aren't DMA-coherent will work better
*4882a593Smuzhiyunwhen the underlying buffers don't share cache lines with other data.
*4882a593Smuzhiyun
*4882a593Smuzhiyun
*4882a593SmuzhiyunUsing Consistent DMA mappings
*4882a593Smuzhiyun=============================
*4882a593Smuzhiyun
*4882a593SmuzhiyunTo allocate and map large (PAGE_SIZE or so) consistent DMA regions,
*4882a593Smuzhiyunyou should do::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	dma_addr_t dma_handle;
*4882a593Smuzhiyun
*4882a593Smuzhiyun	cpu_addr = dma_alloc_coherent(dev, size, &dma_handle, gfp);
*4882a593Smuzhiyun
*4882a593Smuzhiyunwhere device is a ``struct device *``. This may be called in interrupt
*4882a593Smuzhiyuncontext with the GFP_ATOMIC flag.
*4882a593Smuzhiyun
*4882a593SmuzhiyunSize is the length of the region you want to allocate, in bytes.
*4882a593Smuzhiyun
*4882a593SmuzhiyunThis routine will allocate RAM for that region, so it acts similarly to
*4882a593Smuzhiyun__get_free_pages() (but takes size instead of a page order).  If your
*4882a593Smuzhiyundriver needs regions sized smaller than a page, you may prefer using
*4882a593Smuzhiyunthe dma_pool interface, described below.
*4882a593Smuzhiyun
*4882a593SmuzhiyunThe consistent DMA mapping interfaces, will by default return a DMA address
*4882a593Smuzhiyunwhich is 32-bit addressable.  Even if the device indicates (via the DMA mask)
*4882a593Smuzhiyunthat it may address the upper 32-bits, consistent allocation will only
*4882a593Smuzhiyunreturn > 32-bit addresses for DMA if the consistent DMA mask has been
*4882a593Smuzhiyunexplicitly changed via dma_set_coherent_mask().  This is true of the
*4882a593Smuzhiyundma_pool interface as well.
*4882a593Smuzhiyun
*4882a593Smuzhiyundma_alloc_coherent() returns two values: the virtual address which you
*4882a593Smuzhiyuncan use to access it from the CPU and dma_handle which you pass to the
*4882a593Smuzhiyuncard.
*4882a593Smuzhiyun
*4882a593SmuzhiyunThe CPU virtual address and the DMA address are both
*4882a593Smuzhiyunguaranteed to be aligned to the smallest PAGE_SIZE order which
*4882a593Smuzhiyunis greater than or equal to the requested size.  This invariant
*4882a593Smuzhiyunexists (for example) to guarantee that if you allocate a chunk
*4882a593Smuzhiyunwhich is smaller than or equal to 64 kilobytes, the extent of the
*4882a593Smuzhiyunbuffer you receive will not cross a 64K boundary.
*4882a593Smuzhiyun
*4882a593SmuzhiyunTo unmap and free such a DMA region, you call::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	dma_free_coherent(dev, size, cpu_addr, dma_handle);
*4882a593Smuzhiyun
*4882a593Smuzhiyunwhere dev, size are the same as in the above call and cpu_addr and
*4882a593Smuzhiyundma_handle are the values dma_alloc_coherent() returned to you.
*4882a593SmuzhiyunThis function may not be called in interrupt context.
*4882a593Smuzhiyun
*4882a593SmuzhiyunIf your driver needs lots of smaller memory regions, you can write
*4882a593Smuzhiyuncustom code to subdivide pages returned by dma_alloc_coherent(),
*4882a593Smuzhiyunor you can use the dma_pool API to do that.  A dma_pool is like
*4882a593Smuzhiyuna kmem_cache, but it uses dma_alloc_coherent(), not __get_free_pages().
*4882a593SmuzhiyunAlso, it understands common hardware constraints for alignment,
*4882a593Smuzhiyunlike queue heads needing to be aligned on N byte boundaries.
*4882a593Smuzhiyun
*4882a593SmuzhiyunCreate a dma_pool like this::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	struct dma_pool *pool;
*4882a593Smuzhiyun
*4882a593Smuzhiyun	pool = dma_pool_create(name, dev, size, align, boundary);
*4882a593Smuzhiyun
*4882a593SmuzhiyunThe "name" is for diagnostics (like a kmem_cache name); dev and size
*4882a593Smuzhiyunare as above.  The device's hardware alignment requirement for this
*4882a593Smuzhiyuntype of data is "align" (which is expressed in bytes, and must be a
*4882a593Smuzhiyunpower of two).  If your device has no boundary crossing restrictions,
*4882a593Smuzhiyunpass 0 for boundary; passing 4096 says memory allocated from this pool
*4882a593Smuzhiyunmust not cross 4KByte boundaries (but at that time it may be better to
*4882a593Smuzhiyunuse dma_alloc_coherent() directly instead).
*4882a593Smuzhiyun
*4882a593SmuzhiyunAllocate memory from a DMA pool like this::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	cpu_addr = dma_pool_alloc(pool, flags, &dma_handle);
*4882a593Smuzhiyun
*4882a593Smuzhiyunflags are GFP_KERNEL if blocking is permitted (not in_interrupt nor
*4882a593Smuzhiyunholding SMP locks), GFP_ATOMIC otherwise.  Like dma_alloc_coherent(),
*4882a593Smuzhiyunthis returns two values, cpu_addr and dma_handle.
*4882a593Smuzhiyun
*4882a593SmuzhiyunFree memory that was allocated from a dma_pool like this::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	dma_pool_free(pool, cpu_addr, dma_handle);
*4882a593Smuzhiyun
*4882a593Smuzhiyunwhere pool is what you passed to dma_pool_alloc(), and cpu_addr and
*4882a593Smuzhiyundma_handle are the values dma_pool_alloc() returned. This function
*4882a593Smuzhiyunmay be called in interrupt context.
*4882a593Smuzhiyun
*4882a593SmuzhiyunDestroy a dma_pool by calling::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	dma_pool_destroy(pool);
*4882a593Smuzhiyun
*4882a593SmuzhiyunMake sure you've called dma_pool_free() for all memory allocated
*4882a593Smuzhiyunfrom a pool before you destroy the pool. This function may not
*4882a593Smuzhiyunbe called in interrupt context.
*4882a593Smuzhiyun
*4882a593SmuzhiyunDMA Direction
*4882a593Smuzhiyun=============
*4882a593Smuzhiyun
*4882a593SmuzhiyunThe interfaces described in subsequent portions of this document
*4882a593Smuzhiyuntake a DMA direction argument, which is an integer and takes on
*4882a593Smuzhiyunone of the following values::
*4882a593Smuzhiyun
*4882a593Smuzhiyun DMA_BIDIRECTIONAL
*4882a593Smuzhiyun DMA_TO_DEVICE
*4882a593Smuzhiyun DMA_FROM_DEVICE
*4882a593Smuzhiyun DMA_NONE
*4882a593Smuzhiyun
*4882a593SmuzhiyunYou should provide the exact DMA direction if you know it.
*4882a593Smuzhiyun
*4882a593SmuzhiyunDMA_TO_DEVICE means "from main memory to the device"
*4882a593SmuzhiyunDMA_FROM_DEVICE means "from the device to main memory"
*4882a593SmuzhiyunIt is the direction in which the data moves during the DMA
*4882a593Smuzhiyuntransfer.
*4882a593Smuzhiyun
*4882a593SmuzhiyunYou are _strongly_ encouraged to specify this as precisely
*4882a593Smuzhiyunas you possibly can.
*4882a593Smuzhiyun
*4882a593SmuzhiyunIf you absolutely cannot know the direction of the DMA transfer,
*4882a593Smuzhiyunspecify DMA_BIDIRECTIONAL.  It means that the DMA can go in
*4882a593Smuzhiyuneither direction.  The platform guarantees that you may legally
*4882a593Smuzhiyunspecify this, and that it will work, but this may be at the
*4882a593Smuzhiyuncost of performance for example.
*4882a593Smuzhiyun
*4882a593SmuzhiyunThe value DMA_NONE is to be used for debugging.  One can
*4882a593Smuzhiyunhold this in a data structure before you come to know the
*4882a593Smuzhiyunprecise direction, and this will help catch cases where your
*4882a593Smuzhiyundirection tracking logic has failed to set things up properly.
*4882a593Smuzhiyun
*4882a593SmuzhiyunAnother advantage of specifying this value precisely (outside of
*4882a593Smuzhiyunpotential platform-specific optimizations of such) is for debugging.
*4882a593SmuzhiyunSome platforms actually have a write permission boolean which DMA
*4882a593Smuzhiyunmappings can be marked with, much like page protections in the user
*4882a593Smuzhiyunprogram address space.  Such platforms can and do report errors in the
*4882a593Smuzhiyunkernel logs when the DMA controller hardware detects violation of the
*4882a593Smuzhiyunpermission setting.
*4882a593Smuzhiyun
*4882a593SmuzhiyunOnly streaming mappings specify a direction, consistent mappings
*4882a593Smuzhiyunimplicitly have a direction attribute setting of
*4882a593SmuzhiyunDMA_BIDIRECTIONAL.
*4882a593Smuzhiyun
*4882a593SmuzhiyunThe SCSI subsystem tells you the direction to use in the
*4882a593Smuzhiyun'sc_data_direction' member of the SCSI command your driver is
*4882a593Smuzhiyunworking on.
*4882a593Smuzhiyun
*4882a593SmuzhiyunFor Networking drivers, it's a rather simple affair.  For transmit
*4882a593Smuzhiyunpackets, map/unmap them with the DMA_TO_DEVICE direction
*4882a593Smuzhiyunspecifier.  For receive packets, just the opposite, map/unmap them
*4882a593Smuzhiyunwith the DMA_FROM_DEVICE direction specifier.
*4882a593Smuzhiyun
*4882a593SmuzhiyunUsing Streaming DMA mappings
*4882a593Smuzhiyun============================
*4882a593Smuzhiyun
*4882a593SmuzhiyunThe streaming DMA mapping routines can be called from interrupt
*4882a593Smuzhiyuncontext.  There are two versions of each map/unmap, one which will
*4882a593Smuzhiyunmap/unmap a single memory region, and one which will map/unmap a
*4882a593Smuzhiyunscatterlist.
*4882a593Smuzhiyun
*4882a593SmuzhiyunTo map a single region, you do::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	struct device *dev = &my_dev->dev;
*4882a593Smuzhiyun	dma_addr_t dma_handle;
*4882a593Smuzhiyun	void *addr = buffer->ptr;
*4882a593Smuzhiyun	size_t size = buffer->len;
*4882a593Smuzhiyun
*4882a593Smuzhiyun	dma_handle = dma_map_single(dev, addr, size, direction);
*4882a593Smuzhiyun	if (dma_mapping_error(dev, dma_handle)) {
*4882a593Smuzhiyun		/*
*4882a593Smuzhiyun		 * reduce current DMA mapping usage,
*4882a593Smuzhiyun		 * delay and try again later or
*4882a593Smuzhiyun		 * reset driver.
*4882a593Smuzhiyun		 */
*4882a593Smuzhiyun		goto map_error_handling;
*4882a593Smuzhiyun	}
*4882a593Smuzhiyun
*4882a593Smuzhiyunand to unmap it::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	dma_unmap_single(dev, dma_handle, size, direction);
*4882a593Smuzhiyun
*4882a593SmuzhiyunYou should call dma_mapping_error() as dma_map_single() could fail and return
*4882a593Smuzhiyunerror.  Doing so will ensure that the mapping code will work correctly on all
*4882a593SmuzhiyunDMA implementations without any dependency on the specifics of the underlying
*4882a593Smuzhiyunimplementation. Using the returned address without checking for errors could
*4882a593Smuzhiyunresult in failures ranging from panics to silent data corruption.  The same
*4882a593Smuzhiyunapplies to dma_map_page() as well.
*4882a593Smuzhiyun
*4882a593SmuzhiyunYou should call dma_unmap_single() when the DMA activity is finished, e.g.,
*4882a593Smuzhiyunfrom the interrupt which told you that the DMA transfer is done.
*4882a593Smuzhiyun
*4882a593SmuzhiyunUsing CPU pointers like this for single mappings has a disadvantage:
*4882a593Smuzhiyunyou cannot reference HIGHMEM memory in this way.  Thus, there is a
*4882a593Smuzhiyunmap/unmap interface pair akin to dma_{map,unmap}_single().  These
*4882a593Smuzhiyuninterfaces deal with page/offset pairs instead of CPU pointers.
*4882a593SmuzhiyunSpecifically::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	struct device *dev = &my_dev->dev;
*4882a593Smuzhiyun	dma_addr_t dma_handle;
*4882a593Smuzhiyun	struct page *page = buffer->page;
*4882a593Smuzhiyun	unsigned long offset = buffer->offset;
*4882a593Smuzhiyun	size_t size = buffer->len;
*4882a593Smuzhiyun
*4882a593Smuzhiyun	dma_handle = dma_map_page(dev, page, offset, size, direction);
*4882a593Smuzhiyun	if (dma_mapping_error(dev, dma_handle)) {
*4882a593Smuzhiyun		/*
*4882a593Smuzhiyun		 * reduce current DMA mapping usage,
*4882a593Smuzhiyun		 * delay and try again later or
*4882a593Smuzhiyun		 * reset driver.
*4882a593Smuzhiyun		 */
*4882a593Smuzhiyun		goto map_error_handling;
*4882a593Smuzhiyun	}
*4882a593Smuzhiyun
*4882a593Smuzhiyun	...
*4882a593Smuzhiyun
*4882a593Smuzhiyun	dma_unmap_page(dev, dma_handle, size, direction);
*4882a593Smuzhiyun
*4882a593SmuzhiyunHere, "offset" means byte offset within the given page.
*4882a593Smuzhiyun
*4882a593SmuzhiyunYou should call dma_mapping_error() as dma_map_page() could fail and return
*4882a593Smuzhiyunerror as outlined under the dma_map_single() discussion.
*4882a593Smuzhiyun
*4882a593SmuzhiyunYou should call dma_unmap_page() when the DMA activity is finished, e.g.,
*4882a593Smuzhiyunfrom the interrupt which told you that the DMA transfer is done.
*4882a593Smuzhiyun
*4882a593SmuzhiyunWith scatterlists, you map a region gathered from several regions by::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	int i, count = dma_map_sg(dev, sglist, nents, direction);
*4882a593Smuzhiyun	struct scatterlist *sg;
*4882a593Smuzhiyun
*4882a593Smuzhiyun	for_each_sg(sglist, sg, count, i) {
*4882a593Smuzhiyun		hw_address[i] = sg_dma_address(sg);
*4882a593Smuzhiyun		hw_len[i] = sg_dma_len(sg);
*4882a593Smuzhiyun	}
*4882a593Smuzhiyun
*4882a593Smuzhiyunwhere nents is the number of entries in the sglist.
*4882a593Smuzhiyun
*4882a593SmuzhiyunThe implementation is free to merge several consecutive sglist entries
*4882a593Smuzhiyuninto one (e.g. if DMA mapping is done with PAGE_SIZE granularity, any
*4882a593Smuzhiyunconsecutive sglist entries can be merged into one provided the first one
*4882a593Smuzhiyunends and the second one starts on a page boundary - in fact this is a huge
*4882a593Smuzhiyunadvantage for cards which either cannot do scatter-gather or have very
*4882a593Smuzhiyunlimited number of scatter-gather entries) and returns the actual number
*4882a593Smuzhiyunof sg entries it mapped them to. On failure 0 is returned.
*4882a593Smuzhiyun
*4882a593SmuzhiyunThen you should loop count times (note: this can be less than nents times)
*4882a593Smuzhiyunand use sg_dma_address() and sg_dma_len() macros where you previously
*4882a593Smuzhiyunaccessed sg->address and sg->length as shown above.
*4882a593Smuzhiyun
*4882a593SmuzhiyunTo unmap a scatterlist, just call::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	dma_unmap_sg(dev, sglist, nents, direction);
*4882a593Smuzhiyun
*4882a593SmuzhiyunAgain, make sure DMA activity has already finished.
*4882a593Smuzhiyun
*4882a593Smuzhiyun.. note::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	The 'nents' argument to the dma_unmap_sg call must be
*4882a593Smuzhiyun	the _same_ one you passed into the dma_map_sg call,
*4882a593Smuzhiyun	it should _NOT_ be the 'count' value _returned_ from the
*4882a593Smuzhiyun	dma_map_sg call.
*4882a593Smuzhiyun
*4882a593SmuzhiyunEvery dma_map_{single,sg}() call should have its dma_unmap_{single,sg}()
*4882a593Smuzhiyuncounterpart, because the DMA address space is a shared resource and
*4882a593Smuzhiyunyou could render the machine unusable by consuming all DMA addresses.
*4882a593Smuzhiyun
*4882a593SmuzhiyunIf you need to use the same streaming DMA region multiple times and touch
*4882a593Smuzhiyunthe data in between the DMA transfers, the buffer needs to be synced
*4882a593Smuzhiyunproperly in order for the CPU and device to see the most up-to-date and
*4882a593Smuzhiyuncorrect copy of the DMA buffer.
*4882a593Smuzhiyun
*4882a593SmuzhiyunSo, firstly, just map it with dma_map_{single,sg}(), and after each DMA
*4882a593Smuzhiyuntransfer call either::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	dma_sync_single_for_cpu(dev, dma_handle, size, direction);
*4882a593Smuzhiyun
*4882a593Smuzhiyunor::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	dma_sync_sg_for_cpu(dev, sglist, nents, direction);
*4882a593Smuzhiyun
*4882a593Smuzhiyunas appropriate.
*4882a593Smuzhiyun
*4882a593SmuzhiyunThen, if you wish to let the device get at the DMA area again,
*4882a593Smuzhiyunfinish accessing the data with the CPU, and then before actually
*4882a593Smuzhiyungiving the buffer to the hardware call either::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	dma_sync_single_for_device(dev, dma_handle, size, direction);
*4882a593Smuzhiyun
*4882a593Smuzhiyunor::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	dma_sync_sg_for_device(dev, sglist, nents, direction);
*4882a593Smuzhiyun
*4882a593Smuzhiyunas appropriate.
*4882a593Smuzhiyun
*4882a593Smuzhiyun.. note::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	      The 'nents' argument to dma_sync_sg_for_cpu() and
*4882a593Smuzhiyun	      dma_sync_sg_for_device() must be the same passed to
*4882a593Smuzhiyun	      dma_map_sg(). It is _NOT_ the count returned by
*4882a593Smuzhiyun	      dma_map_sg().
*4882a593Smuzhiyun
*4882a593SmuzhiyunAfter the last DMA transfer call one of the DMA unmap routines
*4882a593Smuzhiyundma_unmap_{single,sg}(). If you don't touch the data from the first
*4882a593Smuzhiyundma_map_*() call till dma_unmap_*(), then you don't have to call the
*4882a593Smuzhiyundma_sync_*() routines at all.
*4882a593Smuzhiyun
*4882a593SmuzhiyunHere is pseudo code which shows a situation in which you would need
*4882a593Smuzhiyunto use the dma_sync_*() interfaces::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	my_card_setup_receive_buffer(struct my_card *cp, char *buffer, int len)
*4882a593Smuzhiyun	{
*4882a593Smuzhiyun		dma_addr_t mapping;
*4882a593Smuzhiyun
*4882a593Smuzhiyun		mapping = dma_map_single(cp->dev, buffer, len, DMA_FROM_DEVICE);
*4882a593Smuzhiyun		if (dma_mapping_error(cp->dev, mapping)) {
*4882a593Smuzhiyun			/*
*4882a593Smuzhiyun			 * reduce current DMA mapping usage,
*4882a593Smuzhiyun			 * delay and try again later or
*4882a593Smuzhiyun			 * reset driver.
*4882a593Smuzhiyun			 */
*4882a593Smuzhiyun			goto map_error_handling;
*4882a593Smuzhiyun		}
*4882a593Smuzhiyun
*4882a593Smuzhiyun		cp->rx_buf = buffer;
*4882a593Smuzhiyun		cp->rx_len = len;
*4882a593Smuzhiyun		cp->rx_dma = mapping;
*4882a593Smuzhiyun
*4882a593Smuzhiyun		give_rx_buf_to_card(cp);
*4882a593Smuzhiyun	}
*4882a593Smuzhiyun
*4882a593Smuzhiyun	...
*4882a593Smuzhiyun
*4882a593Smuzhiyun	my_card_interrupt_handler(int irq, void *devid, struct pt_regs *regs)
*4882a593Smuzhiyun	{
*4882a593Smuzhiyun		struct my_card *cp = devid;
*4882a593Smuzhiyun
*4882a593Smuzhiyun		...
*4882a593Smuzhiyun		if (read_card_status(cp) == RX_BUF_TRANSFERRED) {
*4882a593Smuzhiyun			struct my_card_header *hp;
*4882a593Smuzhiyun
*4882a593Smuzhiyun			/* Examine the header to see if we wish
*4882a593Smuzhiyun			 * to accept the data.  But synchronize
*4882a593Smuzhiyun			 * the DMA transfer with the CPU first
*4882a593Smuzhiyun			 * so that we see updated contents.
*4882a593Smuzhiyun			 */
*4882a593Smuzhiyun			dma_sync_single_for_cpu(&cp->dev, cp->rx_dma,
*4882a593Smuzhiyun						cp->rx_len,
*4882a593Smuzhiyun						DMA_FROM_DEVICE);
*4882a593Smuzhiyun
*4882a593Smuzhiyun			/* Now it is safe to examine the buffer. */
*4882a593Smuzhiyun			hp = (struct my_card_header *) cp->rx_buf;
*4882a593Smuzhiyun			if (header_is_ok(hp)) {
*4882a593Smuzhiyun				dma_unmap_single(&cp->dev, cp->rx_dma, cp->rx_len,
*4882a593Smuzhiyun						 DMA_FROM_DEVICE);
*4882a593Smuzhiyun				pass_to_upper_layers(cp->rx_buf);
*4882a593Smuzhiyun				make_and_setup_new_rx_buf(cp);
*4882a593Smuzhiyun			} else {
*4882a593Smuzhiyun				/* CPU should not write to
*4882a593Smuzhiyun				 * DMA_FROM_DEVICE-mapped area,
*4882a593Smuzhiyun				 * so dma_sync_single_for_device() is
*4882a593Smuzhiyun				 * not needed here. It would be required
*4882a593Smuzhiyun				 * for DMA_BIDIRECTIONAL mapping if
*4882a593Smuzhiyun				 * the memory was modified.
*4882a593Smuzhiyun				 */
*4882a593Smuzhiyun				give_rx_buf_to_card(cp);
*4882a593Smuzhiyun			}
*4882a593Smuzhiyun		}
*4882a593Smuzhiyun	}
*4882a593Smuzhiyun
*4882a593SmuzhiyunDrivers converted fully to this interface should not use virt_to_bus() any
*4882a593Smuzhiyunlonger, nor should they use bus_to_virt(). Some drivers have to be changed a
*4882a593Smuzhiyunlittle bit, because there is no longer an equivalent to bus_to_virt() in the
*4882a593Smuzhiyundynamic DMA mapping scheme - you have to always store the DMA addresses
*4882a593Smuzhiyunreturned by the dma_alloc_coherent(), dma_pool_alloc(), and dma_map_single()
*4882a593Smuzhiyuncalls (dma_map_sg() stores them in the scatterlist itself if the platform
*4882a593Smuzhiyunsupports dynamic DMA mapping in hardware) in your driver structures and/or
*4882a593Smuzhiyunin the card registers.
*4882a593Smuzhiyun
*4882a593SmuzhiyunAll drivers should be using these interfaces with no exceptions.  It
*4882a593Smuzhiyunis planned to completely remove virt_to_bus() and bus_to_virt() as
*4882a593Smuzhiyunthey are entirely deprecated.  Some ports already do not provide these
*4882a593Smuzhiyunas it is impossible to correctly support them.
*4882a593Smuzhiyun
*4882a593SmuzhiyunHandling Errors
*4882a593Smuzhiyun===============
*4882a593Smuzhiyun
*4882a593SmuzhiyunDMA address space is limited on some architectures and an allocation
*4882a593Smuzhiyunfailure can be determined by:
*4882a593Smuzhiyun
*4882a593Smuzhiyun- checking if dma_alloc_coherent() returns NULL or dma_map_sg returns 0
*4882a593Smuzhiyun
*4882a593Smuzhiyun- checking the dma_addr_t returned from dma_map_single() and dma_map_page()
*4882a593Smuzhiyun  by using dma_mapping_error()::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	dma_addr_t dma_handle;
*4882a593Smuzhiyun
*4882a593Smuzhiyun	dma_handle = dma_map_single(dev, addr, size, direction);
*4882a593Smuzhiyun	if (dma_mapping_error(dev, dma_handle)) {
*4882a593Smuzhiyun		/*
*4882a593Smuzhiyun		 * reduce current DMA mapping usage,
*4882a593Smuzhiyun		 * delay and try again later or
*4882a593Smuzhiyun		 * reset driver.
*4882a593Smuzhiyun		 */
*4882a593Smuzhiyun		goto map_error_handling;
*4882a593Smuzhiyun	}
*4882a593Smuzhiyun
*4882a593Smuzhiyun- unmap pages that are already mapped, when mapping error occurs in the middle
*4882a593Smuzhiyun  of a multiple page mapping attempt. These example are applicable to
*4882a593Smuzhiyun  dma_map_page() as well.
*4882a593Smuzhiyun
*4882a593SmuzhiyunExample 1::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	dma_addr_t dma_handle1;
*4882a593Smuzhiyun	dma_addr_t dma_handle2;
*4882a593Smuzhiyun
*4882a593Smuzhiyun	dma_handle1 = dma_map_single(dev, addr, size, direction);
*4882a593Smuzhiyun	if (dma_mapping_error(dev, dma_handle1)) {
*4882a593Smuzhiyun		/*
*4882a593Smuzhiyun		 * reduce current DMA mapping usage,
*4882a593Smuzhiyun		 * delay and try again later or
*4882a593Smuzhiyun		 * reset driver.
*4882a593Smuzhiyun		 */
*4882a593Smuzhiyun		goto map_error_handling1;
*4882a593Smuzhiyun	}
*4882a593Smuzhiyun	dma_handle2 = dma_map_single(dev, addr, size, direction);
*4882a593Smuzhiyun	if (dma_mapping_error(dev, dma_handle2)) {
*4882a593Smuzhiyun		/*
*4882a593Smuzhiyun		 * reduce current DMA mapping usage,
*4882a593Smuzhiyun		 * delay and try again later or
*4882a593Smuzhiyun		 * reset driver.
*4882a593Smuzhiyun		 */
*4882a593Smuzhiyun		goto map_error_handling2;
*4882a593Smuzhiyun	}
*4882a593Smuzhiyun
*4882a593Smuzhiyun	...
*4882a593Smuzhiyun
*4882a593Smuzhiyun	map_error_handling2:
*4882a593Smuzhiyun		dma_unmap_single(dma_handle1);
*4882a593Smuzhiyun	map_error_handling1:
*4882a593Smuzhiyun
*4882a593SmuzhiyunExample 2::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	/*
*4882a593Smuzhiyun	 * if buffers are allocated in a loop, unmap all mapped buffers when
*4882a593Smuzhiyun	 * mapping error is detected in the middle
*4882a593Smuzhiyun	 */
*4882a593Smuzhiyun
*4882a593Smuzhiyun	dma_addr_t dma_addr;
*4882a593Smuzhiyun	dma_addr_t array[DMA_BUFFERS];
*4882a593Smuzhiyun	int save_index = 0;
*4882a593Smuzhiyun
*4882a593Smuzhiyun	for (i = 0; i < DMA_BUFFERS; i++) {
*4882a593Smuzhiyun
*4882a593Smuzhiyun		...
*4882a593Smuzhiyun
*4882a593Smuzhiyun		dma_addr = dma_map_single(dev, addr, size, direction);
*4882a593Smuzhiyun		if (dma_mapping_error(dev, dma_addr)) {
*4882a593Smuzhiyun			/*
*4882a593Smuzhiyun			 * reduce current DMA mapping usage,
*4882a593Smuzhiyun			 * delay and try again later or
*4882a593Smuzhiyun			 * reset driver.
*4882a593Smuzhiyun			 */
*4882a593Smuzhiyun			goto map_error_handling;
*4882a593Smuzhiyun		}
*4882a593Smuzhiyun		array[i].dma_addr = dma_addr;
*4882a593Smuzhiyun		save_index++;
*4882a593Smuzhiyun	}
*4882a593Smuzhiyun
*4882a593Smuzhiyun	...
*4882a593Smuzhiyun
*4882a593Smuzhiyun	map_error_handling:
*4882a593Smuzhiyun
*4882a593Smuzhiyun	for (i = 0; i < save_index; i++) {
*4882a593Smuzhiyun
*4882a593Smuzhiyun		...
*4882a593Smuzhiyun
*4882a593Smuzhiyun		dma_unmap_single(array[i].dma_addr);
*4882a593Smuzhiyun	}
*4882a593Smuzhiyun
*4882a593SmuzhiyunNetworking drivers must call dev_kfree_skb() to free the socket buffer
*4882a593Smuzhiyunand return NETDEV_TX_OK if the DMA mapping fails on the transmit hook
*4882a593Smuzhiyun(ndo_start_xmit). This means that the socket buffer is just dropped in
*4882a593Smuzhiyunthe failure case.
*4882a593Smuzhiyun
*4882a593SmuzhiyunSCSI drivers must return SCSI_MLQUEUE_HOST_BUSY if the DMA mapping
*4882a593Smuzhiyunfails in the queuecommand hook. This means that the SCSI subsystem
*4882a593Smuzhiyunpasses the command to the driver again later.
*4882a593Smuzhiyun
*4882a593SmuzhiyunOptimizing Unmap State Space Consumption
*4882a593Smuzhiyun========================================
*4882a593Smuzhiyun
*4882a593SmuzhiyunOn many platforms, dma_unmap_{single,page}() is simply a nop.
*4882a593SmuzhiyunTherefore, keeping track of the mapping address and length is a waste
*4882a593Smuzhiyunof space.  Instead of filling your drivers up with ifdefs and the like
*4882a593Smuzhiyunto "work around" this (which would defeat the whole purpose of a
*4882a593Smuzhiyunportable API) the following facilities are provided.
*4882a593Smuzhiyun
*4882a593SmuzhiyunActually, instead of describing the macros one by one, we'll
*4882a593Smuzhiyuntransform some example code.
*4882a593Smuzhiyun
*4882a593Smuzhiyun1) Use DEFINE_DMA_UNMAP_{ADDR,LEN} in state saving structures.
*4882a593Smuzhiyun   Example, before::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	struct ring_state {
*4882a593Smuzhiyun		struct sk_buff *skb;
*4882a593Smuzhiyun		dma_addr_t mapping;
*4882a593Smuzhiyun		__u32 len;
*4882a593Smuzhiyun	};
*4882a593Smuzhiyun
*4882a593Smuzhiyun   after::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	struct ring_state {
*4882a593Smuzhiyun		struct sk_buff *skb;
*4882a593Smuzhiyun		DEFINE_DMA_UNMAP_ADDR(mapping);
*4882a593Smuzhiyun		DEFINE_DMA_UNMAP_LEN(len);
*4882a593Smuzhiyun	};
*4882a593Smuzhiyun
*4882a593Smuzhiyun2) Use dma_unmap_{addr,len}_set() to set these values.
*4882a593Smuzhiyun   Example, before::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	ringp->mapping = FOO;
*4882a593Smuzhiyun	ringp->len = BAR;
*4882a593Smuzhiyun
*4882a593Smuzhiyun   after::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	dma_unmap_addr_set(ringp, mapping, FOO);
*4882a593Smuzhiyun	dma_unmap_len_set(ringp, len, BAR);
*4882a593Smuzhiyun
*4882a593Smuzhiyun3) Use dma_unmap_{addr,len}() to access these values.
*4882a593Smuzhiyun   Example, before::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	dma_unmap_single(dev, ringp->mapping, ringp->len,
*4882a593Smuzhiyun			 DMA_FROM_DEVICE);
*4882a593Smuzhiyun
*4882a593Smuzhiyun   after::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	dma_unmap_single(dev,
*4882a593Smuzhiyun			 dma_unmap_addr(ringp, mapping),
*4882a593Smuzhiyun			 dma_unmap_len(ringp, len),
*4882a593Smuzhiyun			 DMA_FROM_DEVICE);
*4882a593Smuzhiyun
*4882a593SmuzhiyunIt really should be self-explanatory.  We treat the ADDR and LEN
*4882a593Smuzhiyunseparately, because it is possible for an implementation to only
*4882a593Smuzhiyunneed the address in order to perform the unmap operation.
*4882a593Smuzhiyun
*4882a593SmuzhiyunPlatform Issues
*4882a593Smuzhiyun===============
*4882a593Smuzhiyun
*4882a593SmuzhiyunIf you are just writing drivers for Linux and do not maintain
*4882a593Smuzhiyunan architecture port for the kernel, you can safely skip down
*4882a593Smuzhiyunto "Closing".
*4882a593Smuzhiyun
*4882a593Smuzhiyun1) Struct scatterlist requirements.
*4882a593Smuzhiyun
*4882a593Smuzhiyun   You need to enable CONFIG_NEED_SG_DMA_LENGTH if the architecture
*4882a593Smuzhiyun   supports IOMMUs (including software IOMMU).
*4882a593Smuzhiyun
*4882a593Smuzhiyun2) ARCH_DMA_MINALIGN
*4882a593Smuzhiyun
*4882a593Smuzhiyun   Architectures must ensure that kmalloc'ed buffer is
*4882a593Smuzhiyun   DMA-safe. Drivers and subsystems depend on it. If an architecture
*4882a593Smuzhiyun   isn't fully DMA-coherent (i.e. hardware doesn't ensure that data in
*4882a593Smuzhiyun   the CPU cache is identical to data in main memory),
*4882a593Smuzhiyun   ARCH_DMA_MINALIGN must be set so that the memory allocator
*4882a593Smuzhiyun   makes sure that kmalloc'ed buffer doesn't share a cache line with
*4882a593Smuzhiyun   the others. See arch/arm/include/asm/cache.h as an example.
*4882a593Smuzhiyun
*4882a593Smuzhiyun   Note that ARCH_DMA_MINALIGN is about DMA memory alignment
*4882a593Smuzhiyun   constraints. You don't need to worry about the architecture data
*4882a593Smuzhiyun   alignment constraints (e.g. the alignment constraints about 64-bit
*4882a593Smuzhiyun   objects).
*4882a593Smuzhiyun
*4882a593SmuzhiyunClosing
*4882a593Smuzhiyun=======
*4882a593Smuzhiyun
*4882a593SmuzhiyunThis document, and the API itself, would not be in its current
*4882a593Smuzhiyunform without the feedback and suggestions from numerous individuals.
*4882a593SmuzhiyunWe would like to specifically mention, in no particular order, the
*4882a593Smuzhiyunfollowing people::
*4882a593Smuzhiyun
*4882a593Smuzhiyun	Russell King <rmk@arm.linux.org.uk>
*4882a593Smuzhiyun	Leo Dagum <dagum@barrel.engr.sgi.com>
*4882a593Smuzhiyun	Ralf Baechle <ralf@oss.sgi.com>
*4882a593Smuzhiyun	Grant Grundler <grundler@cup.hp.com>
*4882a593Smuzhiyun	Jay Estabrook <Jay.Estabrook@compaq.com>
*4882a593Smuzhiyun	Thomas Sailer <sailer@ife.ee.ethz.ch>
*4882a593Smuzhiyun	Andrea Arcangeli <andrea@suse.de>
*4882a593Smuzhiyun	Jens Axboe <jens.axboe@oracle.com>
*4882a593Smuzhiyun	David Mosberger-Tang <davidm@hpl.hp.com>