xref: /OK3568_Linux_fs/kernel/Documentation/core-api/dma-api-howto.rst (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593Smuzhiyun=========================
2*4882a593SmuzhiyunDynamic DMA mapping Guide
3*4882a593Smuzhiyun=========================
4*4882a593Smuzhiyun
5*4882a593Smuzhiyun:Author: David S. Miller <davem@redhat.com>
6*4882a593Smuzhiyun:Author: Richard Henderson <rth@cygnus.com>
7*4882a593Smuzhiyun:Author: Jakub Jelinek <jakub@redhat.com>
8*4882a593Smuzhiyun
9*4882a593SmuzhiyunThis is a guide to device driver writers on how to use the DMA API
10*4882a593Smuzhiyunwith example pseudo-code.  For a concise description of the API, see
11*4882a593SmuzhiyunDMA-API.txt.
12*4882a593Smuzhiyun
13*4882a593SmuzhiyunCPU and DMA addresses
14*4882a593Smuzhiyun=====================
15*4882a593Smuzhiyun
16*4882a593SmuzhiyunThere are several kinds of addresses involved in the DMA API, and it's
17*4882a593Smuzhiyunimportant to understand the differences.
18*4882a593Smuzhiyun
19*4882a593SmuzhiyunThe kernel normally uses virtual addresses.  Any address returned by
20*4882a593Smuzhiyunkmalloc(), vmalloc(), and similar interfaces is a virtual address and can
21*4882a593Smuzhiyunbe stored in a ``void *``.
22*4882a593Smuzhiyun
23*4882a593SmuzhiyunThe virtual memory system (TLB, page tables, etc.) translates virtual
24*4882a593Smuzhiyunaddresses to CPU physical addresses, which are stored as "phys_addr_t" or
25*4882a593Smuzhiyun"resource_size_t".  The kernel manages device resources like registers as
26*4882a593Smuzhiyunphysical addresses.  These are the addresses in /proc/iomem.  The physical
27*4882a593Smuzhiyunaddress is not directly useful to a driver; it must use ioremap() to map
28*4882a593Smuzhiyunthe space and produce a virtual address.
29*4882a593Smuzhiyun
30*4882a593SmuzhiyunI/O devices use a third kind of address: a "bus address".  If a device has
31*4882a593Smuzhiyunregisters at an MMIO address, or if it performs DMA to read or write system
32*4882a593Smuzhiyunmemory, the addresses used by the device are bus addresses.  In some
33*4882a593Smuzhiyunsystems, bus addresses are identical to CPU physical addresses, but in
34*4882a593Smuzhiyungeneral they are not.  IOMMUs and host bridges can produce arbitrary
35*4882a593Smuzhiyunmappings between physical and bus addresses.
36*4882a593Smuzhiyun
37*4882a593SmuzhiyunFrom a device's point of view, DMA uses the bus address space, but it may
38*4882a593Smuzhiyunbe restricted to a subset of that space.  For example, even if a system
39*4882a593Smuzhiyunsupports 64-bit addresses for main memory and PCI BARs, it may use an IOMMU
40*4882a593Smuzhiyunso devices only need to use 32-bit DMA addresses.
41*4882a593Smuzhiyun
42*4882a593SmuzhiyunHere's a picture and some examples::
43*4882a593Smuzhiyun
44*4882a593Smuzhiyun               CPU                  CPU                  Bus
45*4882a593Smuzhiyun             Virtual              Physical             Address
46*4882a593Smuzhiyun             Address              Address               Space
47*4882a593Smuzhiyun              Space                Space
48*4882a593Smuzhiyun
49*4882a593Smuzhiyun            +-------+             +------+             +------+
50*4882a593Smuzhiyun            |       |             |MMIO  |   Offset    |      |
51*4882a593Smuzhiyun            |       |  Virtual    |Space |   applied   |      |
52*4882a593Smuzhiyun          C +-------+ --------> B +------+ ----------> +------+ A
53*4882a593Smuzhiyun            |       |  mapping    |      |   by host   |      |
54*4882a593Smuzhiyun  +-----+   |       |             |      |   bridge    |      |   +--------+
55*4882a593Smuzhiyun  |     |   |       |             +------+             |      |   |        |
56*4882a593Smuzhiyun  | CPU |   |       |             | RAM  |             |      |   | Device |
57*4882a593Smuzhiyun  |     |   |       |             |      |             |      |   |        |
58*4882a593Smuzhiyun  +-----+   +-------+             +------+             +------+   +--------+
59*4882a593Smuzhiyun            |       |  Virtual    |Buffer|   Mapping   |      |
60*4882a593Smuzhiyun          X +-------+ --------> Y +------+ <---------- +------+ Z
61*4882a593Smuzhiyun            |       |  mapping    | RAM  |   by IOMMU
62*4882a593Smuzhiyun            |       |             |      |
63*4882a593Smuzhiyun            |       |             |      |
64*4882a593Smuzhiyun            +-------+             +------+
65*4882a593Smuzhiyun
66*4882a593SmuzhiyunDuring the enumeration process, the kernel learns about I/O devices and
67*4882a593Smuzhiyuntheir MMIO space and the host bridges that connect them to the system.  For
68*4882a593Smuzhiyunexample, if a PCI device has a BAR, the kernel reads the bus address (A)
69*4882a593Smuzhiyunfrom the BAR and converts it to a CPU physical address (B).  The address B
70*4882a593Smuzhiyunis stored in a struct resource and usually exposed via /proc/iomem.  When a
71*4882a593Smuzhiyundriver claims a device, it typically uses ioremap() to map physical address
72*4882a593SmuzhiyunB at a virtual address (C).  It can then use, e.g., ioread32(C), to access
73*4882a593Smuzhiyunthe device registers at bus address A.
74*4882a593Smuzhiyun
75*4882a593SmuzhiyunIf the device supports DMA, the driver sets up a buffer using kmalloc() or
76*4882a593Smuzhiyuna similar interface, which returns a virtual address (X).  The virtual
77*4882a593Smuzhiyunmemory system maps X to a physical address (Y) in system RAM.  The driver
78*4882a593Smuzhiyuncan use virtual address X to access the buffer, but the device itself
79*4882a593Smuzhiyuncannot because DMA doesn't go through the CPU virtual memory system.
80*4882a593Smuzhiyun
81*4882a593SmuzhiyunIn some simple systems, the device can do DMA directly to physical address
82*4882a593SmuzhiyunY.  But in many others, there is IOMMU hardware that translates DMA
83*4882a593Smuzhiyunaddresses to physical addresses, e.g., it translates Z to Y.  This is part
84*4882a593Smuzhiyunof the reason for the DMA API: the driver can give a virtual address X to
85*4882a593Smuzhiyunan interface like dma_map_single(), which sets up any required IOMMU
86*4882a593Smuzhiyunmapping and returns the DMA address Z.  The driver then tells the device to
87*4882a593Smuzhiyundo DMA to Z, and the IOMMU maps it to the buffer at address Y in system
88*4882a593SmuzhiyunRAM.
89*4882a593Smuzhiyun
90*4882a593SmuzhiyunSo that Linux can use the dynamic DMA mapping, it needs some help from the
91*4882a593Smuzhiyundrivers, namely it has to take into account that DMA addresses should be
92*4882a593Smuzhiyunmapped only for the time they are actually used and unmapped after the DMA
93*4882a593Smuzhiyuntransfer.
94*4882a593Smuzhiyun
95*4882a593SmuzhiyunThe following API will work of course even on platforms where no such
96*4882a593Smuzhiyunhardware exists.
97*4882a593Smuzhiyun
98*4882a593SmuzhiyunNote that the DMA API works with any bus independent of the underlying
99*4882a593Smuzhiyunmicroprocessor architecture. You should use the DMA API rather than the
100*4882a593Smuzhiyunbus-specific DMA API, i.e., use the dma_map_*() interfaces rather than the
101*4882a593Smuzhiyunpci_map_*() interfaces.
102*4882a593Smuzhiyun
103*4882a593SmuzhiyunFirst of all, you should make sure::
104*4882a593Smuzhiyun
105*4882a593Smuzhiyun	#include <linux/dma-mapping.h>
106*4882a593Smuzhiyun
107*4882a593Smuzhiyunis in your driver, which provides the definition of dma_addr_t.  This type
108*4882a593Smuzhiyuncan hold any valid DMA address for the platform and should be used
109*4882a593Smuzhiyuneverywhere you hold a DMA address returned from the DMA mapping functions.
110*4882a593Smuzhiyun
111*4882a593SmuzhiyunWhat memory is DMA'able?
112*4882a593Smuzhiyun========================
113*4882a593Smuzhiyun
114*4882a593SmuzhiyunThe first piece of information you must know is what kernel memory can
115*4882a593Smuzhiyunbe used with the DMA mapping facilities.  There has been an unwritten
116*4882a593Smuzhiyunset of rules regarding this, and this text is an attempt to finally
117*4882a593Smuzhiyunwrite them down.
118*4882a593Smuzhiyun
119*4882a593SmuzhiyunIf you acquired your memory via the page allocator
120*4882a593Smuzhiyun(i.e. __get_free_page*()) or the generic memory allocators
121*4882a593Smuzhiyun(i.e. kmalloc() or kmem_cache_alloc()) then you may DMA to/from
122*4882a593Smuzhiyunthat memory using the addresses returned from those routines.
123*4882a593Smuzhiyun
124*4882a593SmuzhiyunThis means specifically that you may _not_ use the memory/addresses
125*4882a593Smuzhiyunreturned from vmalloc() for DMA.  It is possible to DMA to the
126*4882a593Smuzhiyun_underlying_ memory mapped into a vmalloc() area, but this requires
127*4882a593Smuzhiyunwalking page tables to get the physical addresses, and then
128*4882a593Smuzhiyuntranslating each of those pages back to a kernel address using
129*4882a593Smuzhiyunsomething like __va().  [ EDIT: Update this when we integrate
130*4882a593SmuzhiyunGerd Knorr's generic code which does this. ]
131*4882a593Smuzhiyun
132*4882a593SmuzhiyunThis rule also means that you may use neither kernel image addresses
133*4882a593Smuzhiyun(items in data/text/bss segments), nor module image addresses, nor
134*4882a593Smuzhiyunstack addresses for DMA.  These could all be mapped somewhere entirely
135*4882a593Smuzhiyundifferent than the rest of physical memory.  Even if those classes of
136*4882a593Smuzhiyunmemory could physically work with DMA, you'd need to ensure the I/O
137*4882a593Smuzhiyunbuffers were cacheline-aligned.  Without that, you'd see cacheline
138*4882a593Smuzhiyunsharing problems (data corruption) on CPUs with DMA-incoherent caches.
139*4882a593Smuzhiyun(The CPU could write to one word, DMA would write to a different one
140*4882a593Smuzhiyunin the same cache line, and one of them could be overwritten.)
141*4882a593Smuzhiyun
142*4882a593SmuzhiyunAlso, this means that you cannot take the return of a kmap()
143*4882a593Smuzhiyuncall and DMA to/from that.  This is similar to vmalloc().
144*4882a593Smuzhiyun
145*4882a593SmuzhiyunWhat about block I/O and networking buffers?  The block I/O and
146*4882a593Smuzhiyunnetworking subsystems make sure that the buffers they use are valid
147*4882a593Smuzhiyunfor you to DMA from/to.
148*4882a593Smuzhiyun
149*4882a593SmuzhiyunDMA addressing capabilities
150*4882a593Smuzhiyun===========================
151*4882a593Smuzhiyun
152*4882a593SmuzhiyunBy default, the kernel assumes that your device can address 32-bits of DMA
153*4882a593Smuzhiyunaddressing.  For a 64-bit capable device, this needs to be increased, and for
154*4882a593Smuzhiyuna device with limitations, it needs to be decreased.
155*4882a593Smuzhiyun
156*4882a593SmuzhiyunSpecial note about PCI: PCI-X specification requires PCI-X devices to support
157*4882a593Smuzhiyun64-bit addressing (DAC) for all transactions.  And at least one platform (SGI
158*4882a593SmuzhiyunSN2) requires 64-bit consistent allocations to operate correctly when the IO
159*4882a593Smuzhiyunbus is in PCI-X mode.
160*4882a593Smuzhiyun
161*4882a593SmuzhiyunFor correct operation, you must set the DMA mask to inform the kernel about
162*4882a593Smuzhiyunyour devices DMA addressing capabilities.
163*4882a593Smuzhiyun
164*4882a593SmuzhiyunThis is performed via a call to dma_set_mask_and_coherent()::
165*4882a593Smuzhiyun
166*4882a593Smuzhiyun	int dma_set_mask_and_coherent(struct device *dev, u64 mask);
167*4882a593Smuzhiyun
168*4882a593Smuzhiyunwhich will set the mask for both streaming and coherent APIs together.  If you
169*4882a593Smuzhiyunhave some special requirements, then the following two separate calls can be
170*4882a593Smuzhiyunused instead:
171*4882a593Smuzhiyun
172*4882a593Smuzhiyun	The setup for streaming mappings is performed via a call to
173*4882a593Smuzhiyun	dma_set_mask()::
174*4882a593Smuzhiyun
175*4882a593Smuzhiyun		int dma_set_mask(struct device *dev, u64 mask);
176*4882a593Smuzhiyun
177*4882a593Smuzhiyun	The setup for consistent allocations is performed via a call
178*4882a593Smuzhiyun	to dma_set_coherent_mask()::
179*4882a593Smuzhiyun
180*4882a593Smuzhiyun		int dma_set_coherent_mask(struct device *dev, u64 mask);
181*4882a593Smuzhiyun
182*4882a593SmuzhiyunHere, dev is a pointer to the device struct of your device, and mask is a bit
183*4882a593Smuzhiyunmask describing which bits of an address your device supports.  Often the
184*4882a593Smuzhiyundevice struct of your device is embedded in the bus-specific device struct of
185*4882a593Smuzhiyunyour device.  For example, &pdev->dev is a pointer to the device struct of a
186*4882a593SmuzhiyunPCI device (pdev is a pointer to the PCI device struct of your device).
187*4882a593Smuzhiyun
188*4882a593SmuzhiyunThese calls usually return zero to indicated your device can perform DMA
189*4882a593Smuzhiyunproperly on the machine given the address mask you provided, but they might
190*4882a593Smuzhiyunreturn an error if the mask is too small to be supportable on the given
191*4882a593Smuzhiyunsystem.  If it returns non-zero, your device cannot perform DMA properly on
192*4882a593Smuzhiyunthis platform, and attempting to do so will result in undefined behavior.
193*4882a593SmuzhiyunYou must not use DMA on this device unless the dma_set_mask family of
194*4882a593Smuzhiyunfunctions has returned success.
195*4882a593Smuzhiyun
196*4882a593SmuzhiyunThis means that in the failure case, you have two options:
197*4882a593Smuzhiyun
198*4882a593Smuzhiyun1) Use some non-DMA mode for data transfer, if possible.
199*4882a593Smuzhiyun2) Ignore this device and do not initialize it.
200*4882a593Smuzhiyun
201*4882a593SmuzhiyunIt is recommended that your driver print a kernel KERN_WARNING message when
202*4882a593Smuzhiyunsetting the DMA mask fails.  In this manner, if a user of your driver reports
203*4882a593Smuzhiyunthat performance is bad or that the device is not even detected, you can ask
204*4882a593Smuzhiyunthem for the kernel messages to find out exactly why.
205*4882a593Smuzhiyun
206*4882a593SmuzhiyunThe standard 64-bit addressing device would do something like this::
207*4882a593Smuzhiyun
208*4882a593Smuzhiyun	if (dma_set_mask_and_coherent(dev, DMA_BIT_MASK(64))) {
209*4882a593Smuzhiyun		dev_warn(dev, "mydev: No suitable DMA available\n");
210*4882a593Smuzhiyun		goto ignore_this_device;
211*4882a593Smuzhiyun	}
212*4882a593Smuzhiyun
213*4882a593SmuzhiyunIf the device only supports 32-bit addressing for descriptors in the
214*4882a593Smuzhiyuncoherent allocations, but supports full 64-bits for streaming mappings
215*4882a593Smuzhiyunit would look like this::
216*4882a593Smuzhiyun
217*4882a593Smuzhiyun	if (dma_set_mask(dev, DMA_BIT_MASK(64))) {
218*4882a593Smuzhiyun		dev_warn(dev, "mydev: No suitable DMA available\n");
219*4882a593Smuzhiyun		goto ignore_this_device;
220*4882a593Smuzhiyun	}
221*4882a593Smuzhiyun
222*4882a593SmuzhiyunThe coherent mask will always be able to set the same or a smaller mask as
223*4882a593Smuzhiyunthe streaming mask. However for the rare case that a device driver only
224*4882a593Smuzhiyunuses consistent allocations, one would have to check the return value from
225*4882a593Smuzhiyundma_set_coherent_mask().
226*4882a593Smuzhiyun
227*4882a593SmuzhiyunFinally, if your device can only drive the low 24-bits of
228*4882a593Smuzhiyunaddress you might do something like::
229*4882a593Smuzhiyun
230*4882a593Smuzhiyun	if (dma_set_mask(dev, DMA_BIT_MASK(24))) {
231*4882a593Smuzhiyun		dev_warn(dev, "mydev: 24-bit DMA addressing not available\n");
232*4882a593Smuzhiyun		goto ignore_this_device;
233*4882a593Smuzhiyun	}
234*4882a593Smuzhiyun
235*4882a593SmuzhiyunWhen dma_set_mask() or dma_set_mask_and_coherent() is successful, and
236*4882a593Smuzhiyunreturns zero, the kernel saves away this mask you have provided.  The
237*4882a593Smuzhiyunkernel will use this information later when you make DMA mappings.
238*4882a593Smuzhiyun
239*4882a593SmuzhiyunThere is a case which we are aware of at this time, which is worth
240*4882a593Smuzhiyunmentioning in this documentation.  If your device supports multiple
241*4882a593Smuzhiyunfunctions (for example a sound card provides playback and record
242*4882a593Smuzhiyunfunctions) and the various different functions have _different_
243*4882a593SmuzhiyunDMA addressing limitations, you may wish to probe each mask and
244*4882a593Smuzhiyunonly provide the functionality which the machine can handle.  It
245*4882a593Smuzhiyunis important that the last call to dma_set_mask() be for the
246*4882a593Smuzhiyunmost specific mask.
247*4882a593Smuzhiyun
248*4882a593SmuzhiyunHere is pseudo-code showing how this might be done::
249*4882a593Smuzhiyun
250*4882a593Smuzhiyun	#define PLAYBACK_ADDRESS_BITS	DMA_BIT_MASK(32)
251*4882a593Smuzhiyun	#define RECORD_ADDRESS_BITS	DMA_BIT_MASK(24)
252*4882a593Smuzhiyun
253*4882a593Smuzhiyun	struct my_sound_card *card;
254*4882a593Smuzhiyun	struct device *dev;
255*4882a593Smuzhiyun
256*4882a593Smuzhiyun	...
257*4882a593Smuzhiyun	if (!dma_set_mask(dev, PLAYBACK_ADDRESS_BITS)) {
258*4882a593Smuzhiyun		card->playback_enabled = 1;
259*4882a593Smuzhiyun	} else {
260*4882a593Smuzhiyun		card->playback_enabled = 0;
261*4882a593Smuzhiyun		dev_warn(dev, "%s: Playback disabled due to DMA limitations\n",
262*4882a593Smuzhiyun		       card->name);
263*4882a593Smuzhiyun	}
264*4882a593Smuzhiyun	if (!dma_set_mask(dev, RECORD_ADDRESS_BITS)) {
265*4882a593Smuzhiyun		card->record_enabled = 1;
266*4882a593Smuzhiyun	} else {
267*4882a593Smuzhiyun		card->record_enabled = 0;
268*4882a593Smuzhiyun		dev_warn(dev, "%s: Record disabled due to DMA limitations\n",
269*4882a593Smuzhiyun		       card->name);
270*4882a593Smuzhiyun	}
271*4882a593Smuzhiyun
272*4882a593SmuzhiyunA sound card was used as an example here because this genre of PCI
273*4882a593Smuzhiyundevices seems to be littered with ISA chips given a PCI front end,
274*4882a593Smuzhiyunand thus retaining the 16MB DMA addressing limitations of ISA.
275*4882a593Smuzhiyun
276*4882a593SmuzhiyunTypes of DMA mappings
277*4882a593Smuzhiyun=====================
278*4882a593Smuzhiyun
279*4882a593SmuzhiyunThere are two types of DMA mappings:
280*4882a593Smuzhiyun
281*4882a593Smuzhiyun- Consistent DMA mappings which are usually mapped at driver
282*4882a593Smuzhiyun  initialization, unmapped at the end and for which the hardware should
283*4882a593Smuzhiyun  guarantee that the device and the CPU can access the data
284*4882a593Smuzhiyun  in parallel and will see updates made by each other without any
285*4882a593Smuzhiyun  explicit software flushing.
286*4882a593Smuzhiyun
287*4882a593Smuzhiyun  Think of "consistent" as "synchronous" or "coherent".
288*4882a593Smuzhiyun
289*4882a593Smuzhiyun  The current default is to return consistent memory in the low 32
290*4882a593Smuzhiyun  bits of the DMA space.  However, for future compatibility you should
291*4882a593Smuzhiyun  set the consistent mask even if this default is fine for your
292*4882a593Smuzhiyun  driver.
293*4882a593Smuzhiyun
294*4882a593Smuzhiyun  Good examples of what to use consistent mappings for are:
295*4882a593Smuzhiyun
296*4882a593Smuzhiyun	- Network card DMA ring descriptors.
297*4882a593Smuzhiyun	- SCSI adapter mailbox command data structures.
298*4882a593Smuzhiyun	- Device firmware microcode executed out of
299*4882a593Smuzhiyun	  main memory.
300*4882a593Smuzhiyun
301*4882a593Smuzhiyun  The invariant these examples all require is that any CPU store
302*4882a593Smuzhiyun  to memory is immediately visible to the device, and vice
303*4882a593Smuzhiyun  versa.  Consistent mappings guarantee this.
304*4882a593Smuzhiyun
305*4882a593Smuzhiyun  .. important::
306*4882a593Smuzhiyun
307*4882a593Smuzhiyun	     Consistent DMA memory does not preclude the usage of
308*4882a593Smuzhiyun	     proper memory barriers.  The CPU may reorder stores to
309*4882a593Smuzhiyun	     consistent memory just as it may normal memory.  Example:
310*4882a593Smuzhiyun	     if it is important for the device to see the first word
311*4882a593Smuzhiyun	     of a descriptor updated before the second, you must do
312*4882a593Smuzhiyun	     something like::
313*4882a593Smuzhiyun
314*4882a593Smuzhiyun		desc->word0 = address;
315*4882a593Smuzhiyun		wmb();
316*4882a593Smuzhiyun		desc->word1 = DESC_VALID;
317*4882a593Smuzhiyun
318*4882a593Smuzhiyun             in order to get correct behavior on all platforms.
319*4882a593Smuzhiyun
320*4882a593Smuzhiyun	     Also, on some platforms your driver may need to flush CPU write
321*4882a593Smuzhiyun	     buffers in much the same way as it needs to flush write buffers
322*4882a593Smuzhiyun	     found in PCI bridges (such as by reading a register's value
323*4882a593Smuzhiyun	     after writing it).
324*4882a593Smuzhiyun
325*4882a593Smuzhiyun- Streaming DMA mappings which are usually mapped for one DMA
326*4882a593Smuzhiyun  transfer, unmapped right after it (unless you use dma_sync_* below)
327*4882a593Smuzhiyun  and for which hardware can optimize for sequential accesses.
328*4882a593Smuzhiyun
329*4882a593Smuzhiyun  Think of "streaming" as "asynchronous" or "outside the coherency
330*4882a593Smuzhiyun  domain".
331*4882a593Smuzhiyun
332*4882a593Smuzhiyun  Good examples of what to use streaming mappings for are:
333*4882a593Smuzhiyun
334*4882a593Smuzhiyun	- Networking buffers transmitted/received by a device.
335*4882a593Smuzhiyun	- Filesystem buffers written/read by a SCSI device.
336*4882a593Smuzhiyun
337*4882a593Smuzhiyun  The interfaces for using this type of mapping were designed in
338*4882a593Smuzhiyun  such a way that an implementation can make whatever performance
339*4882a593Smuzhiyun  optimizations the hardware allows.  To this end, when using
340*4882a593Smuzhiyun  such mappings you must be explicit about what you want to happen.
341*4882a593Smuzhiyun
342*4882a593SmuzhiyunNeither type of DMA mapping has alignment restrictions that come from
343*4882a593Smuzhiyunthe underlying bus, although some devices may have such restrictions.
344*4882a593SmuzhiyunAlso, systems with caches that aren't DMA-coherent will work better
345*4882a593Smuzhiyunwhen the underlying buffers don't share cache lines with other data.
346*4882a593Smuzhiyun
347*4882a593Smuzhiyun
348*4882a593SmuzhiyunUsing Consistent DMA mappings
349*4882a593Smuzhiyun=============================
350*4882a593Smuzhiyun
351*4882a593SmuzhiyunTo allocate and map large (PAGE_SIZE or so) consistent DMA regions,
352*4882a593Smuzhiyunyou should do::
353*4882a593Smuzhiyun
354*4882a593Smuzhiyun	dma_addr_t dma_handle;
355*4882a593Smuzhiyun
356*4882a593Smuzhiyun	cpu_addr = dma_alloc_coherent(dev, size, &dma_handle, gfp);
357*4882a593Smuzhiyun
358*4882a593Smuzhiyunwhere device is a ``struct device *``. This may be called in interrupt
359*4882a593Smuzhiyuncontext with the GFP_ATOMIC flag.
360*4882a593Smuzhiyun
361*4882a593SmuzhiyunSize is the length of the region you want to allocate, in bytes.
362*4882a593Smuzhiyun
363*4882a593SmuzhiyunThis routine will allocate RAM for that region, so it acts similarly to
364*4882a593Smuzhiyun__get_free_pages() (but takes size instead of a page order).  If your
365*4882a593Smuzhiyundriver needs regions sized smaller than a page, you may prefer using
366*4882a593Smuzhiyunthe dma_pool interface, described below.
367*4882a593Smuzhiyun
368*4882a593SmuzhiyunThe consistent DMA mapping interfaces, will by default return a DMA address
369*4882a593Smuzhiyunwhich is 32-bit addressable.  Even if the device indicates (via the DMA mask)
370*4882a593Smuzhiyunthat it may address the upper 32-bits, consistent allocation will only
371*4882a593Smuzhiyunreturn > 32-bit addresses for DMA if the consistent DMA mask has been
372*4882a593Smuzhiyunexplicitly changed via dma_set_coherent_mask().  This is true of the
373*4882a593Smuzhiyundma_pool interface as well.
374*4882a593Smuzhiyun
375*4882a593Smuzhiyundma_alloc_coherent() returns two values: the virtual address which you
376*4882a593Smuzhiyuncan use to access it from the CPU and dma_handle which you pass to the
377*4882a593Smuzhiyuncard.
378*4882a593Smuzhiyun
379*4882a593SmuzhiyunThe CPU virtual address and the DMA address are both
380*4882a593Smuzhiyunguaranteed to be aligned to the smallest PAGE_SIZE order which
381*4882a593Smuzhiyunis greater than or equal to the requested size.  This invariant
382*4882a593Smuzhiyunexists (for example) to guarantee that if you allocate a chunk
383*4882a593Smuzhiyunwhich is smaller than or equal to 64 kilobytes, the extent of the
384*4882a593Smuzhiyunbuffer you receive will not cross a 64K boundary.
385*4882a593Smuzhiyun
386*4882a593SmuzhiyunTo unmap and free such a DMA region, you call::
387*4882a593Smuzhiyun
388*4882a593Smuzhiyun	dma_free_coherent(dev, size, cpu_addr, dma_handle);
389*4882a593Smuzhiyun
390*4882a593Smuzhiyunwhere dev, size are the same as in the above call and cpu_addr and
391*4882a593Smuzhiyundma_handle are the values dma_alloc_coherent() returned to you.
392*4882a593SmuzhiyunThis function may not be called in interrupt context.
393*4882a593Smuzhiyun
394*4882a593SmuzhiyunIf your driver needs lots of smaller memory regions, you can write
395*4882a593Smuzhiyuncustom code to subdivide pages returned by dma_alloc_coherent(),
396*4882a593Smuzhiyunor you can use the dma_pool API to do that.  A dma_pool is like
397*4882a593Smuzhiyuna kmem_cache, but it uses dma_alloc_coherent(), not __get_free_pages().
398*4882a593SmuzhiyunAlso, it understands common hardware constraints for alignment,
399*4882a593Smuzhiyunlike queue heads needing to be aligned on N byte boundaries.
400*4882a593Smuzhiyun
401*4882a593SmuzhiyunCreate a dma_pool like this::
402*4882a593Smuzhiyun
403*4882a593Smuzhiyun	struct dma_pool *pool;
404*4882a593Smuzhiyun
405*4882a593Smuzhiyun	pool = dma_pool_create(name, dev, size, align, boundary);
406*4882a593Smuzhiyun
407*4882a593SmuzhiyunThe "name" is for diagnostics (like a kmem_cache name); dev and size
408*4882a593Smuzhiyunare as above.  The device's hardware alignment requirement for this
409*4882a593Smuzhiyuntype of data is "align" (which is expressed in bytes, and must be a
410*4882a593Smuzhiyunpower of two).  If your device has no boundary crossing restrictions,
411*4882a593Smuzhiyunpass 0 for boundary; passing 4096 says memory allocated from this pool
412*4882a593Smuzhiyunmust not cross 4KByte boundaries (but at that time it may be better to
413*4882a593Smuzhiyunuse dma_alloc_coherent() directly instead).
414*4882a593Smuzhiyun
415*4882a593SmuzhiyunAllocate memory from a DMA pool like this::
416*4882a593Smuzhiyun
417*4882a593Smuzhiyun	cpu_addr = dma_pool_alloc(pool, flags, &dma_handle);
418*4882a593Smuzhiyun
419*4882a593Smuzhiyunflags are GFP_KERNEL if blocking is permitted (not in_interrupt nor
420*4882a593Smuzhiyunholding SMP locks), GFP_ATOMIC otherwise.  Like dma_alloc_coherent(),
421*4882a593Smuzhiyunthis returns two values, cpu_addr and dma_handle.
422*4882a593Smuzhiyun
423*4882a593SmuzhiyunFree memory that was allocated from a dma_pool like this::
424*4882a593Smuzhiyun
425*4882a593Smuzhiyun	dma_pool_free(pool, cpu_addr, dma_handle);
426*4882a593Smuzhiyun
427*4882a593Smuzhiyunwhere pool is what you passed to dma_pool_alloc(), and cpu_addr and
428*4882a593Smuzhiyundma_handle are the values dma_pool_alloc() returned. This function
429*4882a593Smuzhiyunmay be called in interrupt context.
430*4882a593Smuzhiyun
431*4882a593SmuzhiyunDestroy a dma_pool by calling::
432*4882a593Smuzhiyun
433*4882a593Smuzhiyun	dma_pool_destroy(pool);
434*4882a593Smuzhiyun
435*4882a593SmuzhiyunMake sure you've called dma_pool_free() for all memory allocated
436*4882a593Smuzhiyunfrom a pool before you destroy the pool. This function may not
437*4882a593Smuzhiyunbe called in interrupt context.
438*4882a593Smuzhiyun
439*4882a593SmuzhiyunDMA Direction
440*4882a593Smuzhiyun=============
441*4882a593Smuzhiyun
442*4882a593SmuzhiyunThe interfaces described in subsequent portions of this document
443*4882a593Smuzhiyuntake a DMA direction argument, which is an integer and takes on
444*4882a593Smuzhiyunone of the following values::
445*4882a593Smuzhiyun
446*4882a593Smuzhiyun DMA_BIDIRECTIONAL
447*4882a593Smuzhiyun DMA_TO_DEVICE
448*4882a593Smuzhiyun DMA_FROM_DEVICE
449*4882a593Smuzhiyun DMA_NONE
450*4882a593Smuzhiyun
451*4882a593SmuzhiyunYou should provide the exact DMA direction if you know it.
452*4882a593Smuzhiyun
453*4882a593SmuzhiyunDMA_TO_DEVICE means "from main memory to the device"
454*4882a593SmuzhiyunDMA_FROM_DEVICE means "from the device to main memory"
455*4882a593SmuzhiyunIt is the direction in which the data moves during the DMA
456*4882a593Smuzhiyuntransfer.
457*4882a593Smuzhiyun
458*4882a593SmuzhiyunYou are _strongly_ encouraged to specify this as precisely
459*4882a593Smuzhiyunas you possibly can.
460*4882a593Smuzhiyun
461*4882a593SmuzhiyunIf you absolutely cannot know the direction of the DMA transfer,
462*4882a593Smuzhiyunspecify DMA_BIDIRECTIONAL.  It means that the DMA can go in
463*4882a593Smuzhiyuneither direction.  The platform guarantees that you may legally
464*4882a593Smuzhiyunspecify this, and that it will work, but this may be at the
465*4882a593Smuzhiyuncost of performance for example.
466*4882a593Smuzhiyun
467*4882a593SmuzhiyunThe value DMA_NONE is to be used for debugging.  One can
468*4882a593Smuzhiyunhold this in a data structure before you come to know the
469*4882a593Smuzhiyunprecise direction, and this will help catch cases where your
470*4882a593Smuzhiyundirection tracking logic has failed to set things up properly.
471*4882a593Smuzhiyun
472*4882a593SmuzhiyunAnother advantage of specifying this value precisely (outside of
473*4882a593Smuzhiyunpotential platform-specific optimizations of such) is for debugging.
474*4882a593SmuzhiyunSome platforms actually have a write permission boolean which DMA
475*4882a593Smuzhiyunmappings can be marked with, much like page protections in the user
476*4882a593Smuzhiyunprogram address space.  Such platforms can and do report errors in the
477*4882a593Smuzhiyunkernel logs when the DMA controller hardware detects violation of the
478*4882a593Smuzhiyunpermission setting.
479*4882a593Smuzhiyun
480*4882a593SmuzhiyunOnly streaming mappings specify a direction, consistent mappings
481*4882a593Smuzhiyunimplicitly have a direction attribute setting of
482*4882a593SmuzhiyunDMA_BIDIRECTIONAL.
483*4882a593Smuzhiyun
484*4882a593SmuzhiyunThe SCSI subsystem tells you the direction to use in the
485*4882a593Smuzhiyun'sc_data_direction' member of the SCSI command your driver is
486*4882a593Smuzhiyunworking on.
487*4882a593Smuzhiyun
488*4882a593SmuzhiyunFor Networking drivers, it's a rather simple affair.  For transmit
489*4882a593Smuzhiyunpackets, map/unmap them with the DMA_TO_DEVICE direction
490*4882a593Smuzhiyunspecifier.  For receive packets, just the opposite, map/unmap them
491*4882a593Smuzhiyunwith the DMA_FROM_DEVICE direction specifier.
492*4882a593Smuzhiyun
493*4882a593SmuzhiyunUsing Streaming DMA mappings
494*4882a593Smuzhiyun============================
495*4882a593Smuzhiyun
496*4882a593SmuzhiyunThe streaming DMA mapping routines can be called from interrupt
497*4882a593Smuzhiyuncontext.  There are two versions of each map/unmap, one which will
498*4882a593Smuzhiyunmap/unmap a single memory region, and one which will map/unmap a
499*4882a593Smuzhiyunscatterlist.
500*4882a593Smuzhiyun
501*4882a593SmuzhiyunTo map a single region, you do::
502*4882a593Smuzhiyun
503*4882a593Smuzhiyun	struct device *dev = &my_dev->dev;
504*4882a593Smuzhiyun	dma_addr_t dma_handle;
505*4882a593Smuzhiyun	void *addr = buffer->ptr;
506*4882a593Smuzhiyun	size_t size = buffer->len;
507*4882a593Smuzhiyun
508*4882a593Smuzhiyun	dma_handle = dma_map_single(dev, addr, size, direction);
509*4882a593Smuzhiyun	if (dma_mapping_error(dev, dma_handle)) {
510*4882a593Smuzhiyun		/*
511*4882a593Smuzhiyun		 * reduce current DMA mapping usage,
512*4882a593Smuzhiyun		 * delay and try again later or
513*4882a593Smuzhiyun		 * reset driver.
514*4882a593Smuzhiyun		 */
515*4882a593Smuzhiyun		goto map_error_handling;
516*4882a593Smuzhiyun	}
517*4882a593Smuzhiyun
518*4882a593Smuzhiyunand to unmap it::
519*4882a593Smuzhiyun
520*4882a593Smuzhiyun	dma_unmap_single(dev, dma_handle, size, direction);
521*4882a593Smuzhiyun
522*4882a593SmuzhiyunYou should call dma_mapping_error() as dma_map_single() could fail and return
523*4882a593Smuzhiyunerror.  Doing so will ensure that the mapping code will work correctly on all
524*4882a593SmuzhiyunDMA implementations without any dependency on the specifics of the underlying
525*4882a593Smuzhiyunimplementation. Using the returned address without checking for errors could
526*4882a593Smuzhiyunresult in failures ranging from panics to silent data corruption.  The same
527*4882a593Smuzhiyunapplies to dma_map_page() as well.
528*4882a593Smuzhiyun
529*4882a593SmuzhiyunYou should call dma_unmap_single() when the DMA activity is finished, e.g.,
530*4882a593Smuzhiyunfrom the interrupt which told you that the DMA transfer is done.
531*4882a593Smuzhiyun
532*4882a593SmuzhiyunUsing CPU pointers like this for single mappings has a disadvantage:
533*4882a593Smuzhiyunyou cannot reference HIGHMEM memory in this way.  Thus, there is a
534*4882a593Smuzhiyunmap/unmap interface pair akin to dma_{map,unmap}_single().  These
535*4882a593Smuzhiyuninterfaces deal with page/offset pairs instead of CPU pointers.
536*4882a593SmuzhiyunSpecifically::
537*4882a593Smuzhiyun
538*4882a593Smuzhiyun	struct device *dev = &my_dev->dev;
539*4882a593Smuzhiyun	dma_addr_t dma_handle;
540*4882a593Smuzhiyun	struct page *page = buffer->page;
541*4882a593Smuzhiyun	unsigned long offset = buffer->offset;
542*4882a593Smuzhiyun	size_t size = buffer->len;
543*4882a593Smuzhiyun
544*4882a593Smuzhiyun	dma_handle = dma_map_page(dev, page, offset, size, direction);
545*4882a593Smuzhiyun	if (dma_mapping_error(dev, dma_handle)) {
546*4882a593Smuzhiyun		/*
547*4882a593Smuzhiyun		 * reduce current DMA mapping usage,
548*4882a593Smuzhiyun		 * delay and try again later or
549*4882a593Smuzhiyun		 * reset driver.
550*4882a593Smuzhiyun		 */
551*4882a593Smuzhiyun		goto map_error_handling;
552*4882a593Smuzhiyun	}
553*4882a593Smuzhiyun
554*4882a593Smuzhiyun	...
555*4882a593Smuzhiyun
556*4882a593Smuzhiyun	dma_unmap_page(dev, dma_handle, size, direction);
557*4882a593Smuzhiyun
558*4882a593SmuzhiyunHere, "offset" means byte offset within the given page.
559*4882a593Smuzhiyun
560*4882a593SmuzhiyunYou should call dma_mapping_error() as dma_map_page() could fail and return
561*4882a593Smuzhiyunerror as outlined under the dma_map_single() discussion.
562*4882a593Smuzhiyun
563*4882a593SmuzhiyunYou should call dma_unmap_page() when the DMA activity is finished, e.g.,
564*4882a593Smuzhiyunfrom the interrupt which told you that the DMA transfer is done.
565*4882a593Smuzhiyun
566*4882a593SmuzhiyunWith scatterlists, you map a region gathered from several regions by::
567*4882a593Smuzhiyun
568*4882a593Smuzhiyun	int i, count = dma_map_sg(dev, sglist, nents, direction);
569*4882a593Smuzhiyun	struct scatterlist *sg;
570*4882a593Smuzhiyun
571*4882a593Smuzhiyun	for_each_sg(sglist, sg, count, i) {
572*4882a593Smuzhiyun		hw_address[i] = sg_dma_address(sg);
573*4882a593Smuzhiyun		hw_len[i] = sg_dma_len(sg);
574*4882a593Smuzhiyun	}
575*4882a593Smuzhiyun
576*4882a593Smuzhiyunwhere nents is the number of entries in the sglist.
577*4882a593Smuzhiyun
578*4882a593SmuzhiyunThe implementation is free to merge several consecutive sglist entries
579*4882a593Smuzhiyuninto one (e.g. if DMA mapping is done with PAGE_SIZE granularity, any
580*4882a593Smuzhiyunconsecutive sglist entries can be merged into one provided the first one
581*4882a593Smuzhiyunends and the second one starts on a page boundary - in fact this is a huge
582*4882a593Smuzhiyunadvantage for cards which either cannot do scatter-gather or have very
583*4882a593Smuzhiyunlimited number of scatter-gather entries) and returns the actual number
584*4882a593Smuzhiyunof sg entries it mapped them to. On failure 0 is returned.
585*4882a593Smuzhiyun
586*4882a593SmuzhiyunThen you should loop count times (note: this can be less than nents times)
587*4882a593Smuzhiyunand use sg_dma_address() and sg_dma_len() macros where you previously
588*4882a593Smuzhiyunaccessed sg->address and sg->length as shown above.
589*4882a593Smuzhiyun
590*4882a593SmuzhiyunTo unmap a scatterlist, just call::
591*4882a593Smuzhiyun
592*4882a593Smuzhiyun	dma_unmap_sg(dev, sglist, nents, direction);
593*4882a593Smuzhiyun
594*4882a593SmuzhiyunAgain, make sure DMA activity has already finished.
595*4882a593Smuzhiyun
596*4882a593Smuzhiyun.. note::
597*4882a593Smuzhiyun
598*4882a593Smuzhiyun	The 'nents' argument to the dma_unmap_sg call must be
599*4882a593Smuzhiyun	the _same_ one you passed into the dma_map_sg call,
600*4882a593Smuzhiyun	it should _NOT_ be the 'count' value _returned_ from the
601*4882a593Smuzhiyun	dma_map_sg call.
602*4882a593Smuzhiyun
603*4882a593SmuzhiyunEvery dma_map_{single,sg}() call should have its dma_unmap_{single,sg}()
604*4882a593Smuzhiyuncounterpart, because the DMA address space is a shared resource and
605*4882a593Smuzhiyunyou could render the machine unusable by consuming all DMA addresses.
606*4882a593Smuzhiyun
607*4882a593SmuzhiyunIf you need to use the same streaming DMA region multiple times and touch
608*4882a593Smuzhiyunthe data in between the DMA transfers, the buffer needs to be synced
609*4882a593Smuzhiyunproperly in order for the CPU and device to see the most up-to-date and
610*4882a593Smuzhiyuncorrect copy of the DMA buffer.
611*4882a593Smuzhiyun
612*4882a593SmuzhiyunSo, firstly, just map it with dma_map_{single,sg}(), and after each DMA
613*4882a593Smuzhiyuntransfer call either::
614*4882a593Smuzhiyun
615*4882a593Smuzhiyun	dma_sync_single_for_cpu(dev, dma_handle, size, direction);
616*4882a593Smuzhiyun
617*4882a593Smuzhiyunor::
618*4882a593Smuzhiyun
619*4882a593Smuzhiyun	dma_sync_sg_for_cpu(dev, sglist, nents, direction);
620*4882a593Smuzhiyun
621*4882a593Smuzhiyunas appropriate.
622*4882a593Smuzhiyun
623*4882a593SmuzhiyunThen, if you wish to let the device get at the DMA area again,
624*4882a593Smuzhiyunfinish accessing the data with the CPU, and then before actually
625*4882a593Smuzhiyungiving the buffer to the hardware call either::
626*4882a593Smuzhiyun
627*4882a593Smuzhiyun	dma_sync_single_for_device(dev, dma_handle, size, direction);
628*4882a593Smuzhiyun
629*4882a593Smuzhiyunor::
630*4882a593Smuzhiyun
631*4882a593Smuzhiyun	dma_sync_sg_for_device(dev, sglist, nents, direction);
632*4882a593Smuzhiyun
633*4882a593Smuzhiyunas appropriate.
634*4882a593Smuzhiyun
635*4882a593Smuzhiyun.. note::
636*4882a593Smuzhiyun
637*4882a593Smuzhiyun	      The 'nents' argument to dma_sync_sg_for_cpu() and
638*4882a593Smuzhiyun	      dma_sync_sg_for_device() must be the same passed to
639*4882a593Smuzhiyun	      dma_map_sg(). It is _NOT_ the count returned by
640*4882a593Smuzhiyun	      dma_map_sg().
641*4882a593Smuzhiyun
642*4882a593SmuzhiyunAfter the last DMA transfer call one of the DMA unmap routines
643*4882a593Smuzhiyundma_unmap_{single,sg}(). If you don't touch the data from the first
644*4882a593Smuzhiyundma_map_*() call till dma_unmap_*(), then you don't have to call the
645*4882a593Smuzhiyundma_sync_*() routines at all.
646*4882a593Smuzhiyun
647*4882a593SmuzhiyunHere is pseudo code which shows a situation in which you would need
648*4882a593Smuzhiyunto use the dma_sync_*() interfaces::
649*4882a593Smuzhiyun
650*4882a593Smuzhiyun	my_card_setup_receive_buffer(struct my_card *cp, char *buffer, int len)
651*4882a593Smuzhiyun	{
652*4882a593Smuzhiyun		dma_addr_t mapping;
653*4882a593Smuzhiyun
654*4882a593Smuzhiyun		mapping = dma_map_single(cp->dev, buffer, len, DMA_FROM_DEVICE);
655*4882a593Smuzhiyun		if (dma_mapping_error(cp->dev, mapping)) {
656*4882a593Smuzhiyun			/*
657*4882a593Smuzhiyun			 * reduce current DMA mapping usage,
658*4882a593Smuzhiyun			 * delay and try again later or
659*4882a593Smuzhiyun			 * reset driver.
660*4882a593Smuzhiyun			 */
661*4882a593Smuzhiyun			goto map_error_handling;
662*4882a593Smuzhiyun		}
663*4882a593Smuzhiyun
664*4882a593Smuzhiyun		cp->rx_buf = buffer;
665*4882a593Smuzhiyun		cp->rx_len = len;
666*4882a593Smuzhiyun		cp->rx_dma = mapping;
667*4882a593Smuzhiyun
668*4882a593Smuzhiyun		give_rx_buf_to_card(cp);
669*4882a593Smuzhiyun	}
670*4882a593Smuzhiyun
671*4882a593Smuzhiyun	...
672*4882a593Smuzhiyun
673*4882a593Smuzhiyun	my_card_interrupt_handler(int irq, void *devid, struct pt_regs *regs)
674*4882a593Smuzhiyun	{
675*4882a593Smuzhiyun		struct my_card *cp = devid;
676*4882a593Smuzhiyun
677*4882a593Smuzhiyun		...
678*4882a593Smuzhiyun		if (read_card_status(cp) == RX_BUF_TRANSFERRED) {
679*4882a593Smuzhiyun			struct my_card_header *hp;
680*4882a593Smuzhiyun
681*4882a593Smuzhiyun			/* Examine the header to see if we wish
682*4882a593Smuzhiyun			 * to accept the data.  But synchronize
683*4882a593Smuzhiyun			 * the DMA transfer with the CPU first
684*4882a593Smuzhiyun			 * so that we see updated contents.
685*4882a593Smuzhiyun			 */
686*4882a593Smuzhiyun			dma_sync_single_for_cpu(&cp->dev, cp->rx_dma,
687*4882a593Smuzhiyun						cp->rx_len,
688*4882a593Smuzhiyun						DMA_FROM_DEVICE);
689*4882a593Smuzhiyun
690*4882a593Smuzhiyun			/* Now it is safe to examine the buffer. */
691*4882a593Smuzhiyun			hp = (struct my_card_header *) cp->rx_buf;
692*4882a593Smuzhiyun			if (header_is_ok(hp)) {
693*4882a593Smuzhiyun				dma_unmap_single(&cp->dev, cp->rx_dma, cp->rx_len,
694*4882a593Smuzhiyun						 DMA_FROM_DEVICE);
695*4882a593Smuzhiyun				pass_to_upper_layers(cp->rx_buf);
696*4882a593Smuzhiyun				make_and_setup_new_rx_buf(cp);
697*4882a593Smuzhiyun			} else {
698*4882a593Smuzhiyun				/* CPU should not write to
699*4882a593Smuzhiyun				 * DMA_FROM_DEVICE-mapped area,
700*4882a593Smuzhiyun				 * so dma_sync_single_for_device() is
701*4882a593Smuzhiyun				 * not needed here. It would be required
702*4882a593Smuzhiyun				 * for DMA_BIDIRECTIONAL mapping if
703*4882a593Smuzhiyun				 * the memory was modified.
704*4882a593Smuzhiyun				 */
705*4882a593Smuzhiyun				give_rx_buf_to_card(cp);
706*4882a593Smuzhiyun			}
707*4882a593Smuzhiyun		}
708*4882a593Smuzhiyun	}
709*4882a593Smuzhiyun
710*4882a593SmuzhiyunDrivers converted fully to this interface should not use virt_to_bus() any
711*4882a593Smuzhiyunlonger, nor should they use bus_to_virt(). Some drivers have to be changed a
712*4882a593Smuzhiyunlittle bit, because there is no longer an equivalent to bus_to_virt() in the
713*4882a593Smuzhiyundynamic DMA mapping scheme - you have to always store the DMA addresses
714*4882a593Smuzhiyunreturned by the dma_alloc_coherent(), dma_pool_alloc(), and dma_map_single()
715*4882a593Smuzhiyuncalls (dma_map_sg() stores them in the scatterlist itself if the platform
716*4882a593Smuzhiyunsupports dynamic DMA mapping in hardware) in your driver structures and/or
717*4882a593Smuzhiyunin the card registers.
718*4882a593Smuzhiyun
719*4882a593SmuzhiyunAll drivers should be using these interfaces with no exceptions.  It
720*4882a593Smuzhiyunis planned to completely remove virt_to_bus() and bus_to_virt() as
721*4882a593Smuzhiyunthey are entirely deprecated.  Some ports already do not provide these
722*4882a593Smuzhiyunas it is impossible to correctly support them.
723*4882a593Smuzhiyun
724*4882a593SmuzhiyunHandling Errors
725*4882a593Smuzhiyun===============
726*4882a593Smuzhiyun
727*4882a593SmuzhiyunDMA address space is limited on some architectures and an allocation
728*4882a593Smuzhiyunfailure can be determined by:
729*4882a593Smuzhiyun
730*4882a593Smuzhiyun- checking if dma_alloc_coherent() returns NULL or dma_map_sg returns 0
731*4882a593Smuzhiyun
732*4882a593Smuzhiyun- checking the dma_addr_t returned from dma_map_single() and dma_map_page()
733*4882a593Smuzhiyun  by using dma_mapping_error()::
734*4882a593Smuzhiyun
735*4882a593Smuzhiyun	dma_addr_t dma_handle;
736*4882a593Smuzhiyun
737*4882a593Smuzhiyun	dma_handle = dma_map_single(dev, addr, size, direction);
738*4882a593Smuzhiyun	if (dma_mapping_error(dev, dma_handle)) {
739*4882a593Smuzhiyun		/*
740*4882a593Smuzhiyun		 * reduce current DMA mapping usage,
741*4882a593Smuzhiyun		 * delay and try again later or
742*4882a593Smuzhiyun		 * reset driver.
743*4882a593Smuzhiyun		 */
744*4882a593Smuzhiyun		goto map_error_handling;
745*4882a593Smuzhiyun	}
746*4882a593Smuzhiyun
747*4882a593Smuzhiyun- unmap pages that are already mapped, when mapping error occurs in the middle
748*4882a593Smuzhiyun  of a multiple page mapping attempt. These example are applicable to
749*4882a593Smuzhiyun  dma_map_page() as well.
750*4882a593Smuzhiyun
751*4882a593SmuzhiyunExample 1::
752*4882a593Smuzhiyun
753*4882a593Smuzhiyun	dma_addr_t dma_handle1;
754*4882a593Smuzhiyun	dma_addr_t dma_handle2;
755*4882a593Smuzhiyun
756*4882a593Smuzhiyun	dma_handle1 = dma_map_single(dev, addr, size, direction);
757*4882a593Smuzhiyun	if (dma_mapping_error(dev, dma_handle1)) {
758*4882a593Smuzhiyun		/*
759*4882a593Smuzhiyun		 * reduce current DMA mapping usage,
760*4882a593Smuzhiyun		 * delay and try again later or
761*4882a593Smuzhiyun		 * reset driver.
762*4882a593Smuzhiyun		 */
763*4882a593Smuzhiyun		goto map_error_handling1;
764*4882a593Smuzhiyun	}
765*4882a593Smuzhiyun	dma_handle2 = dma_map_single(dev, addr, size, direction);
766*4882a593Smuzhiyun	if (dma_mapping_error(dev, dma_handle2)) {
767*4882a593Smuzhiyun		/*
768*4882a593Smuzhiyun		 * reduce current DMA mapping usage,
769*4882a593Smuzhiyun		 * delay and try again later or
770*4882a593Smuzhiyun		 * reset driver.
771*4882a593Smuzhiyun		 */
772*4882a593Smuzhiyun		goto map_error_handling2;
773*4882a593Smuzhiyun	}
774*4882a593Smuzhiyun
775*4882a593Smuzhiyun	...
776*4882a593Smuzhiyun
777*4882a593Smuzhiyun	map_error_handling2:
778*4882a593Smuzhiyun		dma_unmap_single(dma_handle1);
779*4882a593Smuzhiyun	map_error_handling1:
780*4882a593Smuzhiyun
781*4882a593SmuzhiyunExample 2::
782*4882a593Smuzhiyun
783*4882a593Smuzhiyun	/*
784*4882a593Smuzhiyun	 * if buffers are allocated in a loop, unmap all mapped buffers when
785*4882a593Smuzhiyun	 * mapping error is detected in the middle
786*4882a593Smuzhiyun	 */
787*4882a593Smuzhiyun
788*4882a593Smuzhiyun	dma_addr_t dma_addr;
789*4882a593Smuzhiyun	dma_addr_t array[DMA_BUFFERS];
790*4882a593Smuzhiyun	int save_index = 0;
791*4882a593Smuzhiyun
792*4882a593Smuzhiyun	for (i = 0; i < DMA_BUFFERS; i++) {
793*4882a593Smuzhiyun
794*4882a593Smuzhiyun		...
795*4882a593Smuzhiyun
796*4882a593Smuzhiyun		dma_addr = dma_map_single(dev, addr, size, direction);
797*4882a593Smuzhiyun		if (dma_mapping_error(dev, dma_addr)) {
798*4882a593Smuzhiyun			/*
799*4882a593Smuzhiyun			 * reduce current DMA mapping usage,
800*4882a593Smuzhiyun			 * delay and try again later or
801*4882a593Smuzhiyun			 * reset driver.
802*4882a593Smuzhiyun			 */
803*4882a593Smuzhiyun			goto map_error_handling;
804*4882a593Smuzhiyun		}
805*4882a593Smuzhiyun		array[i].dma_addr = dma_addr;
806*4882a593Smuzhiyun		save_index++;
807*4882a593Smuzhiyun	}
808*4882a593Smuzhiyun
809*4882a593Smuzhiyun	...
810*4882a593Smuzhiyun
811*4882a593Smuzhiyun	map_error_handling:
812*4882a593Smuzhiyun
813*4882a593Smuzhiyun	for (i = 0; i < save_index; i++) {
814*4882a593Smuzhiyun
815*4882a593Smuzhiyun		...
816*4882a593Smuzhiyun
817*4882a593Smuzhiyun		dma_unmap_single(array[i].dma_addr);
818*4882a593Smuzhiyun	}
819*4882a593Smuzhiyun
820*4882a593SmuzhiyunNetworking drivers must call dev_kfree_skb() to free the socket buffer
821*4882a593Smuzhiyunand return NETDEV_TX_OK if the DMA mapping fails on the transmit hook
822*4882a593Smuzhiyun(ndo_start_xmit). This means that the socket buffer is just dropped in
823*4882a593Smuzhiyunthe failure case.
824*4882a593Smuzhiyun
825*4882a593SmuzhiyunSCSI drivers must return SCSI_MLQUEUE_HOST_BUSY if the DMA mapping
826*4882a593Smuzhiyunfails in the queuecommand hook. This means that the SCSI subsystem
827*4882a593Smuzhiyunpasses the command to the driver again later.
828*4882a593Smuzhiyun
829*4882a593SmuzhiyunOptimizing Unmap State Space Consumption
830*4882a593Smuzhiyun========================================
831*4882a593Smuzhiyun
832*4882a593SmuzhiyunOn many platforms, dma_unmap_{single,page}() is simply a nop.
833*4882a593SmuzhiyunTherefore, keeping track of the mapping address and length is a waste
834*4882a593Smuzhiyunof space.  Instead of filling your drivers up with ifdefs and the like
835*4882a593Smuzhiyunto "work around" this (which would defeat the whole purpose of a
836*4882a593Smuzhiyunportable API) the following facilities are provided.
837*4882a593Smuzhiyun
838*4882a593SmuzhiyunActually, instead of describing the macros one by one, we'll
839*4882a593Smuzhiyuntransform some example code.
840*4882a593Smuzhiyun
841*4882a593Smuzhiyun1) Use DEFINE_DMA_UNMAP_{ADDR,LEN} in state saving structures.
842*4882a593Smuzhiyun   Example, before::
843*4882a593Smuzhiyun
844*4882a593Smuzhiyun	struct ring_state {
845*4882a593Smuzhiyun		struct sk_buff *skb;
846*4882a593Smuzhiyun		dma_addr_t mapping;
847*4882a593Smuzhiyun		__u32 len;
848*4882a593Smuzhiyun	};
849*4882a593Smuzhiyun
850*4882a593Smuzhiyun   after::
851*4882a593Smuzhiyun
852*4882a593Smuzhiyun	struct ring_state {
853*4882a593Smuzhiyun		struct sk_buff *skb;
854*4882a593Smuzhiyun		DEFINE_DMA_UNMAP_ADDR(mapping);
855*4882a593Smuzhiyun		DEFINE_DMA_UNMAP_LEN(len);
856*4882a593Smuzhiyun	};
857*4882a593Smuzhiyun
858*4882a593Smuzhiyun2) Use dma_unmap_{addr,len}_set() to set these values.
859*4882a593Smuzhiyun   Example, before::
860*4882a593Smuzhiyun
861*4882a593Smuzhiyun	ringp->mapping = FOO;
862*4882a593Smuzhiyun	ringp->len = BAR;
863*4882a593Smuzhiyun
864*4882a593Smuzhiyun   after::
865*4882a593Smuzhiyun
866*4882a593Smuzhiyun	dma_unmap_addr_set(ringp, mapping, FOO);
867*4882a593Smuzhiyun	dma_unmap_len_set(ringp, len, BAR);
868*4882a593Smuzhiyun
869*4882a593Smuzhiyun3) Use dma_unmap_{addr,len}() to access these values.
870*4882a593Smuzhiyun   Example, before::
871*4882a593Smuzhiyun
872*4882a593Smuzhiyun	dma_unmap_single(dev, ringp->mapping, ringp->len,
873*4882a593Smuzhiyun			 DMA_FROM_DEVICE);
874*4882a593Smuzhiyun
875*4882a593Smuzhiyun   after::
876*4882a593Smuzhiyun
877*4882a593Smuzhiyun	dma_unmap_single(dev,
878*4882a593Smuzhiyun			 dma_unmap_addr(ringp, mapping),
879*4882a593Smuzhiyun			 dma_unmap_len(ringp, len),
880*4882a593Smuzhiyun			 DMA_FROM_DEVICE);
881*4882a593Smuzhiyun
882*4882a593SmuzhiyunIt really should be self-explanatory.  We treat the ADDR and LEN
883*4882a593Smuzhiyunseparately, because it is possible for an implementation to only
884*4882a593Smuzhiyunneed the address in order to perform the unmap operation.
885*4882a593Smuzhiyun
886*4882a593SmuzhiyunPlatform Issues
887*4882a593Smuzhiyun===============
888*4882a593Smuzhiyun
889*4882a593SmuzhiyunIf you are just writing drivers for Linux and do not maintain
890*4882a593Smuzhiyunan architecture port for the kernel, you can safely skip down
891*4882a593Smuzhiyunto "Closing".
892*4882a593Smuzhiyun
893*4882a593Smuzhiyun1) Struct scatterlist requirements.
894*4882a593Smuzhiyun
895*4882a593Smuzhiyun   You need to enable CONFIG_NEED_SG_DMA_LENGTH if the architecture
896*4882a593Smuzhiyun   supports IOMMUs (including software IOMMU).
897*4882a593Smuzhiyun
898*4882a593Smuzhiyun2) ARCH_DMA_MINALIGN
899*4882a593Smuzhiyun
900*4882a593Smuzhiyun   Architectures must ensure that kmalloc'ed buffer is
901*4882a593Smuzhiyun   DMA-safe. Drivers and subsystems depend on it. If an architecture
902*4882a593Smuzhiyun   isn't fully DMA-coherent (i.e. hardware doesn't ensure that data in
903*4882a593Smuzhiyun   the CPU cache is identical to data in main memory),
904*4882a593Smuzhiyun   ARCH_DMA_MINALIGN must be set so that the memory allocator
905*4882a593Smuzhiyun   makes sure that kmalloc'ed buffer doesn't share a cache line with
906*4882a593Smuzhiyun   the others. See arch/arm/include/asm/cache.h as an example.
907*4882a593Smuzhiyun
908*4882a593Smuzhiyun   Note that ARCH_DMA_MINALIGN is about DMA memory alignment
909*4882a593Smuzhiyun   constraints. You don't need to worry about the architecture data
910*4882a593Smuzhiyun   alignment constraints (e.g. the alignment constraints about 64-bit
911*4882a593Smuzhiyun   objects).
912*4882a593Smuzhiyun
913*4882a593SmuzhiyunClosing
914*4882a593Smuzhiyun=======
915*4882a593Smuzhiyun
916*4882a593SmuzhiyunThis document, and the API itself, would not be in its current
917*4882a593Smuzhiyunform without the feedback and suggestions from numerous individuals.
918*4882a593SmuzhiyunWe would like to specifically mention, in no particular order, the
919*4882a593Smuzhiyunfollowing people::
920*4882a593Smuzhiyun
921*4882a593Smuzhiyun	Russell King <rmk@arm.linux.org.uk>
922*4882a593Smuzhiyun	Leo Dagum <dagum@barrel.engr.sgi.com>
923*4882a593Smuzhiyun	Ralf Baechle <ralf@oss.sgi.com>
924*4882a593Smuzhiyun	Grant Grundler <grundler@cup.hp.com>
925*4882a593Smuzhiyun	Jay Estabrook <Jay.Estabrook@compaq.com>
926*4882a593Smuzhiyun	Thomas Sailer <sailer@ife.ee.ethz.ch>
927*4882a593Smuzhiyun	Andrea Arcangeli <andrea@suse.de>
928*4882a593Smuzhiyun	Jens Axboe <jens.axboe@oracle.com>
929*4882a593Smuzhiyun	David Mosberger-Tang <davidm@hpl.hp.com>
930