1*4882a593SmuzhiyunUSB DMA 2*4882a593Smuzhiyun~~~~~~~ 3*4882a593Smuzhiyun 4*4882a593SmuzhiyunIn Linux 2.5 kernels (and later), USB device drivers have additional control 5*4882a593Smuzhiyunover how DMA may be used to perform I/O operations. The APIs are detailed 6*4882a593Smuzhiyunin the kernel usb programming guide (kerneldoc, from the source code). 7*4882a593Smuzhiyun 8*4882a593SmuzhiyunAPI overview 9*4882a593Smuzhiyun============ 10*4882a593Smuzhiyun 11*4882a593SmuzhiyunThe big picture is that USB drivers can continue to ignore most DMA issues, 12*4882a593Smuzhiyunthough they still must provide DMA-ready buffers (see 13*4882a593Smuzhiyun:doc:`/core-api/dma-api-howto`). That's how they've worked through 14*4882a593Smuzhiyunthe 2.4 (and earlier) kernels, or they can now be DMA-aware. 15*4882a593Smuzhiyun 16*4882a593SmuzhiyunDMA-aware usb drivers: 17*4882a593Smuzhiyun 18*4882a593Smuzhiyun- New calls enable DMA-aware drivers, letting them allocate dma buffers and 19*4882a593Smuzhiyun manage dma mappings for existing dma-ready buffers (see below). 20*4882a593Smuzhiyun 21*4882a593Smuzhiyun- URBs have an additional "transfer_dma" field, as well as a transfer_flags 22*4882a593Smuzhiyun bit saying if it's valid. (Control requests also have "setup_dma", but 23*4882a593Smuzhiyun drivers must not use it.) 24*4882a593Smuzhiyun 25*4882a593Smuzhiyun- "usbcore" will map this DMA address, if a DMA-aware driver didn't do 26*4882a593Smuzhiyun it first and set ``URB_NO_TRANSFER_DMA_MAP``. HCDs 27*4882a593Smuzhiyun don't manage dma mappings for URBs. 28*4882a593Smuzhiyun 29*4882a593Smuzhiyun- There's a new "generic DMA API", parts of which are usable by USB device 30*4882a593Smuzhiyun drivers. Never use dma_set_mask() on any USB interface or device; that 31*4882a593Smuzhiyun would potentially break all devices sharing that bus. 32*4882a593Smuzhiyun 33*4882a593SmuzhiyunEliminating copies 34*4882a593Smuzhiyun================== 35*4882a593Smuzhiyun 36*4882a593SmuzhiyunIt's good to avoid making CPUs copy data needlessly. The costs can add up, 37*4882a593Smuzhiyunand effects like cache-trashing can impose subtle penalties. 38*4882a593Smuzhiyun 39*4882a593Smuzhiyun- If you're doing lots of small data transfers from the same buffer all 40*4882a593Smuzhiyun the time, that can really burn up resources on systems which use an 41*4882a593Smuzhiyun IOMMU to manage the DMA mappings. It can cost MUCH more to set up and 42*4882a593Smuzhiyun tear down the IOMMU mappings with each request than perform the I/O! 43*4882a593Smuzhiyun 44*4882a593Smuzhiyun For those specific cases, USB has primitives to allocate less expensive 45*4882a593Smuzhiyun memory. They work like kmalloc and kfree versions that give you the right 46*4882a593Smuzhiyun kind of addresses to store in urb->transfer_buffer and urb->transfer_dma. 47*4882a593Smuzhiyun You'd also set ``URB_NO_TRANSFER_DMA_MAP`` in urb->transfer_flags:: 48*4882a593Smuzhiyun 49*4882a593Smuzhiyun void *usb_alloc_coherent (struct usb_device *dev, size_t size, 50*4882a593Smuzhiyun int mem_flags, dma_addr_t *dma); 51*4882a593Smuzhiyun 52*4882a593Smuzhiyun void usb_free_coherent (struct usb_device *dev, size_t size, 53*4882a593Smuzhiyun void *addr, dma_addr_t dma); 54*4882a593Smuzhiyun 55*4882a593Smuzhiyun Most drivers should **NOT** be using these primitives; they don't need 56*4882a593Smuzhiyun to use this type of memory ("dma-coherent"), and memory returned from 57*4882a593Smuzhiyun :c:func:`kmalloc` will work just fine. 58*4882a593Smuzhiyun 59*4882a593Smuzhiyun The memory buffer returned is "dma-coherent"; sometimes you might need to 60*4882a593Smuzhiyun force a consistent memory access ordering by using memory barriers. It's 61*4882a593Smuzhiyun not using a streaming DMA mapping, so it's good for small transfers on 62*4882a593Smuzhiyun systems where the I/O would otherwise thrash an IOMMU mapping. (See 63*4882a593Smuzhiyun :doc:`/core-api/dma-api-howto` for definitions of "coherent" and 64*4882a593Smuzhiyun "streaming" DMA mappings.) 65*4882a593Smuzhiyun 66*4882a593Smuzhiyun Asking for 1/Nth of a page (as well as asking for N pages) is reasonably 67*4882a593Smuzhiyun space-efficient. 68*4882a593Smuzhiyun 69*4882a593Smuzhiyun On most systems the memory returned will be uncached, because the 70*4882a593Smuzhiyun semantics of dma-coherent memory require either bypassing CPU caches 71*4882a593Smuzhiyun or using cache hardware with bus-snooping support. While x86 hardware 72*4882a593Smuzhiyun has such bus-snooping, many other systems use software to flush cache 73*4882a593Smuzhiyun lines to prevent DMA conflicts. 74*4882a593Smuzhiyun 75*4882a593Smuzhiyun- Devices on some EHCI controllers could handle DMA to/from high memory. 76*4882a593Smuzhiyun 77*4882a593Smuzhiyun Unfortunately, the current Linux DMA infrastructure doesn't have a sane 78*4882a593Smuzhiyun way to expose these capabilities ... and in any case, HIGHMEM is mostly a 79*4882a593Smuzhiyun design wart specific to x86_32. So your best bet is to ensure you never 80*4882a593Smuzhiyun pass a highmem buffer into a USB driver. That's easy; it's the default 81*4882a593Smuzhiyun behavior. Just don't override it; e.g. with ``NETIF_F_HIGHDMA``. 82*4882a593Smuzhiyun 83*4882a593Smuzhiyun This may force your callers to do some bounce buffering, copying from 84*4882a593Smuzhiyun high memory to "normal" DMA memory. If you can come up with a good way 85*4882a593Smuzhiyun to fix this issue (for x86_32 machines with over 1 GByte of memory), 86*4882a593Smuzhiyun feel free to submit patches. 87*4882a593Smuzhiyun 88*4882a593SmuzhiyunWorking with existing buffers 89*4882a593Smuzhiyun============================= 90*4882a593Smuzhiyun 91*4882a593SmuzhiyunExisting buffers aren't usable for DMA without first being mapped into the 92*4882a593SmuzhiyunDMA address space of the device. However, most buffers passed to your 93*4882a593Smuzhiyundriver can safely be used with such DMA mapping. (See the first section 94*4882a593Smuzhiyunof :doc:`/core-api/dma-api-howto`, titled "What memory is DMA-able?") 95*4882a593Smuzhiyun 96*4882a593Smuzhiyun- When you're using scatterlists, you can map everything at once. On some 97*4882a593Smuzhiyun systems, this kicks in an IOMMU and turns the scatterlists into single 98*4882a593Smuzhiyun DMA transactions:: 99*4882a593Smuzhiyun 100*4882a593Smuzhiyun int usb_buffer_map_sg (struct usb_device *dev, unsigned pipe, 101*4882a593Smuzhiyun struct scatterlist *sg, int nents); 102*4882a593Smuzhiyun 103*4882a593Smuzhiyun void usb_buffer_dmasync_sg (struct usb_device *dev, unsigned pipe, 104*4882a593Smuzhiyun struct scatterlist *sg, int n_hw_ents); 105*4882a593Smuzhiyun 106*4882a593Smuzhiyun void usb_buffer_unmap_sg (struct usb_device *dev, unsigned pipe, 107*4882a593Smuzhiyun struct scatterlist *sg, int n_hw_ents); 108*4882a593Smuzhiyun 109*4882a593Smuzhiyun It's probably easier to use the new ``usb_sg_*()`` calls, which do the DMA 110*4882a593Smuzhiyun mapping and apply other tweaks to make scatterlist i/o be fast. 111*4882a593Smuzhiyun 112*4882a593Smuzhiyun- Some drivers may prefer to work with the model that they're mapping large 113*4882a593Smuzhiyun buffers, synchronizing their safe re-use. (If there's no re-use, then let 114*4882a593Smuzhiyun usbcore do the map/unmap.) Large periodic transfers make good examples 115*4882a593Smuzhiyun here, since it's cheaper to just synchronize the buffer than to unmap it 116*4882a593Smuzhiyun each time an urb completes and then re-map it on during resubmission. 117*4882a593Smuzhiyun 118*4882a593Smuzhiyun These calls all work with initialized urbs: ``urb->dev``, ``urb->pipe``, 119*4882a593Smuzhiyun ``urb->transfer_buffer``, and ``urb->transfer_buffer_length`` must all be 120*4882a593Smuzhiyun valid when these calls are used (``urb->setup_packet`` must be valid too 121*4882a593Smuzhiyun if urb is a control request):: 122*4882a593Smuzhiyun 123*4882a593Smuzhiyun struct urb *usb_buffer_map (struct urb *urb); 124*4882a593Smuzhiyun 125*4882a593Smuzhiyun void usb_buffer_dmasync (struct urb *urb); 126*4882a593Smuzhiyun 127*4882a593Smuzhiyun void usb_buffer_unmap (struct urb *urb); 128*4882a593Smuzhiyun 129*4882a593Smuzhiyun The calls manage ``urb->transfer_dma`` for you, and set 130*4882a593Smuzhiyun ``URB_NO_TRANSFER_DMA_MAP`` so that usbcore won't map or unmap the buffer. 131*4882a593Smuzhiyun They cannot be used for setup_packet buffers in control requests. 132*4882a593Smuzhiyun 133*4882a593SmuzhiyunNote that several of those interfaces are currently commented out, since 134*4882a593Smuzhiyunthey don't have current users. See the source code. Other than the dmasync 135*4882a593Smuzhiyuncalls (where the underlying DMA primitives have changed), most of them can 136*4882a593Smuzhiyuneasily be commented back in if you want to use them. 137