xref: /OK3568_Linux_fs/kernel/Documentation/driver-api/dma-buf.rst (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593SmuzhiyunBuffer Sharing and Synchronization
2*4882a593Smuzhiyun==================================
3*4882a593Smuzhiyun
4*4882a593SmuzhiyunThe dma-buf subsystem provides the framework for sharing buffers for
5*4882a593Smuzhiyunhardware (DMA) access across multiple device drivers and subsystems, and
6*4882a593Smuzhiyunfor synchronizing asynchronous hardware access.
7*4882a593Smuzhiyun
8*4882a593SmuzhiyunThis is used, for example, by drm "prime" multi-GPU support, but is of
9*4882a593Smuzhiyuncourse not limited to GPU use cases.
10*4882a593Smuzhiyun
11*4882a593SmuzhiyunThe three main components of this are: (1) dma-buf, representing a
12*4882a593Smuzhiyunsg_table and exposed to userspace as a file descriptor to allow passing
13*4882a593Smuzhiyunbetween devices, (2) fence, which provides a mechanism to signal when
14*4882a593Smuzhiyunone device has finished access, and (3) reservation, which manages the
15*4882a593Smuzhiyunshared or exclusive fence(s) associated with the buffer.
16*4882a593Smuzhiyun
17*4882a593SmuzhiyunShared DMA Buffers
18*4882a593Smuzhiyun------------------
19*4882a593Smuzhiyun
20*4882a593SmuzhiyunThis document serves as a guide to device-driver writers on what is the dma-buf
21*4882a593Smuzhiyunbuffer sharing API, how to use it for exporting and using shared buffers.
22*4882a593Smuzhiyun
23*4882a593SmuzhiyunAny device driver which wishes to be a part of DMA buffer sharing, can do so as
24*4882a593Smuzhiyuneither the 'exporter' of buffers, or the 'user' or 'importer' of buffers.
25*4882a593Smuzhiyun
26*4882a593SmuzhiyunSay a driver A wants to use buffers created by driver B, then we call B as the
27*4882a593Smuzhiyunexporter, and A as buffer-user/importer.
28*4882a593Smuzhiyun
29*4882a593SmuzhiyunThe exporter
30*4882a593Smuzhiyun
31*4882a593Smuzhiyun - implements and manages operations in :c:type:`struct dma_buf_ops
32*4882a593Smuzhiyun   <dma_buf_ops>` for the buffer,
33*4882a593Smuzhiyun - allows other users to share the buffer by using dma_buf sharing APIs,
34*4882a593Smuzhiyun - manages the details of buffer allocation, wrapped in a :c:type:`struct
35*4882a593Smuzhiyun   dma_buf <dma_buf>`,
36*4882a593Smuzhiyun - decides about the actual backing storage where this allocation happens,
37*4882a593Smuzhiyun - and takes care of any migration of scatterlist - for all (shared) users of
38*4882a593Smuzhiyun   this buffer.
39*4882a593Smuzhiyun
40*4882a593SmuzhiyunThe buffer-user
41*4882a593Smuzhiyun
42*4882a593Smuzhiyun - is one of (many) sharing users of the buffer.
43*4882a593Smuzhiyun - doesn't need to worry about how the buffer is allocated, or where.
44*4882a593Smuzhiyun - and needs a mechanism to get access to the scatterlist that makes up this
45*4882a593Smuzhiyun   buffer in memory, mapped into its own address space, so it can access the
46*4882a593Smuzhiyun   same area of memory. This interface is provided by :c:type:`struct
47*4882a593Smuzhiyun   dma_buf_attachment <dma_buf_attachment>`.
48*4882a593Smuzhiyun
49*4882a593SmuzhiyunAny exporters or users of the dma-buf buffer sharing framework must have a
50*4882a593Smuzhiyun'select DMA_SHARED_BUFFER' in their respective Kconfigs.
51*4882a593Smuzhiyun
52*4882a593SmuzhiyunUserspace Interface Notes
53*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~
54*4882a593Smuzhiyun
55*4882a593SmuzhiyunMostly a DMA buffer file descriptor is simply an opaque object for userspace,
56*4882a593Smuzhiyunand hence the generic interface exposed is very minimal. There's a few things to
57*4882a593Smuzhiyunconsider though:
58*4882a593Smuzhiyun
59*4882a593Smuzhiyun- Since kernel 3.12 the dma-buf FD supports the llseek system call, but only
60*4882a593Smuzhiyun  with offset=0 and whence=SEEK_END|SEEK_SET. SEEK_SET is supported to allow
61*4882a593Smuzhiyun  the usual size discover pattern size = SEEK_END(0); SEEK_SET(0). Every other
62*4882a593Smuzhiyun  llseek operation will report -EINVAL.
63*4882a593Smuzhiyun
64*4882a593Smuzhiyun  If llseek on dma-buf FDs isn't support the kernel will report -ESPIPE for all
65*4882a593Smuzhiyun  cases. Userspace can use this to detect support for discovering the dma-buf
66*4882a593Smuzhiyun  size using llseek.
67*4882a593Smuzhiyun
68*4882a593Smuzhiyun- In order to avoid fd leaks on exec, the FD_CLOEXEC flag must be set
69*4882a593Smuzhiyun  on the file descriptor.  This is not just a resource leak, but a
70*4882a593Smuzhiyun  potential security hole.  It could give the newly exec'd application
71*4882a593Smuzhiyun  access to buffers, via the leaked fd, to which it should otherwise
72*4882a593Smuzhiyun  not be permitted access.
73*4882a593Smuzhiyun
74*4882a593Smuzhiyun  The problem with doing this via a separate fcntl() call, versus doing it
75*4882a593Smuzhiyun  atomically when the fd is created, is that this is inherently racy in a
76*4882a593Smuzhiyun  multi-threaded app[3].  The issue is made worse when it is library code
77*4882a593Smuzhiyun  opening/creating the file descriptor, as the application may not even be
78*4882a593Smuzhiyun  aware of the fd's.
79*4882a593Smuzhiyun
80*4882a593Smuzhiyun  To avoid this problem, userspace must have a way to request O_CLOEXEC
81*4882a593Smuzhiyun  flag be set when the dma-buf fd is created.  So any API provided by
82*4882a593Smuzhiyun  the exporting driver to create a dmabuf fd must provide a way to let
83*4882a593Smuzhiyun  userspace control setting of O_CLOEXEC flag passed in to dma_buf_fd().
84*4882a593Smuzhiyun
85*4882a593Smuzhiyun- Memory mapping the contents of the DMA buffer is also supported. See the
86*4882a593Smuzhiyun  discussion below on `CPU Access to DMA Buffer Objects`_ for the full details.
87*4882a593Smuzhiyun
88*4882a593Smuzhiyun- The DMA buffer FD is also pollable, see `Implicit Fence Poll Support`_ below for
89*4882a593Smuzhiyun  details.
90*4882a593Smuzhiyun
91*4882a593SmuzhiyunBasic Operation and Device DMA Access
92*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
93*4882a593Smuzhiyun
94*4882a593Smuzhiyun.. kernel-doc:: drivers/dma-buf/dma-buf.c
95*4882a593Smuzhiyun   :doc: dma buf device access
96*4882a593Smuzhiyun
97*4882a593SmuzhiyunCPU Access to DMA Buffer Objects
98*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
99*4882a593Smuzhiyun
100*4882a593Smuzhiyun.. kernel-doc:: drivers/dma-buf/dma-buf.c
101*4882a593Smuzhiyun   :doc: cpu access
102*4882a593Smuzhiyun
103*4882a593SmuzhiyunImplicit Fence Poll Support
104*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~~~
105*4882a593Smuzhiyun
106*4882a593Smuzhiyun.. kernel-doc:: drivers/dma-buf/dma-buf.c
107*4882a593Smuzhiyun   :doc: implicit fence polling
108*4882a593Smuzhiyun
109*4882a593SmuzhiyunKernel Functions and Structures Reference
110*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
111*4882a593Smuzhiyun
112*4882a593Smuzhiyun.. kernel-doc:: drivers/dma-buf/dma-buf.c
113*4882a593Smuzhiyun   :export:
114*4882a593Smuzhiyun
115*4882a593Smuzhiyun.. kernel-doc:: include/linux/dma-buf.h
116*4882a593Smuzhiyun   :internal:
117*4882a593Smuzhiyun
118*4882a593SmuzhiyunReservation Objects
119*4882a593Smuzhiyun-------------------
120*4882a593Smuzhiyun
121*4882a593Smuzhiyun.. kernel-doc:: drivers/dma-buf/dma-resv.c
122*4882a593Smuzhiyun   :doc: Reservation Object Overview
123*4882a593Smuzhiyun
124*4882a593Smuzhiyun.. kernel-doc:: drivers/dma-buf/dma-resv.c
125*4882a593Smuzhiyun   :export:
126*4882a593Smuzhiyun
127*4882a593Smuzhiyun.. kernel-doc:: include/linux/dma-resv.h
128*4882a593Smuzhiyun   :internal:
129*4882a593Smuzhiyun
130*4882a593SmuzhiyunDMA Fences
131*4882a593Smuzhiyun----------
132*4882a593Smuzhiyun
133*4882a593Smuzhiyun.. kernel-doc:: drivers/dma-buf/dma-fence.c
134*4882a593Smuzhiyun   :doc: DMA fences overview
135*4882a593Smuzhiyun
136*4882a593SmuzhiyunDMA Fence Cross-Driver Contract
137*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
138*4882a593Smuzhiyun
139*4882a593Smuzhiyun.. kernel-doc:: drivers/dma-buf/dma-fence.c
140*4882a593Smuzhiyun   :doc: fence cross-driver contract
141*4882a593Smuzhiyun
142*4882a593SmuzhiyunDMA Fence Signalling Annotations
143*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
144*4882a593Smuzhiyun
145*4882a593Smuzhiyun.. kernel-doc:: drivers/dma-buf/dma-fence.c
146*4882a593Smuzhiyun   :doc: fence signalling annotation
147*4882a593Smuzhiyun
148*4882a593SmuzhiyunDMA Fences Functions Reference
149*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
150*4882a593Smuzhiyun
151*4882a593Smuzhiyun.. kernel-doc:: drivers/dma-buf/dma-fence.c
152*4882a593Smuzhiyun   :export:
153*4882a593Smuzhiyun
154*4882a593Smuzhiyun.. kernel-doc:: include/linux/dma-fence.h
155*4882a593Smuzhiyun   :internal:
156*4882a593Smuzhiyun
157*4882a593SmuzhiyunSeqno Hardware Fences
158*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~
159*4882a593Smuzhiyun
160*4882a593Smuzhiyun.. kernel-doc:: include/linux/seqno-fence.h
161*4882a593Smuzhiyun   :internal:
162*4882a593Smuzhiyun
163*4882a593SmuzhiyunDMA Fence Array
164*4882a593Smuzhiyun~~~~~~~~~~~~~~~
165*4882a593Smuzhiyun
166*4882a593Smuzhiyun.. kernel-doc:: drivers/dma-buf/dma-fence-array.c
167*4882a593Smuzhiyun   :export:
168*4882a593Smuzhiyun
169*4882a593Smuzhiyun.. kernel-doc:: include/linux/dma-fence-array.h
170*4882a593Smuzhiyun   :internal:
171*4882a593Smuzhiyun
172*4882a593SmuzhiyunDMA Fence uABI/Sync File
173*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~~~~
174*4882a593Smuzhiyun
175*4882a593Smuzhiyun.. kernel-doc:: drivers/dma-buf/sync_file.c
176*4882a593Smuzhiyun   :export:
177*4882a593Smuzhiyun
178*4882a593Smuzhiyun.. kernel-doc:: include/linux/sync_file.h
179*4882a593Smuzhiyun   :internal:
180*4882a593Smuzhiyun
181*4882a593SmuzhiyunIndefinite DMA Fences
182*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~~~~
183*4882a593Smuzhiyun
184*4882a593SmuzhiyunAt various times &dma_fence with an indefinite time until dma_fence_wait()
185*4882a593Smuzhiyunfinishes have been proposed. Examples include:
186*4882a593Smuzhiyun
187*4882a593Smuzhiyun* Future fences, used in HWC1 to signal when a buffer isn't used by the display
188*4882a593Smuzhiyun  any longer, and created with the screen update that makes the buffer visible.
189*4882a593Smuzhiyun  The time this fence completes is entirely under userspace's control.
190*4882a593Smuzhiyun
191*4882a593Smuzhiyun* Proxy fences, proposed to handle &drm_syncobj for which the fence has not yet
192*4882a593Smuzhiyun  been set. Used to asynchronously delay command submission.
193*4882a593Smuzhiyun
194*4882a593Smuzhiyun* Userspace fences or gpu futexes, fine-grained locking within a command buffer
195*4882a593Smuzhiyun  that userspace uses for synchronization across engines or with the CPU, which
196*4882a593Smuzhiyun  are then imported as a DMA fence for integration into existing winsys
197*4882a593Smuzhiyun  protocols.
198*4882a593Smuzhiyun
199*4882a593Smuzhiyun* Long-running compute command buffers, while still using traditional end of
200*4882a593Smuzhiyun  batch DMA fences for memory management instead of context preemption DMA
201*4882a593Smuzhiyun  fences which get reattached when the compute job is rescheduled.
202*4882a593Smuzhiyun
203*4882a593SmuzhiyunCommon to all these schemes is that userspace controls the dependencies of these
204*4882a593Smuzhiyunfences and controls when they fire. Mixing indefinite fences with normal
205*4882a593Smuzhiyunin-kernel DMA fences does not work, even when a fallback timeout is included to
206*4882a593Smuzhiyunprotect against malicious userspace:
207*4882a593Smuzhiyun
208*4882a593Smuzhiyun* Only the kernel knows about all DMA fence dependencies, userspace is not aware
209*4882a593Smuzhiyun  of dependencies injected due to memory management or scheduler decisions.
210*4882a593Smuzhiyun
211*4882a593Smuzhiyun* Only userspace knows about all dependencies in indefinite fences and when
212*4882a593Smuzhiyun  exactly they will complete, the kernel has no visibility.
213*4882a593Smuzhiyun
214*4882a593SmuzhiyunFurthermore the kernel has to be able to hold up userspace command submission
215*4882a593Smuzhiyunfor memory management needs, which means we must support indefinite fences being
216*4882a593Smuzhiyundependent upon DMA fences. If the kernel also support indefinite fences in the
217*4882a593Smuzhiyunkernel like a DMA fence, like any of the above proposal would, there is the
218*4882a593Smuzhiyunpotential for deadlocks.
219*4882a593Smuzhiyun
220*4882a593Smuzhiyun.. kernel-render:: DOT
221*4882a593Smuzhiyun   :alt: Indefinite Fencing Dependency Cycle
222*4882a593Smuzhiyun   :caption: Indefinite Fencing Dependency Cycle
223*4882a593Smuzhiyun
224*4882a593Smuzhiyun   digraph "Fencing Cycle" {
225*4882a593Smuzhiyun      node [shape=box bgcolor=grey style=filled]
226*4882a593Smuzhiyun      kernel [label="Kernel DMA Fences"]
227*4882a593Smuzhiyun      userspace [label="userspace controlled fences"]
228*4882a593Smuzhiyun      kernel -> userspace [label="memory management"]
229*4882a593Smuzhiyun      userspace -> kernel [label="Future fence, fence proxy, ..."]
230*4882a593Smuzhiyun
231*4882a593Smuzhiyun      { rank=same; kernel userspace }
232*4882a593Smuzhiyun   }
233*4882a593Smuzhiyun
234*4882a593SmuzhiyunThis means that the kernel might accidentally create deadlocks
235*4882a593Smuzhiyunthrough memory management dependencies which userspace is unaware of, which
236*4882a593Smuzhiyunrandomly hangs workloads until the timeout kicks in. Workloads, which from
237*4882a593Smuzhiyunuserspace's perspective, do not contain a deadlock.  In such a mixed fencing
238*4882a593Smuzhiyunarchitecture there is no single entity with knowledge of all dependencies.
239*4882a593SmuzhiyunThefore preventing such deadlocks from within the kernel is not possible.
240*4882a593Smuzhiyun
241*4882a593SmuzhiyunThe only solution to avoid dependencies loops is by not allowing indefinite
242*4882a593Smuzhiyunfences in the kernel. This means:
243*4882a593Smuzhiyun
244*4882a593Smuzhiyun* No future fences, proxy fences or userspace fences imported as DMA fences,
245*4882a593Smuzhiyun  with or without a timeout.
246*4882a593Smuzhiyun
247*4882a593Smuzhiyun* No DMA fences that signal end of batchbuffer for command submission where
248*4882a593Smuzhiyun  userspace is allowed to use userspace fencing or long running compute
249*4882a593Smuzhiyun  workloads. This also means no implicit fencing for shared buffers in these
250*4882a593Smuzhiyun  cases.
251