xref: /OK3568_Linux_fs/kernel/Documentation/userspace-api/iommu.rst (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593Smuzhiyun.. SPDX-License-Identifier: GPL-2.0
2*4882a593Smuzhiyun.. iommu:
3*4882a593Smuzhiyun
4*4882a593Smuzhiyun=====================================
5*4882a593SmuzhiyunIOMMU Userspace API
6*4882a593Smuzhiyun=====================================
7*4882a593Smuzhiyun
8*4882a593SmuzhiyunIOMMU UAPI is used for virtualization cases where communications are
9*4882a593Smuzhiyunneeded between physical and virtual IOMMU drivers. For baremetal
10*4882a593Smuzhiyunusage, the IOMMU is a system device which does not need to communicate
11*4882a593Smuzhiyunwith userspace directly.
12*4882a593Smuzhiyun
13*4882a593SmuzhiyunThe primary use cases are guest Shared Virtual Address (SVA) and
14*4882a593Smuzhiyunguest IO virtual address (IOVA), wherein the vIOMMU implementation
15*4882a593Smuzhiyunrelies on the physical IOMMU and for this reason requires interactions
16*4882a593Smuzhiyunwith the host driver.
17*4882a593Smuzhiyun
18*4882a593Smuzhiyun.. contents:: :local:
19*4882a593Smuzhiyun
20*4882a593SmuzhiyunFunctionalities
21*4882a593Smuzhiyun===============
22*4882a593SmuzhiyunCommunications of user and kernel involve both directions. The
23*4882a593Smuzhiyunsupported user-kernel APIs are as follows:
24*4882a593Smuzhiyun
25*4882a593Smuzhiyun1. Bind/Unbind guest PASID (e.g. Intel VT-d)
26*4882a593Smuzhiyun2. Bind/Unbind guest PASID table (e.g. ARM SMMU)
27*4882a593Smuzhiyun3. Invalidate IOMMU caches upon guest requests
28*4882a593Smuzhiyun4. Report errors to the guest and serve page requests
29*4882a593Smuzhiyun
30*4882a593SmuzhiyunRequirements
31*4882a593Smuzhiyun============
32*4882a593SmuzhiyunThe IOMMU UAPIs are generic and extensible to meet the following
33*4882a593Smuzhiyunrequirements:
34*4882a593Smuzhiyun
35*4882a593Smuzhiyun1. Emulated and para-virtualised vIOMMUs
36*4882a593Smuzhiyun2. Multiple vendors (Intel VT-d, ARM SMMU, etc.)
37*4882a593Smuzhiyun3. Extensions to the UAPI shall not break existing userspace
38*4882a593Smuzhiyun
39*4882a593SmuzhiyunInterfaces
40*4882a593Smuzhiyun==========
41*4882a593SmuzhiyunAlthough the data structures defined in IOMMU UAPI are self-contained,
42*4882a593Smuzhiyunthere are no user API functions introduced. Instead, IOMMU UAPI is
43*4882a593Smuzhiyundesigned to work with existing user driver frameworks such as VFIO.
44*4882a593Smuzhiyun
45*4882a593SmuzhiyunExtension Rules & Precautions
46*4882a593Smuzhiyun-----------------------------
47*4882a593SmuzhiyunWhen IOMMU UAPI gets extended, the data structures can *only* be
48*4882a593Smuzhiyunmodified in two ways:
49*4882a593Smuzhiyun
50*4882a593Smuzhiyun1. Adding new fields by re-purposing the padding[] field. No size change.
51*4882a593Smuzhiyun2. Adding new union members at the end. May increase the structure sizes.
52*4882a593Smuzhiyun
53*4882a593SmuzhiyunNo new fields can be added *after* the variable sized union in that it
54*4882a593Smuzhiyunwill break backward compatibility when offset moves. A new flag must
55*4882a593Smuzhiyunbe introduced whenever a change affects the structure using either
56*4882a593Smuzhiyunmethod. The IOMMU driver processes the data based on flags which
57*4882a593Smuzhiyunensures backward compatibility.
58*4882a593Smuzhiyun
59*4882a593SmuzhiyunVersion field is only reserved for the unlikely event of UAPI upgrade
60*4882a593Smuzhiyunat its entirety.
61*4882a593Smuzhiyun
62*4882a593SmuzhiyunIt's *always* the caller's responsibility to indicate the size of the
63*4882a593Smuzhiyunstructure passed by setting argsz appropriately.
64*4882a593SmuzhiyunThough at the same time, argsz is user provided data which is not
65*4882a593Smuzhiyuntrusted. The argsz field allows the user app to indicate how much data
66*4882a593Smuzhiyunit is providing; it's still the kernel's responsibility to validate
67*4882a593Smuzhiyunwhether it's correct and sufficient for the requested operation.
68*4882a593Smuzhiyun
69*4882a593SmuzhiyunCompatibility Checking
70*4882a593Smuzhiyun----------------------
71*4882a593SmuzhiyunWhen IOMMU UAPI extension results in some structure size increase,
72*4882a593SmuzhiyunIOMMU UAPI code shall handle the following cases:
73*4882a593Smuzhiyun
74*4882a593Smuzhiyun1. User and kernel has exact size match
75*4882a593Smuzhiyun2. An older user with older kernel header (smaller UAPI size) running on a
76*4882a593Smuzhiyun   newer kernel (larger UAPI size)
77*4882a593Smuzhiyun3. A newer user with newer kernel header (larger UAPI size) running
78*4882a593Smuzhiyun   on an older kernel.
79*4882a593Smuzhiyun4. A malicious/misbehaving user passing illegal/invalid size but within
80*4882a593Smuzhiyun   range. The data may contain garbage.
81*4882a593Smuzhiyun
82*4882a593SmuzhiyunFeature Checking
83*4882a593Smuzhiyun----------------
84*4882a593SmuzhiyunWhile launching a guest with vIOMMU, it is strongly advised to check
85*4882a593Smuzhiyunthe compatibility upfront, as some subsequent errors happening during
86*4882a593SmuzhiyunvIOMMU operation, such as cache invalidation failures cannot be nicely
87*4882a593Smuzhiyunescalated to the guest due to IOMMU specifications. This can lead to
88*4882a593Smuzhiyuncatastrophic failures for the users.
89*4882a593Smuzhiyun
90*4882a593SmuzhiyunUser applications such as QEMU are expected to import kernel UAPI
91*4882a593Smuzhiyunheaders. Backward compatibility is supported per feature flags.
92*4882a593SmuzhiyunFor example, an older QEMU (with older kernel header) can run on newer
93*4882a593Smuzhiyunkernel. Newer QEMU (with new kernel header) may refuse to initialize
94*4882a593Smuzhiyunon an older kernel if new feature flags are not supported by older
95*4882a593Smuzhiyunkernel. Simply recompiling existing code with newer kernel header should
96*4882a593Smuzhiyunnot be an issue in that only existing flags are used.
97*4882a593Smuzhiyun
98*4882a593SmuzhiyunIOMMU vendor driver should report the below features to IOMMU UAPI
99*4882a593Smuzhiyunconsumers (e.g. via VFIO).
100*4882a593Smuzhiyun
101*4882a593Smuzhiyun1. IOMMU_NESTING_FEAT_SYSWIDE_PASID
102*4882a593Smuzhiyun2. IOMMU_NESTING_FEAT_BIND_PGTBL
103*4882a593Smuzhiyun3. IOMMU_NESTING_FEAT_BIND_PASID_TABLE
104*4882a593Smuzhiyun4. IOMMU_NESTING_FEAT_CACHE_INVLD
105*4882a593Smuzhiyun5. IOMMU_NESTING_FEAT_PAGE_REQUEST
106*4882a593Smuzhiyun
107*4882a593SmuzhiyunTake VFIO as example, upon request from VFIO userspace (e.g. QEMU),
108*4882a593SmuzhiyunVFIO kernel code shall query IOMMU vendor driver for the support of
109*4882a593Smuzhiyunthe above features. Query result can then be reported back to the
110*4882a593Smuzhiyunuserspace caller. Details can be found in
111*4882a593SmuzhiyunDocumentation/driver-api/vfio.rst.
112*4882a593Smuzhiyun
113*4882a593Smuzhiyun
114*4882a593SmuzhiyunData Passing Example with VFIO
115*4882a593Smuzhiyun------------------------------
116*4882a593SmuzhiyunAs the ubiquitous userspace driver framework, VFIO is already IOMMU
117*4882a593Smuzhiyunaware and shares many key concepts such as device model, group, and
118*4882a593Smuzhiyunprotection domain. Other user driver frameworks can also be extended
119*4882a593Smuzhiyunto support IOMMU UAPI but it is outside the scope of this document.
120*4882a593Smuzhiyun
121*4882a593SmuzhiyunIn this tight-knit VFIO-IOMMU interface, the ultimate consumer of the
122*4882a593SmuzhiyunIOMMU UAPI data is the host IOMMU driver. VFIO facilitates user-kernel
123*4882a593Smuzhiyuntransport, capability checking, security, and life cycle management of
124*4882a593Smuzhiyunprocess address space ID (PASID).
125*4882a593Smuzhiyun
126*4882a593SmuzhiyunVFIO layer conveys the data structures down to the IOMMU driver. It
127*4882a593Smuzhiyunfollows the pattern below::
128*4882a593Smuzhiyun
129*4882a593Smuzhiyun   struct {
130*4882a593Smuzhiyun	__u32 argsz;
131*4882a593Smuzhiyun	__u32 flags;
132*4882a593Smuzhiyun	__u8  data[];
133*4882a593Smuzhiyun   };
134*4882a593Smuzhiyun
135*4882a593SmuzhiyunHere data[] contains the IOMMU UAPI data structures. VFIO has the
136*4882a593Smuzhiyunfreedom to bundle the data as well as parse data size based on its own flags.
137*4882a593Smuzhiyun
138*4882a593SmuzhiyunIn order to determine the size and feature set of the user data, argsz
139*4882a593Smuzhiyunand flags (or the equivalent) are also embedded in the IOMMU UAPI data
140*4882a593Smuzhiyunstructures.
141*4882a593Smuzhiyun
142*4882a593SmuzhiyunA "__u32 argsz" field is *always* at the beginning of each structure.
143*4882a593Smuzhiyun
144*4882a593SmuzhiyunFor example:
145*4882a593Smuzhiyun::
146*4882a593Smuzhiyun
147*4882a593Smuzhiyun   struct iommu_cache_invalidate_info {
148*4882a593Smuzhiyun	__u32	argsz;
149*4882a593Smuzhiyun	#define IOMMU_CACHE_INVALIDATE_INFO_VERSION_1 1
150*4882a593Smuzhiyun	__u32	version;
151*4882a593Smuzhiyun	/* IOMMU paging structure cache */
152*4882a593Smuzhiyun	#define IOMMU_CACHE_INV_TYPE_IOTLB	(1 << 0) /* IOMMU IOTLB */
153*4882a593Smuzhiyun	#define IOMMU_CACHE_INV_TYPE_DEV_IOTLB	(1 << 1) /* Device IOTLB */
154*4882a593Smuzhiyun	#define IOMMU_CACHE_INV_TYPE_PASID	(1 << 2) /* PASID cache */
155*4882a593Smuzhiyun	#define IOMMU_CACHE_INV_TYPE_NR		(3)
156*4882a593Smuzhiyun	__u8	cache;
157*4882a593Smuzhiyun	__u8	granularity;
158*4882a593Smuzhiyun	__u8	padding[6];
159*4882a593Smuzhiyun	union {
160*4882a593Smuzhiyun		struct iommu_inv_pasid_info pasid_info;
161*4882a593Smuzhiyun		struct iommu_inv_addr_info addr_info;
162*4882a593Smuzhiyun	} granu;
163*4882a593Smuzhiyun   };
164*4882a593Smuzhiyun
165*4882a593SmuzhiyunVFIO is responsible for checking its own argsz and flags. It then
166*4882a593Smuzhiyuninvokes appropriate IOMMU UAPI functions. The user pointers are passed
167*4882a593Smuzhiyunto the IOMMU layer for further processing. The responsibilities are
168*4882a593Smuzhiyundivided as follows:
169*4882a593Smuzhiyun
170*4882a593Smuzhiyun- Generic IOMMU layer checks argsz range based on UAPI data in the
171*4882a593Smuzhiyun  current kernel version.
172*4882a593Smuzhiyun
173*4882a593Smuzhiyun- Generic IOMMU layer checks content of the UAPI data for non-zero
174*4882a593Smuzhiyun  reserved bits in flags, padding fields, and unsupported version.
175*4882a593Smuzhiyun  This is to ensure not breaking userspace in the future when these
176*4882a593Smuzhiyun  fields or flags are used.
177*4882a593Smuzhiyun
178*4882a593Smuzhiyun- Vendor IOMMU driver checks argsz based on vendor flags. UAPI data
179*4882a593Smuzhiyun  is consumed based on flags. Vendor driver has access to
180*4882a593Smuzhiyun  unadulterated argsz value in case of vendor specific future
181*4882a593Smuzhiyun  extensions. Currently, it does not perform the copy_from_user()
182*4882a593Smuzhiyun  itself. A __user pointer can be provided in some future scenarios
183*4882a593Smuzhiyun  where there's vendor data outside of the structure definition.
184*4882a593Smuzhiyun
185*4882a593SmuzhiyunIOMMU code treats UAPI data in two categories:
186*4882a593Smuzhiyun
187*4882a593Smuzhiyun- structure contains vendor data
188*4882a593Smuzhiyun  (Example: iommu_uapi_cache_invalidate())
189*4882a593Smuzhiyun
190*4882a593Smuzhiyun- structure contains only generic data
191*4882a593Smuzhiyun  (Example: iommu_uapi_sva_bind_gpasid())
192*4882a593Smuzhiyun
193*4882a593Smuzhiyun
194*4882a593Smuzhiyun
195*4882a593SmuzhiyunSharing UAPI with in-kernel users
196*4882a593Smuzhiyun---------------------------------
197*4882a593SmuzhiyunFor UAPIs that are shared with in-kernel users, a wrapper function is
198*4882a593Smuzhiyunprovided to distinguish the callers. For example,
199*4882a593Smuzhiyun
200*4882a593SmuzhiyunUserspace caller ::
201*4882a593Smuzhiyun
202*4882a593Smuzhiyun  int iommu_uapi_sva_unbind_gpasid(struct iommu_domain *domain,
203*4882a593Smuzhiyun                                   struct device *dev,
204*4882a593Smuzhiyun                                   void __user *udata)
205*4882a593Smuzhiyun
206*4882a593SmuzhiyunIn-kernel caller ::
207*4882a593Smuzhiyun
208*4882a593Smuzhiyun  int iommu_sva_unbind_gpasid(struct iommu_domain *domain,
209*4882a593Smuzhiyun                              struct device *dev, ioasid_t ioasid);
210