xref: /OK3568_Linux_fs/kernel/Documentation/s390/vfio-ap.rst (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593Smuzhiyun===============================
2*4882a593SmuzhiyunAdjunct Processor (AP) facility
3*4882a593Smuzhiyun===============================
4*4882a593Smuzhiyun
5*4882a593Smuzhiyun
6*4882a593SmuzhiyunIntroduction
7*4882a593Smuzhiyun============
8*4882a593SmuzhiyunThe Adjunct Processor (AP) facility is an IBM Z cryptographic facility comprised
9*4882a593Smuzhiyunof three AP instructions and from 1 up to 256 PCIe cryptographic adapter cards.
10*4882a593SmuzhiyunThe AP devices provide cryptographic functions to all CPUs assigned to a
11*4882a593Smuzhiyunlinux system running in an IBM Z system LPAR.
12*4882a593Smuzhiyun
13*4882a593SmuzhiyunThe AP adapter cards are exposed via the AP bus. The motivation for vfio-ap
14*4882a593Smuzhiyunis to make AP cards available to KVM guests using the VFIO mediated device
15*4882a593Smuzhiyunframework. This implementation relies considerably on the s390 virtualization
16*4882a593Smuzhiyunfacilities which do most of the hard work of providing direct access to AP
17*4882a593Smuzhiyundevices.
18*4882a593Smuzhiyun
19*4882a593SmuzhiyunAP Architectural Overview
20*4882a593Smuzhiyun=========================
21*4882a593SmuzhiyunTo facilitate the comprehension of the design, let's start with some
22*4882a593Smuzhiyundefinitions:
23*4882a593Smuzhiyun
24*4882a593Smuzhiyun* AP adapter
25*4882a593Smuzhiyun
26*4882a593Smuzhiyun  An AP adapter is an IBM Z adapter card that can perform cryptographic
27*4882a593Smuzhiyun  functions. There can be from 0 to 256 adapters assigned to an LPAR. Adapters
28*4882a593Smuzhiyun  assigned to the LPAR in which a linux host is running will be available to
29*4882a593Smuzhiyun  the linux host. Each adapter is identified by a number from 0 to 255; however,
30*4882a593Smuzhiyun  the maximum adapter number is determined by machine model and/or adapter type.
31*4882a593Smuzhiyun  When installed, an AP adapter is accessed by AP instructions executed by any
32*4882a593Smuzhiyun  CPU.
33*4882a593Smuzhiyun
34*4882a593Smuzhiyun  The AP adapter cards are assigned to a given LPAR via the system's Activation
35*4882a593Smuzhiyun  Profile which can be edited via the HMC. When the linux host system is IPL'd
36*4882a593Smuzhiyun  in the LPAR, the AP bus detects the AP adapter cards assigned to the LPAR and
37*4882a593Smuzhiyun  creates a sysfs device for each assigned adapter. For example, if AP adapters
38*4882a593Smuzhiyun  4 and 10 (0x0a) are assigned to the LPAR, the AP bus will create the following
39*4882a593Smuzhiyun  sysfs device entries::
40*4882a593Smuzhiyun
41*4882a593Smuzhiyun    /sys/devices/ap/card04
42*4882a593Smuzhiyun    /sys/devices/ap/card0a
43*4882a593Smuzhiyun
44*4882a593Smuzhiyun  Symbolic links to these devices will also be created in the AP bus devices
45*4882a593Smuzhiyun  sub-directory::
46*4882a593Smuzhiyun
47*4882a593Smuzhiyun    /sys/bus/ap/devices/[card04]
48*4882a593Smuzhiyun    /sys/bus/ap/devices/[card04]
49*4882a593Smuzhiyun
50*4882a593Smuzhiyun* AP domain
51*4882a593Smuzhiyun
52*4882a593Smuzhiyun  An adapter is partitioned into domains. An adapter can hold up to 256 domains
53*4882a593Smuzhiyun  depending upon the adapter type and hardware configuration. A domain is
54*4882a593Smuzhiyun  identified by a number from 0 to 255; however, the maximum domain number is
55*4882a593Smuzhiyun  determined by machine model and/or adapter type.. A domain can be thought of
56*4882a593Smuzhiyun  as a set of hardware registers and memory used for processing AP commands. A
57*4882a593Smuzhiyun  domain can be configured with a secure private key used for clear key
58*4882a593Smuzhiyun  encryption. A domain is classified in one of two ways depending upon how it
59*4882a593Smuzhiyun  may be accessed:
60*4882a593Smuzhiyun
61*4882a593Smuzhiyun    * Usage domains are domains that are targeted by an AP instruction to
62*4882a593Smuzhiyun      process an AP command.
63*4882a593Smuzhiyun
64*4882a593Smuzhiyun    * Control domains are domains that are changed by an AP command sent to a
65*4882a593Smuzhiyun      usage domain; for example, to set the secure private key for the control
66*4882a593Smuzhiyun      domain.
67*4882a593Smuzhiyun
68*4882a593Smuzhiyun  The AP usage and control domains are assigned to a given LPAR via the system's
69*4882a593Smuzhiyun  Activation Profile which can be edited via the HMC. When a linux host system
70*4882a593Smuzhiyun  is IPL'd in the LPAR, the AP bus module detects the AP usage and control
71*4882a593Smuzhiyun  domains assigned to the LPAR. The domain number of each usage domain and
72*4882a593Smuzhiyun  adapter number of each AP adapter are combined to create AP queue devices
73*4882a593Smuzhiyun  (see AP Queue section below). The domain number of each control domain will be
74*4882a593Smuzhiyun  represented in a bitmask and stored in a sysfs file
75*4882a593Smuzhiyun  /sys/bus/ap/ap_control_domain_mask. The bits in the mask, from most to least
76*4882a593Smuzhiyun  significant bit, correspond to domains 0-255.
77*4882a593Smuzhiyun
78*4882a593Smuzhiyun* AP Queue
79*4882a593Smuzhiyun
80*4882a593Smuzhiyun  An AP queue is the means by which an AP command is sent to a usage domain
81*4882a593Smuzhiyun  inside a specific adapter. An AP queue is identified by a tuple
82*4882a593Smuzhiyun  comprised of an AP adapter ID (APID) and an AP queue index (APQI). The
83*4882a593Smuzhiyun  APQI corresponds to a given usage domain number within the adapter. This tuple
84*4882a593Smuzhiyun  forms an AP Queue Number (APQN) uniquely identifying an AP queue. AP
85*4882a593Smuzhiyun  instructions include a field containing the APQN to identify the AP queue to
86*4882a593Smuzhiyun  which the AP command is to be sent for processing.
87*4882a593Smuzhiyun
88*4882a593Smuzhiyun  The AP bus will create a sysfs device for each APQN that can be derived from
89*4882a593Smuzhiyun  the cross product of the AP adapter and usage domain numbers detected when the
90*4882a593Smuzhiyun  AP bus module is loaded. For example, if adapters 4 and 10 (0x0a) and usage
91*4882a593Smuzhiyun  domains 6 and 71 (0x47) are assigned to the LPAR, the AP bus will create the
92*4882a593Smuzhiyun  following sysfs entries::
93*4882a593Smuzhiyun
94*4882a593Smuzhiyun    /sys/devices/ap/card04/04.0006
95*4882a593Smuzhiyun    /sys/devices/ap/card04/04.0047
96*4882a593Smuzhiyun    /sys/devices/ap/card0a/0a.0006
97*4882a593Smuzhiyun    /sys/devices/ap/card0a/0a.0047
98*4882a593Smuzhiyun
99*4882a593Smuzhiyun  The following symbolic links to these devices will be created in the AP bus
100*4882a593Smuzhiyun  devices subdirectory::
101*4882a593Smuzhiyun
102*4882a593Smuzhiyun    /sys/bus/ap/devices/[04.0006]
103*4882a593Smuzhiyun    /sys/bus/ap/devices/[04.0047]
104*4882a593Smuzhiyun    /sys/bus/ap/devices/[0a.0006]
105*4882a593Smuzhiyun    /sys/bus/ap/devices/[0a.0047]
106*4882a593Smuzhiyun
107*4882a593Smuzhiyun* AP Instructions:
108*4882a593Smuzhiyun
109*4882a593Smuzhiyun  There are three AP instructions:
110*4882a593Smuzhiyun
111*4882a593Smuzhiyun  * NQAP: to enqueue an AP command-request message to a queue
112*4882a593Smuzhiyun  * DQAP: to dequeue an AP command-reply message from a queue
113*4882a593Smuzhiyun  * PQAP: to administer the queues
114*4882a593Smuzhiyun
115*4882a593Smuzhiyun  AP instructions identify the domain that is targeted to process the AP
116*4882a593Smuzhiyun  command; this must be one of the usage domains. An AP command may modify a
117*4882a593Smuzhiyun  domain that is not one of the usage domains, but the modified domain
118*4882a593Smuzhiyun  must be one of the control domains.
119*4882a593Smuzhiyun
120*4882a593SmuzhiyunAP and SIE
121*4882a593Smuzhiyun==========
122*4882a593SmuzhiyunLet's now take a look at how AP instructions executed on a guest are interpreted
123*4882a593Smuzhiyunby the hardware.
124*4882a593Smuzhiyun
125*4882a593SmuzhiyunA satellite control block called the Crypto Control Block (CRYCB) is attached to
126*4882a593Smuzhiyunour main hardware virtualization control block. The CRYCB contains three fields
127*4882a593Smuzhiyunto identify the adapters, usage domains and control domains assigned to the KVM
128*4882a593Smuzhiyunguest:
129*4882a593Smuzhiyun
130*4882a593Smuzhiyun* The AP Mask (APM) field is a bit mask that identifies the AP adapters assigned
131*4882a593Smuzhiyun  to the KVM guest. Each bit in the mask, from left to right (i.e. from most
132*4882a593Smuzhiyun  significant to least significant bit in big endian order), corresponds to
133*4882a593Smuzhiyun  an APID from 0-255. If a bit is set, the corresponding adapter is valid for
134*4882a593Smuzhiyun  use by the KVM guest.
135*4882a593Smuzhiyun
136*4882a593Smuzhiyun* The AP Queue Mask (AQM) field is a bit mask identifying the AP usage domains
137*4882a593Smuzhiyun  assigned to the KVM guest. Each bit in the mask, from left to right (i.e. from
138*4882a593Smuzhiyun  most significant to least significant bit in big endian order), corresponds to
139*4882a593Smuzhiyun  an AP queue index (APQI) from 0-255. If a bit is set, the corresponding queue
140*4882a593Smuzhiyun  is valid for use by the KVM guest.
141*4882a593Smuzhiyun
142*4882a593Smuzhiyun* The AP Domain Mask field is a bit mask that identifies the AP control domains
143*4882a593Smuzhiyun  assigned to the KVM guest. The ADM bit mask controls which domains can be
144*4882a593Smuzhiyun  changed by an AP command-request message sent to a usage domain from the
145*4882a593Smuzhiyun  guest. Each bit in the mask, from left to right (i.e. from most significant to
146*4882a593Smuzhiyun  least significant bit in big endian order), corresponds to a domain from
147*4882a593Smuzhiyun  0-255. If a bit is set, the corresponding domain can be modified by an AP
148*4882a593Smuzhiyun  command-request message sent to a usage domain.
149*4882a593Smuzhiyun
150*4882a593SmuzhiyunIf you recall from the description of an AP Queue, AP instructions include
151*4882a593Smuzhiyunan APQN to identify the AP queue to which an AP command-request message is to be
152*4882a593Smuzhiyunsent (NQAP and PQAP instructions), or from which a command-reply message is to
153*4882a593Smuzhiyunbe received (DQAP instruction). The validity of an APQN is defined by the matrix
154*4882a593Smuzhiyuncalculated from the APM and AQM; it is the cross product of all assigned adapter
155*4882a593Smuzhiyunnumbers (APM) with all assigned queue indexes (AQM). For example, if adapters 1
156*4882a593Smuzhiyunand 2 and usage domains 5 and 6 are assigned to a guest, the APQNs (1,5), (1,6),
157*4882a593Smuzhiyun(2,5) and (2,6) will be valid for the guest.
158*4882a593Smuzhiyun
159*4882a593SmuzhiyunThe APQNs can provide secure key functionality - i.e., a private key is stored
160*4882a593Smuzhiyunon the adapter card for each of its domains - so each APQN must be assigned to
161*4882a593Smuzhiyunat most one guest or to the linux host::
162*4882a593Smuzhiyun
163*4882a593Smuzhiyun   Example 1: Valid configuration:
164*4882a593Smuzhiyun   ------------------------------
165*4882a593Smuzhiyun   Guest1: adapters 1,2  domains 5,6
166*4882a593Smuzhiyun   Guest2: adapter  1,2  domain 7
167*4882a593Smuzhiyun
168*4882a593Smuzhiyun   This is valid because both guests have a unique set of APQNs:
169*4882a593Smuzhiyun      Guest1 has APQNs (1,5), (1,6), (2,5), (2,6);
170*4882a593Smuzhiyun      Guest2 has APQNs (1,7), (2,7)
171*4882a593Smuzhiyun
172*4882a593Smuzhiyun   Example 2: Valid configuration:
173*4882a593Smuzhiyun   ------------------------------
174*4882a593Smuzhiyun   Guest1: adapters 1,2 domains 5,6
175*4882a593Smuzhiyun   Guest2: adapters 3,4 domains 5,6
176*4882a593Smuzhiyun
177*4882a593Smuzhiyun   This is also valid because both guests have a unique set of APQNs:
178*4882a593Smuzhiyun      Guest1 has APQNs (1,5), (1,6), (2,5), (2,6);
179*4882a593Smuzhiyun      Guest2 has APQNs (3,5), (3,6), (4,5), (4,6)
180*4882a593Smuzhiyun
181*4882a593Smuzhiyun   Example 3: Invalid configuration:
182*4882a593Smuzhiyun   --------------------------------
183*4882a593Smuzhiyun   Guest1: adapters 1,2  domains 5,6
184*4882a593Smuzhiyun   Guest2: adapter  1    domains 6,7
185*4882a593Smuzhiyun
186*4882a593Smuzhiyun   This is an invalid configuration because both guests have access to
187*4882a593Smuzhiyun   APQN (1,6).
188*4882a593Smuzhiyun
189*4882a593SmuzhiyunThe Design
190*4882a593Smuzhiyun==========
191*4882a593SmuzhiyunThe design introduces three new objects:
192*4882a593Smuzhiyun
193*4882a593Smuzhiyun1. AP matrix device
194*4882a593Smuzhiyun2. VFIO AP device driver (vfio_ap.ko)
195*4882a593Smuzhiyun3. VFIO AP mediated matrix pass-through device
196*4882a593Smuzhiyun
197*4882a593SmuzhiyunThe VFIO AP device driver
198*4882a593Smuzhiyun-------------------------
199*4882a593SmuzhiyunThe VFIO AP (vfio_ap) device driver serves the following purposes:
200*4882a593Smuzhiyun
201*4882a593Smuzhiyun1. Provides the interfaces to secure APQNs for exclusive use of KVM guests.
202*4882a593Smuzhiyun
203*4882a593Smuzhiyun2. Sets up the VFIO mediated device interfaces to manage a mediated matrix
204*4882a593Smuzhiyun   device and creates the sysfs interfaces for assigning adapters, usage
205*4882a593Smuzhiyun   domains, and control domains comprising the matrix for a KVM guest.
206*4882a593Smuzhiyun
207*4882a593Smuzhiyun3. Configures the APM, AQM and ADM in the CRYCB referenced by a KVM guest's
208*4882a593Smuzhiyun   SIE state description to grant the guest access to a matrix of AP devices
209*4882a593Smuzhiyun
210*4882a593SmuzhiyunReserve APQNs for exclusive use of KVM guests
211*4882a593Smuzhiyun---------------------------------------------
212*4882a593SmuzhiyunThe following block diagram illustrates the mechanism by which APQNs are
213*4882a593Smuzhiyunreserved::
214*4882a593Smuzhiyun
215*4882a593Smuzhiyun				+------------------+
216*4882a593Smuzhiyun		 7 remove       |                  |
217*4882a593Smuzhiyun	   +--------------------> cex4queue driver |
218*4882a593Smuzhiyun	   |                    |                  |
219*4882a593Smuzhiyun	   |                    +------------------+
220*4882a593Smuzhiyun	   |
221*4882a593Smuzhiyun	   |
222*4882a593Smuzhiyun	   |                    +------------------+          +----------------+
223*4882a593Smuzhiyun	   |  5 register driver |                  | 3 create |                |
224*4882a593Smuzhiyun	   |   +---------------->   Device core    +---------->  matrix device |
225*4882a593Smuzhiyun	   |   |                |                  |          |                |
226*4882a593Smuzhiyun	   |   |                +--------^---------+          +----------------+
227*4882a593Smuzhiyun	   |   |                         |
228*4882a593Smuzhiyun	   |   |                         +-------------------+
229*4882a593Smuzhiyun	   |   | +-----------------------------------+       |
230*4882a593Smuzhiyun	   |   | |      4 register AP driver         |       | 2 register device
231*4882a593Smuzhiyun	   |   | |                                   |       |
232*4882a593Smuzhiyun  +--------+---+-v---+                      +--------+-------+-+
233*4882a593Smuzhiyun  |                  |                      |                  |
234*4882a593Smuzhiyun  |      ap_bus      +--------------------- >  vfio_ap driver  |
235*4882a593Smuzhiyun  |                  |       8 probe        |                  |
236*4882a593Smuzhiyun  +--------^---------+                      +--^--^------------+
237*4882a593Smuzhiyun  6 edit   |                                   |  |
238*4882a593Smuzhiyun    apmask |     +-----------------------------+  | 9 mdev create
239*4882a593Smuzhiyun    aqmask |     |           1 modprobe           |
240*4882a593Smuzhiyun  +--------+-----+---+           +----------------+-+         +----------------+
241*4882a593Smuzhiyun  |                  |           |                  |8 create |     mediated   |
242*4882a593Smuzhiyun  |      admin       |           | VFIO device core |--------->     matrix     |
243*4882a593Smuzhiyun  |                  +           |                  |         |     device     |
244*4882a593Smuzhiyun  +------+-+---------+           +--------^---------+         +--------^-------+
245*4882a593Smuzhiyun	 | |                              |                            |
246*4882a593Smuzhiyun	 | | 9 create vfio_ap-passthrough |                            |
247*4882a593Smuzhiyun	 | +------------------------------+                            |
248*4882a593Smuzhiyun	 +-------------------------------------------------------------+
249*4882a593Smuzhiyun		     10  assign adapter/domain/control domain
250*4882a593Smuzhiyun
251*4882a593SmuzhiyunThe process for reserving an AP queue for use by a KVM guest is:
252*4882a593Smuzhiyun
253*4882a593Smuzhiyun1. The administrator loads the vfio_ap device driver
254*4882a593Smuzhiyun2. The vfio-ap driver during its initialization will register a single 'matrix'
255*4882a593Smuzhiyun   device with the device core. This will serve as the parent device for
256*4882a593Smuzhiyun   all mediated matrix devices used to configure an AP matrix for a guest.
257*4882a593Smuzhiyun3. The /sys/devices/vfio_ap/matrix device is created by the device core
258*4882a593Smuzhiyun4. The vfio_ap device driver will register with the AP bus for AP queue devices
259*4882a593Smuzhiyun   of type 10 and higher (CEX4 and newer). The driver will provide the vfio_ap
260*4882a593Smuzhiyun   driver's probe and remove callback interfaces. Devices older than CEX4 queues
261*4882a593Smuzhiyun   are not supported to simplify the implementation by not needlessly
262*4882a593Smuzhiyun   complicating the design by supporting older devices that will go out of
263*4882a593Smuzhiyun   service in the relatively near future, and for which there are few older
264*4882a593Smuzhiyun   systems around on which to test.
265*4882a593Smuzhiyun5. The AP bus registers the vfio_ap device driver with the device core
266*4882a593Smuzhiyun6. The administrator edits the AP adapter and queue masks to reserve AP queues
267*4882a593Smuzhiyun   for use by the vfio_ap device driver.
268*4882a593Smuzhiyun7. The AP bus removes the AP queues reserved for the vfio_ap driver from the
269*4882a593Smuzhiyun   default zcrypt cex4queue driver.
270*4882a593Smuzhiyun8. The AP bus probes the vfio_ap device driver to bind the queues reserved for
271*4882a593Smuzhiyun   it.
272*4882a593Smuzhiyun9. The administrator creates a passthrough type mediated matrix device to be
273*4882a593Smuzhiyun   used by a guest
274*4882a593Smuzhiyun10. The administrator assigns the adapters, usage domains and control domains
275*4882a593Smuzhiyun    to be exclusively used by a guest.
276*4882a593Smuzhiyun
277*4882a593SmuzhiyunSet up the VFIO mediated device interfaces
278*4882a593Smuzhiyun------------------------------------------
279*4882a593SmuzhiyunThe VFIO AP device driver utilizes the common interface of the VFIO mediated
280*4882a593Smuzhiyundevice core driver to:
281*4882a593Smuzhiyun
282*4882a593Smuzhiyun* Register an AP mediated bus driver to add a mediated matrix device to and
283*4882a593Smuzhiyun  remove it from a VFIO group.
284*4882a593Smuzhiyun* Create and destroy a mediated matrix device
285*4882a593Smuzhiyun* Add a mediated matrix device to and remove it from the AP mediated bus driver
286*4882a593Smuzhiyun* Add a mediated matrix device to and remove it from an IOMMU group
287*4882a593Smuzhiyun
288*4882a593SmuzhiyunThe following high-level block diagram shows the main components and interfaces
289*4882a593Smuzhiyunof the VFIO AP mediated matrix device driver::
290*4882a593Smuzhiyun
291*4882a593Smuzhiyun   +-------------+
292*4882a593Smuzhiyun   |             |
293*4882a593Smuzhiyun   | +---------+ | mdev_register_driver() +--------------+
294*4882a593Smuzhiyun   | |  Mdev   | +<-----------------------+              |
295*4882a593Smuzhiyun   | |  bus    | |                        | vfio_mdev.ko |
296*4882a593Smuzhiyun   | | driver  | +----------------------->+              |<-> VFIO user
297*4882a593Smuzhiyun   | +---------+ |    probe()/remove()    +--------------+    APIs
298*4882a593Smuzhiyun   |             |
299*4882a593Smuzhiyun   |  MDEV CORE  |
300*4882a593Smuzhiyun   |   MODULE    |
301*4882a593Smuzhiyun   |   mdev.ko   |
302*4882a593Smuzhiyun   | +---------+ | mdev_register_device() +--------------+
303*4882a593Smuzhiyun   | |Physical | +<-----------------------+              |
304*4882a593Smuzhiyun   | | device  | |                        |  vfio_ap.ko  |<-> matrix
305*4882a593Smuzhiyun   | |interface| +----------------------->+              |    device
306*4882a593Smuzhiyun   | +---------+ |       callback         +--------------+
307*4882a593Smuzhiyun   +-------------+
308*4882a593Smuzhiyun
309*4882a593SmuzhiyunDuring initialization of the vfio_ap module, the matrix device is registered
310*4882a593Smuzhiyunwith an 'mdev_parent_ops' structure that provides the sysfs attribute
311*4882a593Smuzhiyunstructures, mdev functions and callback interfaces for managing the mediated
312*4882a593Smuzhiyunmatrix device.
313*4882a593Smuzhiyun
314*4882a593Smuzhiyun* sysfs attribute structures:
315*4882a593Smuzhiyun
316*4882a593Smuzhiyun  supported_type_groups
317*4882a593Smuzhiyun    The VFIO mediated device framework supports creation of user-defined
318*4882a593Smuzhiyun    mediated device types. These mediated device types are specified
319*4882a593Smuzhiyun    via the 'supported_type_groups' structure when a device is registered
320*4882a593Smuzhiyun    with the mediated device framework. The registration process creates the
321*4882a593Smuzhiyun    sysfs structures for each mediated device type specified in the
322*4882a593Smuzhiyun    'mdev_supported_types' sub-directory of the device being registered. Along
323*4882a593Smuzhiyun    with the device type, the sysfs attributes of the mediated device type are
324*4882a593Smuzhiyun    provided.
325*4882a593Smuzhiyun
326*4882a593Smuzhiyun    The VFIO AP device driver will register one mediated device type for
327*4882a593Smuzhiyun    passthrough devices:
328*4882a593Smuzhiyun
329*4882a593Smuzhiyun      /sys/devices/vfio_ap/matrix/mdev_supported_types/vfio_ap-passthrough
330*4882a593Smuzhiyun
331*4882a593Smuzhiyun    Only the read-only attributes required by the VFIO mdev framework will
332*4882a593Smuzhiyun    be provided::
333*4882a593Smuzhiyun
334*4882a593Smuzhiyun	... name
335*4882a593Smuzhiyun	... device_api
336*4882a593Smuzhiyun	... available_instances
337*4882a593Smuzhiyun	... device_api
338*4882a593Smuzhiyun
339*4882a593Smuzhiyun    Where:
340*4882a593Smuzhiyun
341*4882a593Smuzhiyun	* name:
342*4882a593Smuzhiyun	    specifies the name of the mediated device type
343*4882a593Smuzhiyun	* device_api:
344*4882a593Smuzhiyun	    the mediated device type's API
345*4882a593Smuzhiyun	* available_instances:
346*4882a593Smuzhiyun	    the number of mediated matrix passthrough devices
347*4882a593Smuzhiyun	    that can be created
348*4882a593Smuzhiyun	* device_api:
349*4882a593Smuzhiyun	    specifies the VFIO API
350*4882a593Smuzhiyun  mdev_attr_groups
351*4882a593Smuzhiyun    This attribute group identifies the user-defined sysfs attributes of the
352*4882a593Smuzhiyun    mediated device. When a device is registered with the VFIO mediated device
353*4882a593Smuzhiyun    framework, the sysfs attribute files identified in the 'mdev_attr_groups'
354*4882a593Smuzhiyun    structure will be created in the mediated matrix device's directory. The
355*4882a593Smuzhiyun    sysfs attributes for a mediated matrix device are:
356*4882a593Smuzhiyun
357*4882a593Smuzhiyun    assign_adapter / unassign_adapter:
358*4882a593Smuzhiyun      Write-only attributes for assigning/unassigning an AP adapter to/from the
359*4882a593Smuzhiyun      mediated matrix device. To assign/unassign an adapter, the APID of the
360*4882a593Smuzhiyun      adapter is echoed to the respective attribute file.
361*4882a593Smuzhiyun    assign_domain / unassign_domain:
362*4882a593Smuzhiyun      Write-only attributes for assigning/unassigning an AP usage domain to/from
363*4882a593Smuzhiyun      the mediated matrix device. To assign/unassign a domain, the domain
364*4882a593Smuzhiyun      number of the usage domain is echoed to the respective attribute
365*4882a593Smuzhiyun      file.
366*4882a593Smuzhiyun    matrix:
367*4882a593Smuzhiyun      A read-only file for displaying the APQNs derived from the cross product
368*4882a593Smuzhiyun      of the adapter and domain numbers assigned to the mediated matrix device.
369*4882a593Smuzhiyun    assign_control_domain / unassign_control_domain:
370*4882a593Smuzhiyun      Write-only attributes for assigning/unassigning an AP control domain
371*4882a593Smuzhiyun      to/from the mediated matrix device. To assign/unassign a control domain,
372*4882a593Smuzhiyun      the ID of the domain to be assigned/unassigned is echoed to the respective
373*4882a593Smuzhiyun      attribute file.
374*4882a593Smuzhiyun    control_domains:
375*4882a593Smuzhiyun      A read-only file for displaying the control domain numbers assigned to the
376*4882a593Smuzhiyun      mediated matrix device.
377*4882a593Smuzhiyun
378*4882a593Smuzhiyun* functions:
379*4882a593Smuzhiyun
380*4882a593Smuzhiyun  create:
381*4882a593Smuzhiyun    allocates the ap_matrix_mdev structure used by the vfio_ap driver to:
382*4882a593Smuzhiyun
383*4882a593Smuzhiyun    * Store the reference to the KVM structure for the guest using the mdev
384*4882a593Smuzhiyun    * Store the AP matrix configuration for the adapters, domains, and control
385*4882a593Smuzhiyun      domains assigned via the corresponding sysfs attributes files
386*4882a593Smuzhiyun
387*4882a593Smuzhiyun  remove:
388*4882a593Smuzhiyun    deallocates the mediated matrix device's ap_matrix_mdev structure. This will
389*4882a593Smuzhiyun    be allowed only if a running guest is not using the mdev.
390*4882a593Smuzhiyun
391*4882a593Smuzhiyun* callback interfaces
392*4882a593Smuzhiyun
393*4882a593Smuzhiyun  open:
394*4882a593Smuzhiyun    The vfio_ap driver uses this callback to register a
395*4882a593Smuzhiyun    VFIO_GROUP_NOTIFY_SET_KVM notifier callback function for the mdev matrix
396*4882a593Smuzhiyun    device. The open is invoked when QEMU connects the VFIO iommu group
397*4882a593Smuzhiyun    for the mdev matrix device to the MDEV bus. Access to the KVM structure used
398*4882a593Smuzhiyun    to configure the KVM guest is provided via this callback. The KVM structure,
399*4882a593Smuzhiyun    is used to configure the guest's access to the AP matrix defined via the
400*4882a593Smuzhiyun    mediated matrix device's sysfs attribute files.
401*4882a593Smuzhiyun  release:
402*4882a593Smuzhiyun    unregisters the VFIO_GROUP_NOTIFY_SET_KVM notifier callback function for the
403*4882a593Smuzhiyun    mdev matrix device and deconfigures the guest's AP matrix.
404*4882a593Smuzhiyun
405*4882a593SmuzhiyunConfigure the APM, AQM and ADM in the CRYCB
406*4882a593Smuzhiyun-------------------------------------------
407*4882a593SmuzhiyunConfiguring the AP matrix for a KVM guest will be performed when the
408*4882a593SmuzhiyunVFIO_GROUP_NOTIFY_SET_KVM notifier callback is invoked. The notifier
409*4882a593Smuzhiyunfunction is called when QEMU connects to KVM. The guest's AP matrix is
410*4882a593Smuzhiyunconfigured via it's CRYCB by:
411*4882a593Smuzhiyun
412*4882a593Smuzhiyun* Setting the bits in the APM corresponding to the APIDs assigned to the
413*4882a593Smuzhiyun  mediated matrix device via its 'assign_adapter' interface.
414*4882a593Smuzhiyun* Setting the bits in the AQM corresponding to the domains assigned to the
415*4882a593Smuzhiyun  mediated matrix device via its 'assign_domain' interface.
416*4882a593Smuzhiyun* Setting the bits in the ADM corresponding to the domain dIDs assigned to the
417*4882a593Smuzhiyun  mediated matrix device via its 'assign_control_domains' interface.
418*4882a593Smuzhiyun
419*4882a593SmuzhiyunThe CPU model features for AP
420*4882a593Smuzhiyun-----------------------------
421*4882a593SmuzhiyunThe AP stack relies on the presence of the AP instructions as well as two
422*4882a593Smuzhiyunfacilities: The AP Facilities Test (APFT) facility; and the AP Query
423*4882a593SmuzhiyunConfiguration Information (QCI) facility. These features/facilities are made
424*4882a593Smuzhiyunavailable to a KVM guest via the following CPU model features:
425*4882a593Smuzhiyun
426*4882a593Smuzhiyun1. ap: Indicates whether the AP instructions are installed on the guest. This
427*4882a593Smuzhiyun   feature will be enabled by KVM only if the AP instructions are installed
428*4882a593Smuzhiyun   on the host.
429*4882a593Smuzhiyun
430*4882a593Smuzhiyun2. apft: Indicates the APFT facility is available on the guest. This facility
431*4882a593Smuzhiyun   can be made available to the guest only if it is available on the host (i.e.,
432*4882a593Smuzhiyun   facility bit 15 is set).
433*4882a593Smuzhiyun
434*4882a593Smuzhiyun3. apqci: Indicates the AP QCI facility is available on the guest. This facility
435*4882a593Smuzhiyun   can be made available to the guest only if it is available on the host (i.e.,
436*4882a593Smuzhiyun   facility bit 12 is set).
437*4882a593Smuzhiyun
438*4882a593SmuzhiyunNote: If the user chooses to specify a CPU model different than the 'host'
439*4882a593Smuzhiyunmodel to QEMU, the CPU model features and facilities need to be turned on
440*4882a593Smuzhiyunexplicitly; for example::
441*4882a593Smuzhiyun
442*4882a593Smuzhiyun     /usr/bin/qemu-system-s390x ... -cpu z13,ap=on,apqci=on,apft=on
443*4882a593Smuzhiyun
444*4882a593SmuzhiyunA guest can be precluded from using AP features/facilities by turning them off
445*4882a593Smuzhiyunexplicitly; for example::
446*4882a593Smuzhiyun
447*4882a593Smuzhiyun     /usr/bin/qemu-system-s390x ... -cpu host,ap=off,apqci=off,apft=off
448*4882a593Smuzhiyun
449*4882a593SmuzhiyunNote: If the APFT facility is turned off (apft=off) for the guest, the guest
450*4882a593Smuzhiyunwill not see any AP devices. The zcrypt device drivers that register for type 10
451*4882a593Smuzhiyunand newer AP devices - i.e., the cex4card and cex4queue device drivers - need
452*4882a593Smuzhiyunthe APFT facility to ascertain the facilities installed on a given AP device. If
453*4882a593Smuzhiyunthe APFT facility is not installed on the guest, then the probe of device
454*4882a593Smuzhiyundrivers will fail since only type 10 and newer devices can be configured for
455*4882a593Smuzhiyunguest use.
456*4882a593Smuzhiyun
457*4882a593SmuzhiyunExample
458*4882a593Smuzhiyun=======
459*4882a593SmuzhiyunLet's now provide an example to illustrate how KVM guests may be given
460*4882a593Smuzhiyunaccess to AP facilities. For this example, we will show how to configure
461*4882a593Smuzhiyunthree guests such that executing the lszcrypt command on the guests would
462*4882a593Smuzhiyunlook like this:
463*4882a593Smuzhiyun
464*4882a593SmuzhiyunGuest1
465*4882a593Smuzhiyun------
466*4882a593Smuzhiyun=========== ===== ============
467*4882a593SmuzhiyunCARD.DOMAIN TYPE  MODE
468*4882a593Smuzhiyun=========== ===== ============
469*4882a593Smuzhiyun05          CEX5C CCA-Coproc
470*4882a593Smuzhiyun05.0004     CEX5C CCA-Coproc
471*4882a593Smuzhiyun05.00ab     CEX5C CCA-Coproc
472*4882a593Smuzhiyun06          CEX5A Accelerator
473*4882a593Smuzhiyun06.0004     CEX5A Accelerator
474*4882a593Smuzhiyun06.00ab     CEX5C CCA-Coproc
475*4882a593Smuzhiyun=========== ===== ============
476*4882a593Smuzhiyun
477*4882a593SmuzhiyunGuest2
478*4882a593Smuzhiyun------
479*4882a593Smuzhiyun=========== ===== ============
480*4882a593SmuzhiyunCARD.DOMAIN TYPE  MODE
481*4882a593Smuzhiyun=========== ===== ============
482*4882a593Smuzhiyun05          CEX5A Accelerator
483*4882a593Smuzhiyun05.0047     CEX5A Accelerator
484*4882a593Smuzhiyun05.00ff     CEX5A Accelerator
485*4882a593Smuzhiyun=========== ===== ============
486*4882a593Smuzhiyun
487*4882a593SmuzhiyunGuest3
488*4882a593Smuzhiyun------
489*4882a593Smuzhiyun=========== ===== ============
490*4882a593SmuzhiyunCARD.DOMAIN TYPE  MODE
491*4882a593Smuzhiyun=========== ===== ============
492*4882a593Smuzhiyun06          CEX5A Accelerator
493*4882a593Smuzhiyun06.0047     CEX5A Accelerator
494*4882a593Smuzhiyun06.00ff     CEX5A Accelerator
495*4882a593Smuzhiyun=========== ===== ============
496*4882a593Smuzhiyun
497*4882a593SmuzhiyunThese are the steps:
498*4882a593Smuzhiyun
499*4882a593Smuzhiyun1. Install the vfio_ap module on the linux host. The dependency chain for the
500*4882a593Smuzhiyun   vfio_ap module is:
501*4882a593Smuzhiyun   * iommu
502*4882a593Smuzhiyun   * s390
503*4882a593Smuzhiyun   * zcrypt
504*4882a593Smuzhiyun   * vfio
505*4882a593Smuzhiyun   * vfio_mdev
506*4882a593Smuzhiyun   * vfio_mdev_device
507*4882a593Smuzhiyun   * KVM
508*4882a593Smuzhiyun
509*4882a593Smuzhiyun   To build the vfio_ap module, the kernel build must be configured with the
510*4882a593Smuzhiyun   following Kconfig elements selected:
511*4882a593Smuzhiyun   * IOMMU_SUPPORT
512*4882a593Smuzhiyun   * S390
513*4882a593Smuzhiyun   * ZCRYPT
514*4882a593Smuzhiyun   * S390_AP_IOMMU
515*4882a593Smuzhiyun   * VFIO
516*4882a593Smuzhiyun   * VFIO_MDEV
517*4882a593Smuzhiyun   * VFIO_MDEV_DEVICE
518*4882a593Smuzhiyun   * KVM
519*4882a593Smuzhiyun
520*4882a593Smuzhiyun   If using make menuconfig select the following to build the vfio_ap module::
521*4882a593Smuzhiyun
522*4882a593Smuzhiyun     -> Device Drivers
523*4882a593Smuzhiyun	-> IOMMU Hardware Support
524*4882a593Smuzhiyun	   select S390 AP IOMMU Support
525*4882a593Smuzhiyun	-> VFIO Non-Privileged userspace driver framework
526*4882a593Smuzhiyun	   -> Mediated device driver frramework
527*4882a593Smuzhiyun	      -> VFIO driver for Mediated devices
528*4882a593Smuzhiyun     -> I/O subsystem
529*4882a593Smuzhiyun	-> VFIO support for AP devices
530*4882a593Smuzhiyun
531*4882a593Smuzhiyun2. Secure the AP queues to be used by the three guests so that the host can not
532*4882a593Smuzhiyun   access them. To secure them, there are two sysfs files that specify
533*4882a593Smuzhiyun   bitmasks marking a subset of the APQN range as 'usable by the default AP
534*4882a593Smuzhiyun   queue device drivers' or 'not usable by the default device drivers' and thus
535*4882a593Smuzhiyun   available for use by the vfio_ap device driver'. The location of the sysfs
536*4882a593Smuzhiyun   files containing the masks are::
537*4882a593Smuzhiyun
538*4882a593Smuzhiyun     /sys/bus/ap/apmask
539*4882a593Smuzhiyun     /sys/bus/ap/aqmask
540*4882a593Smuzhiyun
541*4882a593Smuzhiyun   The 'apmask' is a 256-bit mask that identifies a set of AP adapter IDs
542*4882a593Smuzhiyun   (APID). Each bit in the mask, from left to right (i.e., from most significant
543*4882a593Smuzhiyun   to least significant bit in big endian order), corresponds to an APID from
544*4882a593Smuzhiyun   0-255. If a bit is set, the APID is marked as usable only by the default AP
545*4882a593Smuzhiyun   queue device drivers; otherwise, the APID is usable by the vfio_ap
546*4882a593Smuzhiyun   device driver.
547*4882a593Smuzhiyun
548*4882a593Smuzhiyun   The 'aqmask' is a 256-bit mask that identifies a set of AP queue indexes
549*4882a593Smuzhiyun   (APQI). Each bit in the mask, from left to right (i.e., from most significant
550*4882a593Smuzhiyun   to least significant bit in big endian order), corresponds to an APQI from
551*4882a593Smuzhiyun   0-255. If a bit is set, the APQI is marked as usable only by the default AP
552*4882a593Smuzhiyun   queue device drivers; otherwise, the APQI is usable by the vfio_ap device
553*4882a593Smuzhiyun   driver.
554*4882a593Smuzhiyun
555*4882a593Smuzhiyun   Take, for example, the following mask::
556*4882a593Smuzhiyun
557*4882a593Smuzhiyun      0x7dffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
558*4882a593Smuzhiyun
559*4882a593Smuzhiyun    It indicates:
560*4882a593Smuzhiyun
561*4882a593Smuzhiyun      1, 2, 3, 4, 5, and 7-255 belong to the default drivers' pool, and 0 and 6
562*4882a593Smuzhiyun      belong to the vfio_ap device driver's pool.
563*4882a593Smuzhiyun
564*4882a593Smuzhiyun   The APQN of each AP queue device assigned to the linux host is checked by the
565*4882a593Smuzhiyun   AP bus against the set of APQNs derived from the cross product of APIDs
566*4882a593Smuzhiyun   and APQIs marked as usable only by the default AP queue device drivers. If a
567*4882a593Smuzhiyun   match is detected,  only the default AP queue device drivers will be probed;
568*4882a593Smuzhiyun   otherwise, the vfio_ap device driver will be probed.
569*4882a593Smuzhiyun
570*4882a593Smuzhiyun   By default, the two masks are set to reserve all APQNs for use by the default
571*4882a593Smuzhiyun   AP queue device drivers. There are two ways the default masks can be changed:
572*4882a593Smuzhiyun
573*4882a593Smuzhiyun   1. The sysfs mask files can be edited by echoing a string into the
574*4882a593Smuzhiyun      respective sysfs mask file in one of two formats:
575*4882a593Smuzhiyun
576*4882a593Smuzhiyun      * An absolute hex string starting with 0x - like "0x12345678" - sets
577*4882a593Smuzhiyun	the mask. If the given string is shorter than the mask, it is padded
578*4882a593Smuzhiyun	with 0s on the right; for example, specifying a mask value of 0x41 is
579*4882a593Smuzhiyun	the same as specifying::
580*4882a593Smuzhiyun
581*4882a593Smuzhiyun	   0x4100000000000000000000000000000000000000000000000000000000000000
582*4882a593Smuzhiyun
583*4882a593Smuzhiyun	Keep in mind that the mask reads from left to right (i.e., most
584*4882a593Smuzhiyun	significant to least significant bit in big endian order), so the mask
585*4882a593Smuzhiyun	above identifies device numbers 1 and 7 (01000001).
586*4882a593Smuzhiyun
587*4882a593Smuzhiyun	If the string is longer than the mask, the operation is terminated with
588*4882a593Smuzhiyun	an error (EINVAL).
589*4882a593Smuzhiyun
590*4882a593Smuzhiyun      * Individual bits in the mask can be switched on and off by specifying
591*4882a593Smuzhiyun	each bit number to be switched in a comma separated list. Each bit
592*4882a593Smuzhiyun	number string must be prepended with a ('+') or minus ('-') to indicate
593*4882a593Smuzhiyun	the corresponding bit is to be switched on ('+') or off ('-'). Some
594*4882a593Smuzhiyun	valid values are:
595*4882a593Smuzhiyun
596*4882a593Smuzhiyun	   - "+0"    switches bit 0 on
597*4882a593Smuzhiyun	   - "-13"   switches bit 13 off
598*4882a593Smuzhiyun	   - "+0x41" switches bit 65 on
599*4882a593Smuzhiyun	   - "-0xff" switches bit 255 off
600*4882a593Smuzhiyun
601*4882a593Smuzhiyun	The following example:
602*4882a593Smuzhiyun
603*4882a593Smuzhiyun	      +0,-6,+0x47,-0xf0
604*4882a593Smuzhiyun
605*4882a593Smuzhiyun	Switches bits 0 and 71 (0x47) on
606*4882a593Smuzhiyun
607*4882a593Smuzhiyun	Switches bits 6 and 240 (0xf0) off
608*4882a593Smuzhiyun
609*4882a593Smuzhiyun	Note that the bits not specified in the list remain as they were before
610*4882a593Smuzhiyun	the operation.
611*4882a593Smuzhiyun
612*4882a593Smuzhiyun   2. The masks can also be changed at boot time via parameters on the kernel
613*4882a593Smuzhiyun      command line like this:
614*4882a593Smuzhiyun
615*4882a593Smuzhiyun	 ap.apmask=0xffff ap.aqmask=0x40
616*4882a593Smuzhiyun
617*4882a593Smuzhiyun	 This would create the following masks::
618*4882a593Smuzhiyun
619*4882a593Smuzhiyun	    apmask:
620*4882a593Smuzhiyun	    0xffff000000000000000000000000000000000000000000000000000000000000
621*4882a593Smuzhiyun
622*4882a593Smuzhiyun	    aqmask:
623*4882a593Smuzhiyun	    0x4000000000000000000000000000000000000000000000000000000000000000
624*4882a593Smuzhiyun
625*4882a593Smuzhiyun	 Resulting in these two pools::
626*4882a593Smuzhiyun
627*4882a593Smuzhiyun	    default drivers pool:    adapter 0-15, domain 1
628*4882a593Smuzhiyun	    alternate drivers pool:  adapter 16-255, domains 0, 2-255
629*4882a593Smuzhiyun
630*4882a593SmuzhiyunSecuring the APQNs for our example
631*4882a593Smuzhiyun----------------------------------
632*4882a593Smuzhiyun   To secure the AP queues 05.0004, 05.0047, 05.00ab, 05.00ff, 06.0004, 06.0047,
633*4882a593Smuzhiyun   06.00ab, and 06.00ff for use by the vfio_ap device driver, the corresponding
634*4882a593Smuzhiyun   APQNs can either be removed from the default masks::
635*4882a593Smuzhiyun
636*4882a593Smuzhiyun      echo -5,-6 > /sys/bus/ap/apmask
637*4882a593Smuzhiyun
638*4882a593Smuzhiyun      echo -4,-0x47,-0xab,-0xff > /sys/bus/ap/aqmask
639*4882a593Smuzhiyun
640*4882a593Smuzhiyun   Or the masks can be set as follows::
641*4882a593Smuzhiyun
642*4882a593Smuzhiyun      echo 0xf9ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff \
643*4882a593Smuzhiyun      > apmask
644*4882a593Smuzhiyun
645*4882a593Smuzhiyun      echo 0xf7fffffffffffffffeffffffffffffffffffffffffeffffffffffffffffffffe \
646*4882a593Smuzhiyun      > aqmask
647*4882a593Smuzhiyun
648*4882a593Smuzhiyun   This will result in AP queues 05.0004, 05.0047, 05.00ab, 05.00ff, 06.0004,
649*4882a593Smuzhiyun   06.0047, 06.00ab, and 06.00ff getting bound to the vfio_ap device driver. The
650*4882a593Smuzhiyun   sysfs directory for the vfio_ap device driver will now contain symbolic links
651*4882a593Smuzhiyun   to the AP queue devices bound to it::
652*4882a593Smuzhiyun
653*4882a593Smuzhiyun     /sys/bus/ap
654*4882a593Smuzhiyun     ... [drivers]
655*4882a593Smuzhiyun     ...... [vfio_ap]
656*4882a593Smuzhiyun     ......... [05.0004]
657*4882a593Smuzhiyun     ......... [05.0047]
658*4882a593Smuzhiyun     ......... [05.00ab]
659*4882a593Smuzhiyun     ......... [05.00ff]
660*4882a593Smuzhiyun     ......... [06.0004]
661*4882a593Smuzhiyun     ......... [06.0047]
662*4882a593Smuzhiyun     ......... [06.00ab]
663*4882a593Smuzhiyun     ......... [06.00ff]
664*4882a593Smuzhiyun
665*4882a593Smuzhiyun   Keep in mind that only type 10 and newer adapters (i.e., CEX4 and later)
666*4882a593Smuzhiyun   can be bound to the vfio_ap device driver. The reason for this is to
667*4882a593Smuzhiyun   simplify the implementation by not needlessly complicating the design by
668*4882a593Smuzhiyun   supporting older devices that will go out of service in the relatively near
669*4882a593Smuzhiyun   future and for which there are few older systems on which to test.
670*4882a593Smuzhiyun
671*4882a593Smuzhiyun   The administrator, therefore, must take care to secure only AP queues that
672*4882a593Smuzhiyun   can be bound to the vfio_ap device driver. The device type for a given AP
673*4882a593Smuzhiyun   queue device can be read from the parent card's sysfs directory. For example,
674*4882a593Smuzhiyun   to see the hardware type of the queue 05.0004:
675*4882a593Smuzhiyun
676*4882a593Smuzhiyun     cat /sys/bus/ap/devices/card05/hwtype
677*4882a593Smuzhiyun
678*4882a593Smuzhiyun   The hwtype must be 10 or higher (CEX4 or newer) in order to be bound to the
679*4882a593Smuzhiyun   vfio_ap device driver.
680*4882a593Smuzhiyun
681*4882a593Smuzhiyun3. Create the mediated devices needed to configure the AP matrixes for the
682*4882a593Smuzhiyun   three guests and to provide an interface to the vfio_ap driver for
683*4882a593Smuzhiyun   use by the guests::
684*4882a593Smuzhiyun
685*4882a593Smuzhiyun     /sys/devices/vfio_ap/matrix/
686*4882a593Smuzhiyun     --- [mdev_supported_types]
687*4882a593Smuzhiyun     ------ [vfio_ap-passthrough] (passthrough mediated matrix device type)
688*4882a593Smuzhiyun     --------- create
689*4882a593Smuzhiyun     --------- [devices]
690*4882a593Smuzhiyun
691*4882a593Smuzhiyun   To create the mediated devices for the three guests::
692*4882a593Smuzhiyun
693*4882a593Smuzhiyun	uuidgen > create
694*4882a593Smuzhiyun	uuidgen > create
695*4882a593Smuzhiyun	uuidgen > create
696*4882a593Smuzhiyun
697*4882a593Smuzhiyun	or
698*4882a593Smuzhiyun
699*4882a593Smuzhiyun	echo $uuid1 > create
700*4882a593Smuzhiyun	echo $uuid2 > create
701*4882a593Smuzhiyun	echo $uuid3 > create
702*4882a593Smuzhiyun
703*4882a593Smuzhiyun   This will create three mediated devices in the [devices] subdirectory named
704*4882a593Smuzhiyun   after the UUID written to the create attribute file. We call them $uuid1,
705*4882a593Smuzhiyun   $uuid2 and $uuid3 and this is the sysfs directory structure after creation::
706*4882a593Smuzhiyun
707*4882a593Smuzhiyun     /sys/devices/vfio_ap/matrix/
708*4882a593Smuzhiyun     --- [mdev_supported_types]
709*4882a593Smuzhiyun     ------ [vfio_ap-passthrough]
710*4882a593Smuzhiyun     --------- [devices]
711*4882a593Smuzhiyun     ------------ [$uuid1]
712*4882a593Smuzhiyun     --------------- assign_adapter
713*4882a593Smuzhiyun     --------------- assign_control_domain
714*4882a593Smuzhiyun     --------------- assign_domain
715*4882a593Smuzhiyun     --------------- matrix
716*4882a593Smuzhiyun     --------------- unassign_adapter
717*4882a593Smuzhiyun     --------------- unassign_control_domain
718*4882a593Smuzhiyun     --------------- unassign_domain
719*4882a593Smuzhiyun
720*4882a593Smuzhiyun     ------------ [$uuid2]
721*4882a593Smuzhiyun     --------------- assign_adapter
722*4882a593Smuzhiyun     --------------- assign_control_domain
723*4882a593Smuzhiyun     --------------- assign_domain
724*4882a593Smuzhiyun     --------------- matrix
725*4882a593Smuzhiyun     --------------- unassign_adapter
726*4882a593Smuzhiyun     ----------------unassign_control_domain
727*4882a593Smuzhiyun     ----------------unassign_domain
728*4882a593Smuzhiyun
729*4882a593Smuzhiyun     ------------ [$uuid3]
730*4882a593Smuzhiyun     --------------- assign_adapter
731*4882a593Smuzhiyun     --------------- assign_control_domain
732*4882a593Smuzhiyun     --------------- assign_domain
733*4882a593Smuzhiyun     --------------- matrix
734*4882a593Smuzhiyun     --------------- unassign_adapter
735*4882a593Smuzhiyun     ----------------unassign_control_domain
736*4882a593Smuzhiyun     ----------------unassign_domain
737*4882a593Smuzhiyun
738*4882a593Smuzhiyun4. The administrator now needs to configure the matrixes for the mediated
739*4882a593Smuzhiyun   devices $uuid1 (for Guest1), $uuid2 (for Guest2) and $uuid3 (for Guest3).
740*4882a593Smuzhiyun
741*4882a593Smuzhiyun   This is how the matrix is configured for Guest1::
742*4882a593Smuzhiyun
743*4882a593Smuzhiyun      echo 5 > assign_adapter
744*4882a593Smuzhiyun      echo 6 > assign_adapter
745*4882a593Smuzhiyun      echo 4 > assign_domain
746*4882a593Smuzhiyun      echo 0xab > assign_domain
747*4882a593Smuzhiyun
748*4882a593Smuzhiyun   Control domains can similarly be assigned using the assign_control_domain
749*4882a593Smuzhiyun   sysfs file.
750*4882a593Smuzhiyun
751*4882a593Smuzhiyun   If a mistake is made configuring an adapter, domain or control domain,
752*4882a593Smuzhiyun   you can use the unassign_xxx files to unassign the adapter, domain or
753*4882a593Smuzhiyun   control domain.
754*4882a593Smuzhiyun
755*4882a593Smuzhiyun   To display the matrix configuration for Guest1::
756*4882a593Smuzhiyun
757*4882a593Smuzhiyun	 cat matrix
758*4882a593Smuzhiyun
759*4882a593Smuzhiyun   This is how the matrix is configured for Guest2::
760*4882a593Smuzhiyun
761*4882a593Smuzhiyun      echo 5 > assign_adapter
762*4882a593Smuzhiyun      echo 0x47 > assign_domain
763*4882a593Smuzhiyun      echo 0xff > assign_domain
764*4882a593Smuzhiyun
765*4882a593Smuzhiyun   This is how the matrix is configured for Guest3::
766*4882a593Smuzhiyun
767*4882a593Smuzhiyun      echo 6 > assign_adapter
768*4882a593Smuzhiyun      echo 0x47 > assign_domain
769*4882a593Smuzhiyun      echo 0xff > assign_domain
770*4882a593Smuzhiyun
771*4882a593Smuzhiyun   In order to successfully assign an adapter:
772*4882a593Smuzhiyun
773*4882a593Smuzhiyun   * The adapter number specified must represent a value from 0 up to the
774*4882a593Smuzhiyun     maximum adapter number configured for the system. If an adapter number
775*4882a593Smuzhiyun     higher than the maximum is specified, the operation will terminate with
776*4882a593Smuzhiyun     an error (ENODEV).
777*4882a593Smuzhiyun
778*4882a593Smuzhiyun   * All APQNs that can be derived from the adapter ID and the IDs of
779*4882a593Smuzhiyun     the previously assigned domains must be bound to the vfio_ap device
780*4882a593Smuzhiyun     driver. If no domains have yet been assigned, then there must be at least
781*4882a593Smuzhiyun     one APQN with the specified APID bound to the vfio_ap driver. If no such
782*4882a593Smuzhiyun     APQNs are bound to the driver, the operation will terminate with an
783*4882a593Smuzhiyun     error (EADDRNOTAVAIL).
784*4882a593Smuzhiyun
785*4882a593Smuzhiyun     No APQN that can be derived from the adapter ID and the IDs of the
786*4882a593Smuzhiyun     previously assigned domains can be assigned to another mediated matrix
787*4882a593Smuzhiyun     device. If an APQN is assigned to another mediated matrix device, the
788*4882a593Smuzhiyun     operation will terminate with an error (EADDRINUSE).
789*4882a593Smuzhiyun
790*4882a593Smuzhiyun   In order to successfully assign a domain:
791*4882a593Smuzhiyun
792*4882a593Smuzhiyun   * The domain number specified must represent a value from 0 up to the
793*4882a593Smuzhiyun     maximum domain number configured for the system. If a domain number
794*4882a593Smuzhiyun     higher than the maximum is specified, the operation will terminate with
795*4882a593Smuzhiyun     an error (ENODEV).
796*4882a593Smuzhiyun
797*4882a593Smuzhiyun   * All APQNs that can be derived from the domain ID and the IDs of
798*4882a593Smuzhiyun     the previously assigned adapters must be bound to the vfio_ap device
799*4882a593Smuzhiyun     driver. If no domains have yet been assigned, then there must be at least
800*4882a593Smuzhiyun     one APQN with the specified APQI bound to the vfio_ap driver. If no such
801*4882a593Smuzhiyun     APQNs are bound to the driver, the operation will terminate with an
802*4882a593Smuzhiyun     error (EADDRNOTAVAIL).
803*4882a593Smuzhiyun
804*4882a593Smuzhiyun     No APQN that can be derived from the domain ID and the IDs of the
805*4882a593Smuzhiyun     previously assigned adapters can be assigned to another mediated matrix
806*4882a593Smuzhiyun     device. If an APQN is assigned to another mediated matrix device, the
807*4882a593Smuzhiyun     operation will terminate with an error (EADDRINUSE).
808*4882a593Smuzhiyun
809*4882a593Smuzhiyun   In order to successfully assign a control domain, the domain number
810*4882a593Smuzhiyun   specified must represent a value from 0 up to the maximum domain number
811*4882a593Smuzhiyun   configured for the system. If a control domain number higher than the maximum
812*4882a593Smuzhiyun   is specified, the operation will terminate with an error (ENODEV).
813*4882a593Smuzhiyun
814*4882a593Smuzhiyun5. Start Guest1::
815*4882a593Smuzhiyun
816*4882a593Smuzhiyun     /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on \
817*4882a593Smuzhiyun	-device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid1 ...
818*4882a593Smuzhiyun
819*4882a593Smuzhiyun7. Start Guest2::
820*4882a593Smuzhiyun
821*4882a593Smuzhiyun     /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on \
822*4882a593Smuzhiyun	-device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid2 ...
823*4882a593Smuzhiyun
824*4882a593Smuzhiyun7. Start Guest3::
825*4882a593Smuzhiyun
826*4882a593Smuzhiyun     /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on \
827*4882a593Smuzhiyun	-device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid3 ...
828*4882a593Smuzhiyun
829*4882a593SmuzhiyunWhen the guest is shut down, the mediated matrix devices may be removed.
830*4882a593Smuzhiyun
831*4882a593SmuzhiyunUsing our example again, to remove the mediated matrix device $uuid1::
832*4882a593Smuzhiyun
833*4882a593Smuzhiyun   /sys/devices/vfio_ap/matrix/
834*4882a593Smuzhiyun      --- [mdev_supported_types]
835*4882a593Smuzhiyun      ------ [vfio_ap-passthrough]
836*4882a593Smuzhiyun      --------- [devices]
837*4882a593Smuzhiyun      ------------ [$uuid1]
838*4882a593Smuzhiyun      --------------- remove
839*4882a593Smuzhiyun
840*4882a593Smuzhiyun::
841*4882a593Smuzhiyun
842*4882a593Smuzhiyun   echo 1 > remove
843*4882a593Smuzhiyun
844*4882a593SmuzhiyunThis will remove all of the mdev matrix device's sysfs structures including
845*4882a593Smuzhiyunthe mdev device itself. To recreate and reconfigure the mdev matrix device,
846*4882a593Smuzhiyunall of the steps starting with step 3 will have to be performed again. Note
847*4882a593Smuzhiyunthat the remove will fail if a guest using the mdev is still running.
848*4882a593Smuzhiyun
849*4882a593SmuzhiyunIt is not necessary to remove an mdev matrix device, but one may want to
850*4882a593Smuzhiyunremove it if no guest will use it during the remaining lifetime of the linux
851*4882a593Smuzhiyunhost. If the mdev matrix device is removed, one may want to also reconfigure
852*4882a593Smuzhiyunthe pool of adapters and queues reserved for use by the default drivers.
853*4882a593Smuzhiyun
854*4882a593SmuzhiyunLimitations
855*4882a593Smuzhiyun===========
856*4882a593Smuzhiyun* The KVM/kernel interfaces do not provide a way to prevent restoring an APQN
857*4882a593Smuzhiyun  to the default drivers pool of a queue that is still assigned to a mediated
858*4882a593Smuzhiyun  device in use by a guest. It is incumbent upon the administrator to
859*4882a593Smuzhiyun  ensure there is no mediated device in use by a guest to which the APQN is
860*4882a593Smuzhiyun  assigned lest the host be given access to the private data of the AP queue
861*4882a593Smuzhiyun  device such as a private key configured specifically for the guest.
862*4882a593Smuzhiyun
863*4882a593Smuzhiyun* Dynamically modifying the AP matrix for a running guest (which would amount to
864*4882a593Smuzhiyun  hot(un)plug of AP devices for the guest) is currently not supported
865*4882a593Smuzhiyun
866*4882a593Smuzhiyun* Live guest migration is not supported for guests using AP devices.
867