1*4882a593Smuzhiyun=============================== 2*4882a593SmuzhiyunAdjunct Processor (AP) facility 3*4882a593Smuzhiyun=============================== 4*4882a593Smuzhiyun 5*4882a593Smuzhiyun 6*4882a593SmuzhiyunIntroduction 7*4882a593Smuzhiyun============ 8*4882a593SmuzhiyunThe Adjunct Processor (AP) facility is an IBM Z cryptographic facility comprised 9*4882a593Smuzhiyunof three AP instructions and from 1 up to 256 PCIe cryptographic adapter cards. 10*4882a593SmuzhiyunThe AP devices provide cryptographic functions to all CPUs assigned to a 11*4882a593Smuzhiyunlinux system running in an IBM Z system LPAR. 12*4882a593Smuzhiyun 13*4882a593SmuzhiyunThe AP adapter cards are exposed via the AP bus. The motivation for vfio-ap 14*4882a593Smuzhiyunis to make AP cards available to KVM guests using the VFIO mediated device 15*4882a593Smuzhiyunframework. This implementation relies considerably on the s390 virtualization 16*4882a593Smuzhiyunfacilities which do most of the hard work of providing direct access to AP 17*4882a593Smuzhiyundevices. 18*4882a593Smuzhiyun 19*4882a593SmuzhiyunAP Architectural Overview 20*4882a593Smuzhiyun========================= 21*4882a593SmuzhiyunTo facilitate the comprehension of the design, let's start with some 22*4882a593Smuzhiyundefinitions: 23*4882a593Smuzhiyun 24*4882a593Smuzhiyun* AP adapter 25*4882a593Smuzhiyun 26*4882a593Smuzhiyun An AP adapter is an IBM Z adapter card that can perform cryptographic 27*4882a593Smuzhiyun functions. There can be from 0 to 256 adapters assigned to an LPAR. Adapters 28*4882a593Smuzhiyun assigned to the LPAR in which a linux host is running will be available to 29*4882a593Smuzhiyun the linux host. Each adapter is identified by a number from 0 to 255; however, 30*4882a593Smuzhiyun the maximum adapter number is determined by machine model and/or adapter type. 31*4882a593Smuzhiyun When installed, an AP adapter is accessed by AP instructions executed by any 32*4882a593Smuzhiyun CPU. 33*4882a593Smuzhiyun 34*4882a593Smuzhiyun The AP adapter cards are assigned to a given LPAR via the system's Activation 35*4882a593Smuzhiyun Profile which can be edited via the HMC. When the linux host system is IPL'd 36*4882a593Smuzhiyun in the LPAR, the AP bus detects the AP adapter cards assigned to the LPAR and 37*4882a593Smuzhiyun creates a sysfs device for each assigned adapter. For example, if AP adapters 38*4882a593Smuzhiyun 4 and 10 (0x0a) are assigned to the LPAR, the AP bus will create the following 39*4882a593Smuzhiyun sysfs device entries:: 40*4882a593Smuzhiyun 41*4882a593Smuzhiyun /sys/devices/ap/card04 42*4882a593Smuzhiyun /sys/devices/ap/card0a 43*4882a593Smuzhiyun 44*4882a593Smuzhiyun Symbolic links to these devices will also be created in the AP bus devices 45*4882a593Smuzhiyun sub-directory:: 46*4882a593Smuzhiyun 47*4882a593Smuzhiyun /sys/bus/ap/devices/[card04] 48*4882a593Smuzhiyun /sys/bus/ap/devices/[card04] 49*4882a593Smuzhiyun 50*4882a593Smuzhiyun* AP domain 51*4882a593Smuzhiyun 52*4882a593Smuzhiyun An adapter is partitioned into domains. An adapter can hold up to 256 domains 53*4882a593Smuzhiyun depending upon the adapter type and hardware configuration. A domain is 54*4882a593Smuzhiyun identified by a number from 0 to 255; however, the maximum domain number is 55*4882a593Smuzhiyun determined by machine model and/or adapter type.. A domain can be thought of 56*4882a593Smuzhiyun as a set of hardware registers and memory used for processing AP commands. A 57*4882a593Smuzhiyun domain can be configured with a secure private key used for clear key 58*4882a593Smuzhiyun encryption. A domain is classified in one of two ways depending upon how it 59*4882a593Smuzhiyun may be accessed: 60*4882a593Smuzhiyun 61*4882a593Smuzhiyun * Usage domains are domains that are targeted by an AP instruction to 62*4882a593Smuzhiyun process an AP command. 63*4882a593Smuzhiyun 64*4882a593Smuzhiyun * Control domains are domains that are changed by an AP command sent to a 65*4882a593Smuzhiyun usage domain; for example, to set the secure private key for the control 66*4882a593Smuzhiyun domain. 67*4882a593Smuzhiyun 68*4882a593Smuzhiyun The AP usage and control domains are assigned to a given LPAR via the system's 69*4882a593Smuzhiyun Activation Profile which can be edited via the HMC. When a linux host system 70*4882a593Smuzhiyun is IPL'd in the LPAR, the AP bus module detects the AP usage and control 71*4882a593Smuzhiyun domains assigned to the LPAR. The domain number of each usage domain and 72*4882a593Smuzhiyun adapter number of each AP adapter are combined to create AP queue devices 73*4882a593Smuzhiyun (see AP Queue section below). The domain number of each control domain will be 74*4882a593Smuzhiyun represented in a bitmask and stored in a sysfs file 75*4882a593Smuzhiyun /sys/bus/ap/ap_control_domain_mask. The bits in the mask, from most to least 76*4882a593Smuzhiyun significant bit, correspond to domains 0-255. 77*4882a593Smuzhiyun 78*4882a593Smuzhiyun* AP Queue 79*4882a593Smuzhiyun 80*4882a593Smuzhiyun An AP queue is the means by which an AP command is sent to a usage domain 81*4882a593Smuzhiyun inside a specific adapter. An AP queue is identified by a tuple 82*4882a593Smuzhiyun comprised of an AP adapter ID (APID) and an AP queue index (APQI). The 83*4882a593Smuzhiyun APQI corresponds to a given usage domain number within the adapter. This tuple 84*4882a593Smuzhiyun forms an AP Queue Number (APQN) uniquely identifying an AP queue. AP 85*4882a593Smuzhiyun instructions include a field containing the APQN to identify the AP queue to 86*4882a593Smuzhiyun which the AP command is to be sent for processing. 87*4882a593Smuzhiyun 88*4882a593Smuzhiyun The AP bus will create a sysfs device for each APQN that can be derived from 89*4882a593Smuzhiyun the cross product of the AP adapter and usage domain numbers detected when the 90*4882a593Smuzhiyun AP bus module is loaded. For example, if adapters 4 and 10 (0x0a) and usage 91*4882a593Smuzhiyun domains 6 and 71 (0x47) are assigned to the LPAR, the AP bus will create the 92*4882a593Smuzhiyun following sysfs entries:: 93*4882a593Smuzhiyun 94*4882a593Smuzhiyun /sys/devices/ap/card04/04.0006 95*4882a593Smuzhiyun /sys/devices/ap/card04/04.0047 96*4882a593Smuzhiyun /sys/devices/ap/card0a/0a.0006 97*4882a593Smuzhiyun /sys/devices/ap/card0a/0a.0047 98*4882a593Smuzhiyun 99*4882a593Smuzhiyun The following symbolic links to these devices will be created in the AP bus 100*4882a593Smuzhiyun devices subdirectory:: 101*4882a593Smuzhiyun 102*4882a593Smuzhiyun /sys/bus/ap/devices/[04.0006] 103*4882a593Smuzhiyun /sys/bus/ap/devices/[04.0047] 104*4882a593Smuzhiyun /sys/bus/ap/devices/[0a.0006] 105*4882a593Smuzhiyun /sys/bus/ap/devices/[0a.0047] 106*4882a593Smuzhiyun 107*4882a593Smuzhiyun* AP Instructions: 108*4882a593Smuzhiyun 109*4882a593Smuzhiyun There are three AP instructions: 110*4882a593Smuzhiyun 111*4882a593Smuzhiyun * NQAP: to enqueue an AP command-request message to a queue 112*4882a593Smuzhiyun * DQAP: to dequeue an AP command-reply message from a queue 113*4882a593Smuzhiyun * PQAP: to administer the queues 114*4882a593Smuzhiyun 115*4882a593Smuzhiyun AP instructions identify the domain that is targeted to process the AP 116*4882a593Smuzhiyun command; this must be one of the usage domains. An AP command may modify a 117*4882a593Smuzhiyun domain that is not one of the usage domains, but the modified domain 118*4882a593Smuzhiyun must be one of the control domains. 119*4882a593Smuzhiyun 120*4882a593SmuzhiyunAP and SIE 121*4882a593Smuzhiyun========== 122*4882a593SmuzhiyunLet's now take a look at how AP instructions executed on a guest are interpreted 123*4882a593Smuzhiyunby the hardware. 124*4882a593Smuzhiyun 125*4882a593SmuzhiyunA satellite control block called the Crypto Control Block (CRYCB) is attached to 126*4882a593Smuzhiyunour main hardware virtualization control block. The CRYCB contains three fields 127*4882a593Smuzhiyunto identify the adapters, usage domains and control domains assigned to the KVM 128*4882a593Smuzhiyunguest: 129*4882a593Smuzhiyun 130*4882a593Smuzhiyun* The AP Mask (APM) field is a bit mask that identifies the AP adapters assigned 131*4882a593Smuzhiyun to the KVM guest. Each bit in the mask, from left to right (i.e. from most 132*4882a593Smuzhiyun significant to least significant bit in big endian order), corresponds to 133*4882a593Smuzhiyun an APID from 0-255. If a bit is set, the corresponding adapter is valid for 134*4882a593Smuzhiyun use by the KVM guest. 135*4882a593Smuzhiyun 136*4882a593Smuzhiyun* The AP Queue Mask (AQM) field is a bit mask identifying the AP usage domains 137*4882a593Smuzhiyun assigned to the KVM guest. Each bit in the mask, from left to right (i.e. from 138*4882a593Smuzhiyun most significant to least significant bit in big endian order), corresponds to 139*4882a593Smuzhiyun an AP queue index (APQI) from 0-255. If a bit is set, the corresponding queue 140*4882a593Smuzhiyun is valid for use by the KVM guest. 141*4882a593Smuzhiyun 142*4882a593Smuzhiyun* The AP Domain Mask field is a bit mask that identifies the AP control domains 143*4882a593Smuzhiyun assigned to the KVM guest. The ADM bit mask controls which domains can be 144*4882a593Smuzhiyun changed by an AP command-request message sent to a usage domain from the 145*4882a593Smuzhiyun guest. Each bit in the mask, from left to right (i.e. from most significant to 146*4882a593Smuzhiyun least significant bit in big endian order), corresponds to a domain from 147*4882a593Smuzhiyun 0-255. If a bit is set, the corresponding domain can be modified by an AP 148*4882a593Smuzhiyun command-request message sent to a usage domain. 149*4882a593Smuzhiyun 150*4882a593SmuzhiyunIf you recall from the description of an AP Queue, AP instructions include 151*4882a593Smuzhiyunan APQN to identify the AP queue to which an AP command-request message is to be 152*4882a593Smuzhiyunsent (NQAP and PQAP instructions), or from which a command-reply message is to 153*4882a593Smuzhiyunbe received (DQAP instruction). The validity of an APQN is defined by the matrix 154*4882a593Smuzhiyuncalculated from the APM and AQM; it is the cross product of all assigned adapter 155*4882a593Smuzhiyunnumbers (APM) with all assigned queue indexes (AQM). For example, if adapters 1 156*4882a593Smuzhiyunand 2 and usage domains 5 and 6 are assigned to a guest, the APQNs (1,5), (1,6), 157*4882a593Smuzhiyun(2,5) and (2,6) will be valid for the guest. 158*4882a593Smuzhiyun 159*4882a593SmuzhiyunThe APQNs can provide secure key functionality - i.e., a private key is stored 160*4882a593Smuzhiyunon the adapter card for each of its domains - so each APQN must be assigned to 161*4882a593Smuzhiyunat most one guest or to the linux host:: 162*4882a593Smuzhiyun 163*4882a593Smuzhiyun Example 1: Valid configuration: 164*4882a593Smuzhiyun ------------------------------ 165*4882a593Smuzhiyun Guest1: adapters 1,2 domains 5,6 166*4882a593Smuzhiyun Guest2: adapter 1,2 domain 7 167*4882a593Smuzhiyun 168*4882a593Smuzhiyun This is valid because both guests have a unique set of APQNs: 169*4882a593Smuzhiyun Guest1 has APQNs (1,5), (1,6), (2,5), (2,6); 170*4882a593Smuzhiyun Guest2 has APQNs (1,7), (2,7) 171*4882a593Smuzhiyun 172*4882a593Smuzhiyun Example 2: Valid configuration: 173*4882a593Smuzhiyun ------------------------------ 174*4882a593Smuzhiyun Guest1: adapters 1,2 domains 5,6 175*4882a593Smuzhiyun Guest2: adapters 3,4 domains 5,6 176*4882a593Smuzhiyun 177*4882a593Smuzhiyun This is also valid because both guests have a unique set of APQNs: 178*4882a593Smuzhiyun Guest1 has APQNs (1,5), (1,6), (2,5), (2,6); 179*4882a593Smuzhiyun Guest2 has APQNs (3,5), (3,6), (4,5), (4,6) 180*4882a593Smuzhiyun 181*4882a593Smuzhiyun Example 3: Invalid configuration: 182*4882a593Smuzhiyun -------------------------------- 183*4882a593Smuzhiyun Guest1: adapters 1,2 domains 5,6 184*4882a593Smuzhiyun Guest2: adapter 1 domains 6,7 185*4882a593Smuzhiyun 186*4882a593Smuzhiyun This is an invalid configuration because both guests have access to 187*4882a593Smuzhiyun APQN (1,6). 188*4882a593Smuzhiyun 189*4882a593SmuzhiyunThe Design 190*4882a593Smuzhiyun========== 191*4882a593SmuzhiyunThe design introduces three new objects: 192*4882a593Smuzhiyun 193*4882a593Smuzhiyun1. AP matrix device 194*4882a593Smuzhiyun2. VFIO AP device driver (vfio_ap.ko) 195*4882a593Smuzhiyun3. VFIO AP mediated matrix pass-through device 196*4882a593Smuzhiyun 197*4882a593SmuzhiyunThe VFIO AP device driver 198*4882a593Smuzhiyun------------------------- 199*4882a593SmuzhiyunThe VFIO AP (vfio_ap) device driver serves the following purposes: 200*4882a593Smuzhiyun 201*4882a593Smuzhiyun1. Provides the interfaces to secure APQNs for exclusive use of KVM guests. 202*4882a593Smuzhiyun 203*4882a593Smuzhiyun2. Sets up the VFIO mediated device interfaces to manage a mediated matrix 204*4882a593Smuzhiyun device and creates the sysfs interfaces for assigning adapters, usage 205*4882a593Smuzhiyun domains, and control domains comprising the matrix for a KVM guest. 206*4882a593Smuzhiyun 207*4882a593Smuzhiyun3. Configures the APM, AQM and ADM in the CRYCB referenced by a KVM guest's 208*4882a593Smuzhiyun SIE state description to grant the guest access to a matrix of AP devices 209*4882a593Smuzhiyun 210*4882a593SmuzhiyunReserve APQNs for exclusive use of KVM guests 211*4882a593Smuzhiyun--------------------------------------------- 212*4882a593SmuzhiyunThe following block diagram illustrates the mechanism by which APQNs are 213*4882a593Smuzhiyunreserved:: 214*4882a593Smuzhiyun 215*4882a593Smuzhiyun +------------------+ 216*4882a593Smuzhiyun 7 remove | | 217*4882a593Smuzhiyun +--------------------> cex4queue driver | 218*4882a593Smuzhiyun | | | 219*4882a593Smuzhiyun | +------------------+ 220*4882a593Smuzhiyun | 221*4882a593Smuzhiyun | 222*4882a593Smuzhiyun | +------------------+ +----------------+ 223*4882a593Smuzhiyun | 5 register driver | | 3 create | | 224*4882a593Smuzhiyun | +----------------> Device core +----------> matrix device | 225*4882a593Smuzhiyun | | | | | | 226*4882a593Smuzhiyun | | +--------^---------+ +----------------+ 227*4882a593Smuzhiyun | | | 228*4882a593Smuzhiyun | | +-------------------+ 229*4882a593Smuzhiyun | | +-----------------------------------+ | 230*4882a593Smuzhiyun | | | 4 register AP driver | | 2 register device 231*4882a593Smuzhiyun | | | | | 232*4882a593Smuzhiyun +--------+---+-v---+ +--------+-------+-+ 233*4882a593Smuzhiyun | | | | 234*4882a593Smuzhiyun | ap_bus +--------------------- > vfio_ap driver | 235*4882a593Smuzhiyun | | 8 probe | | 236*4882a593Smuzhiyun +--------^---------+ +--^--^------------+ 237*4882a593Smuzhiyun 6 edit | | | 238*4882a593Smuzhiyun apmask | +-----------------------------+ | 9 mdev create 239*4882a593Smuzhiyun aqmask | | 1 modprobe | 240*4882a593Smuzhiyun +--------+-----+---+ +----------------+-+ +----------------+ 241*4882a593Smuzhiyun | | | |8 create | mediated | 242*4882a593Smuzhiyun | admin | | VFIO device core |---------> matrix | 243*4882a593Smuzhiyun | + | | | device | 244*4882a593Smuzhiyun +------+-+---------+ +--------^---------+ +--------^-------+ 245*4882a593Smuzhiyun | | | | 246*4882a593Smuzhiyun | | 9 create vfio_ap-passthrough | | 247*4882a593Smuzhiyun | +------------------------------+ | 248*4882a593Smuzhiyun +-------------------------------------------------------------+ 249*4882a593Smuzhiyun 10 assign adapter/domain/control domain 250*4882a593Smuzhiyun 251*4882a593SmuzhiyunThe process for reserving an AP queue for use by a KVM guest is: 252*4882a593Smuzhiyun 253*4882a593Smuzhiyun1. The administrator loads the vfio_ap device driver 254*4882a593Smuzhiyun2. The vfio-ap driver during its initialization will register a single 'matrix' 255*4882a593Smuzhiyun device with the device core. This will serve as the parent device for 256*4882a593Smuzhiyun all mediated matrix devices used to configure an AP matrix for a guest. 257*4882a593Smuzhiyun3. The /sys/devices/vfio_ap/matrix device is created by the device core 258*4882a593Smuzhiyun4. The vfio_ap device driver will register with the AP bus for AP queue devices 259*4882a593Smuzhiyun of type 10 and higher (CEX4 and newer). The driver will provide the vfio_ap 260*4882a593Smuzhiyun driver's probe and remove callback interfaces. Devices older than CEX4 queues 261*4882a593Smuzhiyun are not supported to simplify the implementation by not needlessly 262*4882a593Smuzhiyun complicating the design by supporting older devices that will go out of 263*4882a593Smuzhiyun service in the relatively near future, and for which there are few older 264*4882a593Smuzhiyun systems around on which to test. 265*4882a593Smuzhiyun5. The AP bus registers the vfio_ap device driver with the device core 266*4882a593Smuzhiyun6. The administrator edits the AP adapter and queue masks to reserve AP queues 267*4882a593Smuzhiyun for use by the vfio_ap device driver. 268*4882a593Smuzhiyun7. The AP bus removes the AP queues reserved for the vfio_ap driver from the 269*4882a593Smuzhiyun default zcrypt cex4queue driver. 270*4882a593Smuzhiyun8. The AP bus probes the vfio_ap device driver to bind the queues reserved for 271*4882a593Smuzhiyun it. 272*4882a593Smuzhiyun9. The administrator creates a passthrough type mediated matrix device to be 273*4882a593Smuzhiyun used by a guest 274*4882a593Smuzhiyun10. The administrator assigns the adapters, usage domains and control domains 275*4882a593Smuzhiyun to be exclusively used by a guest. 276*4882a593Smuzhiyun 277*4882a593SmuzhiyunSet up the VFIO mediated device interfaces 278*4882a593Smuzhiyun------------------------------------------ 279*4882a593SmuzhiyunThe VFIO AP device driver utilizes the common interface of the VFIO mediated 280*4882a593Smuzhiyundevice core driver to: 281*4882a593Smuzhiyun 282*4882a593Smuzhiyun* Register an AP mediated bus driver to add a mediated matrix device to and 283*4882a593Smuzhiyun remove it from a VFIO group. 284*4882a593Smuzhiyun* Create and destroy a mediated matrix device 285*4882a593Smuzhiyun* Add a mediated matrix device to and remove it from the AP mediated bus driver 286*4882a593Smuzhiyun* Add a mediated matrix device to and remove it from an IOMMU group 287*4882a593Smuzhiyun 288*4882a593SmuzhiyunThe following high-level block diagram shows the main components and interfaces 289*4882a593Smuzhiyunof the VFIO AP mediated matrix device driver:: 290*4882a593Smuzhiyun 291*4882a593Smuzhiyun +-------------+ 292*4882a593Smuzhiyun | | 293*4882a593Smuzhiyun | +---------+ | mdev_register_driver() +--------------+ 294*4882a593Smuzhiyun | | Mdev | +<-----------------------+ | 295*4882a593Smuzhiyun | | bus | | | vfio_mdev.ko | 296*4882a593Smuzhiyun | | driver | +----------------------->+ |<-> VFIO user 297*4882a593Smuzhiyun | +---------+ | probe()/remove() +--------------+ APIs 298*4882a593Smuzhiyun | | 299*4882a593Smuzhiyun | MDEV CORE | 300*4882a593Smuzhiyun | MODULE | 301*4882a593Smuzhiyun | mdev.ko | 302*4882a593Smuzhiyun | +---------+ | mdev_register_device() +--------------+ 303*4882a593Smuzhiyun | |Physical | +<-----------------------+ | 304*4882a593Smuzhiyun | | device | | | vfio_ap.ko |<-> matrix 305*4882a593Smuzhiyun | |interface| +----------------------->+ | device 306*4882a593Smuzhiyun | +---------+ | callback +--------------+ 307*4882a593Smuzhiyun +-------------+ 308*4882a593Smuzhiyun 309*4882a593SmuzhiyunDuring initialization of the vfio_ap module, the matrix device is registered 310*4882a593Smuzhiyunwith an 'mdev_parent_ops' structure that provides the sysfs attribute 311*4882a593Smuzhiyunstructures, mdev functions and callback interfaces for managing the mediated 312*4882a593Smuzhiyunmatrix device. 313*4882a593Smuzhiyun 314*4882a593Smuzhiyun* sysfs attribute structures: 315*4882a593Smuzhiyun 316*4882a593Smuzhiyun supported_type_groups 317*4882a593Smuzhiyun The VFIO mediated device framework supports creation of user-defined 318*4882a593Smuzhiyun mediated device types. These mediated device types are specified 319*4882a593Smuzhiyun via the 'supported_type_groups' structure when a device is registered 320*4882a593Smuzhiyun with the mediated device framework. The registration process creates the 321*4882a593Smuzhiyun sysfs structures for each mediated device type specified in the 322*4882a593Smuzhiyun 'mdev_supported_types' sub-directory of the device being registered. Along 323*4882a593Smuzhiyun with the device type, the sysfs attributes of the mediated device type are 324*4882a593Smuzhiyun provided. 325*4882a593Smuzhiyun 326*4882a593Smuzhiyun The VFIO AP device driver will register one mediated device type for 327*4882a593Smuzhiyun passthrough devices: 328*4882a593Smuzhiyun 329*4882a593Smuzhiyun /sys/devices/vfio_ap/matrix/mdev_supported_types/vfio_ap-passthrough 330*4882a593Smuzhiyun 331*4882a593Smuzhiyun Only the read-only attributes required by the VFIO mdev framework will 332*4882a593Smuzhiyun be provided:: 333*4882a593Smuzhiyun 334*4882a593Smuzhiyun ... name 335*4882a593Smuzhiyun ... device_api 336*4882a593Smuzhiyun ... available_instances 337*4882a593Smuzhiyun ... device_api 338*4882a593Smuzhiyun 339*4882a593Smuzhiyun Where: 340*4882a593Smuzhiyun 341*4882a593Smuzhiyun * name: 342*4882a593Smuzhiyun specifies the name of the mediated device type 343*4882a593Smuzhiyun * device_api: 344*4882a593Smuzhiyun the mediated device type's API 345*4882a593Smuzhiyun * available_instances: 346*4882a593Smuzhiyun the number of mediated matrix passthrough devices 347*4882a593Smuzhiyun that can be created 348*4882a593Smuzhiyun * device_api: 349*4882a593Smuzhiyun specifies the VFIO API 350*4882a593Smuzhiyun mdev_attr_groups 351*4882a593Smuzhiyun This attribute group identifies the user-defined sysfs attributes of the 352*4882a593Smuzhiyun mediated device. When a device is registered with the VFIO mediated device 353*4882a593Smuzhiyun framework, the sysfs attribute files identified in the 'mdev_attr_groups' 354*4882a593Smuzhiyun structure will be created in the mediated matrix device's directory. The 355*4882a593Smuzhiyun sysfs attributes for a mediated matrix device are: 356*4882a593Smuzhiyun 357*4882a593Smuzhiyun assign_adapter / unassign_adapter: 358*4882a593Smuzhiyun Write-only attributes for assigning/unassigning an AP adapter to/from the 359*4882a593Smuzhiyun mediated matrix device. To assign/unassign an adapter, the APID of the 360*4882a593Smuzhiyun adapter is echoed to the respective attribute file. 361*4882a593Smuzhiyun assign_domain / unassign_domain: 362*4882a593Smuzhiyun Write-only attributes for assigning/unassigning an AP usage domain to/from 363*4882a593Smuzhiyun the mediated matrix device. To assign/unassign a domain, the domain 364*4882a593Smuzhiyun number of the usage domain is echoed to the respective attribute 365*4882a593Smuzhiyun file. 366*4882a593Smuzhiyun matrix: 367*4882a593Smuzhiyun A read-only file for displaying the APQNs derived from the cross product 368*4882a593Smuzhiyun of the adapter and domain numbers assigned to the mediated matrix device. 369*4882a593Smuzhiyun assign_control_domain / unassign_control_domain: 370*4882a593Smuzhiyun Write-only attributes for assigning/unassigning an AP control domain 371*4882a593Smuzhiyun to/from the mediated matrix device. To assign/unassign a control domain, 372*4882a593Smuzhiyun the ID of the domain to be assigned/unassigned is echoed to the respective 373*4882a593Smuzhiyun attribute file. 374*4882a593Smuzhiyun control_domains: 375*4882a593Smuzhiyun A read-only file for displaying the control domain numbers assigned to the 376*4882a593Smuzhiyun mediated matrix device. 377*4882a593Smuzhiyun 378*4882a593Smuzhiyun* functions: 379*4882a593Smuzhiyun 380*4882a593Smuzhiyun create: 381*4882a593Smuzhiyun allocates the ap_matrix_mdev structure used by the vfio_ap driver to: 382*4882a593Smuzhiyun 383*4882a593Smuzhiyun * Store the reference to the KVM structure for the guest using the mdev 384*4882a593Smuzhiyun * Store the AP matrix configuration for the adapters, domains, and control 385*4882a593Smuzhiyun domains assigned via the corresponding sysfs attributes files 386*4882a593Smuzhiyun 387*4882a593Smuzhiyun remove: 388*4882a593Smuzhiyun deallocates the mediated matrix device's ap_matrix_mdev structure. This will 389*4882a593Smuzhiyun be allowed only if a running guest is not using the mdev. 390*4882a593Smuzhiyun 391*4882a593Smuzhiyun* callback interfaces 392*4882a593Smuzhiyun 393*4882a593Smuzhiyun open: 394*4882a593Smuzhiyun The vfio_ap driver uses this callback to register a 395*4882a593Smuzhiyun VFIO_GROUP_NOTIFY_SET_KVM notifier callback function for the mdev matrix 396*4882a593Smuzhiyun device. The open is invoked when QEMU connects the VFIO iommu group 397*4882a593Smuzhiyun for the mdev matrix device to the MDEV bus. Access to the KVM structure used 398*4882a593Smuzhiyun to configure the KVM guest is provided via this callback. The KVM structure, 399*4882a593Smuzhiyun is used to configure the guest's access to the AP matrix defined via the 400*4882a593Smuzhiyun mediated matrix device's sysfs attribute files. 401*4882a593Smuzhiyun release: 402*4882a593Smuzhiyun unregisters the VFIO_GROUP_NOTIFY_SET_KVM notifier callback function for the 403*4882a593Smuzhiyun mdev matrix device and deconfigures the guest's AP matrix. 404*4882a593Smuzhiyun 405*4882a593SmuzhiyunConfigure the APM, AQM and ADM in the CRYCB 406*4882a593Smuzhiyun------------------------------------------- 407*4882a593SmuzhiyunConfiguring the AP matrix for a KVM guest will be performed when the 408*4882a593SmuzhiyunVFIO_GROUP_NOTIFY_SET_KVM notifier callback is invoked. The notifier 409*4882a593Smuzhiyunfunction is called when QEMU connects to KVM. The guest's AP matrix is 410*4882a593Smuzhiyunconfigured via it's CRYCB by: 411*4882a593Smuzhiyun 412*4882a593Smuzhiyun* Setting the bits in the APM corresponding to the APIDs assigned to the 413*4882a593Smuzhiyun mediated matrix device via its 'assign_adapter' interface. 414*4882a593Smuzhiyun* Setting the bits in the AQM corresponding to the domains assigned to the 415*4882a593Smuzhiyun mediated matrix device via its 'assign_domain' interface. 416*4882a593Smuzhiyun* Setting the bits in the ADM corresponding to the domain dIDs assigned to the 417*4882a593Smuzhiyun mediated matrix device via its 'assign_control_domains' interface. 418*4882a593Smuzhiyun 419*4882a593SmuzhiyunThe CPU model features for AP 420*4882a593Smuzhiyun----------------------------- 421*4882a593SmuzhiyunThe AP stack relies on the presence of the AP instructions as well as two 422*4882a593Smuzhiyunfacilities: The AP Facilities Test (APFT) facility; and the AP Query 423*4882a593SmuzhiyunConfiguration Information (QCI) facility. These features/facilities are made 424*4882a593Smuzhiyunavailable to a KVM guest via the following CPU model features: 425*4882a593Smuzhiyun 426*4882a593Smuzhiyun1. ap: Indicates whether the AP instructions are installed on the guest. This 427*4882a593Smuzhiyun feature will be enabled by KVM only if the AP instructions are installed 428*4882a593Smuzhiyun on the host. 429*4882a593Smuzhiyun 430*4882a593Smuzhiyun2. apft: Indicates the APFT facility is available on the guest. This facility 431*4882a593Smuzhiyun can be made available to the guest only if it is available on the host (i.e., 432*4882a593Smuzhiyun facility bit 15 is set). 433*4882a593Smuzhiyun 434*4882a593Smuzhiyun3. apqci: Indicates the AP QCI facility is available on the guest. This facility 435*4882a593Smuzhiyun can be made available to the guest only if it is available on the host (i.e., 436*4882a593Smuzhiyun facility bit 12 is set). 437*4882a593Smuzhiyun 438*4882a593SmuzhiyunNote: If the user chooses to specify a CPU model different than the 'host' 439*4882a593Smuzhiyunmodel to QEMU, the CPU model features and facilities need to be turned on 440*4882a593Smuzhiyunexplicitly; for example:: 441*4882a593Smuzhiyun 442*4882a593Smuzhiyun /usr/bin/qemu-system-s390x ... -cpu z13,ap=on,apqci=on,apft=on 443*4882a593Smuzhiyun 444*4882a593SmuzhiyunA guest can be precluded from using AP features/facilities by turning them off 445*4882a593Smuzhiyunexplicitly; for example:: 446*4882a593Smuzhiyun 447*4882a593Smuzhiyun /usr/bin/qemu-system-s390x ... -cpu host,ap=off,apqci=off,apft=off 448*4882a593Smuzhiyun 449*4882a593SmuzhiyunNote: If the APFT facility is turned off (apft=off) for the guest, the guest 450*4882a593Smuzhiyunwill not see any AP devices. The zcrypt device drivers that register for type 10 451*4882a593Smuzhiyunand newer AP devices - i.e., the cex4card and cex4queue device drivers - need 452*4882a593Smuzhiyunthe APFT facility to ascertain the facilities installed on a given AP device. If 453*4882a593Smuzhiyunthe APFT facility is not installed on the guest, then the probe of device 454*4882a593Smuzhiyundrivers will fail since only type 10 and newer devices can be configured for 455*4882a593Smuzhiyunguest use. 456*4882a593Smuzhiyun 457*4882a593SmuzhiyunExample 458*4882a593Smuzhiyun======= 459*4882a593SmuzhiyunLet's now provide an example to illustrate how KVM guests may be given 460*4882a593Smuzhiyunaccess to AP facilities. For this example, we will show how to configure 461*4882a593Smuzhiyunthree guests such that executing the lszcrypt command on the guests would 462*4882a593Smuzhiyunlook like this: 463*4882a593Smuzhiyun 464*4882a593SmuzhiyunGuest1 465*4882a593Smuzhiyun------ 466*4882a593Smuzhiyun=========== ===== ============ 467*4882a593SmuzhiyunCARD.DOMAIN TYPE MODE 468*4882a593Smuzhiyun=========== ===== ============ 469*4882a593Smuzhiyun05 CEX5C CCA-Coproc 470*4882a593Smuzhiyun05.0004 CEX5C CCA-Coproc 471*4882a593Smuzhiyun05.00ab CEX5C CCA-Coproc 472*4882a593Smuzhiyun06 CEX5A Accelerator 473*4882a593Smuzhiyun06.0004 CEX5A Accelerator 474*4882a593Smuzhiyun06.00ab CEX5C CCA-Coproc 475*4882a593Smuzhiyun=========== ===== ============ 476*4882a593Smuzhiyun 477*4882a593SmuzhiyunGuest2 478*4882a593Smuzhiyun------ 479*4882a593Smuzhiyun=========== ===== ============ 480*4882a593SmuzhiyunCARD.DOMAIN TYPE MODE 481*4882a593Smuzhiyun=========== ===== ============ 482*4882a593Smuzhiyun05 CEX5A Accelerator 483*4882a593Smuzhiyun05.0047 CEX5A Accelerator 484*4882a593Smuzhiyun05.00ff CEX5A Accelerator 485*4882a593Smuzhiyun=========== ===== ============ 486*4882a593Smuzhiyun 487*4882a593SmuzhiyunGuest3 488*4882a593Smuzhiyun------ 489*4882a593Smuzhiyun=========== ===== ============ 490*4882a593SmuzhiyunCARD.DOMAIN TYPE MODE 491*4882a593Smuzhiyun=========== ===== ============ 492*4882a593Smuzhiyun06 CEX5A Accelerator 493*4882a593Smuzhiyun06.0047 CEX5A Accelerator 494*4882a593Smuzhiyun06.00ff CEX5A Accelerator 495*4882a593Smuzhiyun=========== ===== ============ 496*4882a593Smuzhiyun 497*4882a593SmuzhiyunThese are the steps: 498*4882a593Smuzhiyun 499*4882a593Smuzhiyun1. Install the vfio_ap module on the linux host. The dependency chain for the 500*4882a593Smuzhiyun vfio_ap module is: 501*4882a593Smuzhiyun * iommu 502*4882a593Smuzhiyun * s390 503*4882a593Smuzhiyun * zcrypt 504*4882a593Smuzhiyun * vfio 505*4882a593Smuzhiyun * vfio_mdev 506*4882a593Smuzhiyun * vfio_mdev_device 507*4882a593Smuzhiyun * KVM 508*4882a593Smuzhiyun 509*4882a593Smuzhiyun To build the vfio_ap module, the kernel build must be configured with the 510*4882a593Smuzhiyun following Kconfig elements selected: 511*4882a593Smuzhiyun * IOMMU_SUPPORT 512*4882a593Smuzhiyun * S390 513*4882a593Smuzhiyun * ZCRYPT 514*4882a593Smuzhiyun * S390_AP_IOMMU 515*4882a593Smuzhiyun * VFIO 516*4882a593Smuzhiyun * VFIO_MDEV 517*4882a593Smuzhiyun * VFIO_MDEV_DEVICE 518*4882a593Smuzhiyun * KVM 519*4882a593Smuzhiyun 520*4882a593Smuzhiyun If using make menuconfig select the following to build the vfio_ap module:: 521*4882a593Smuzhiyun 522*4882a593Smuzhiyun -> Device Drivers 523*4882a593Smuzhiyun -> IOMMU Hardware Support 524*4882a593Smuzhiyun select S390 AP IOMMU Support 525*4882a593Smuzhiyun -> VFIO Non-Privileged userspace driver framework 526*4882a593Smuzhiyun -> Mediated device driver frramework 527*4882a593Smuzhiyun -> VFIO driver for Mediated devices 528*4882a593Smuzhiyun -> I/O subsystem 529*4882a593Smuzhiyun -> VFIO support for AP devices 530*4882a593Smuzhiyun 531*4882a593Smuzhiyun2. Secure the AP queues to be used by the three guests so that the host can not 532*4882a593Smuzhiyun access them. To secure them, there are two sysfs files that specify 533*4882a593Smuzhiyun bitmasks marking a subset of the APQN range as 'usable by the default AP 534*4882a593Smuzhiyun queue device drivers' or 'not usable by the default device drivers' and thus 535*4882a593Smuzhiyun available for use by the vfio_ap device driver'. The location of the sysfs 536*4882a593Smuzhiyun files containing the masks are:: 537*4882a593Smuzhiyun 538*4882a593Smuzhiyun /sys/bus/ap/apmask 539*4882a593Smuzhiyun /sys/bus/ap/aqmask 540*4882a593Smuzhiyun 541*4882a593Smuzhiyun The 'apmask' is a 256-bit mask that identifies a set of AP adapter IDs 542*4882a593Smuzhiyun (APID). Each bit in the mask, from left to right (i.e., from most significant 543*4882a593Smuzhiyun to least significant bit in big endian order), corresponds to an APID from 544*4882a593Smuzhiyun 0-255. If a bit is set, the APID is marked as usable only by the default AP 545*4882a593Smuzhiyun queue device drivers; otherwise, the APID is usable by the vfio_ap 546*4882a593Smuzhiyun device driver. 547*4882a593Smuzhiyun 548*4882a593Smuzhiyun The 'aqmask' is a 256-bit mask that identifies a set of AP queue indexes 549*4882a593Smuzhiyun (APQI). Each bit in the mask, from left to right (i.e., from most significant 550*4882a593Smuzhiyun to least significant bit in big endian order), corresponds to an APQI from 551*4882a593Smuzhiyun 0-255. If a bit is set, the APQI is marked as usable only by the default AP 552*4882a593Smuzhiyun queue device drivers; otherwise, the APQI is usable by the vfio_ap device 553*4882a593Smuzhiyun driver. 554*4882a593Smuzhiyun 555*4882a593Smuzhiyun Take, for example, the following mask:: 556*4882a593Smuzhiyun 557*4882a593Smuzhiyun 0x7dffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff 558*4882a593Smuzhiyun 559*4882a593Smuzhiyun It indicates: 560*4882a593Smuzhiyun 561*4882a593Smuzhiyun 1, 2, 3, 4, 5, and 7-255 belong to the default drivers' pool, and 0 and 6 562*4882a593Smuzhiyun belong to the vfio_ap device driver's pool. 563*4882a593Smuzhiyun 564*4882a593Smuzhiyun The APQN of each AP queue device assigned to the linux host is checked by the 565*4882a593Smuzhiyun AP bus against the set of APQNs derived from the cross product of APIDs 566*4882a593Smuzhiyun and APQIs marked as usable only by the default AP queue device drivers. If a 567*4882a593Smuzhiyun match is detected, only the default AP queue device drivers will be probed; 568*4882a593Smuzhiyun otherwise, the vfio_ap device driver will be probed. 569*4882a593Smuzhiyun 570*4882a593Smuzhiyun By default, the two masks are set to reserve all APQNs for use by the default 571*4882a593Smuzhiyun AP queue device drivers. There are two ways the default masks can be changed: 572*4882a593Smuzhiyun 573*4882a593Smuzhiyun 1. The sysfs mask files can be edited by echoing a string into the 574*4882a593Smuzhiyun respective sysfs mask file in one of two formats: 575*4882a593Smuzhiyun 576*4882a593Smuzhiyun * An absolute hex string starting with 0x - like "0x12345678" - sets 577*4882a593Smuzhiyun the mask. If the given string is shorter than the mask, it is padded 578*4882a593Smuzhiyun with 0s on the right; for example, specifying a mask value of 0x41 is 579*4882a593Smuzhiyun the same as specifying:: 580*4882a593Smuzhiyun 581*4882a593Smuzhiyun 0x4100000000000000000000000000000000000000000000000000000000000000 582*4882a593Smuzhiyun 583*4882a593Smuzhiyun Keep in mind that the mask reads from left to right (i.e., most 584*4882a593Smuzhiyun significant to least significant bit in big endian order), so the mask 585*4882a593Smuzhiyun above identifies device numbers 1 and 7 (01000001). 586*4882a593Smuzhiyun 587*4882a593Smuzhiyun If the string is longer than the mask, the operation is terminated with 588*4882a593Smuzhiyun an error (EINVAL). 589*4882a593Smuzhiyun 590*4882a593Smuzhiyun * Individual bits in the mask can be switched on and off by specifying 591*4882a593Smuzhiyun each bit number to be switched in a comma separated list. Each bit 592*4882a593Smuzhiyun number string must be prepended with a ('+') or minus ('-') to indicate 593*4882a593Smuzhiyun the corresponding bit is to be switched on ('+') or off ('-'). Some 594*4882a593Smuzhiyun valid values are: 595*4882a593Smuzhiyun 596*4882a593Smuzhiyun - "+0" switches bit 0 on 597*4882a593Smuzhiyun - "-13" switches bit 13 off 598*4882a593Smuzhiyun - "+0x41" switches bit 65 on 599*4882a593Smuzhiyun - "-0xff" switches bit 255 off 600*4882a593Smuzhiyun 601*4882a593Smuzhiyun The following example: 602*4882a593Smuzhiyun 603*4882a593Smuzhiyun +0,-6,+0x47,-0xf0 604*4882a593Smuzhiyun 605*4882a593Smuzhiyun Switches bits 0 and 71 (0x47) on 606*4882a593Smuzhiyun 607*4882a593Smuzhiyun Switches bits 6 and 240 (0xf0) off 608*4882a593Smuzhiyun 609*4882a593Smuzhiyun Note that the bits not specified in the list remain as they were before 610*4882a593Smuzhiyun the operation. 611*4882a593Smuzhiyun 612*4882a593Smuzhiyun 2. The masks can also be changed at boot time via parameters on the kernel 613*4882a593Smuzhiyun command line like this: 614*4882a593Smuzhiyun 615*4882a593Smuzhiyun ap.apmask=0xffff ap.aqmask=0x40 616*4882a593Smuzhiyun 617*4882a593Smuzhiyun This would create the following masks:: 618*4882a593Smuzhiyun 619*4882a593Smuzhiyun apmask: 620*4882a593Smuzhiyun 0xffff000000000000000000000000000000000000000000000000000000000000 621*4882a593Smuzhiyun 622*4882a593Smuzhiyun aqmask: 623*4882a593Smuzhiyun 0x4000000000000000000000000000000000000000000000000000000000000000 624*4882a593Smuzhiyun 625*4882a593Smuzhiyun Resulting in these two pools:: 626*4882a593Smuzhiyun 627*4882a593Smuzhiyun default drivers pool: adapter 0-15, domain 1 628*4882a593Smuzhiyun alternate drivers pool: adapter 16-255, domains 0, 2-255 629*4882a593Smuzhiyun 630*4882a593SmuzhiyunSecuring the APQNs for our example 631*4882a593Smuzhiyun---------------------------------- 632*4882a593Smuzhiyun To secure the AP queues 05.0004, 05.0047, 05.00ab, 05.00ff, 06.0004, 06.0047, 633*4882a593Smuzhiyun 06.00ab, and 06.00ff for use by the vfio_ap device driver, the corresponding 634*4882a593Smuzhiyun APQNs can either be removed from the default masks:: 635*4882a593Smuzhiyun 636*4882a593Smuzhiyun echo -5,-6 > /sys/bus/ap/apmask 637*4882a593Smuzhiyun 638*4882a593Smuzhiyun echo -4,-0x47,-0xab,-0xff > /sys/bus/ap/aqmask 639*4882a593Smuzhiyun 640*4882a593Smuzhiyun Or the masks can be set as follows:: 641*4882a593Smuzhiyun 642*4882a593Smuzhiyun echo 0xf9ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff \ 643*4882a593Smuzhiyun > apmask 644*4882a593Smuzhiyun 645*4882a593Smuzhiyun echo 0xf7fffffffffffffffeffffffffffffffffffffffffeffffffffffffffffffffe \ 646*4882a593Smuzhiyun > aqmask 647*4882a593Smuzhiyun 648*4882a593Smuzhiyun This will result in AP queues 05.0004, 05.0047, 05.00ab, 05.00ff, 06.0004, 649*4882a593Smuzhiyun 06.0047, 06.00ab, and 06.00ff getting bound to the vfio_ap device driver. The 650*4882a593Smuzhiyun sysfs directory for the vfio_ap device driver will now contain symbolic links 651*4882a593Smuzhiyun to the AP queue devices bound to it:: 652*4882a593Smuzhiyun 653*4882a593Smuzhiyun /sys/bus/ap 654*4882a593Smuzhiyun ... [drivers] 655*4882a593Smuzhiyun ...... [vfio_ap] 656*4882a593Smuzhiyun ......... [05.0004] 657*4882a593Smuzhiyun ......... [05.0047] 658*4882a593Smuzhiyun ......... [05.00ab] 659*4882a593Smuzhiyun ......... [05.00ff] 660*4882a593Smuzhiyun ......... [06.0004] 661*4882a593Smuzhiyun ......... [06.0047] 662*4882a593Smuzhiyun ......... [06.00ab] 663*4882a593Smuzhiyun ......... [06.00ff] 664*4882a593Smuzhiyun 665*4882a593Smuzhiyun Keep in mind that only type 10 and newer adapters (i.e., CEX4 and later) 666*4882a593Smuzhiyun can be bound to the vfio_ap device driver. The reason for this is to 667*4882a593Smuzhiyun simplify the implementation by not needlessly complicating the design by 668*4882a593Smuzhiyun supporting older devices that will go out of service in the relatively near 669*4882a593Smuzhiyun future and for which there are few older systems on which to test. 670*4882a593Smuzhiyun 671*4882a593Smuzhiyun The administrator, therefore, must take care to secure only AP queues that 672*4882a593Smuzhiyun can be bound to the vfio_ap device driver. The device type for a given AP 673*4882a593Smuzhiyun queue device can be read from the parent card's sysfs directory. For example, 674*4882a593Smuzhiyun to see the hardware type of the queue 05.0004: 675*4882a593Smuzhiyun 676*4882a593Smuzhiyun cat /sys/bus/ap/devices/card05/hwtype 677*4882a593Smuzhiyun 678*4882a593Smuzhiyun The hwtype must be 10 or higher (CEX4 or newer) in order to be bound to the 679*4882a593Smuzhiyun vfio_ap device driver. 680*4882a593Smuzhiyun 681*4882a593Smuzhiyun3. Create the mediated devices needed to configure the AP matrixes for the 682*4882a593Smuzhiyun three guests and to provide an interface to the vfio_ap driver for 683*4882a593Smuzhiyun use by the guests:: 684*4882a593Smuzhiyun 685*4882a593Smuzhiyun /sys/devices/vfio_ap/matrix/ 686*4882a593Smuzhiyun --- [mdev_supported_types] 687*4882a593Smuzhiyun ------ [vfio_ap-passthrough] (passthrough mediated matrix device type) 688*4882a593Smuzhiyun --------- create 689*4882a593Smuzhiyun --------- [devices] 690*4882a593Smuzhiyun 691*4882a593Smuzhiyun To create the mediated devices for the three guests:: 692*4882a593Smuzhiyun 693*4882a593Smuzhiyun uuidgen > create 694*4882a593Smuzhiyun uuidgen > create 695*4882a593Smuzhiyun uuidgen > create 696*4882a593Smuzhiyun 697*4882a593Smuzhiyun or 698*4882a593Smuzhiyun 699*4882a593Smuzhiyun echo $uuid1 > create 700*4882a593Smuzhiyun echo $uuid2 > create 701*4882a593Smuzhiyun echo $uuid3 > create 702*4882a593Smuzhiyun 703*4882a593Smuzhiyun This will create three mediated devices in the [devices] subdirectory named 704*4882a593Smuzhiyun after the UUID written to the create attribute file. We call them $uuid1, 705*4882a593Smuzhiyun $uuid2 and $uuid3 and this is the sysfs directory structure after creation:: 706*4882a593Smuzhiyun 707*4882a593Smuzhiyun /sys/devices/vfio_ap/matrix/ 708*4882a593Smuzhiyun --- [mdev_supported_types] 709*4882a593Smuzhiyun ------ [vfio_ap-passthrough] 710*4882a593Smuzhiyun --------- [devices] 711*4882a593Smuzhiyun ------------ [$uuid1] 712*4882a593Smuzhiyun --------------- assign_adapter 713*4882a593Smuzhiyun --------------- assign_control_domain 714*4882a593Smuzhiyun --------------- assign_domain 715*4882a593Smuzhiyun --------------- matrix 716*4882a593Smuzhiyun --------------- unassign_adapter 717*4882a593Smuzhiyun --------------- unassign_control_domain 718*4882a593Smuzhiyun --------------- unassign_domain 719*4882a593Smuzhiyun 720*4882a593Smuzhiyun ------------ [$uuid2] 721*4882a593Smuzhiyun --------------- assign_adapter 722*4882a593Smuzhiyun --------------- assign_control_domain 723*4882a593Smuzhiyun --------------- assign_domain 724*4882a593Smuzhiyun --------------- matrix 725*4882a593Smuzhiyun --------------- unassign_adapter 726*4882a593Smuzhiyun ----------------unassign_control_domain 727*4882a593Smuzhiyun ----------------unassign_domain 728*4882a593Smuzhiyun 729*4882a593Smuzhiyun ------------ [$uuid3] 730*4882a593Smuzhiyun --------------- assign_adapter 731*4882a593Smuzhiyun --------------- assign_control_domain 732*4882a593Smuzhiyun --------------- assign_domain 733*4882a593Smuzhiyun --------------- matrix 734*4882a593Smuzhiyun --------------- unassign_adapter 735*4882a593Smuzhiyun ----------------unassign_control_domain 736*4882a593Smuzhiyun ----------------unassign_domain 737*4882a593Smuzhiyun 738*4882a593Smuzhiyun4. The administrator now needs to configure the matrixes for the mediated 739*4882a593Smuzhiyun devices $uuid1 (for Guest1), $uuid2 (for Guest2) and $uuid3 (for Guest3). 740*4882a593Smuzhiyun 741*4882a593Smuzhiyun This is how the matrix is configured for Guest1:: 742*4882a593Smuzhiyun 743*4882a593Smuzhiyun echo 5 > assign_adapter 744*4882a593Smuzhiyun echo 6 > assign_adapter 745*4882a593Smuzhiyun echo 4 > assign_domain 746*4882a593Smuzhiyun echo 0xab > assign_domain 747*4882a593Smuzhiyun 748*4882a593Smuzhiyun Control domains can similarly be assigned using the assign_control_domain 749*4882a593Smuzhiyun sysfs file. 750*4882a593Smuzhiyun 751*4882a593Smuzhiyun If a mistake is made configuring an adapter, domain or control domain, 752*4882a593Smuzhiyun you can use the unassign_xxx files to unassign the adapter, domain or 753*4882a593Smuzhiyun control domain. 754*4882a593Smuzhiyun 755*4882a593Smuzhiyun To display the matrix configuration for Guest1:: 756*4882a593Smuzhiyun 757*4882a593Smuzhiyun cat matrix 758*4882a593Smuzhiyun 759*4882a593Smuzhiyun This is how the matrix is configured for Guest2:: 760*4882a593Smuzhiyun 761*4882a593Smuzhiyun echo 5 > assign_adapter 762*4882a593Smuzhiyun echo 0x47 > assign_domain 763*4882a593Smuzhiyun echo 0xff > assign_domain 764*4882a593Smuzhiyun 765*4882a593Smuzhiyun This is how the matrix is configured for Guest3:: 766*4882a593Smuzhiyun 767*4882a593Smuzhiyun echo 6 > assign_adapter 768*4882a593Smuzhiyun echo 0x47 > assign_domain 769*4882a593Smuzhiyun echo 0xff > assign_domain 770*4882a593Smuzhiyun 771*4882a593Smuzhiyun In order to successfully assign an adapter: 772*4882a593Smuzhiyun 773*4882a593Smuzhiyun * The adapter number specified must represent a value from 0 up to the 774*4882a593Smuzhiyun maximum adapter number configured for the system. If an adapter number 775*4882a593Smuzhiyun higher than the maximum is specified, the operation will terminate with 776*4882a593Smuzhiyun an error (ENODEV). 777*4882a593Smuzhiyun 778*4882a593Smuzhiyun * All APQNs that can be derived from the adapter ID and the IDs of 779*4882a593Smuzhiyun the previously assigned domains must be bound to the vfio_ap device 780*4882a593Smuzhiyun driver. If no domains have yet been assigned, then there must be at least 781*4882a593Smuzhiyun one APQN with the specified APID bound to the vfio_ap driver. If no such 782*4882a593Smuzhiyun APQNs are bound to the driver, the operation will terminate with an 783*4882a593Smuzhiyun error (EADDRNOTAVAIL). 784*4882a593Smuzhiyun 785*4882a593Smuzhiyun No APQN that can be derived from the adapter ID and the IDs of the 786*4882a593Smuzhiyun previously assigned domains can be assigned to another mediated matrix 787*4882a593Smuzhiyun device. If an APQN is assigned to another mediated matrix device, the 788*4882a593Smuzhiyun operation will terminate with an error (EADDRINUSE). 789*4882a593Smuzhiyun 790*4882a593Smuzhiyun In order to successfully assign a domain: 791*4882a593Smuzhiyun 792*4882a593Smuzhiyun * The domain number specified must represent a value from 0 up to the 793*4882a593Smuzhiyun maximum domain number configured for the system. If a domain number 794*4882a593Smuzhiyun higher than the maximum is specified, the operation will terminate with 795*4882a593Smuzhiyun an error (ENODEV). 796*4882a593Smuzhiyun 797*4882a593Smuzhiyun * All APQNs that can be derived from the domain ID and the IDs of 798*4882a593Smuzhiyun the previously assigned adapters must be bound to the vfio_ap device 799*4882a593Smuzhiyun driver. If no domains have yet been assigned, then there must be at least 800*4882a593Smuzhiyun one APQN with the specified APQI bound to the vfio_ap driver. If no such 801*4882a593Smuzhiyun APQNs are bound to the driver, the operation will terminate with an 802*4882a593Smuzhiyun error (EADDRNOTAVAIL). 803*4882a593Smuzhiyun 804*4882a593Smuzhiyun No APQN that can be derived from the domain ID and the IDs of the 805*4882a593Smuzhiyun previously assigned adapters can be assigned to another mediated matrix 806*4882a593Smuzhiyun device. If an APQN is assigned to another mediated matrix device, the 807*4882a593Smuzhiyun operation will terminate with an error (EADDRINUSE). 808*4882a593Smuzhiyun 809*4882a593Smuzhiyun In order to successfully assign a control domain, the domain number 810*4882a593Smuzhiyun specified must represent a value from 0 up to the maximum domain number 811*4882a593Smuzhiyun configured for the system. If a control domain number higher than the maximum 812*4882a593Smuzhiyun is specified, the operation will terminate with an error (ENODEV). 813*4882a593Smuzhiyun 814*4882a593Smuzhiyun5. Start Guest1:: 815*4882a593Smuzhiyun 816*4882a593Smuzhiyun /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on \ 817*4882a593Smuzhiyun -device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid1 ... 818*4882a593Smuzhiyun 819*4882a593Smuzhiyun7. Start Guest2:: 820*4882a593Smuzhiyun 821*4882a593Smuzhiyun /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on \ 822*4882a593Smuzhiyun -device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid2 ... 823*4882a593Smuzhiyun 824*4882a593Smuzhiyun7. Start Guest3:: 825*4882a593Smuzhiyun 826*4882a593Smuzhiyun /usr/bin/qemu-system-s390x ... -cpu host,ap=on,apqci=on,apft=on \ 827*4882a593Smuzhiyun -device vfio-ap,sysfsdev=/sys/devices/vfio_ap/matrix/$uuid3 ... 828*4882a593Smuzhiyun 829*4882a593SmuzhiyunWhen the guest is shut down, the mediated matrix devices may be removed. 830*4882a593Smuzhiyun 831*4882a593SmuzhiyunUsing our example again, to remove the mediated matrix device $uuid1:: 832*4882a593Smuzhiyun 833*4882a593Smuzhiyun /sys/devices/vfio_ap/matrix/ 834*4882a593Smuzhiyun --- [mdev_supported_types] 835*4882a593Smuzhiyun ------ [vfio_ap-passthrough] 836*4882a593Smuzhiyun --------- [devices] 837*4882a593Smuzhiyun ------------ [$uuid1] 838*4882a593Smuzhiyun --------------- remove 839*4882a593Smuzhiyun 840*4882a593Smuzhiyun:: 841*4882a593Smuzhiyun 842*4882a593Smuzhiyun echo 1 > remove 843*4882a593Smuzhiyun 844*4882a593SmuzhiyunThis will remove all of the mdev matrix device's sysfs structures including 845*4882a593Smuzhiyunthe mdev device itself. To recreate and reconfigure the mdev matrix device, 846*4882a593Smuzhiyunall of the steps starting with step 3 will have to be performed again. Note 847*4882a593Smuzhiyunthat the remove will fail if a guest using the mdev is still running. 848*4882a593Smuzhiyun 849*4882a593SmuzhiyunIt is not necessary to remove an mdev matrix device, but one may want to 850*4882a593Smuzhiyunremove it if no guest will use it during the remaining lifetime of the linux 851*4882a593Smuzhiyunhost. If the mdev matrix device is removed, one may want to also reconfigure 852*4882a593Smuzhiyunthe pool of adapters and queues reserved for use by the default drivers. 853*4882a593Smuzhiyun 854*4882a593SmuzhiyunLimitations 855*4882a593Smuzhiyun=========== 856*4882a593Smuzhiyun* The KVM/kernel interfaces do not provide a way to prevent restoring an APQN 857*4882a593Smuzhiyun to the default drivers pool of a queue that is still assigned to a mediated 858*4882a593Smuzhiyun device in use by a guest. It is incumbent upon the administrator to 859*4882a593Smuzhiyun ensure there is no mediated device in use by a guest to which the APQN is 860*4882a593Smuzhiyun assigned lest the host be given access to the private data of the AP queue 861*4882a593Smuzhiyun device such as a private key configured specifically for the guest. 862*4882a593Smuzhiyun 863*4882a593Smuzhiyun* Dynamically modifying the AP matrix for a running guest (which would amount to 864*4882a593Smuzhiyun hot(un)plug of AP devices for the guest) is currently not supported 865*4882a593Smuzhiyun 866*4882a593Smuzhiyun* Live guest migration is not supported for guests using AP devices. 867