1*4882a593Smuzhiyun.. SPDX-License-Identifier: GPL-2.0 2*4882a593Smuzhiyun 3*4882a593Smuzhiyun=================================================================== 4*4882a593SmuzhiyunThe Definitive KVM (Kernel-based Virtual Machine) API Documentation 5*4882a593Smuzhiyun=================================================================== 6*4882a593Smuzhiyun 7*4882a593Smuzhiyun1. General description 8*4882a593Smuzhiyun====================== 9*4882a593Smuzhiyun 10*4882a593SmuzhiyunThe kvm API is a set of ioctls that are issued to control various aspects 11*4882a593Smuzhiyunof a virtual machine. The ioctls belong to the following classes: 12*4882a593Smuzhiyun 13*4882a593Smuzhiyun - System ioctls: These query and set global attributes which affect the 14*4882a593Smuzhiyun whole kvm subsystem. In addition a system ioctl is used to create 15*4882a593Smuzhiyun virtual machines. 16*4882a593Smuzhiyun 17*4882a593Smuzhiyun - VM ioctls: These query and set attributes that affect an entire virtual 18*4882a593Smuzhiyun machine, for example memory layout. In addition a VM ioctl is used to 19*4882a593Smuzhiyun create virtual cpus (vcpus) and devices. 20*4882a593Smuzhiyun 21*4882a593Smuzhiyun VM ioctls must be issued from the same process (address space) that was 22*4882a593Smuzhiyun used to create the VM. 23*4882a593Smuzhiyun 24*4882a593Smuzhiyun - vcpu ioctls: These query and set attributes that control the operation 25*4882a593Smuzhiyun of a single virtual cpu. 26*4882a593Smuzhiyun 27*4882a593Smuzhiyun vcpu ioctls should be issued from the same thread that was used to create 28*4882a593Smuzhiyun the vcpu, except for asynchronous vcpu ioctl that are marked as such in 29*4882a593Smuzhiyun the documentation. Otherwise, the first ioctl after switching threads 30*4882a593Smuzhiyun could see a performance impact. 31*4882a593Smuzhiyun 32*4882a593Smuzhiyun - device ioctls: These query and set attributes that control the operation 33*4882a593Smuzhiyun of a single device. 34*4882a593Smuzhiyun 35*4882a593Smuzhiyun device ioctls must be issued from the same process (address space) that 36*4882a593Smuzhiyun was used to create the VM. 37*4882a593Smuzhiyun 38*4882a593Smuzhiyun2. File descriptors 39*4882a593Smuzhiyun=================== 40*4882a593Smuzhiyun 41*4882a593SmuzhiyunThe kvm API is centered around file descriptors. An initial 42*4882a593Smuzhiyunopen("/dev/kvm") obtains a handle to the kvm subsystem; this handle 43*4882a593Smuzhiyuncan be used to issue system ioctls. A KVM_CREATE_VM ioctl on this 44*4882a593Smuzhiyunhandle will create a VM file descriptor which can be used to issue VM 45*4882a593Smuzhiyunioctls. A KVM_CREATE_VCPU or KVM_CREATE_DEVICE ioctl on a VM fd will 46*4882a593Smuzhiyuncreate a virtual cpu or device and return a file descriptor pointing to 47*4882a593Smuzhiyunthe new resource. Finally, ioctls on a vcpu or device fd can be used 48*4882a593Smuzhiyunto control the vcpu or device. For vcpus, this includes the important 49*4882a593Smuzhiyuntask of actually running guest code. 50*4882a593Smuzhiyun 51*4882a593SmuzhiyunIn general file descriptors can be migrated among processes by means 52*4882a593Smuzhiyunof fork() and the SCM_RIGHTS facility of unix domain socket. These 53*4882a593Smuzhiyunkinds of tricks are explicitly not supported by kvm. While they will 54*4882a593Smuzhiyunnot cause harm to the host, their actual behavior is not guaranteed by 55*4882a593Smuzhiyunthe API. See "General description" for details on the ioctl usage 56*4882a593Smuzhiyunmodel that is supported by KVM. 57*4882a593Smuzhiyun 58*4882a593SmuzhiyunIt is important to note that althought VM ioctls may only be issued from 59*4882a593Smuzhiyunthe process that created the VM, a VM's lifecycle is associated with its 60*4882a593Smuzhiyunfile descriptor, not its creator (process). In other words, the VM and 61*4882a593Smuzhiyunits resources, *including the associated address space*, are not freed 62*4882a593Smuzhiyununtil the last reference to the VM's file descriptor has been released. 63*4882a593SmuzhiyunFor example, if fork() is issued after ioctl(KVM_CREATE_VM), the VM will 64*4882a593Smuzhiyunnot be freed until both the parent (original) process and its child have 65*4882a593Smuzhiyunput their references to the VM's file descriptor. 66*4882a593Smuzhiyun 67*4882a593SmuzhiyunBecause a VM's resources are not freed until the last reference to its 68*4882a593Smuzhiyunfile descriptor is released, creating additional references to a VM 69*4882a593Smuzhiyunvia fork(), dup(), etc... without careful consideration is strongly 70*4882a593Smuzhiyundiscouraged and may have unwanted side effects, e.g. memory allocated 71*4882a593Smuzhiyunby and on behalf of the VM's process may not be freed/unaccounted when 72*4882a593Smuzhiyunthe VM is shut down. 73*4882a593Smuzhiyun 74*4882a593Smuzhiyun 75*4882a593Smuzhiyun3. Extensions 76*4882a593Smuzhiyun============= 77*4882a593Smuzhiyun 78*4882a593SmuzhiyunAs of Linux 2.6.22, the KVM ABI has been stabilized: no backward 79*4882a593Smuzhiyunincompatible change are allowed. However, there is an extension 80*4882a593Smuzhiyunfacility that allows backward-compatible extensions to the API to be 81*4882a593Smuzhiyunqueried and used. 82*4882a593Smuzhiyun 83*4882a593SmuzhiyunThe extension mechanism is not based on the Linux version number. 84*4882a593SmuzhiyunInstead, kvm defines extension identifiers and a facility to query 85*4882a593Smuzhiyunwhether a particular extension identifier is available. If it is, a 86*4882a593Smuzhiyunset of ioctls is available for application use. 87*4882a593Smuzhiyun 88*4882a593Smuzhiyun 89*4882a593Smuzhiyun4. API description 90*4882a593Smuzhiyun================== 91*4882a593Smuzhiyun 92*4882a593SmuzhiyunThis section describes ioctls that can be used to control kvm guests. 93*4882a593SmuzhiyunFor each ioctl, the following information is provided along with a 94*4882a593Smuzhiyundescription: 95*4882a593Smuzhiyun 96*4882a593Smuzhiyun Capability: 97*4882a593Smuzhiyun which KVM extension provides this ioctl. Can be 'basic', 98*4882a593Smuzhiyun which means that is will be provided by any kernel that supports 99*4882a593Smuzhiyun API version 12 (see section 4.1), a KVM_CAP_xyz constant, which 100*4882a593Smuzhiyun means availability needs to be checked with KVM_CHECK_EXTENSION 101*4882a593Smuzhiyun (see section 4.4), or 'none' which means that while not all kernels 102*4882a593Smuzhiyun support this ioctl, there's no capability bit to check its 103*4882a593Smuzhiyun availability: for kernels that don't support the ioctl, 104*4882a593Smuzhiyun the ioctl returns -ENOTTY. 105*4882a593Smuzhiyun 106*4882a593Smuzhiyun Architectures: 107*4882a593Smuzhiyun which instruction set architectures provide this ioctl. 108*4882a593Smuzhiyun x86 includes both i386 and x86_64. 109*4882a593Smuzhiyun 110*4882a593Smuzhiyun Type: 111*4882a593Smuzhiyun system, vm, or vcpu. 112*4882a593Smuzhiyun 113*4882a593Smuzhiyun Parameters: 114*4882a593Smuzhiyun what parameters are accepted by the ioctl. 115*4882a593Smuzhiyun 116*4882a593Smuzhiyun Returns: 117*4882a593Smuzhiyun the return value. General error numbers (EBADF, ENOMEM, EINVAL) 118*4882a593Smuzhiyun are not detailed, but errors with specific meanings are. 119*4882a593Smuzhiyun 120*4882a593Smuzhiyun 121*4882a593Smuzhiyun4.1 KVM_GET_API_VERSION 122*4882a593Smuzhiyun----------------------- 123*4882a593Smuzhiyun 124*4882a593Smuzhiyun:Capability: basic 125*4882a593Smuzhiyun:Architectures: all 126*4882a593Smuzhiyun:Type: system ioctl 127*4882a593Smuzhiyun:Parameters: none 128*4882a593Smuzhiyun:Returns: the constant KVM_API_VERSION (=12) 129*4882a593Smuzhiyun 130*4882a593SmuzhiyunThis identifies the API version as the stable kvm API. It is not 131*4882a593Smuzhiyunexpected that this number will change. However, Linux 2.6.20 and 132*4882a593Smuzhiyun2.6.21 report earlier versions; these are not documented and not 133*4882a593Smuzhiyunsupported. Applications should refuse to run if KVM_GET_API_VERSION 134*4882a593Smuzhiyunreturns a value other than 12. If this check passes, all ioctls 135*4882a593Smuzhiyundescribed as 'basic' will be available. 136*4882a593Smuzhiyun 137*4882a593Smuzhiyun 138*4882a593Smuzhiyun4.2 KVM_CREATE_VM 139*4882a593Smuzhiyun----------------- 140*4882a593Smuzhiyun 141*4882a593Smuzhiyun:Capability: basic 142*4882a593Smuzhiyun:Architectures: all 143*4882a593Smuzhiyun:Type: system ioctl 144*4882a593Smuzhiyun:Parameters: machine type identifier (KVM_VM_*) 145*4882a593Smuzhiyun:Returns: a VM fd that can be used to control the new virtual machine. 146*4882a593Smuzhiyun 147*4882a593SmuzhiyunThe new VM has no virtual cpus and no memory. 148*4882a593SmuzhiyunYou probably want to use 0 as machine type. 149*4882a593Smuzhiyun 150*4882a593SmuzhiyunIn order to create user controlled virtual machines on S390, check 151*4882a593SmuzhiyunKVM_CAP_S390_UCONTROL and use the flag KVM_VM_S390_UCONTROL as 152*4882a593Smuzhiyunprivileged user (CAP_SYS_ADMIN). 153*4882a593Smuzhiyun 154*4882a593SmuzhiyunTo use hardware assisted virtualization on MIPS (VZ ASE) rather than 155*4882a593Smuzhiyunthe default trap & emulate implementation (which changes the virtual 156*4882a593Smuzhiyunmemory layout to fit in user mode), check KVM_CAP_MIPS_VZ and use the 157*4882a593Smuzhiyunflag KVM_VM_MIPS_VZ. 158*4882a593Smuzhiyun 159*4882a593Smuzhiyun 160*4882a593SmuzhiyunOn arm64, the physical address size for a VM (IPA Size limit) is limited 161*4882a593Smuzhiyunto 40bits by default. The limit can be configured if the host supports the 162*4882a593Smuzhiyunextension KVM_CAP_ARM_VM_IPA_SIZE. When supported, use 163*4882a593SmuzhiyunKVM_VM_TYPE_ARM_IPA_SIZE(IPA_Bits) to set the size in the machine type 164*4882a593Smuzhiyunidentifier, where IPA_Bits is the maximum width of any physical 165*4882a593Smuzhiyunaddress used by the VM. The IPA_Bits is encoded in bits[7-0] of the 166*4882a593Smuzhiyunmachine type identifier. 167*4882a593Smuzhiyun 168*4882a593Smuzhiyune.g, to configure a guest to use 48bit physical address size:: 169*4882a593Smuzhiyun 170*4882a593Smuzhiyun vm_fd = ioctl(dev_fd, KVM_CREATE_VM, KVM_VM_TYPE_ARM_IPA_SIZE(48)); 171*4882a593Smuzhiyun 172*4882a593SmuzhiyunThe requested size (IPA_Bits) must be: 173*4882a593Smuzhiyun 174*4882a593Smuzhiyun == ========================================================= 175*4882a593Smuzhiyun 0 Implies default size, 40bits (for backward compatibility) 176*4882a593Smuzhiyun N Implies N bits, where N is a positive integer such that, 177*4882a593Smuzhiyun 32 <= N <= Host_IPA_Limit 178*4882a593Smuzhiyun == ========================================================= 179*4882a593Smuzhiyun 180*4882a593SmuzhiyunHost_IPA_Limit is the maximum possible value for IPA_Bits on the host and 181*4882a593Smuzhiyunis dependent on the CPU capability and the kernel configuration. The limit can 182*4882a593Smuzhiyunbe retrieved using KVM_CAP_ARM_VM_IPA_SIZE of the KVM_CHECK_EXTENSION 183*4882a593Smuzhiyunioctl() at run-time. 184*4882a593Smuzhiyun 185*4882a593SmuzhiyunCreation of the VM will fail if the requested IPA size (whether it is 186*4882a593Smuzhiyunimplicit or explicit) is unsupported on the host. 187*4882a593Smuzhiyun 188*4882a593SmuzhiyunPlease note that configuring the IPA size does not affect the capability 189*4882a593Smuzhiyunexposed by the guest CPUs in ID_AA64MMFR0_EL1[PARange]. It only affects 190*4882a593Smuzhiyunsize of the address translated by the stage2 level (guest physical to 191*4882a593Smuzhiyunhost physical address translations). 192*4882a593Smuzhiyun 193*4882a593Smuzhiyun 194*4882a593Smuzhiyun4.3 KVM_GET_MSR_INDEX_LIST, KVM_GET_MSR_FEATURE_INDEX_LIST 195*4882a593Smuzhiyun---------------------------------------------------------- 196*4882a593Smuzhiyun 197*4882a593Smuzhiyun:Capability: basic, KVM_CAP_GET_MSR_FEATURES for KVM_GET_MSR_FEATURE_INDEX_LIST 198*4882a593Smuzhiyun:Architectures: x86 199*4882a593Smuzhiyun:Type: system ioctl 200*4882a593Smuzhiyun:Parameters: struct kvm_msr_list (in/out) 201*4882a593Smuzhiyun:Returns: 0 on success; -1 on error 202*4882a593Smuzhiyun 203*4882a593SmuzhiyunErrors: 204*4882a593Smuzhiyun 205*4882a593Smuzhiyun ====== ============================================================ 206*4882a593Smuzhiyun EFAULT the msr index list cannot be read from or written to 207*4882a593Smuzhiyun E2BIG the msr index list is to be to fit in the array specified by 208*4882a593Smuzhiyun the user. 209*4882a593Smuzhiyun ====== ============================================================ 210*4882a593Smuzhiyun 211*4882a593Smuzhiyun:: 212*4882a593Smuzhiyun 213*4882a593Smuzhiyun struct kvm_msr_list { 214*4882a593Smuzhiyun __u32 nmsrs; /* number of msrs in entries */ 215*4882a593Smuzhiyun __u32 indices[0]; 216*4882a593Smuzhiyun }; 217*4882a593Smuzhiyun 218*4882a593SmuzhiyunThe user fills in the size of the indices array in nmsrs, and in return 219*4882a593Smuzhiyunkvm adjusts nmsrs to reflect the actual number of msrs and fills in the 220*4882a593Smuzhiyunindices array with their numbers. 221*4882a593Smuzhiyun 222*4882a593SmuzhiyunKVM_GET_MSR_INDEX_LIST returns the guest msrs that are supported. The list 223*4882a593Smuzhiyunvaries by kvm version and host processor, but does not change otherwise. 224*4882a593Smuzhiyun 225*4882a593SmuzhiyunNote: if kvm indicates supports MCE (KVM_CAP_MCE), then the MCE bank MSRs are 226*4882a593Smuzhiyunnot returned in the MSR list, as different vcpus can have a different number 227*4882a593Smuzhiyunof banks, as set via the KVM_X86_SETUP_MCE ioctl. 228*4882a593Smuzhiyun 229*4882a593SmuzhiyunKVM_GET_MSR_FEATURE_INDEX_LIST returns the list of MSRs that can be passed 230*4882a593Smuzhiyunto the KVM_GET_MSRS system ioctl. This lets userspace probe host capabilities 231*4882a593Smuzhiyunand processor features that are exposed via MSRs (e.g., VMX capabilities). 232*4882a593SmuzhiyunThis list also varies by kvm version and host processor, but does not change 233*4882a593Smuzhiyunotherwise. 234*4882a593Smuzhiyun 235*4882a593Smuzhiyun 236*4882a593Smuzhiyun4.4 KVM_CHECK_EXTENSION 237*4882a593Smuzhiyun----------------------- 238*4882a593Smuzhiyun 239*4882a593Smuzhiyun:Capability: basic, KVM_CAP_CHECK_EXTENSION_VM for vm ioctl 240*4882a593Smuzhiyun:Architectures: all 241*4882a593Smuzhiyun:Type: system ioctl, vm ioctl 242*4882a593Smuzhiyun:Parameters: extension identifier (KVM_CAP_*) 243*4882a593Smuzhiyun:Returns: 0 if unsupported; 1 (or some other positive integer) if supported 244*4882a593Smuzhiyun 245*4882a593SmuzhiyunThe API allows the application to query about extensions to the core 246*4882a593Smuzhiyunkvm API. Userspace passes an extension identifier (an integer) and 247*4882a593Smuzhiyunreceives an integer that describes the extension availability. 248*4882a593SmuzhiyunGenerally 0 means no and 1 means yes, but some extensions may report 249*4882a593Smuzhiyunadditional information in the integer return value. 250*4882a593Smuzhiyun 251*4882a593SmuzhiyunBased on their initialization different VMs may have different capabilities. 252*4882a593SmuzhiyunIt is thus encouraged to use the vm ioctl to query for capabilities (available 253*4882a593Smuzhiyunwith KVM_CAP_CHECK_EXTENSION_VM on the vm fd) 254*4882a593Smuzhiyun 255*4882a593Smuzhiyun4.5 KVM_GET_VCPU_MMAP_SIZE 256*4882a593Smuzhiyun-------------------------- 257*4882a593Smuzhiyun 258*4882a593Smuzhiyun:Capability: basic 259*4882a593Smuzhiyun:Architectures: all 260*4882a593Smuzhiyun:Type: system ioctl 261*4882a593Smuzhiyun:Parameters: none 262*4882a593Smuzhiyun:Returns: size of vcpu mmap area, in bytes 263*4882a593Smuzhiyun 264*4882a593SmuzhiyunThe KVM_RUN ioctl (cf.) communicates with userspace via a shared 265*4882a593Smuzhiyunmemory region. This ioctl returns the size of that region. See the 266*4882a593SmuzhiyunKVM_RUN documentation for details. 267*4882a593Smuzhiyun 268*4882a593Smuzhiyun 269*4882a593Smuzhiyun4.6 KVM_SET_MEMORY_REGION 270*4882a593Smuzhiyun------------------------- 271*4882a593Smuzhiyun 272*4882a593Smuzhiyun:Capability: basic 273*4882a593Smuzhiyun:Architectures: all 274*4882a593Smuzhiyun:Type: vm ioctl 275*4882a593Smuzhiyun:Parameters: struct kvm_memory_region (in) 276*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 277*4882a593Smuzhiyun 278*4882a593SmuzhiyunThis ioctl is obsolete and has been removed. 279*4882a593Smuzhiyun 280*4882a593Smuzhiyun 281*4882a593Smuzhiyun4.7 KVM_CREATE_VCPU 282*4882a593Smuzhiyun------------------- 283*4882a593Smuzhiyun 284*4882a593Smuzhiyun:Capability: basic 285*4882a593Smuzhiyun:Architectures: all 286*4882a593Smuzhiyun:Type: vm ioctl 287*4882a593Smuzhiyun:Parameters: vcpu id (apic id on x86) 288*4882a593Smuzhiyun:Returns: vcpu fd on success, -1 on error 289*4882a593Smuzhiyun 290*4882a593SmuzhiyunThis API adds a vcpu to a virtual machine. No more than max_vcpus may be added. 291*4882a593SmuzhiyunThe vcpu id is an integer in the range [0, max_vcpu_id). 292*4882a593Smuzhiyun 293*4882a593SmuzhiyunThe recommended max_vcpus value can be retrieved using the KVM_CAP_NR_VCPUS of 294*4882a593Smuzhiyunthe KVM_CHECK_EXTENSION ioctl() at run-time. 295*4882a593SmuzhiyunThe maximum possible value for max_vcpus can be retrieved using the 296*4882a593SmuzhiyunKVM_CAP_MAX_VCPUS of the KVM_CHECK_EXTENSION ioctl() at run-time. 297*4882a593Smuzhiyun 298*4882a593SmuzhiyunIf the KVM_CAP_NR_VCPUS does not exist, you should assume that max_vcpus is 4 299*4882a593Smuzhiyuncpus max. 300*4882a593SmuzhiyunIf the KVM_CAP_MAX_VCPUS does not exist, you should assume that max_vcpus is 301*4882a593Smuzhiyunsame as the value returned from KVM_CAP_NR_VCPUS. 302*4882a593Smuzhiyun 303*4882a593SmuzhiyunThe maximum possible value for max_vcpu_id can be retrieved using the 304*4882a593SmuzhiyunKVM_CAP_MAX_VCPU_ID of the KVM_CHECK_EXTENSION ioctl() at run-time. 305*4882a593Smuzhiyun 306*4882a593SmuzhiyunIf the KVM_CAP_MAX_VCPU_ID does not exist, you should assume that max_vcpu_id 307*4882a593Smuzhiyunis the same as the value returned from KVM_CAP_MAX_VCPUS. 308*4882a593Smuzhiyun 309*4882a593SmuzhiyunOn powerpc using book3s_hv mode, the vcpus are mapped onto virtual 310*4882a593Smuzhiyunthreads in one or more virtual CPU cores. (This is because the 311*4882a593Smuzhiyunhardware requires all the hardware threads in a CPU core to be in the 312*4882a593Smuzhiyunsame partition.) The KVM_CAP_PPC_SMT capability indicates the number 313*4882a593Smuzhiyunof vcpus per virtual core (vcore). The vcore id is obtained by 314*4882a593Smuzhiyundividing the vcpu id by the number of vcpus per vcore. The vcpus in a 315*4882a593Smuzhiyungiven vcore will always be in the same physical core as each other 316*4882a593Smuzhiyun(though that might be a different physical core from time to time). 317*4882a593SmuzhiyunUserspace can control the threading (SMT) mode of the guest by its 318*4882a593Smuzhiyunallocation of vcpu ids. For example, if userspace wants 319*4882a593Smuzhiyunsingle-threaded guest vcpus, it should make all vcpu ids be a multiple 320*4882a593Smuzhiyunof the number of vcpus per vcore. 321*4882a593Smuzhiyun 322*4882a593SmuzhiyunFor virtual cpus that have been created with S390 user controlled virtual 323*4882a593Smuzhiyunmachines, the resulting vcpu fd can be memory mapped at page offset 324*4882a593SmuzhiyunKVM_S390_SIE_PAGE_OFFSET in order to obtain a memory map of the virtual 325*4882a593Smuzhiyuncpu's hardware control block. 326*4882a593Smuzhiyun 327*4882a593Smuzhiyun 328*4882a593Smuzhiyun4.8 KVM_GET_DIRTY_LOG (vm ioctl) 329*4882a593Smuzhiyun-------------------------------- 330*4882a593Smuzhiyun 331*4882a593Smuzhiyun:Capability: basic 332*4882a593Smuzhiyun:Architectures: all 333*4882a593Smuzhiyun:Type: vm ioctl 334*4882a593Smuzhiyun:Parameters: struct kvm_dirty_log (in/out) 335*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 336*4882a593Smuzhiyun 337*4882a593Smuzhiyun:: 338*4882a593Smuzhiyun 339*4882a593Smuzhiyun /* for KVM_GET_DIRTY_LOG */ 340*4882a593Smuzhiyun struct kvm_dirty_log { 341*4882a593Smuzhiyun __u32 slot; 342*4882a593Smuzhiyun __u32 padding; 343*4882a593Smuzhiyun union { 344*4882a593Smuzhiyun void __user *dirty_bitmap; /* one bit per page */ 345*4882a593Smuzhiyun __u64 padding; 346*4882a593Smuzhiyun }; 347*4882a593Smuzhiyun }; 348*4882a593Smuzhiyun 349*4882a593SmuzhiyunGiven a memory slot, return a bitmap containing any pages dirtied 350*4882a593Smuzhiyunsince the last call to this ioctl. Bit 0 is the first page in the 351*4882a593Smuzhiyunmemory slot. Ensure the entire structure is cleared to avoid padding 352*4882a593Smuzhiyunissues. 353*4882a593Smuzhiyun 354*4882a593SmuzhiyunIf KVM_CAP_MULTI_ADDRESS_SPACE is available, bits 16-31 specifies 355*4882a593Smuzhiyunthe address space for which you want to return the dirty bitmap. 356*4882a593SmuzhiyunThey must be less than the value that KVM_CHECK_EXTENSION returns for 357*4882a593Smuzhiyunthe KVM_CAP_MULTI_ADDRESS_SPACE capability. 358*4882a593Smuzhiyun 359*4882a593SmuzhiyunThe bits in the dirty bitmap are cleared before the ioctl returns, unless 360*4882a593SmuzhiyunKVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 is enabled. For more information, 361*4882a593Smuzhiyunsee the description of the capability. 362*4882a593Smuzhiyun 363*4882a593Smuzhiyun4.9 KVM_SET_MEMORY_ALIAS 364*4882a593Smuzhiyun------------------------ 365*4882a593Smuzhiyun 366*4882a593Smuzhiyun:Capability: basic 367*4882a593Smuzhiyun:Architectures: x86 368*4882a593Smuzhiyun:Type: vm ioctl 369*4882a593Smuzhiyun:Parameters: struct kvm_memory_alias (in) 370*4882a593Smuzhiyun:Returns: 0 (success), -1 (error) 371*4882a593Smuzhiyun 372*4882a593SmuzhiyunThis ioctl is obsolete and has been removed. 373*4882a593Smuzhiyun 374*4882a593Smuzhiyun 375*4882a593Smuzhiyun4.10 KVM_RUN 376*4882a593Smuzhiyun------------ 377*4882a593Smuzhiyun 378*4882a593Smuzhiyun:Capability: basic 379*4882a593Smuzhiyun:Architectures: all 380*4882a593Smuzhiyun:Type: vcpu ioctl 381*4882a593Smuzhiyun:Parameters: none 382*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 383*4882a593Smuzhiyun 384*4882a593SmuzhiyunErrors: 385*4882a593Smuzhiyun 386*4882a593Smuzhiyun ======= ============================================================== 387*4882a593Smuzhiyun EINTR an unmasked signal is pending 388*4882a593Smuzhiyun ENOEXEC the vcpu hasn't been initialized or the guest tried to execute 389*4882a593Smuzhiyun instructions from device memory (arm64) 390*4882a593Smuzhiyun ENOSYS data abort outside memslots with no syndrome info and 391*4882a593Smuzhiyun KVM_CAP_ARM_NISV_TO_USER not enabled (arm64) 392*4882a593Smuzhiyun EPERM SVE feature set but not finalized (arm64) 393*4882a593Smuzhiyun ======= ============================================================== 394*4882a593Smuzhiyun 395*4882a593SmuzhiyunThis ioctl is used to run a guest virtual cpu. While there are no 396*4882a593Smuzhiyunexplicit parameters, there is an implicit parameter block that can be 397*4882a593Smuzhiyunobtained by mmap()ing the vcpu fd at offset 0, with the size given by 398*4882a593SmuzhiyunKVM_GET_VCPU_MMAP_SIZE. The parameter block is formatted as a 'struct 399*4882a593Smuzhiyunkvm_run' (see below). 400*4882a593Smuzhiyun 401*4882a593Smuzhiyun 402*4882a593Smuzhiyun4.11 KVM_GET_REGS 403*4882a593Smuzhiyun----------------- 404*4882a593Smuzhiyun 405*4882a593Smuzhiyun:Capability: basic 406*4882a593Smuzhiyun:Architectures: all except ARM, arm64 407*4882a593Smuzhiyun:Type: vcpu ioctl 408*4882a593Smuzhiyun:Parameters: struct kvm_regs (out) 409*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 410*4882a593Smuzhiyun 411*4882a593SmuzhiyunReads the general purpose registers from the vcpu. 412*4882a593Smuzhiyun 413*4882a593Smuzhiyun:: 414*4882a593Smuzhiyun 415*4882a593Smuzhiyun /* x86 */ 416*4882a593Smuzhiyun struct kvm_regs { 417*4882a593Smuzhiyun /* out (KVM_GET_REGS) / in (KVM_SET_REGS) */ 418*4882a593Smuzhiyun __u64 rax, rbx, rcx, rdx; 419*4882a593Smuzhiyun __u64 rsi, rdi, rsp, rbp; 420*4882a593Smuzhiyun __u64 r8, r9, r10, r11; 421*4882a593Smuzhiyun __u64 r12, r13, r14, r15; 422*4882a593Smuzhiyun __u64 rip, rflags; 423*4882a593Smuzhiyun }; 424*4882a593Smuzhiyun 425*4882a593Smuzhiyun /* mips */ 426*4882a593Smuzhiyun struct kvm_regs { 427*4882a593Smuzhiyun /* out (KVM_GET_REGS) / in (KVM_SET_REGS) */ 428*4882a593Smuzhiyun __u64 gpr[32]; 429*4882a593Smuzhiyun __u64 hi; 430*4882a593Smuzhiyun __u64 lo; 431*4882a593Smuzhiyun __u64 pc; 432*4882a593Smuzhiyun }; 433*4882a593Smuzhiyun 434*4882a593Smuzhiyun 435*4882a593Smuzhiyun4.12 KVM_SET_REGS 436*4882a593Smuzhiyun----------------- 437*4882a593Smuzhiyun 438*4882a593Smuzhiyun:Capability: basic 439*4882a593Smuzhiyun:Architectures: all except ARM, arm64 440*4882a593Smuzhiyun:Type: vcpu ioctl 441*4882a593Smuzhiyun:Parameters: struct kvm_regs (in) 442*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 443*4882a593Smuzhiyun 444*4882a593SmuzhiyunWrites the general purpose registers into the vcpu. 445*4882a593Smuzhiyun 446*4882a593SmuzhiyunSee KVM_GET_REGS for the data structure. 447*4882a593Smuzhiyun 448*4882a593Smuzhiyun 449*4882a593Smuzhiyun4.13 KVM_GET_SREGS 450*4882a593Smuzhiyun------------------ 451*4882a593Smuzhiyun 452*4882a593Smuzhiyun:Capability: basic 453*4882a593Smuzhiyun:Architectures: x86, ppc 454*4882a593Smuzhiyun:Type: vcpu ioctl 455*4882a593Smuzhiyun:Parameters: struct kvm_sregs (out) 456*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 457*4882a593Smuzhiyun 458*4882a593SmuzhiyunReads special registers from the vcpu. 459*4882a593Smuzhiyun 460*4882a593Smuzhiyun:: 461*4882a593Smuzhiyun 462*4882a593Smuzhiyun /* x86 */ 463*4882a593Smuzhiyun struct kvm_sregs { 464*4882a593Smuzhiyun struct kvm_segment cs, ds, es, fs, gs, ss; 465*4882a593Smuzhiyun struct kvm_segment tr, ldt; 466*4882a593Smuzhiyun struct kvm_dtable gdt, idt; 467*4882a593Smuzhiyun __u64 cr0, cr2, cr3, cr4, cr8; 468*4882a593Smuzhiyun __u64 efer; 469*4882a593Smuzhiyun __u64 apic_base; 470*4882a593Smuzhiyun __u64 interrupt_bitmap[(KVM_NR_INTERRUPTS + 63) / 64]; 471*4882a593Smuzhiyun }; 472*4882a593Smuzhiyun 473*4882a593Smuzhiyun /* ppc -- see arch/powerpc/include/uapi/asm/kvm.h */ 474*4882a593Smuzhiyun 475*4882a593Smuzhiyuninterrupt_bitmap is a bitmap of pending external interrupts. At most 476*4882a593Smuzhiyunone bit may be set. This interrupt has been acknowledged by the APIC 477*4882a593Smuzhiyunbut not yet injected into the cpu core. 478*4882a593Smuzhiyun 479*4882a593Smuzhiyun 480*4882a593Smuzhiyun4.14 KVM_SET_SREGS 481*4882a593Smuzhiyun------------------ 482*4882a593Smuzhiyun 483*4882a593Smuzhiyun:Capability: basic 484*4882a593Smuzhiyun:Architectures: x86, ppc 485*4882a593Smuzhiyun:Type: vcpu ioctl 486*4882a593Smuzhiyun:Parameters: struct kvm_sregs (in) 487*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 488*4882a593Smuzhiyun 489*4882a593SmuzhiyunWrites special registers into the vcpu. See KVM_GET_SREGS for the 490*4882a593Smuzhiyundata structures. 491*4882a593Smuzhiyun 492*4882a593Smuzhiyun 493*4882a593Smuzhiyun4.15 KVM_TRANSLATE 494*4882a593Smuzhiyun------------------ 495*4882a593Smuzhiyun 496*4882a593Smuzhiyun:Capability: basic 497*4882a593Smuzhiyun:Architectures: x86 498*4882a593Smuzhiyun:Type: vcpu ioctl 499*4882a593Smuzhiyun:Parameters: struct kvm_translation (in/out) 500*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 501*4882a593Smuzhiyun 502*4882a593SmuzhiyunTranslates a virtual address according to the vcpu's current address 503*4882a593Smuzhiyuntranslation mode. 504*4882a593Smuzhiyun 505*4882a593Smuzhiyun:: 506*4882a593Smuzhiyun 507*4882a593Smuzhiyun struct kvm_translation { 508*4882a593Smuzhiyun /* in */ 509*4882a593Smuzhiyun __u64 linear_address; 510*4882a593Smuzhiyun 511*4882a593Smuzhiyun /* out */ 512*4882a593Smuzhiyun __u64 physical_address; 513*4882a593Smuzhiyun __u8 valid; 514*4882a593Smuzhiyun __u8 writeable; 515*4882a593Smuzhiyun __u8 usermode; 516*4882a593Smuzhiyun __u8 pad[5]; 517*4882a593Smuzhiyun }; 518*4882a593Smuzhiyun 519*4882a593Smuzhiyun 520*4882a593Smuzhiyun4.16 KVM_INTERRUPT 521*4882a593Smuzhiyun------------------ 522*4882a593Smuzhiyun 523*4882a593Smuzhiyun:Capability: basic 524*4882a593Smuzhiyun:Architectures: x86, ppc, mips 525*4882a593Smuzhiyun:Type: vcpu ioctl 526*4882a593Smuzhiyun:Parameters: struct kvm_interrupt (in) 527*4882a593Smuzhiyun:Returns: 0 on success, negative on failure. 528*4882a593Smuzhiyun 529*4882a593SmuzhiyunQueues a hardware interrupt vector to be injected. 530*4882a593Smuzhiyun 531*4882a593Smuzhiyun:: 532*4882a593Smuzhiyun 533*4882a593Smuzhiyun /* for KVM_INTERRUPT */ 534*4882a593Smuzhiyun struct kvm_interrupt { 535*4882a593Smuzhiyun /* in */ 536*4882a593Smuzhiyun __u32 irq; 537*4882a593Smuzhiyun }; 538*4882a593Smuzhiyun 539*4882a593SmuzhiyunX86: 540*4882a593Smuzhiyun^^^^ 541*4882a593Smuzhiyun 542*4882a593Smuzhiyun:Returns: 543*4882a593Smuzhiyun 544*4882a593Smuzhiyun ========= =================================== 545*4882a593Smuzhiyun 0 on success, 546*4882a593Smuzhiyun -EEXIST if an interrupt is already enqueued 547*4882a593Smuzhiyun -EINVAL the irq number is invalid 548*4882a593Smuzhiyun -ENXIO if the PIC is in the kernel 549*4882a593Smuzhiyun -EFAULT if the pointer is invalid 550*4882a593Smuzhiyun ========= =================================== 551*4882a593Smuzhiyun 552*4882a593SmuzhiyunNote 'irq' is an interrupt vector, not an interrupt pin or line. This 553*4882a593Smuzhiyunioctl is useful if the in-kernel PIC is not used. 554*4882a593Smuzhiyun 555*4882a593SmuzhiyunPPC: 556*4882a593Smuzhiyun^^^^ 557*4882a593Smuzhiyun 558*4882a593SmuzhiyunQueues an external interrupt to be injected. This ioctl is overleaded 559*4882a593Smuzhiyunwith 3 different irq values: 560*4882a593Smuzhiyun 561*4882a593Smuzhiyuna) KVM_INTERRUPT_SET 562*4882a593Smuzhiyun 563*4882a593Smuzhiyun This injects an edge type external interrupt into the guest once it's ready 564*4882a593Smuzhiyun to receive interrupts. When injected, the interrupt is done. 565*4882a593Smuzhiyun 566*4882a593Smuzhiyunb) KVM_INTERRUPT_UNSET 567*4882a593Smuzhiyun 568*4882a593Smuzhiyun This unsets any pending interrupt. 569*4882a593Smuzhiyun 570*4882a593Smuzhiyun Only available with KVM_CAP_PPC_UNSET_IRQ. 571*4882a593Smuzhiyun 572*4882a593Smuzhiyunc) KVM_INTERRUPT_SET_LEVEL 573*4882a593Smuzhiyun 574*4882a593Smuzhiyun This injects a level type external interrupt into the guest context. The 575*4882a593Smuzhiyun interrupt stays pending until a specific ioctl with KVM_INTERRUPT_UNSET 576*4882a593Smuzhiyun is triggered. 577*4882a593Smuzhiyun 578*4882a593Smuzhiyun Only available with KVM_CAP_PPC_IRQ_LEVEL. 579*4882a593Smuzhiyun 580*4882a593SmuzhiyunNote that any value for 'irq' other than the ones stated above is invalid 581*4882a593Smuzhiyunand incurs unexpected behavior. 582*4882a593Smuzhiyun 583*4882a593SmuzhiyunThis is an asynchronous vcpu ioctl and can be invoked from any thread. 584*4882a593Smuzhiyun 585*4882a593SmuzhiyunMIPS: 586*4882a593Smuzhiyun^^^^^ 587*4882a593Smuzhiyun 588*4882a593SmuzhiyunQueues an external interrupt to be injected into the virtual CPU. A negative 589*4882a593Smuzhiyuninterrupt number dequeues the interrupt. 590*4882a593Smuzhiyun 591*4882a593SmuzhiyunThis is an asynchronous vcpu ioctl and can be invoked from any thread. 592*4882a593Smuzhiyun 593*4882a593Smuzhiyun 594*4882a593Smuzhiyun4.17 KVM_DEBUG_GUEST 595*4882a593Smuzhiyun-------------------- 596*4882a593Smuzhiyun 597*4882a593Smuzhiyun:Capability: basic 598*4882a593Smuzhiyun:Architectures: none 599*4882a593Smuzhiyun:Type: vcpu ioctl 600*4882a593Smuzhiyun:Parameters: none) 601*4882a593Smuzhiyun:Returns: -1 on error 602*4882a593Smuzhiyun 603*4882a593SmuzhiyunSupport for this has been removed. Use KVM_SET_GUEST_DEBUG instead. 604*4882a593Smuzhiyun 605*4882a593Smuzhiyun 606*4882a593Smuzhiyun4.18 KVM_GET_MSRS 607*4882a593Smuzhiyun----------------- 608*4882a593Smuzhiyun 609*4882a593Smuzhiyun:Capability: basic (vcpu), KVM_CAP_GET_MSR_FEATURES (system) 610*4882a593Smuzhiyun:Architectures: x86 611*4882a593Smuzhiyun:Type: system ioctl, vcpu ioctl 612*4882a593Smuzhiyun:Parameters: struct kvm_msrs (in/out) 613*4882a593Smuzhiyun:Returns: number of msrs successfully returned; 614*4882a593Smuzhiyun -1 on error 615*4882a593Smuzhiyun 616*4882a593SmuzhiyunWhen used as a system ioctl: 617*4882a593SmuzhiyunReads the values of MSR-based features that are available for the VM. This 618*4882a593Smuzhiyunis similar to KVM_GET_SUPPORTED_CPUID, but it returns MSR indices and values. 619*4882a593SmuzhiyunThe list of msr-based features can be obtained using KVM_GET_MSR_FEATURE_INDEX_LIST 620*4882a593Smuzhiyunin a system ioctl. 621*4882a593Smuzhiyun 622*4882a593SmuzhiyunWhen used as a vcpu ioctl: 623*4882a593SmuzhiyunReads model-specific registers from the vcpu. Supported msr indices can 624*4882a593Smuzhiyunbe obtained using KVM_GET_MSR_INDEX_LIST in a system ioctl. 625*4882a593Smuzhiyun 626*4882a593Smuzhiyun:: 627*4882a593Smuzhiyun 628*4882a593Smuzhiyun struct kvm_msrs { 629*4882a593Smuzhiyun __u32 nmsrs; /* number of msrs in entries */ 630*4882a593Smuzhiyun __u32 pad; 631*4882a593Smuzhiyun 632*4882a593Smuzhiyun struct kvm_msr_entry entries[0]; 633*4882a593Smuzhiyun }; 634*4882a593Smuzhiyun 635*4882a593Smuzhiyun struct kvm_msr_entry { 636*4882a593Smuzhiyun __u32 index; 637*4882a593Smuzhiyun __u32 reserved; 638*4882a593Smuzhiyun __u64 data; 639*4882a593Smuzhiyun }; 640*4882a593Smuzhiyun 641*4882a593SmuzhiyunApplication code should set the 'nmsrs' member (which indicates the 642*4882a593Smuzhiyunsize of the entries array) and the 'index' member of each array entry. 643*4882a593Smuzhiyunkvm will fill in the 'data' member. 644*4882a593Smuzhiyun 645*4882a593Smuzhiyun 646*4882a593Smuzhiyun4.19 KVM_SET_MSRS 647*4882a593Smuzhiyun----------------- 648*4882a593Smuzhiyun 649*4882a593Smuzhiyun:Capability: basic 650*4882a593Smuzhiyun:Architectures: x86 651*4882a593Smuzhiyun:Type: vcpu ioctl 652*4882a593Smuzhiyun:Parameters: struct kvm_msrs (in) 653*4882a593Smuzhiyun:Returns: number of msrs successfully set (see below), -1 on error 654*4882a593Smuzhiyun 655*4882a593SmuzhiyunWrites model-specific registers to the vcpu. See KVM_GET_MSRS for the 656*4882a593Smuzhiyundata structures. 657*4882a593Smuzhiyun 658*4882a593SmuzhiyunApplication code should set the 'nmsrs' member (which indicates the 659*4882a593Smuzhiyunsize of the entries array), and the 'index' and 'data' members of each 660*4882a593Smuzhiyunarray entry. 661*4882a593Smuzhiyun 662*4882a593SmuzhiyunIt tries to set the MSRs in array entries[] one by one. If setting an MSR 663*4882a593Smuzhiyunfails, e.g., due to setting reserved bits, the MSR isn't supported/emulated 664*4882a593Smuzhiyunby KVM, etc..., it stops processing the MSR list and returns the number of 665*4882a593SmuzhiyunMSRs that have been set successfully. 666*4882a593Smuzhiyun 667*4882a593Smuzhiyun 668*4882a593Smuzhiyun4.20 KVM_SET_CPUID 669*4882a593Smuzhiyun------------------ 670*4882a593Smuzhiyun 671*4882a593Smuzhiyun:Capability: basic 672*4882a593Smuzhiyun:Architectures: x86 673*4882a593Smuzhiyun:Type: vcpu ioctl 674*4882a593Smuzhiyun:Parameters: struct kvm_cpuid (in) 675*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 676*4882a593Smuzhiyun 677*4882a593SmuzhiyunDefines the vcpu responses to the cpuid instruction. Applications 678*4882a593Smuzhiyunshould use the KVM_SET_CPUID2 ioctl if available. 679*4882a593Smuzhiyun 680*4882a593SmuzhiyunNote, when this IOCTL fails, KVM gives no guarantees that previous valid CPUID 681*4882a593Smuzhiyunconfiguration (if there is) is not corrupted. Userspace can get a copy of the 682*4882a593Smuzhiyunresulting CPUID configuration through KVM_GET_CPUID2 in case. 683*4882a593Smuzhiyun 684*4882a593Smuzhiyun:: 685*4882a593Smuzhiyun 686*4882a593Smuzhiyun struct kvm_cpuid_entry { 687*4882a593Smuzhiyun __u32 function; 688*4882a593Smuzhiyun __u32 eax; 689*4882a593Smuzhiyun __u32 ebx; 690*4882a593Smuzhiyun __u32 ecx; 691*4882a593Smuzhiyun __u32 edx; 692*4882a593Smuzhiyun __u32 padding; 693*4882a593Smuzhiyun }; 694*4882a593Smuzhiyun 695*4882a593Smuzhiyun /* for KVM_SET_CPUID */ 696*4882a593Smuzhiyun struct kvm_cpuid { 697*4882a593Smuzhiyun __u32 nent; 698*4882a593Smuzhiyun __u32 padding; 699*4882a593Smuzhiyun struct kvm_cpuid_entry entries[0]; 700*4882a593Smuzhiyun }; 701*4882a593Smuzhiyun 702*4882a593Smuzhiyun 703*4882a593Smuzhiyun4.21 KVM_SET_SIGNAL_MASK 704*4882a593Smuzhiyun------------------------ 705*4882a593Smuzhiyun 706*4882a593Smuzhiyun:Capability: basic 707*4882a593Smuzhiyun:Architectures: all 708*4882a593Smuzhiyun:Type: vcpu ioctl 709*4882a593Smuzhiyun:Parameters: struct kvm_signal_mask (in) 710*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 711*4882a593Smuzhiyun 712*4882a593SmuzhiyunDefines which signals are blocked during execution of KVM_RUN. This 713*4882a593Smuzhiyunsignal mask temporarily overrides the threads signal mask. Any 714*4882a593Smuzhiyununblocked signal received (except SIGKILL and SIGSTOP, which retain 715*4882a593Smuzhiyuntheir traditional behaviour) will cause KVM_RUN to return with -EINTR. 716*4882a593Smuzhiyun 717*4882a593SmuzhiyunNote the signal will only be delivered if not blocked by the original 718*4882a593Smuzhiyunsignal mask. 719*4882a593Smuzhiyun 720*4882a593Smuzhiyun:: 721*4882a593Smuzhiyun 722*4882a593Smuzhiyun /* for KVM_SET_SIGNAL_MASK */ 723*4882a593Smuzhiyun struct kvm_signal_mask { 724*4882a593Smuzhiyun __u32 len; 725*4882a593Smuzhiyun __u8 sigset[0]; 726*4882a593Smuzhiyun }; 727*4882a593Smuzhiyun 728*4882a593Smuzhiyun 729*4882a593Smuzhiyun4.22 KVM_GET_FPU 730*4882a593Smuzhiyun---------------- 731*4882a593Smuzhiyun 732*4882a593Smuzhiyun:Capability: basic 733*4882a593Smuzhiyun:Architectures: x86 734*4882a593Smuzhiyun:Type: vcpu ioctl 735*4882a593Smuzhiyun:Parameters: struct kvm_fpu (out) 736*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 737*4882a593Smuzhiyun 738*4882a593SmuzhiyunReads the floating point state from the vcpu. 739*4882a593Smuzhiyun 740*4882a593Smuzhiyun:: 741*4882a593Smuzhiyun 742*4882a593Smuzhiyun /* for KVM_GET_FPU and KVM_SET_FPU */ 743*4882a593Smuzhiyun struct kvm_fpu { 744*4882a593Smuzhiyun __u8 fpr[8][16]; 745*4882a593Smuzhiyun __u16 fcw; 746*4882a593Smuzhiyun __u16 fsw; 747*4882a593Smuzhiyun __u8 ftwx; /* in fxsave format */ 748*4882a593Smuzhiyun __u8 pad1; 749*4882a593Smuzhiyun __u16 last_opcode; 750*4882a593Smuzhiyun __u64 last_ip; 751*4882a593Smuzhiyun __u64 last_dp; 752*4882a593Smuzhiyun __u8 xmm[16][16]; 753*4882a593Smuzhiyun __u32 mxcsr; 754*4882a593Smuzhiyun __u32 pad2; 755*4882a593Smuzhiyun }; 756*4882a593Smuzhiyun 757*4882a593Smuzhiyun 758*4882a593Smuzhiyun4.23 KVM_SET_FPU 759*4882a593Smuzhiyun---------------- 760*4882a593Smuzhiyun 761*4882a593Smuzhiyun:Capability: basic 762*4882a593Smuzhiyun:Architectures: x86 763*4882a593Smuzhiyun:Type: vcpu ioctl 764*4882a593Smuzhiyun:Parameters: struct kvm_fpu (in) 765*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 766*4882a593Smuzhiyun 767*4882a593SmuzhiyunWrites the floating point state to the vcpu. 768*4882a593Smuzhiyun 769*4882a593Smuzhiyun:: 770*4882a593Smuzhiyun 771*4882a593Smuzhiyun /* for KVM_GET_FPU and KVM_SET_FPU */ 772*4882a593Smuzhiyun struct kvm_fpu { 773*4882a593Smuzhiyun __u8 fpr[8][16]; 774*4882a593Smuzhiyun __u16 fcw; 775*4882a593Smuzhiyun __u16 fsw; 776*4882a593Smuzhiyun __u8 ftwx; /* in fxsave format */ 777*4882a593Smuzhiyun __u8 pad1; 778*4882a593Smuzhiyun __u16 last_opcode; 779*4882a593Smuzhiyun __u64 last_ip; 780*4882a593Smuzhiyun __u64 last_dp; 781*4882a593Smuzhiyun __u8 xmm[16][16]; 782*4882a593Smuzhiyun __u32 mxcsr; 783*4882a593Smuzhiyun __u32 pad2; 784*4882a593Smuzhiyun }; 785*4882a593Smuzhiyun 786*4882a593Smuzhiyun 787*4882a593Smuzhiyun4.24 KVM_CREATE_IRQCHIP 788*4882a593Smuzhiyun----------------------- 789*4882a593Smuzhiyun 790*4882a593Smuzhiyun:Capability: KVM_CAP_IRQCHIP, KVM_CAP_S390_IRQCHIP (s390) 791*4882a593Smuzhiyun:Architectures: x86, ARM, arm64, s390 792*4882a593Smuzhiyun:Type: vm ioctl 793*4882a593Smuzhiyun:Parameters: none 794*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 795*4882a593Smuzhiyun 796*4882a593SmuzhiyunCreates an interrupt controller model in the kernel. 797*4882a593SmuzhiyunOn x86, creates a virtual ioapic, a virtual PIC (two PICs, nested), and sets up 798*4882a593Smuzhiyunfuture vcpus to have a local APIC. IRQ routing for GSIs 0-15 is set to both 799*4882a593SmuzhiyunPIC and IOAPIC; GSI 16-23 only go to the IOAPIC. 800*4882a593SmuzhiyunOn ARM/arm64, a GICv2 is created. Any other GIC versions require the usage of 801*4882a593SmuzhiyunKVM_CREATE_DEVICE, which also supports creating a GICv2. Using 802*4882a593SmuzhiyunKVM_CREATE_DEVICE is preferred over KVM_CREATE_IRQCHIP for GICv2. 803*4882a593SmuzhiyunOn s390, a dummy irq routing table is created. 804*4882a593Smuzhiyun 805*4882a593SmuzhiyunNote that on s390 the KVM_CAP_S390_IRQCHIP vm capability needs to be enabled 806*4882a593Smuzhiyunbefore KVM_CREATE_IRQCHIP can be used. 807*4882a593Smuzhiyun 808*4882a593Smuzhiyun 809*4882a593Smuzhiyun4.25 KVM_IRQ_LINE 810*4882a593Smuzhiyun----------------- 811*4882a593Smuzhiyun 812*4882a593Smuzhiyun:Capability: KVM_CAP_IRQCHIP 813*4882a593Smuzhiyun:Architectures: x86, arm, arm64 814*4882a593Smuzhiyun:Type: vm ioctl 815*4882a593Smuzhiyun:Parameters: struct kvm_irq_level 816*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 817*4882a593Smuzhiyun 818*4882a593SmuzhiyunSets the level of a GSI input to the interrupt controller model in the kernel. 819*4882a593SmuzhiyunOn some architectures it is required that an interrupt controller model has 820*4882a593Smuzhiyunbeen previously created with KVM_CREATE_IRQCHIP. Note that edge-triggered 821*4882a593Smuzhiyuninterrupts require the level to be set to 1 and then back to 0. 822*4882a593Smuzhiyun 823*4882a593SmuzhiyunOn real hardware, interrupt pins can be active-low or active-high. This 824*4882a593Smuzhiyundoes not matter for the level field of struct kvm_irq_level: 1 always 825*4882a593Smuzhiyunmeans active (asserted), 0 means inactive (deasserted). 826*4882a593Smuzhiyun 827*4882a593Smuzhiyunx86 allows the operating system to program the interrupt polarity 828*4882a593Smuzhiyun(active-low/active-high) for level-triggered interrupts, and KVM used 829*4882a593Smuzhiyunto consider the polarity. However, due to bitrot in the handling of 830*4882a593Smuzhiyunactive-low interrupts, the above convention is now valid on x86 too. 831*4882a593SmuzhiyunThis is signaled by KVM_CAP_X86_IOAPIC_POLARITY_IGNORED. Userspace 832*4882a593Smuzhiyunshould not present interrupts to the guest as active-low unless this 833*4882a593Smuzhiyuncapability is present (or unless it is not using the in-kernel irqchip, 834*4882a593Smuzhiyunof course). 835*4882a593Smuzhiyun 836*4882a593Smuzhiyun 837*4882a593SmuzhiyunARM/arm64 can signal an interrupt either at the CPU level, or at the 838*4882a593Smuzhiyunin-kernel irqchip (GIC), and for in-kernel irqchip can tell the GIC to 839*4882a593Smuzhiyunuse PPIs designated for specific cpus. The irq field is interpreted 840*4882a593Smuzhiyunlike this:: 841*4882a593Smuzhiyun 842*4882a593Smuzhiyun bits: | 31 ... 28 | 27 ... 24 | 23 ... 16 | 15 ... 0 | 843*4882a593Smuzhiyun field: | vcpu2_index | irq_type | vcpu_index | irq_id | 844*4882a593Smuzhiyun 845*4882a593SmuzhiyunThe irq_type field has the following values: 846*4882a593Smuzhiyun 847*4882a593Smuzhiyun- irq_type[0]: 848*4882a593Smuzhiyun out-of-kernel GIC: irq_id 0 is IRQ, irq_id 1 is FIQ 849*4882a593Smuzhiyun- irq_type[1]: 850*4882a593Smuzhiyun in-kernel GIC: SPI, irq_id between 32 and 1019 (incl.) 851*4882a593Smuzhiyun (the vcpu_index field is ignored) 852*4882a593Smuzhiyun- irq_type[2]: 853*4882a593Smuzhiyun in-kernel GIC: PPI, irq_id between 16 and 31 (incl.) 854*4882a593Smuzhiyun 855*4882a593Smuzhiyun(The irq_id field thus corresponds nicely to the IRQ ID in the ARM GIC specs) 856*4882a593Smuzhiyun 857*4882a593SmuzhiyunIn both cases, level is used to assert/deassert the line. 858*4882a593Smuzhiyun 859*4882a593SmuzhiyunWhen KVM_CAP_ARM_IRQ_LINE_LAYOUT_2 is supported, the target vcpu is 860*4882a593Smuzhiyunidentified as (256 * vcpu2_index + vcpu_index). Otherwise, vcpu2_index 861*4882a593Smuzhiyunmust be zero. 862*4882a593Smuzhiyun 863*4882a593SmuzhiyunNote that on arm/arm64, the KVM_CAP_IRQCHIP capability only conditions 864*4882a593Smuzhiyuninjection of interrupts for the in-kernel irqchip. KVM_IRQ_LINE can always 865*4882a593Smuzhiyunbe used for a userspace interrupt controller. 866*4882a593Smuzhiyun 867*4882a593Smuzhiyun:: 868*4882a593Smuzhiyun 869*4882a593Smuzhiyun struct kvm_irq_level { 870*4882a593Smuzhiyun union { 871*4882a593Smuzhiyun __u32 irq; /* GSI */ 872*4882a593Smuzhiyun __s32 status; /* not used for KVM_IRQ_LEVEL */ 873*4882a593Smuzhiyun }; 874*4882a593Smuzhiyun __u32 level; /* 0 or 1 */ 875*4882a593Smuzhiyun }; 876*4882a593Smuzhiyun 877*4882a593Smuzhiyun 878*4882a593Smuzhiyun4.26 KVM_GET_IRQCHIP 879*4882a593Smuzhiyun-------------------- 880*4882a593Smuzhiyun 881*4882a593Smuzhiyun:Capability: KVM_CAP_IRQCHIP 882*4882a593Smuzhiyun:Architectures: x86 883*4882a593Smuzhiyun:Type: vm ioctl 884*4882a593Smuzhiyun:Parameters: struct kvm_irqchip (in/out) 885*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 886*4882a593Smuzhiyun 887*4882a593SmuzhiyunReads the state of a kernel interrupt controller created with 888*4882a593SmuzhiyunKVM_CREATE_IRQCHIP into a buffer provided by the caller. 889*4882a593Smuzhiyun 890*4882a593Smuzhiyun:: 891*4882a593Smuzhiyun 892*4882a593Smuzhiyun struct kvm_irqchip { 893*4882a593Smuzhiyun __u32 chip_id; /* 0 = PIC1, 1 = PIC2, 2 = IOAPIC */ 894*4882a593Smuzhiyun __u32 pad; 895*4882a593Smuzhiyun union { 896*4882a593Smuzhiyun char dummy[512]; /* reserving space */ 897*4882a593Smuzhiyun struct kvm_pic_state pic; 898*4882a593Smuzhiyun struct kvm_ioapic_state ioapic; 899*4882a593Smuzhiyun } chip; 900*4882a593Smuzhiyun }; 901*4882a593Smuzhiyun 902*4882a593Smuzhiyun 903*4882a593Smuzhiyun4.27 KVM_SET_IRQCHIP 904*4882a593Smuzhiyun-------------------- 905*4882a593Smuzhiyun 906*4882a593Smuzhiyun:Capability: KVM_CAP_IRQCHIP 907*4882a593Smuzhiyun:Architectures: x86 908*4882a593Smuzhiyun:Type: vm ioctl 909*4882a593Smuzhiyun:Parameters: struct kvm_irqchip (in) 910*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 911*4882a593Smuzhiyun 912*4882a593SmuzhiyunSets the state of a kernel interrupt controller created with 913*4882a593SmuzhiyunKVM_CREATE_IRQCHIP from a buffer provided by the caller. 914*4882a593Smuzhiyun 915*4882a593Smuzhiyun:: 916*4882a593Smuzhiyun 917*4882a593Smuzhiyun struct kvm_irqchip { 918*4882a593Smuzhiyun __u32 chip_id; /* 0 = PIC1, 1 = PIC2, 2 = IOAPIC */ 919*4882a593Smuzhiyun __u32 pad; 920*4882a593Smuzhiyun union { 921*4882a593Smuzhiyun char dummy[512]; /* reserving space */ 922*4882a593Smuzhiyun struct kvm_pic_state pic; 923*4882a593Smuzhiyun struct kvm_ioapic_state ioapic; 924*4882a593Smuzhiyun } chip; 925*4882a593Smuzhiyun }; 926*4882a593Smuzhiyun 927*4882a593Smuzhiyun 928*4882a593Smuzhiyun4.28 KVM_XEN_HVM_CONFIG 929*4882a593Smuzhiyun----------------------- 930*4882a593Smuzhiyun 931*4882a593Smuzhiyun:Capability: KVM_CAP_XEN_HVM 932*4882a593Smuzhiyun:Architectures: x86 933*4882a593Smuzhiyun:Type: vm ioctl 934*4882a593Smuzhiyun:Parameters: struct kvm_xen_hvm_config (in) 935*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 936*4882a593Smuzhiyun 937*4882a593SmuzhiyunSets the MSR that the Xen HVM guest uses to initialize its hypercall 938*4882a593Smuzhiyunpage, and provides the starting address and size of the hypercall 939*4882a593Smuzhiyunblobs in userspace. When the guest writes the MSR, kvm copies one 940*4882a593Smuzhiyunpage of a blob (32- or 64-bit, depending on the vcpu mode) to guest 941*4882a593Smuzhiyunmemory. 942*4882a593Smuzhiyun 943*4882a593Smuzhiyun:: 944*4882a593Smuzhiyun 945*4882a593Smuzhiyun struct kvm_xen_hvm_config { 946*4882a593Smuzhiyun __u32 flags; 947*4882a593Smuzhiyun __u32 msr; 948*4882a593Smuzhiyun __u64 blob_addr_32; 949*4882a593Smuzhiyun __u64 blob_addr_64; 950*4882a593Smuzhiyun __u8 blob_size_32; 951*4882a593Smuzhiyun __u8 blob_size_64; 952*4882a593Smuzhiyun __u8 pad2[30]; 953*4882a593Smuzhiyun }; 954*4882a593Smuzhiyun 955*4882a593Smuzhiyun 956*4882a593Smuzhiyun4.29 KVM_GET_CLOCK 957*4882a593Smuzhiyun------------------ 958*4882a593Smuzhiyun 959*4882a593Smuzhiyun:Capability: KVM_CAP_ADJUST_CLOCK 960*4882a593Smuzhiyun:Architectures: x86 961*4882a593Smuzhiyun:Type: vm ioctl 962*4882a593Smuzhiyun:Parameters: struct kvm_clock_data (out) 963*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 964*4882a593Smuzhiyun 965*4882a593SmuzhiyunGets the current timestamp of kvmclock as seen by the current guest. In 966*4882a593Smuzhiyunconjunction with KVM_SET_CLOCK, it is used to ensure monotonicity on scenarios 967*4882a593Smuzhiyunsuch as migration. 968*4882a593Smuzhiyun 969*4882a593SmuzhiyunWhen KVM_CAP_ADJUST_CLOCK is passed to KVM_CHECK_EXTENSION, it returns the 970*4882a593Smuzhiyunset of bits that KVM can return in struct kvm_clock_data's flag member. 971*4882a593Smuzhiyun 972*4882a593SmuzhiyunThe only flag defined now is KVM_CLOCK_TSC_STABLE. If set, the returned 973*4882a593Smuzhiyunvalue is the exact kvmclock value seen by all VCPUs at the instant 974*4882a593Smuzhiyunwhen KVM_GET_CLOCK was called. If clear, the returned value is simply 975*4882a593SmuzhiyunCLOCK_MONOTONIC plus a constant offset; the offset can be modified 976*4882a593Smuzhiyunwith KVM_SET_CLOCK. KVM will try to make all VCPUs follow this clock, 977*4882a593Smuzhiyunbut the exact value read by each VCPU could differ, because the host 978*4882a593SmuzhiyunTSC is not stable. 979*4882a593Smuzhiyun 980*4882a593Smuzhiyun:: 981*4882a593Smuzhiyun 982*4882a593Smuzhiyun struct kvm_clock_data { 983*4882a593Smuzhiyun __u64 clock; /* kvmclock current value */ 984*4882a593Smuzhiyun __u32 flags; 985*4882a593Smuzhiyun __u32 pad[9]; 986*4882a593Smuzhiyun }; 987*4882a593Smuzhiyun 988*4882a593Smuzhiyun 989*4882a593Smuzhiyun4.30 KVM_SET_CLOCK 990*4882a593Smuzhiyun------------------ 991*4882a593Smuzhiyun 992*4882a593Smuzhiyun:Capability: KVM_CAP_ADJUST_CLOCK 993*4882a593Smuzhiyun:Architectures: x86 994*4882a593Smuzhiyun:Type: vm ioctl 995*4882a593Smuzhiyun:Parameters: struct kvm_clock_data (in) 996*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 997*4882a593Smuzhiyun 998*4882a593SmuzhiyunSets the current timestamp of kvmclock to the value specified in its parameter. 999*4882a593SmuzhiyunIn conjunction with KVM_GET_CLOCK, it is used to ensure monotonicity on scenarios 1000*4882a593Smuzhiyunsuch as migration. 1001*4882a593Smuzhiyun 1002*4882a593Smuzhiyun:: 1003*4882a593Smuzhiyun 1004*4882a593Smuzhiyun struct kvm_clock_data { 1005*4882a593Smuzhiyun __u64 clock; /* kvmclock current value */ 1006*4882a593Smuzhiyun __u32 flags; 1007*4882a593Smuzhiyun __u32 pad[9]; 1008*4882a593Smuzhiyun }; 1009*4882a593Smuzhiyun 1010*4882a593Smuzhiyun 1011*4882a593Smuzhiyun4.31 KVM_GET_VCPU_EVENTS 1012*4882a593Smuzhiyun------------------------ 1013*4882a593Smuzhiyun 1014*4882a593Smuzhiyun:Capability: KVM_CAP_VCPU_EVENTS 1015*4882a593Smuzhiyun:Extended by: KVM_CAP_INTR_SHADOW 1016*4882a593Smuzhiyun:Architectures: x86, arm, arm64 1017*4882a593Smuzhiyun:Type: vcpu ioctl 1018*4882a593Smuzhiyun:Parameters: struct kvm_vcpu_event (out) 1019*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 1020*4882a593Smuzhiyun 1021*4882a593SmuzhiyunX86: 1022*4882a593Smuzhiyun^^^^ 1023*4882a593Smuzhiyun 1024*4882a593SmuzhiyunGets currently pending exceptions, interrupts, and NMIs as well as related 1025*4882a593Smuzhiyunstates of the vcpu. 1026*4882a593Smuzhiyun 1027*4882a593Smuzhiyun:: 1028*4882a593Smuzhiyun 1029*4882a593Smuzhiyun struct kvm_vcpu_events { 1030*4882a593Smuzhiyun struct { 1031*4882a593Smuzhiyun __u8 injected; 1032*4882a593Smuzhiyun __u8 nr; 1033*4882a593Smuzhiyun __u8 has_error_code; 1034*4882a593Smuzhiyun __u8 pending; 1035*4882a593Smuzhiyun __u32 error_code; 1036*4882a593Smuzhiyun } exception; 1037*4882a593Smuzhiyun struct { 1038*4882a593Smuzhiyun __u8 injected; 1039*4882a593Smuzhiyun __u8 nr; 1040*4882a593Smuzhiyun __u8 soft; 1041*4882a593Smuzhiyun __u8 shadow; 1042*4882a593Smuzhiyun } interrupt; 1043*4882a593Smuzhiyun struct { 1044*4882a593Smuzhiyun __u8 injected; 1045*4882a593Smuzhiyun __u8 pending; 1046*4882a593Smuzhiyun __u8 masked; 1047*4882a593Smuzhiyun __u8 pad; 1048*4882a593Smuzhiyun } nmi; 1049*4882a593Smuzhiyun __u32 sipi_vector; 1050*4882a593Smuzhiyun __u32 flags; 1051*4882a593Smuzhiyun struct { 1052*4882a593Smuzhiyun __u8 smm; 1053*4882a593Smuzhiyun __u8 pending; 1054*4882a593Smuzhiyun __u8 smm_inside_nmi; 1055*4882a593Smuzhiyun __u8 latched_init; 1056*4882a593Smuzhiyun } smi; 1057*4882a593Smuzhiyun __u8 reserved[27]; 1058*4882a593Smuzhiyun __u8 exception_has_payload; 1059*4882a593Smuzhiyun __u64 exception_payload; 1060*4882a593Smuzhiyun }; 1061*4882a593Smuzhiyun 1062*4882a593SmuzhiyunThe following bits are defined in the flags field: 1063*4882a593Smuzhiyun 1064*4882a593Smuzhiyun- KVM_VCPUEVENT_VALID_SHADOW may be set to signal that 1065*4882a593Smuzhiyun interrupt.shadow contains a valid state. 1066*4882a593Smuzhiyun 1067*4882a593Smuzhiyun- KVM_VCPUEVENT_VALID_SMM may be set to signal that smi contains a 1068*4882a593Smuzhiyun valid state. 1069*4882a593Smuzhiyun 1070*4882a593Smuzhiyun- KVM_VCPUEVENT_VALID_PAYLOAD may be set to signal that the 1071*4882a593Smuzhiyun exception_has_payload, exception_payload, and exception.pending 1072*4882a593Smuzhiyun fields contain a valid state. This bit will be set whenever 1073*4882a593Smuzhiyun KVM_CAP_EXCEPTION_PAYLOAD is enabled. 1074*4882a593Smuzhiyun 1075*4882a593SmuzhiyunARM/ARM64: 1076*4882a593Smuzhiyun^^^^^^^^^^ 1077*4882a593Smuzhiyun 1078*4882a593SmuzhiyunIf the guest accesses a device that is being emulated by the host kernel in 1079*4882a593Smuzhiyunsuch a way that a real device would generate a physical SError, KVM may make 1080*4882a593Smuzhiyuna virtual SError pending for that VCPU. This system error interrupt remains 1081*4882a593Smuzhiyunpending until the guest takes the exception by unmasking PSTATE.A. 1082*4882a593Smuzhiyun 1083*4882a593SmuzhiyunRunning the VCPU may cause it to take a pending SError, or make an access that 1084*4882a593Smuzhiyuncauses an SError to become pending. The event's description is only valid while 1085*4882a593Smuzhiyunthe VPCU is not running. 1086*4882a593Smuzhiyun 1087*4882a593SmuzhiyunThis API provides a way to read and write the pending 'event' state that is not 1088*4882a593Smuzhiyunvisible to the guest. To save, restore or migrate a VCPU the struct representing 1089*4882a593Smuzhiyunthe state can be read then written using this GET/SET API, along with the other 1090*4882a593Smuzhiyunguest-visible registers. It is not possible to 'cancel' an SError that has been 1091*4882a593Smuzhiyunmade pending. 1092*4882a593Smuzhiyun 1093*4882a593SmuzhiyunA device being emulated in user-space may also wish to generate an SError. To do 1094*4882a593Smuzhiyunthis the events structure can be populated by user-space. The current state 1095*4882a593Smuzhiyunshould be read first, to ensure no existing SError is pending. If an existing 1096*4882a593SmuzhiyunSError is pending, the architecture's 'Multiple SError interrupts' rules should 1097*4882a593Smuzhiyunbe followed. (2.5.3 of DDI0587.a "ARM Reliability, Availability, and 1098*4882a593SmuzhiyunServiceability (RAS) Specification"). 1099*4882a593Smuzhiyun 1100*4882a593SmuzhiyunSError exceptions always have an ESR value. Some CPUs have the ability to 1101*4882a593Smuzhiyunspecify what the virtual SError's ESR value should be. These systems will 1102*4882a593Smuzhiyunadvertise KVM_CAP_ARM_INJECT_SERROR_ESR. In this case exception.has_esr will 1103*4882a593Smuzhiyunalways have a non-zero value when read, and the agent making an SError pending 1104*4882a593Smuzhiyunshould specify the ISS field in the lower 24 bits of exception.serror_esr. If 1105*4882a593Smuzhiyunthe system supports KVM_CAP_ARM_INJECT_SERROR_ESR, but user-space sets the events 1106*4882a593Smuzhiyunwith exception.has_esr as zero, KVM will choose an ESR. 1107*4882a593Smuzhiyun 1108*4882a593SmuzhiyunSpecifying exception.has_esr on a system that does not support it will return 1109*4882a593Smuzhiyun-EINVAL. Setting anything other than the lower 24bits of exception.serror_esr 1110*4882a593Smuzhiyunwill return -EINVAL. 1111*4882a593Smuzhiyun 1112*4882a593SmuzhiyunIt is not possible to read back a pending external abort (injected via 1113*4882a593SmuzhiyunKVM_SET_VCPU_EVENTS or otherwise) because such an exception is always delivered 1114*4882a593Smuzhiyundirectly to the virtual CPU). 1115*4882a593Smuzhiyun 1116*4882a593Smuzhiyun:: 1117*4882a593Smuzhiyun 1118*4882a593Smuzhiyun struct kvm_vcpu_events { 1119*4882a593Smuzhiyun struct { 1120*4882a593Smuzhiyun __u8 serror_pending; 1121*4882a593Smuzhiyun __u8 serror_has_esr; 1122*4882a593Smuzhiyun __u8 ext_dabt_pending; 1123*4882a593Smuzhiyun /* Align it to 8 bytes */ 1124*4882a593Smuzhiyun __u8 pad[5]; 1125*4882a593Smuzhiyun __u64 serror_esr; 1126*4882a593Smuzhiyun } exception; 1127*4882a593Smuzhiyun __u32 reserved[12]; 1128*4882a593Smuzhiyun }; 1129*4882a593Smuzhiyun 1130*4882a593Smuzhiyun4.32 KVM_SET_VCPU_EVENTS 1131*4882a593Smuzhiyun------------------------ 1132*4882a593Smuzhiyun 1133*4882a593Smuzhiyun:Capability: KVM_CAP_VCPU_EVENTS 1134*4882a593Smuzhiyun:Extended by: KVM_CAP_INTR_SHADOW 1135*4882a593Smuzhiyun:Architectures: x86, arm, arm64 1136*4882a593Smuzhiyun:Type: vcpu ioctl 1137*4882a593Smuzhiyun:Parameters: struct kvm_vcpu_event (in) 1138*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 1139*4882a593Smuzhiyun 1140*4882a593SmuzhiyunX86: 1141*4882a593Smuzhiyun^^^^ 1142*4882a593Smuzhiyun 1143*4882a593SmuzhiyunSet pending exceptions, interrupts, and NMIs as well as related states of the 1144*4882a593Smuzhiyunvcpu. 1145*4882a593Smuzhiyun 1146*4882a593SmuzhiyunSee KVM_GET_VCPU_EVENTS for the data structure. 1147*4882a593Smuzhiyun 1148*4882a593SmuzhiyunFields that may be modified asynchronously by running VCPUs can be excluded 1149*4882a593Smuzhiyunfrom the update. These fields are nmi.pending, sipi_vector, smi.smm, 1150*4882a593Smuzhiyunsmi.pending. Keep the corresponding bits in the flags field cleared to 1151*4882a593Smuzhiyunsuppress overwriting the current in-kernel state. The bits are: 1152*4882a593Smuzhiyun 1153*4882a593Smuzhiyun=============================== ================================== 1154*4882a593SmuzhiyunKVM_VCPUEVENT_VALID_NMI_PENDING transfer nmi.pending to the kernel 1155*4882a593SmuzhiyunKVM_VCPUEVENT_VALID_SIPI_VECTOR transfer sipi_vector 1156*4882a593SmuzhiyunKVM_VCPUEVENT_VALID_SMM transfer the smi sub-struct. 1157*4882a593Smuzhiyun=============================== ================================== 1158*4882a593Smuzhiyun 1159*4882a593SmuzhiyunIf KVM_CAP_INTR_SHADOW is available, KVM_VCPUEVENT_VALID_SHADOW can be set in 1160*4882a593Smuzhiyunthe flags field to signal that interrupt.shadow contains a valid state and 1161*4882a593Smuzhiyunshall be written into the VCPU. 1162*4882a593Smuzhiyun 1163*4882a593SmuzhiyunKVM_VCPUEVENT_VALID_SMM can only be set if KVM_CAP_X86_SMM is available. 1164*4882a593Smuzhiyun 1165*4882a593SmuzhiyunIf KVM_CAP_EXCEPTION_PAYLOAD is enabled, KVM_VCPUEVENT_VALID_PAYLOAD 1166*4882a593Smuzhiyuncan be set in the flags field to signal that the 1167*4882a593Smuzhiyunexception_has_payload, exception_payload, and exception.pending fields 1168*4882a593Smuzhiyuncontain a valid state and shall be written into the VCPU. 1169*4882a593Smuzhiyun 1170*4882a593SmuzhiyunARM/ARM64: 1171*4882a593Smuzhiyun^^^^^^^^^^ 1172*4882a593Smuzhiyun 1173*4882a593SmuzhiyunUser space may need to inject several types of events to the guest. 1174*4882a593Smuzhiyun 1175*4882a593SmuzhiyunSet the pending SError exception state for this VCPU. It is not possible to 1176*4882a593Smuzhiyun'cancel' an Serror that has been made pending. 1177*4882a593Smuzhiyun 1178*4882a593SmuzhiyunIf the guest performed an access to I/O memory which could not be handled by 1179*4882a593Smuzhiyunuserspace, for example because of missing instruction syndrome decode 1180*4882a593Smuzhiyuninformation or because there is no device mapped at the accessed IPA, then 1181*4882a593Smuzhiyunuserspace can ask the kernel to inject an external abort using the address 1182*4882a593Smuzhiyunfrom the exiting fault on the VCPU. It is a programming error to set 1183*4882a593Smuzhiyunext_dabt_pending after an exit which was not either KVM_EXIT_MMIO or 1184*4882a593SmuzhiyunKVM_EXIT_ARM_NISV. This feature is only available if the system supports 1185*4882a593SmuzhiyunKVM_CAP_ARM_INJECT_EXT_DABT. This is a helper which provides commonality in 1186*4882a593Smuzhiyunhow userspace reports accesses for the above cases to guests, across different 1187*4882a593Smuzhiyunuserspace implementations. Nevertheless, userspace can still emulate all Arm 1188*4882a593Smuzhiyunexceptions by manipulating individual registers using the KVM_SET_ONE_REG API. 1189*4882a593Smuzhiyun 1190*4882a593SmuzhiyunSee KVM_GET_VCPU_EVENTS for the data structure. 1191*4882a593Smuzhiyun 1192*4882a593Smuzhiyun 1193*4882a593Smuzhiyun4.33 KVM_GET_DEBUGREGS 1194*4882a593Smuzhiyun---------------------- 1195*4882a593Smuzhiyun 1196*4882a593Smuzhiyun:Capability: KVM_CAP_DEBUGREGS 1197*4882a593Smuzhiyun:Architectures: x86 1198*4882a593Smuzhiyun:Type: vm ioctl 1199*4882a593Smuzhiyun:Parameters: struct kvm_debugregs (out) 1200*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 1201*4882a593Smuzhiyun 1202*4882a593SmuzhiyunReads debug registers from the vcpu. 1203*4882a593Smuzhiyun 1204*4882a593Smuzhiyun:: 1205*4882a593Smuzhiyun 1206*4882a593Smuzhiyun struct kvm_debugregs { 1207*4882a593Smuzhiyun __u64 db[4]; 1208*4882a593Smuzhiyun __u64 dr6; 1209*4882a593Smuzhiyun __u64 dr7; 1210*4882a593Smuzhiyun __u64 flags; 1211*4882a593Smuzhiyun __u64 reserved[9]; 1212*4882a593Smuzhiyun }; 1213*4882a593Smuzhiyun 1214*4882a593Smuzhiyun 1215*4882a593Smuzhiyun4.34 KVM_SET_DEBUGREGS 1216*4882a593Smuzhiyun---------------------- 1217*4882a593Smuzhiyun 1218*4882a593Smuzhiyun:Capability: KVM_CAP_DEBUGREGS 1219*4882a593Smuzhiyun:Architectures: x86 1220*4882a593Smuzhiyun:Type: vm ioctl 1221*4882a593Smuzhiyun:Parameters: struct kvm_debugregs (in) 1222*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 1223*4882a593Smuzhiyun 1224*4882a593SmuzhiyunWrites debug registers into the vcpu. 1225*4882a593Smuzhiyun 1226*4882a593SmuzhiyunSee KVM_GET_DEBUGREGS for the data structure. The flags field is unused 1227*4882a593Smuzhiyunyet and must be cleared on entry. 1228*4882a593Smuzhiyun 1229*4882a593Smuzhiyun 1230*4882a593Smuzhiyun4.35 KVM_SET_USER_MEMORY_REGION 1231*4882a593Smuzhiyun------------------------------- 1232*4882a593Smuzhiyun 1233*4882a593Smuzhiyun:Capability: KVM_CAP_USER_MEMORY 1234*4882a593Smuzhiyun:Architectures: all 1235*4882a593Smuzhiyun:Type: vm ioctl 1236*4882a593Smuzhiyun:Parameters: struct kvm_userspace_memory_region (in) 1237*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 1238*4882a593Smuzhiyun 1239*4882a593Smuzhiyun:: 1240*4882a593Smuzhiyun 1241*4882a593Smuzhiyun struct kvm_userspace_memory_region { 1242*4882a593Smuzhiyun __u32 slot; 1243*4882a593Smuzhiyun __u32 flags; 1244*4882a593Smuzhiyun __u64 guest_phys_addr; 1245*4882a593Smuzhiyun __u64 memory_size; /* bytes */ 1246*4882a593Smuzhiyun __u64 userspace_addr; /* start of the userspace allocated memory */ 1247*4882a593Smuzhiyun }; 1248*4882a593Smuzhiyun 1249*4882a593Smuzhiyun /* for kvm_memory_region::flags */ 1250*4882a593Smuzhiyun #define KVM_MEM_LOG_DIRTY_PAGES (1UL << 0) 1251*4882a593Smuzhiyun #define KVM_MEM_READONLY (1UL << 1) 1252*4882a593Smuzhiyun 1253*4882a593SmuzhiyunThis ioctl allows the user to create, modify or delete a guest physical 1254*4882a593Smuzhiyunmemory slot. Bits 0-15 of "slot" specify the slot id and this value 1255*4882a593Smuzhiyunshould be less than the maximum number of user memory slots supported per 1256*4882a593SmuzhiyunVM. The maximum allowed slots can be queried using KVM_CAP_NR_MEMSLOTS. 1257*4882a593SmuzhiyunSlots may not overlap in guest physical address space. 1258*4882a593Smuzhiyun 1259*4882a593SmuzhiyunIf KVM_CAP_MULTI_ADDRESS_SPACE is available, bits 16-31 of "slot" 1260*4882a593Smuzhiyunspecifies the address space which is being modified. They must be 1261*4882a593Smuzhiyunless than the value that KVM_CHECK_EXTENSION returns for the 1262*4882a593SmuzhiyunKVM_CAP_MULTI_ADDRESS_SPACE capability. Slots in separate address spaces 1263*4882a593Smuzhiyunare unrelated; the restriction on overlapping slots only applies within 1264*4882a593Smuzhiyuneach address space. 1265*4882a593Smuzhiyun 1266*4882a593SmuzhiyunDeleting a slot is done by passing zero for memory_size. When changing 1267*4882a593Smuzhiyunan existing slot, it may be moved in the guest physical memory space, 1268*4882a593Smuzhiyunor its flags may be modified, but it may not be resized. 1269*4882a593Smuzhiyun 1270*4882a593SmuzhiyunMemory for the region is taken starting at the address denoted by the 1271*4882a593Smuzhiyunfield userspace_addr, which must point at user addressable memory for 1272*4882a593Smuzhiyunthe entire memory slot size. Any object may back this memory, including 1273*4882a593Smuzhiyunanonymous memory, ordinary files, and hugetlbfs. 1274*4882a593Smuzhiyun 1275*4882a593SmuzhiyunOn architectures that support a form of address tagging, userspace_addr must 1276*4882a593Smuzhiyunbe an untagged address. 1277*4882a593Smuzhiyun 1278*4882a593SmuzhiyunIt is recommended that the lower 21 bits of guest_phys_addr and userspace_addr 1279*4882a593Smuzhiyunbe identical. This allows large pages in the guest to be backed by large 1280*4882a593Smuzhiyunpages in the host. 1281*4882a593Smuzhiyun 1282*4882a593SmuzhiyunThe flags field supports two flags: KVM_MEM_LOG_DIRTY_PAGES and 1283*4882a593SmuzhiyunKVM_MEM_READONLY. The former can be set to instruct KVM to keep track of 1284*4882a593Smuzhiyunwrites to memory within the slot. See KVM_GET_DIRTY_LOG ioctl to know how to 1285*4882a593Smuzhiyunuse it. The latter can be set, if KVM_CAP_READONLY_MEM capability allows it, 1286*4882a593Smuzhiyunto make a new slot read-only. In this case, writes to this memory will be 1287*4882a593Smuzhiyunposted to userspace as KVM_EXIT_MMIO exits. 1288*4882a593Smuzhiyun 1289*4882a593SmuzhiyunWhen the KVM_CAP_SYNC_MMU capability is available, changes in the backing of 1290*4882a593Smuzhiyunthe memory region are automatically reflected into the guest. For example, an 1291*4882a593Smuzhiyunmmap() that affects the region will be made visible immediately. Another 1292*4882a593Smuzhiyunexample is madvise(MADV_DROP). 1293*4882a593Smuzhiyun 1294*4882a593SmuzhiyunIt is recommended to use this API instead of the KVM_SET_MEMORY_REGION ioctl. 1295*4882a593SmuzhiyunThe KVM_SET_MEMORY_REGION does not allow fine grained control over memory 1296*4882a593Smuzhiyunallocation and is deprecated. 1297*4882a593Smuzhiyun 1298*4882a593Smuzhiyun 1299*4882a593Smuzhiyun4.36 KVM_SET_TSS_ADDR 1300*4882a593Smuzhiyun--------------------- 1301*4882a593Smuzhiyun 1302*4882a593Smuzhiyun:Capability: KVM_CAP_SET_TSS_ADDR 1303*4882a593Smuzhiyun:Architectures: x86 1304*4882a593Smuzhiyun:Type: vm ioctl 1305*4882a593Smuzhiyun:Parameters: unsigned long tss_address (in) 1306*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 1307*4882a593Smuzhiyun 1308*4882a593SmuzhiyunThis ioctl defines the physical address of a three-page region in the guest 1309*4882a593Smuzhiyunphysical address space. The region must be within the first 4GB of the 1310*4882a593Smuzhiyunguest physical address space and must not conflict with any memory slot 1311*4882a593Smuzhiyunor any mmio address. The guest may malfunction if it accesses this memory 1312*4882a593Smuzhiyunregion. 1313*4882a593Smuzhiyun 1314*4882a593SmuzhiyunThis ioctl is required on Intel-based hosts. This is needed on Intel hardware 1315*4882a593Smuzhiyunbecause of a quirk in the virtualization implementation (see the internals 1316*4882a593Smuzhiyundocumentation when it pops into existence). 1317*4882a593Smuzhiyun 1318*4882a593Smuzhiyun 1319*4882a593Smuzhiyun4.37 KVM_ENABLE_CAP 1320*4882a593Smuzhiyun------------------- 1321*4882a593Smuzhiyun 1322*4882a593Smuzhiyun:Capability: KVM_CAP_ENABLE_CAP 1323*4882a593Smuzhiyun:Architectures: mips, ppc, s390 1324*4882a593Smuzhiyun:Type: vcpu ioctl 1325*4882a593Smuzhiyun:Parameters: struct kvm_enable_cap (in) 1326*4882a593Smuzhiyun:Returns: 0 on success; -1 on error 1327*4882a593Smuzhiyun 1328*4882a593Smuzhiyun:Capability: KVM_CAP_ENABLE_CAP_VM 1329*4882a593Smuzhiyun:Architectures: all 1330*4882a593Smuzhiyun:Type: vm ioctl 1331*4882a593Smuzhiyun:Parameters: struct kvm_enable_cap (in) 1332*4882a593Smuzhiyun:Returns: 0 on success; -1 on error 1333*4882a593Smuzhiyun 1334*4882a593Smuzhiyun.. note:: 1335*4882a593Smuzhiyun 1336*4882a593Smuzhiyun Not all extensions are enabled by default. Using this ioctl the application 1337*4882a593Smuzhiyun can enable an extension, making it available to the guest. 1338*4882a593Smuzhiyun 1339*4882a593SmuzhiyunOn systems that do not support this ioctl, it always fails. On systems that 1340*4882a593Smuzhiyundo support it, it only works for extensions that are supported for enablement. 1341*4882a593Smuzhiyun 1342*4882a593SmuzhiyunTo check if a capability can be enabled, the KVM_CHECK_EXTENSION ioctl should 1343*4882a593Smuzhiyunbe used. 1344*4882a593Smuzhiyun 1345*4882a593Smuzhiyun:: 1346*4882a593Smuzhiyun 1347*4882a593Smuzhiyun struct kvm_enable_cap { 1348*4882a593Smuzhiyun /* in */ 1349*4882a593Smuzhiyun __u32 cap; 1350*4882a593Smuzhiyun 1351*4882a593SmuzhiyunThe capability that is supposed to get enabled. 1352*4882a593Smuzhiyun 1353*4882a593Smuzhiyun:: 1354*4882a593Smuzhiyun 1355*4882a593Smuzhiyun __u32 flags; 1356*4882a593Smuzhiyun 1357*4882a593SmuzhiyunA bitfield indicating future enhancements. Has to be 0 for now. 1358*4882a593Smuzhiyun 1359*4882a593Smuzhiyun:: 1360*4882a593Smuzhiyun 1361*4882a593Smuzhiyun __u64 args[4]; 1362*4882a593Smuzhiyun 1363*4882a593SmuzhiyunArguments for enabling a feature. If a feature needs initial values to 1364*4882a593Smuzhiyunfunction properly, this is the place to put them. 1365*4882a593Smuzhiyun 1366*4882a593Smuzhiyun:: 1367*4882a593Smuzhiyun 1368*4882a593Smuzhiyun __u8 pad[64]; 1369*4882a593Smuzhiyun }; 1370*4882a593Smuzhiyun 1371*4882a593SmuzhiyunThe vcpu ioctl should be used for vcpu-specific capabilities, the vm ioctl 1372*4882a593Smuzhiyunfor vm-wide capabilities. 1373*4882a593Smuzhiyun 1374*4882a593Smuzhiyun4.38 KVM_GET_MP_STATE 1375*4882a593Smuzhiyun--------------------- 1376*4882a593Smuzhiyun 1377*4882a593Smuzhiyun:Capability: KVM_CAP_MP_STATE 1378*4882a593Smuzhiyun:Architectures: x86, s390, arm, arm64 1379*4882a593Smuzhiyun:Type: vcpu ioctl 1380*4882a593Smuzhiyun:Parameters: struct kvm_mp_state (out) 1381*4882a593Smuzhiyun:Returns: 0 on success; -1 on error 1382*4882a593Smuzhiyun 1383*4882a593Smuzhiyun:: 1384*4882a593Smuzhiyun 1385*4882a593Smuzhiyun struct kvm_mp_state { 1386*4882a593Smuzhiyun __u32 mp_state; 1387*4882a593Smuzhiyun }; 1388*4882a593Smuzhiyun 1389*4882a593SmuzhiyunReturns the vcpu's current "multiprocessing state" (though also valid on 1390*4882a593Smuzhiyununiprocessor guests). 1391*4882a593Smuzhiyun 1392*4882a593SmuzhiyunPossible values are: 1393*4882a593Smuzhiyun 1394*4882a593Smuzhiyun ========================== =============================================== 1395*4882a593Smuzhiyun KVM_MP_STATE_RUNNABLE the vcpu is currently running [x86,arm/arm64] 1396*4882a593Smuzhiyun KVM_MP_STATE_UNINITIALIZED the vcpu is an application processor (AP) 1397*4882a593Smuzhiyun which has not yet received an INIT signal [x86] 1398*4882a593Smuzhiyun KVM_MP_STATE_INIT_RECEIVED the vcpu has received an INIT signal, and is 1399*4882a593Smuzhiyun now ready for a SIPI [x86] 1400*4882a593Smuzhiyun KVM_MP_STATE_HALTED the vcpu has executed a HLT instruction and 1401*4882a593Smuzhiyun is waiting for an interrupt [x86] 1402*4882a593Smuzhiyun KVM_MP_STATE_SIPI_RECEIVED the vcpu has just received a SIPI (vector 1403*4882a593Smuzhiyun accessible via KVM_GET_VCPU_EVENTS) [x86] 1404*4882a593Smuzhiyun KVM_MP_STATE_STOPPED the vcpu is stopped [s390,arm/arm64] 1405*4882a593Smuzhiyun KVM_MP_STATE_CHECK_STOP the vcpu is in a special error state [s390] 1406*4882a593Smuzhiyun KVM_MP_STATE_OPERATING the vcpu is operating (running or halted) 1407*4882a593Smuzhiyun [s390] 1408*4882a593Smuzhiyun KVM_MP_STATE_LOAD the vcpu is in a special load/startup state 1409*4882a593Smuzhiyun [s390] 1410*4882a593Smuzhiyun ========================== =============================================== 1411*4882a593Smuzhiyun 1412*4882a593SmuzhiyunOn x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an 1413*4882a593Smuzhiyunin-kernel irqchip, the multiprocessing state must be maintained by userspace on 1414*4882a593Smuzhiyunthese architectures. 1415*4882a593Smuzhiyun 1416*4882a593SmuzhiyunFor arm/arm64: 1417*4882a593Smuzhiyun^^^^^^^^^^^^^^ 1418*4882a593Smuzhiyun 1419*4882a593SmuzhiyunThe only states that are valid are KVM_MP_STATE_STOPPED and 1420*4882a593SmuzhiyunKVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not. 1421*4882a593Smuzhiyun 1422*4882a593Smuzhiyun4.39 KVM_SET_MP_STATE 1423*4882a593Smuzhiyun--------------------- 1424*4882a593Smuzhiyun 1425*4882a593Smuzhiyun:Capability: KVM_CAP_MP_STATE 1426*4882a593Smuzhiyun:Architectures: x86, s390, arm, arm64 1427*4882a593Smuzhiyun:Type: vcpu ioctl 1428*4882a593Smuzhiyun:Parameters: struct kvm_mp_state (in) 1429*4882a593Smuzhiyun:Returns: 0 on success; -1 on error 1430*4882a593Smuzhiyun 1431*4882a593SmuzhiyunSets the vcpu's current "multiprocessing state"; see KVM_GET_MP_STATE for 1432*4882a593Smuzhiyunarguments. 1433*4882a593Smuzhiyun 1434*4882a593SmuzhiyunOn x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an 1435*4882a593Smuzhiyunin-kernel irqchip, the multiprocessing state must be maintained by userspace on 1436*4882a593Smuzhiyunthese architectures. 1437*4882a593Smuzhiyun 1438*4882a593SmuzhiyunFor arm/arm64: 1439*4882a593Smuzhiyun^^^^^^^^^^^^^^ 1440*4882a593Smuzhiyun 1441*4882a593SmuzhiyunThe only states that are valid are KVM_MP_STATE_STOPPED and 1442*4882a593SmuzhiyunKVM_MP_STATE_RUNNABLE which reflect if the vcpu should be paused or not. 1443*4882a593Smuzhiyun 1444*4882a593Smuzhiyun4.40 KVM_SET_IDENTITY_MAP_ADDR 1445*4882a593Smuzhiyun------------------------------ 1446*4882a593Smuzhiyun 1447*4882a593Smuzhiyun:Capability: KVM_CAP_SET_IDENTITY_MAP_ADDR 1448*4882a593Smuzhiyun:Architectures: x86 1449*4882a593Smuzhiyun:Type: vm ioctl 1450*4882a593Smuzhiyun:Parameters: unsigned long identity (in) 1451*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 1452*4882a593Smuzhiyun 1453*4882a593SmuzhiyunThis ioctl defines the physical address of a one-page region in the guest 1454*4882a593Smuzhiyunphysical address space. The region must be within the first 4GB of the 1455*4882a593Smuzhiyunguest physical address space and must not conflict with any memory slot 1456*4882a593Smuzhiyunor any mmio address. The guest may malfunction if it accesses this memory 1457*4882a593Smuzhiyunregion. 1458*4882a593Smuzhiyun 1459*4882a593SmuzhiyunSetting the address to 0 will result in resetting the address to its default 1460*4882a593Smuzhiyun(0xfffbc000). 1461*4882a593Smuzhiyun 1462*4882a593SmuzhiyunThis ioctl is required on Intel-based hosts. This is needed on Intel hardware 1463*4882a593Smuzhiyunbecause of a quirk in the virtualization implementation (see the internals 1464*4882a593Smuzhiyundocumentation when it pops into existence). 1465*4882a593Smuzhiyun 1466*4882a593SmuzhiyunFails if any VCPU has already been created. 1467*4882a593Smuzhiyun 1468*4882a593Smuzhiyun4.41 KVM_SET_BOOT_CPU_ID 1469*4882a593Smuzhiyun------------------------ 1470*4882a593Smuzhiyun 1471*4882a593Smuzhiyun:Capability: KVM_CAP_SET_BOOT_CPU_ID 1472*4882a593Smuzhiyun:Architectures: x86 1473*4882a593Smuzhiyun:Type: vm ioctl 1474*4882a593Smuzhiyun:Parameters: unsigned long vcpu_id 1475*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 1476*4882a593Smuzhiyun 1477*4882a593SmuzhiyunDefine which vcpu is the Bootstrap Processor (BSP). Values are the same 1478*4882a593Smuzhiyunas the vcpu id in KVM_CREATE_VCPU. If this ioctl is not called, the default 1479*4882a593Smuzhiyunis vcpu 0. 1480*4882a593Smuzhiyun 1481*4882a593Smuzhiyun 1482*4882a593Smuzhiyun4.42 KVM_GET_XSAVE 1483*4882a593Smuzhiyun------------------ 1484*4882a593Smuzhiyun 1485*4882a593Smuzhiyun:Capability: KVM_CAP_XSAVE 1486*4882a593Smuzhiyun:Architectures: x86 1487*4882a593Smuzhiyun:Type: vcpu ioctl 1488*4882a593Smuzhiyun:Parameters: struct kvm_xsave (out) 1489*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 1490*4882a593Smuzhiyun 1491*4882a593Smuzhiyun 1492*4882a593Smuzhiyun:: 1493*4882a593Smuzhiyun 1494*4882a593Smuzhiyun struct kvm_xsave { 1495*4882a593Smuzhiyun __u32 region[1024]; 1496*4882a593Smuzhiyun }; 1497*4882a593Smuzhiyun 1498*4882a593SmuzhiyunThis ioctl would copy current vcpu's xsave struct to the userspace. 1499*4882a593Smuzhiyun 1500*4882a593Smuzhiyun 1501*4882a593Smuzhiyun4.43 KVM_SET_XSAVE 1502*4882a593Smuzhiyun------------------ 1503*4882a593Smuzhiyun 1504*4882a593Smuzhiyun:Capability: KVM_CAP_XSAVE 1505*4882a593Smuzhiyun:Architectures: x86 1506*4882a593Smuzhiyun:Type: vcpu ioctl 1507*4882a593Smuzhiyun:Parameters: struct kvm_xsave (in) 1508*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 1509*4882a593Smuzhiyun 1510*4882a593Smuzhiyun:: 1511*4882a593Smuzhiyun 1512*4882a593Smuzhiyun 1513*4882a593Smuzhiyun struct kvm_xsave { 1514*4882a593Smuzhiyun __u32 region[1024]; 1515*4882a593Smuzhiyun }; 1516*4882a593Smuzhiyun 1517*4882a593SmuzhiyunThis ioctl would copy userspace's xsave struct to the kernel. 1518*4882a593Smuzhiyun 1519*4882a593Smuzhiyun 1520*4882a593Smuzhiyun4.44 KVM_GET_XCRS 1521*4882a593Smuzhiyun----------------- 1522*4882a593Smuzhiyun 1523*4882a593Smuzhiyun:Capability: KVM_CAP_XCRS 1524*4882a593Smuzhiyun:Architectures: x86 1525*4882a593Smuzhiyun:Type: vcpu ioctl 1526*4882a593Smuzhiyun:Parameters: struct kvm_xcrs (out) 1527*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 1528*4882a593Smuzhiyun 1529*4882a593Smuzhiyun:: 1530*4882a593Smuzhiyun 1531*4882a593Smuzhiyun struct kvm_xcr { 1532*4882a593Smuzhiyun __u32 xcr; 1533*4882a593Smuzhiyun __u32 reserved; 1534*4882a593Smuzhiyun __u64 value; 1535*4882a593Smuzhiyun }; 1536*4882a593Smuzhiyun 1537*4882a593Smuzhiyun struct kvm_xcrs { 1538*4882a593Smuzhiyun __u32 nr_xcrs; 1539*4882a593Smuzhiyun __u32 flags; 1540*4882a593Smuzhiyun struct kvm_xcr xcrs[KVM_MAX_XCRS]; 1541*4882a593Smuzhiyun __u64 padding[16]; 1542*4882a593Smuzhiyun }; 1543*4882a593Smuzhiyun 1544*4882a593SmuzhiyunThis ioctl would copy current vcpu's xcrs to the userspace. 1545*4882a593Smuzhiyun 1546*4882a593Smuzhiyun 1547*4882a593Smuzhiyun4.45 KVM_SET_XCRS 1548*4882a593Smuzhiyun----------------- 1549*4882a593Smuzhiyun 1550*4882a593Smuzhiyun:Capability: KVM_CAP_XCRS 1551*4882a593Smuzhiyun:Architectures: x86 1552*4882a593Smuzhiyun:Type: vcpu ioctl 1553*4882a593Smuzhiyun:Parameters: struct kvm_xcrs (in) 1554*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 1555*4882a593Smuzhiyun 1556*4882a593Smuzhiyun:: 1557*4882a593Smuzhiyun 1558*4882a593Smuzhiyun struct kvm_xcr { 1559*4882a593Smuzhiyun __u32 xcr; 1560*4882a593Smuzhiyun __u32 reserved; 1561*4882a593Smuzhiyun __u64 value; 1562*4882a593Smuzhiyun }; 1563*4882a593Smuzhiyun 1564*4882a593Smuzhiyun struct kvm_xcrs { 1565*4882a593Smuzhiyun __u32 nr_xcrs; 1566*4882a593Smuzhiyun __u32 flags; 1567*4882a593Smuzhiyun struct kvm_xcr xcrs[KVM_MAX_XCRS]; 1568*4882a593Smuzhiyun __u64 padding[16]; 1569*4882a593Smuzhiyun }; 1570*4882a593Smuzhiyun 1571*4882a593SmuzhiyunThis ioctl would set vcpu's xcr to the value userspace specified. 1572*4882a593Smuzhiyun 1573*4882a593Smuzhiyun 1574*4882a593Smuzhiyun4.46 KVM_GET_SUPPORTED_CPUID 1575*4882a593Smuzhiyun---------------------------- 1576*4882a593Smuzhiyun 1577*4882a593Smuzhiyun:Capability: KVM_CAP_EXT_CPUID 1578*4882a593Smuzhiyun:Architectures: x86 1579*4882a593Smuzhiyun:Type: system ioctl 1580*4882a593Smuzhiyun:Parameters: struct kvm_cpuid2 (in/out) 1581*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 1582*4882a593Smuzhiyun 1583*4882a593Smuzhiyun:: 1584*4882a593Smuzhiyun 1585*4882a593Smuzhiyun struct kvm_cpuid2 { 1586*4882a593Smuzhiyun __u32 nent; 1587*4882a593Smuzhiyun __u32 padding; 1588*4882a593Smuzhiyun struct kvm_cpuid_entry2 entries[0]; 1589*4882a593Smuzhiyun }; 1590*4882a593Smuzhiyun 1591*4882a593Smuzhiyun #define KVM_CPUID_FLAG_SIGNIFCANT_INDEX BIT(0) 1592*4882a593Smuzhiyun #define KVM_CPUID_FLAG_STATEFUL_FUNC BIT(1) /* deprecated */ 1593*4882a593Smuzhiyun #define KVM_CPUID_FLAG_STATE_READ_NEXT BIT(2) /* deprecated */ 1594*4882a593Smuzhiyun 1595*4882a593Smuzhiyun struct kvm_cpuid_entry2 { 1596*4882a593Smuzhiyun __u32 function; 1597*4882a593Smuzhiyun __u32 index; 1598*4882a593Smuzhiyun __u32 flags; 1599*4882a593Smuzhiyun __u32 eax; 1600*4882a593Smuzhiyun __u32 ebx; 1601*4882a593Smuzhiyun __u32 ecx; 1602*4882a593Smuzhiyun __u32 edx; 1603*4882a593Smuzhiyun __u32 padding[3]; 1604*4882a593Smuzhiyun }; 1605*4882a593Smuzhiyun 1606*4882a593SmuzhiyunThis ioctl returns x86 cpuid features which are supported by both the 1607*4882a593Smuzhiyunhardware and kvm in its default configuration. Userspace can use the 1608*4882a593Smuzhiyuninformation returned by this ioctl to construct cpuid information (for 1609*4882a593SmuzhiyunKVM_SET_CPUID2) that is consistent with hardware, kernel, and 1610*4882a593Smuzhiyunuserspace capabilities, and with user requirements (for example, the 1611*4882a593Smuzhiyunuser may wish to constrain cpuid to emulate older hardware, or for 1612*4882a593Smuzhiyunfeature consistency across a cluster). 1613*4882a593Smuzhiyun 1614*4882a593SmuzhiyunNote that certain capabilities, such as KVM_CAP_X86_DISABLE_EXITS, may 1615*4882a593Smuzhiyunexpose cpuid features (e.g. MONITOR) which are not supported by kvm in 1616*4882a593Smuzhiyunits default configuration. If userspace enables such capabilities, it 1617*4882a593Smuzhiyunis responsible for modifying the results of this ioctl appropriately. 1618*4882a593Smuzhiyun 1619*4882a593SmuzhiyunUserspace invokes KVM_GET_SUPPORTED_CPUID by passing a kvm_cpuid2 structure 1620*4882a593Smuzhiyunwith the 'nent' field indicating the number of entries in the variable-size 1621*4882a593Smuzhiyunarray 'entries'. If the number of entries is too low to describe the cpu 1622*4882a593Smuzhiyuncapabilities, an error (E2BIG) is returned. If the number is too high, 1623*4882a593Smuzhiyunthe 'nent' field is adjusted and an error (ENOMEM) is returned. If the 1624*4882a593Smuzhiyunnumber is just right, the 'nent' field is adjusted to the number of valid 1625*4882a593Smuzhiyunentries in the 'entries' array, which is then filled. 1626*4882a593Smuzhiyun 1627*4882a593SmuzhiyunThe entries returned are the host cpuid as returned by the cpuid instruction, 1628*4882a593Smuzhiyunwith unknown or unsupported features masked out. Some features (for example, 1629*4882a593Smuzhiyunx2apic), may not be present in the host cpu, but are exposed by kvm if it can 1630*4882a593Smuzhiyunemulate them efficiently. The fields in each entry are defined as follows: 1631*4882a593Smuzhiyun 1632*4882a593Smuzhiyun function: 1633*4882a593Smuzhiyun the eax value used to obtain the entry 1634*4882a593Smuzhiyun 1635*4882a593Smuzhiyun index: 1636*4882a593Smuzhiyun the ecx value used to obtain the entry (for entries that are 1637*4882a593Smuzhiyun affected by ecx) 1638*4882a593Smuzhiyun 1639*4882a593Smuzhiyun flags: 1640*4882a593Smuzhiyun an OR of zero or more of the following: 1641*4882a593Smuzhiyun 1642*4882a593Smuzhiyun KVM_CPUID_FLAG_SIGNIFCANT_INDEX: 1643*4882a593Smuzhiyun if the index field is valid 1644*4882a593Smuzhiyun 1645*4882a593Smuzhiyun eax, ebx, ecx, edx: 1646*4882a593Smuzhiyun the values returned by the cpuid instruction for 1647*4882a593Smuzhiyun this function/index combination 1648*4882a593Smuzhiyun 1649*4882a593SmuzhiyunThe TSC deadline timer feature (CPUID leaf 1, ecx[24]) is always returned 1650*4882a593Smuzhiyunas false, since the feature depends on KVM_CREATE_IRQCHIP for local APIC 1651*4882a593Smuzhiyunsupport. Instead it is reported via:: 1652*4882a593Smuzhiyun 1653*4882a593Smuzhiyun ioctl(KVM_CHECK_EXTENSION, KVM_CAP_TSC_DEADLINE_TIMER) 1654*4882a593Smuzhiyun 1655*4882a593Smuzhiyunif that returns true and you use KVM_CREATE_IRQCHIP, or if you emulate the 1656*4882a593Smuzhiyunfeature in userspace, then you can enable the feature for KVM_SET_CPUID2. 1657*4882a593Smuzhiyun 1658*4882a593Smuzhiyun 1659*4882a593Smuzhiyun4.47 KVM_PPC_GET_PVINFO 1660*4882a593Smuzhiyun----------------------- 1661*4882a593Smuzhiyun 1662*4882a593Smuzhiyun:Capability: KVM_CAP_PPC_GET_PVINFO 1663*4882a593Smuzhiyun:Architectures: ppc 1664*4882a593Smuzhiyun:Type: vm ioctl 1665*4882a593Smuzhiyun:Parameters: struct kvm_ppc_pvinfo (out) 1666*4882a593Smuzhiyun:Returns: 0 on success, !0 on error 1667*4882a593Smuzhiyun 1668*4882a593Smuzhiyun:: 1669*4882a593Smuzhiyun 1670*4882a593Smuzhiyun struct kvm_ppc_pvinfo { 1671*4882a593Smuzhiyun __u32 flags; 1672*4882a593Smuzhiyun __u32 hcall[4]; 1673*4882a593Smuzhiyun __u8 pad[108]; 1674*4882a593Smuzhiyun }; 1675*4882a593Smuzhiyun 1676*4882a593SmuzhiyunThis ioctl fetches PV specific information that need to be passed to the guest 1677*4882a593Smuzhiyunusing the device tree or other means from vm context. 1678*4882a593Smuzhiyun 1679*4882a593SmuzhiyunThe hcall array defines 4 instructions that make up a hypercall. 1680*4882a593Smuzhiyun 1681*4882a593SmuzhiyunIf any additional field gets added to this structure later on, a bit for that 1682*4882a593Smuzhiyunadditional piece of information will be set in the flags bitmap. 1683*4882a593Smuzhiyun 1684*4882a593SmuzhiyunThe flags bitmap is defined as:: 1685*4882a593Smuzhiyun 1686*4882a593Smuzhiyun /* the host supports the ePAPR idle hcall 1687*4882a593Smuzhiyun #define KVM_PPC_PVINFO_FLAGS_EV_IDLE (1<<0) 1688*4882a593Smuzhiyun 1689*4882a593Smuzhiyun4.52 KVM_SET_GSI_ROUTING 1690*4882a593Smuzhiyun------------------------ 1691*4882a593Smuzhiyun 1692*4882a593Smuzhiyun:Capability: KVM_CAP_IRQ_ROUTING 1693*4882a593Smuzhiyun:Architectures: x86 s390 arm arm64 1694*4882a593Smuzhiyun:Type: vm ioctl 1695*4882a593Smuzhiyun:Parameters: struct kvm_irq_routing (in) 1696*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 1697*4882a593Smuzhiyun 1698*4882a593SmuzhiyunSets the GSI routing table entries, overwriting any previously set entries. 1699*4882a593Smuzhiyun 1700*4882a593SmuzhiyunOn arm/arm64, GSI routing has the following limitation: 1701*4882a593Smuzhiyun 1702*4882a593Smuzhiyun- GSI routing does not apply to KVM_IRQ_LINE but only to KVM_IRQFD. 1703*4882a593Smuzhiyun 1704*4882a593Smuzhiyun:: 1705*4882a593Smuzhiyun 1706*4882a593Smuzhiyun struct kvm_irq_routing { 1707*4882a593Smuzhiyun __u32 nr; 1708*4882a593Smuzhiyun __u32 flags; 1709*4882a593Smuzhiyun struct kvm_irq_routing_entry entries[0]; 1710*4882a593Smuzhiyun }; 1711*4882a593Smuzhiyun 1712*4882a593SmuzhiyunNo flags are specified so far, the corresponding field must be set to zero. 1713*4882a593Smuzhiyun 1714*4882a593Smuzhiyun:: 1715*4882a593Smuzhiyun 1716*4882a593Smuzhiyun struct kvm_irq_routing_entry { 1717*4882a593Smuzhiyun __u32 gsi; 1718*4882a593Smuzhiyun __u32 type; 1719*4882a593Smuzhiyun __u32 flags; 1720*4882a593Smuzhiyun __u32 pad; 1721*4882a593Smuzhiyun union { 1722*4882a593Smuzhiyun struct kvm_irq_routing_irqchip irqchip; 1723*4882a593Smuzhiyun struct kvm_irq_routing_msi msi; 1724*4882a593Smuzhiyun struct kvm_irq_routing_s390_adapter adapter; 1725*4882a593Smuzhiyun struct kvm_irq_routing_hv_sint hv_sint; 1726*4882a593Smuzhiyun __u32 pad[8]; 1727*4882a593Smuzhiyun } u; 1728*4882a593Smuzhiyun }; 1729*4882a593Smuzhiyun 1730*4882a593Smuzhiyun /* gsi routing entry types */ 1731*4882a593Smuzhiyun #define KVM_IRQ_ROUTING_IRQCHIP 1 1732*4882a593Smuzhiyun #define KVM_IRQ_ROUTING_MSI 2 1733*4882a593Smuzhiyun #define KVM_IRQ_ROUTING_S390_ADAPTER 3 1734*4882a593Smuzhiyun #define KVM_IRQ_ROUTING_HV_SINT 4 1735*4882a593Smuzhiyun 1736*4882a593Smuzhiyunflags: 1737*4882a593Smuzhiyun 1738*4882a593Smuzhiyun- KVM_MSI_VALID_DEVID: used along with KVM_IRQ_ROUTING_MSI routing entry 1739*4882a593Smuzhiyun type, specifies that the devid field contains a valid value. The per-VM 1740*4882a593Smuzhiyun KVM_CAP_MSI_DEVID capability advertises the requirement to provide 1741*4882a593Smuzhiyun the device ID. If this capability is not available, userspace should 1742*4882a593Smuzhiyun never set the KVM_MSI_VALID_DEVID flag as the ioctl might fail. 1743*4882a593Smuzhiyun- zero otherwise 1744*4882a593Smuzhiyun 1745*4882a593Smuzhiyun:: 1746*4882a593Smuzhiyun 1747*4882a593Smuzhiyun struct kvm_irq_routing_irqchip { 1748*4882a593Smuzhiyun __u32 irqchip; 1749*4882a593Smuzhiyun __u32 pin; 1750*4882a593Smuzhiyun }; 1751*4882a593Smuzhiyun 1752*4882a593Smuzhiyun struct kvm_irq_routing_msi { 1753*4882a593Smuzhiyun __u32 address_lo; 1754*4882a593Smuzhiyun __u32 address_hi; 1755*4882a593Smuzhiyun __u32 data; 1756*4882a593Smuzhiyun union { 1757*4882a593Smuzhiyun __u32 pad; 1758*4882a593Smuzhiyun __u32 devid; 1759*4882a593Smuzhiyun }; 1760*4882a593Smuzhiyun }; 1761*4882a593Smuzhiyun 1762*4882a593SmuzhiyunIf KVM_MSI_VALID_DEVID is set, devid contains a unique device identifier 1763*4882a593Smuzhiyunfor the device that wrote the MSI message. For PCI, this is usually a 1764*4882a593SmuzhiyunBFD identifier in the lower 16 bits. 1765*4882a593Smuzhiyun 1766*4882a593SmuzhiyunOn x86, address_hi is ignored unless the KVM_X2APIC_API_USE_32BIT_IDS 1767*4882a593Smuzhiyunfeature of KVM_CAP_X2APIC_API capability is enabled. If it is enabled, 1768*4882a593Smuzhiyunaddress_hi bits 31-8 provide bits 31-8 of the destination id. Bits 7-0 of 1769*4882a593Smuzhiyunaddress_hi must be zero. 1770*4882a593Smuzhiyun 1771*4882a593Smuzhiyun:: 1772*4882a593Smuzhiyun 1773*4882a593Smuzhiyun struct kvm_irq_routing_s390_adapter { 1774*4882a593Smuzhiyun __u64 ind_addr; 1775*4882a593Smuzhiyun __u64 summary_addr; 1776*4882a593Smuzhiyun __u64 ind_offset; 1777*4882a593Smuzhiyun __u32 summary_offset; 1778*4882a593Smuzhiyun __u32 adapter_id; 1779*4882a593Smuzhiyun }; 1780*4882a593Smuzhiyun 1781*4882a593Smuzhiyun struct kvm_irq_routing_hv_sint { 1782*4882a593Smuzhiyun __u32 vcpu; 1783*4882a593Smuzhiyun __u32 sint; 1784*4882a593Smuzhiyun }; 1785*4882a593Smuzhiyun 1786*4882a593Smuzhiyun 1787*4882a593Smuzhiyun4.55 KVM_SET_TSC_KHZ 1788*4882a593Smuzhiyun-------------------- 1789*4882a593Smuzhiyun 1790*4882a593Smuzhiyun:Capability: KVM_CAP_TSC_CONTROL 1791*4882a593Smuzhiyun:Architectures: x86 1792*4882a593Smuzhiyun:Type: vcpu ioctl 1793*4882a593Smuzhiyun:Parameters: virtual tsc_khz 1794*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 1795*4882a593Smuzhiyun 1796*4882a593SmuzhiyunSpecifies the tsc frequency for the virtual machine. The unit of the 1797*4882a593Smuzhiyunfrequency is KHz. 1798*4882a593Smuzhiyun 1799*4882a593Smuzhiyun 1800*4882a593Smuzhiyun4.56 KVM_GET_TSC_KHZ 1801*4882a593Smuzhiyun-------------------- 1802*4882a593Smuzhiyun 1803*4882a593Smuzhiyun:Capability: KVM_CAP_GET_TSC_KHZ 1804*4882a593Smuzhiyun:Architectures: x86 1805*4882a593Smuzhiyun:Type: vcpu ioctl 1806*4882a593Smuzhiyun:Parameters: none 1807*4882a593Smuzhiyun:Returns: virtual tsc-khz on success, negative value on error 1808*4882a593Smuzhiyun 1809*4882a593SmuzhiyunReturns the tsc frequency of the guest. The unit of the return value is 1810*4882a593SmuzhiyunKHz. If the host has unstable tsc this ioctl returns -EIO instead as an 1811*4882a593Smuzhiyunerror. 1812*4882a593Smuzhiyun 1813*4882a593Smuzhiyun 1814*4882a593Smuzhiyun4.57 KVM_GET_LAPIC 1815*4882a593Smuzhiyun------------------ 1816*4882a593Smuzhiyun 1817*4882a593Smuzhiyun:Capability: KVM_CAP_IRQCHIP 1818*4882a593Smuzhiyun:Architectures: x86 1819*4882a593Smuzhiyun:Type: vcpu ioctl 1820*4882a593Smuzhiyun:Parameters: struct kvm_lapic_state (out) 1821*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 1822*4882a593Smuzhiyun 1823*4882a593Smuzhiyun:: 1824*4882a593Smuzhiyun 1825*4882a593Smuzhiyun #define KVM_APIC_REG_SIZE 0x400 1826*4882a593Smuzhiyun struct kvm_lapic_state { 1827*4882a593Smuzhiyun char regs[KVM_APIC_REG_SIZE]; 1828*4882a593Smuzhiyun }; 1829*4882a593Smuzhiyun 1830*4882a593SmuzhiyunReads the Local APIC registers and copies them into the input argument. The 1831*4882a593Smuzhiyundata format and layout are the same as documented in the architecture manual. 1832*4882a593Smuzhiyun 1833*4882a593SmuzhiyunIf KVM_X2APIC_API_USE_32BIT_IDS feature of KVM_CAP_X2APIC_API is 1834*4882a593Smuzhiyunenabled, then the format of APIC_ID register depends on the APIC mode 1835*4882a593Smuzhiyun(reported by MSR_IA32_APICBASE) of its VCPU. x2APIC stores APIC ID in 1836*4882a593Smuzhiyunthe APIC_ID register (bytes 32-35). xAPIC only allows an 8-bit APIC ID 1837*4882a593Smuzhiyunwhich is stored in bits 31-24 of the APIC register, or equivalently in 1838*4882a593Smuzhiyunbyte 35 of struct kvm_lapic_state's regs field. KVM_GET_LAPIC must then 1839*4882a593Smuzhiyunbe called after MSR_IA32_APICBASE has been set with KVM_SET_MSR. 1840*4882a593Smuzhiyun 1841*4882a593SmuzhiyunIf KVM_X2APIC_API_USE_32BIT_IDS feature is disabled, struct kvm_lapic_state 1842*4882a593Smuzhiyunalways uses xAPIC format. 1843*4882a593Smuzhiyun 1844*4882a593Smuzhiyun 1845*4882a593Smuzhiyun4.58 KVM_SET_LAPIC 1846*4882a593Smuzhiyun------------------ 1847*4882a593Smuzhiyun 1848*4882a593Smuzhiyun:Capability: KVM_CAP_IRQCHIP 1849*4882a593Smuzhiyun:Architectures: x86 1850*4882a593Smuzhiyun:Type: vcpu ioctl 1851*4882a593Smuzhiyun:Parameters: struct kvm_lapic_state (in) 1852*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 1853*4882a593Smuzhiyun 1854*4882a593Smuzhiyun:: 1855*4882a593Smuzhiyun 1856*4882a593Smuzhiyun #define KVM_APIC_REG_SIZE 0x400 1857*4882a593Smuzhiyun struct kvm_lapic_state { 1858*4882a593Smuzhiyun char regs[KVM_APIC_REG_SIZE]; 1859*4882a593Smuzhiyun }; 1860*4882a593Smuzhiyun 1861*4882a593SmuzhiyunCopies the input argument into the Local APIC registers. The data format 1862*4882a593Smuzhiyunand layout are the same as documented in the architecture manual. 1863*4882a593Smuzhiyun 1864*4882a593SmuzhiyunThe format of the APIC ID register (bytes 32-35 of struct kvm_lapic_state's 1865*4882a593Smuzhiyunregs field) depends on the state of the KVM_CAP_X2APIC_API capability. 1866*4882a593SmuzhiyunSee the note in KVM_GET_LAPIC. 1867*4882a593Smuzhiyun 1868*4882a593Smuzhiyun 1869*4882a593Smuzhiyun4.59 KVM_IOEVENTFD 1870*4882a593Smuzhiyun------------------ 1871*4882a593Smuzhiyun 1872*4882a593Smuzhiyun:Capability: KVM_CAP_IOEVENTFD 1873*4882a593Smuzhiyun:Architectures: all 1874*4882a593Smuzhiyun:Type: vm ioctl 1875*4882a593Smuzhiyun:Parameters: struct kvm_ioeventfd (in) 1876*4882a593Smuzhiyun:Returns: 0 on success, !0 on error 1877*4882a593Smuzhiyun 1878*4882a593SmuzhiyunThis ioctl attaches or detaches an ioeventfd to a legal pio/mmio address 1879*4882a593Smuzhiyunwithin the guest. A guest write in the registered address will signal the 1880*4882a593Smuzhiyunprovided event instead of triggering an exit. 1881*4882a593Smuzhiyun 1882*4882a593Smuzhiyun:: 1883*4882a593Smuzhiyun 1884*4882a593Smuzhiyun struct kvm_ioeventfd { 1885*4882a593Smuzhiyun __u64 datamatch; 1886*4882a593Smuzhiyun __u64 addr; /* legal pio/mmio address */ 1887*4882a593Smuzhiyun __u32 len; /* 0, 1, 2, 4, or 8 bytes */ 1888*4882a593Smuzhiyun __s32 fd; 1889*4882a593Smuzhiyun __u32 flags; 1890*4882a593Smuzhiyun __u8 pad[36]; 1891*4882a593Smuzhiyun }; 1892*4882a593Smuzhiyun 1893*4882a593SmuzhiyunFor the special case of virtio-ccw devices on s390, the ioevent is matched 1894*4882a593Smuzhiyunto a subchannel/virtqueue tuple instead. 1895*4882a593Smuzhiyun 1896*4882a593SmuzhiyunThe following flags are defined:: 1897*4882a593Smuzhiyun 1898*4882a593Smuzhiyun #define KVM_IOEVENTFD_FLAG_DATAMATCH (1 << kvm_ioeventfd_flag_nr_datamatch) 1899*4882a593Smuzhiyun #define KVM_IOEVENTFD_FLAG_PIO (1 << kvm_ioeventfd_flag_nr_pio) 1900*4882a593Smuzhiyun #define KVM_IOEVENTFD_FLAG_DEASSIGN (1 << kvm_ioeventfd_flag_nr_deassign) 1901*4882a593Smuzhiyun #define KVM_IOEVENTFD_FLAG_VIRTIO_CCW_NOTIFY \ 1902*4882a593Smuzhiyun (1 << kvm_ioeventfd_flag_nr_virtio_ccw_notify) 1903*4882a593Smuzhiyun 1904*4882a593SmuzhiyunIf datamatch flag is set, the event will be signaled only if the written value 1905*4882a593Smuzhiyunto the registered address is equal to datamatch in struct kvm_ioeventfd. 1906*4882a593Smuzhiyun 1907*4882a593SmuzhiyunFor virtio-ccw devices, addr contains the subchannel id and datamatch the 1908*4882a593Smuzhiyunvirtqueue index. 1909*4882a593Smuzhiyun 1910*4882a593SmuzhiyunWith KVM_CAP_IOEVENTFD_ANY_LENGTH, a zero length ioeventfd is allowed, and 1911*4882a593Smuzhiyunthe kernel will ignore the length of guest write and may get a faster vmexit. 1912*4882a593SmuzhiyunThe speedup may only apply to specific architectures, but the ioeventfd will 1913*4882a593Smuzhiyunwork anyway. 1914*4882a593Smuzhiyun 1915*4882a593Smuzhiyun4.60 KVM_DIRTY_TLB 1916*4882a593Smuzhiyun------------------ 1917*4882a593Smuzhiyun 1918*4882a593Smuzhiyun:Capability: KVM_CAP_SW_TLB 1919*4882a593Smuzhiyun:Architectures: ppc 1920*4882a593Smuzhiyun:Type: vcpu ioctl 1921*4882a593Smuzhiyun:Parameters: struct kvm_dirty_tlb (in) 1922*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 1923*4882a593Smuzhiyun 1924*4882a593Smuzhiyun:: 1925*4882a593Smuzhiyun 1926*4882a593Smuzhiyun struct kvm_dirty_tlb { 1927*4882a593Smuzhiyun __u64 bitmap; 1928*4882a593Smuzhiyun __u32 num_dirty; 1929*4882a593Smuzhiyun }; 1930*4882a593Smuzhiyun 1931*4882a593SmuzhiyunThis must be called whenever userspace has changed an entry in the shared 1932*4882a593SmuzhiyunTLB, prior to calling KVM_RUN on the associated vcpu. 1933*4882a593Smuzhiyun 1934*4882a593SmuzhiyunThe "bitmap" field is the userspace address of an array. This array 1935*4882a593Smuzhiyunconsists of a number of bits, equal to the total number of TLB entries as 1936*4882a593Smuzhiyundetermined by the last successful call to KVM_CONFIG_TLB, rounded up to the 1937*4882a593Smuzhiyunnearest multiple of 64. 1938*4882a593Smuzhiyun 1939*4882a593SmuzhiyunEach bit corresponds to one TLB entry, ordered the same as in the shared TLB 1940*4882a593Smuzhiyunarray. 1941*4882a593Smuzhiyun 1942*4882a593SmuzhiyunThe array is little-endian: the bit 0 is the least significant bit of the 1943*4882a593Smuzhiyunfirst byte, bit 8 is the least significant bit of the second byte, etc. 1944*4882a593SmuzhiyunThis avoids any complications with differing word sizes. 1945*4882a593Smuzhiyun 1946*4882a593SmuzhiyunThe "num_dirty" field is a performance hint for KVM to determine whether it 1947*4882a593Smuzhiyunshould skip processing the bitmap and just invalidate everything. It must 1948*4882a593Smuzhiyunbe set to the number of set bits in the bitmap. 1949*4882a593Smuzhiyun 1950*4882a593Smuzhiyun 1951*4882a593Smuzhiyun4.62 KVM_CREATE_SPAPR_TCE 1952*4882a593Smuzhiyun------------------------- 1953*4882a593Smuzhiyun 1954*4882a593Smuzhiyun:Capability: KVM_CAP_SPAPR_TCE 1955*4882a593Smuzhiyun:Architectures: powerpc 1956*4882a593Smuzhiyun:Type: vm ioctl 1957*4882a593Smuzhiyun:Parameters: struct kvm_create_spapr_tce (in) 1958*4882a593Smuzhiyun:Returns: file descriptor for manipulating the created TCE table 1959*4882a593Smuzhiyun 1960*4882a593SmuzhiyunThis creates a virtual TCE (translation control entry) table, which 1961*4882a593Smuzhiyunis an IOMMU for PAPR-style virtual I/O. It is used to translate 1962*4882a593Smuzhiyunlogical addresses used in virtual I/O into guest physical addresses, 1963*4882a593Smuzhiyunand provides a scatter/gather capability for PAPR virtual I/O. 1964*4882a593Smuzhiyun 1965*4882a593Smuzhiyun:: 1966*4882a593Smuzhiyun 1967*4882a593Smuzhiyun /* for KVM_CAP_SPAPR_TCE */ 1968*4882a593Smuzhiyun struct kvm_create_spapr_tce { 1969*4882a593Smuzhiyun __u64 liobn; 1970*4882a593Smuzhiyun __u32 window_size; 1971*4882a593Smuzhiyun }; 1972*4882a593Smuzhiyun 1973*4882a593SmuzhiyunThe liobn field gives the logical IO bus number for which to create a 1974*4882a593SmuzhiyunTCE table. The window_size field specifies the size of the DMA window 1975*4882a593Smuzhiyunwhich this TCE table will translate - the table will contain one 64 1976*4882a593Smuzhiyunbit TCE entry for every 4kiB of the DMA window. 1977*4882a593Smuzhiyun 1978*4882a593SmuzhiyunWhen the guest issues an H_PUT_TCE hcall on a liobn for which a TCE 1979*4882a593Smuzhiyuntable has been created using this ioctl(), the kernel will handle it 1980*4882a593Smuzhiyunin real mode, updating the TCE table. H_PUT_TCE calls for other 1981*4882a593Smuzhiyunliobns will cause a vm exit and must be handled by userspace. 1982*4882a593Smuzhiyun 1983*4882a593SmuzhiyunThe return value is a file descriptor which can be passed to mmap(2) 1984*4882a593Smuzhiyunto map the created TCE table into userspace. This lets userspace read 1985*4882a593Smuzhiyunthe entries written by kernel-handled H_PUT_TCE calls, and also lets 1986*4882a593Smuzhiyunuserspace update the TCE table directly which is useful in some 1987*4882a593Smuzhiyuncircumstances. 1988*4882a593Smuzhiyun 1989*4882a593Smuzhiyun 1990*4882a593Smuzhiyun4.63 KVM_ALLOCATE_RMA 1991*4882a593Smuzhiyun--------------------- 1992*4882a593Smuzhiyun 1993*4882a593Smuzhiyun:Capability: KVM_CAP_PPC_RMA 1994*4882a593Smuzhiyun:Architectures: powerpc 1995*4882a593Smuzhiyun:Type: vm ioctl 1996*4882a593Smuzhiyun:Parameters: struct kvm_allocate_rma (out) 1997*4882a593Smuzhiyun:Returns: file descriptor for mapping the allocated RMA 1998*4882a593Smuzhiyun 1999*4882a593SmuzhiyunThis allocates a Real Mode Area (RMA) from the pool allocated at boot 2000*4882a593Smuzhiyuntime by the kernel. An RMA is a physically-contiguous, aligned region 2001*4882a593Smuzhiyunof memory used on older POWER processors to provide the memory which 2002*4882a593Smuzhiyunwill be accessed by real-mode (MMU off) accesses in a KVM guest. 2003*4882a593SmuzhiyunPOWER processors support a set of sizes for the RMA that usually 2004*4882a593Smuzhiyunincludes 64MB, 128MB, 256MB and some larger powers of two. 2005*4882a593Smuzhiyun 2006*4882a593Smuzhiyun:: 2007*4882a593Smuzhiyun 2008*4882a593Smuzhiyun /* for KVM_ALLOCATE_RMA */ 2009*4882a593Smuzhiyun struct kvm_allocate_rma { 2010*4882a593Smuzhiyun __u64 rma_size; 2011*4882a593Smuzhiyun }; 2012*4882a593Smuzhiyun 2013*4882a593SmuzhiyunThe return value is a file descriptor which can be passed to mmap(2) 2014*4882a593Smuzhiyunto map the allocated RMA into userspace. The mapped area can then be 2015*4882a593Smuzhiyunpassed to the KVM_SET_USER_MEMORY_REGION ioctl to establish it as the 2016*4882a593SmuzhiyunRMA for a virtual machine. The size of the RMA in bytes (which is 2017*4882a593Smuzhiyunfixed at host kernel boot time) is returned in the rma_size field of 2018*4882a593Smuzhiyunthe argument structure. 2019*4882a593Smuzhiyun 2020*4882a593SmuzhiyunThe KVM_CAP_PPC_RMA capability is 1 or 2 if the KVM_ALLOCATE_RMA ioctl 2021*4882a593Smuzhiyunis supported; 2 if the processor requires all virtual machines to have 2022*4882a593Smuzhiyunan RMA, or 1 if the processor can use an RMA but doesn't require it, 2023*4882a593Smuzhiyunbecause it supports the Virtual RMA (VRMA) facility. 2024*4882a593Smuzhiyun 2025*4882a593Smuzhiyun 2026*4882a593Smuzhiyun4.64 KVM_NMI 2027*4882a593Smuzhiyun------------ 2028*4882a593Smuzhiyun 2029*4882a593Smuzhiyun:Capability: KVM_CAP_USER_NMI 2030*4882a593Smuzhiyun:Architectures: x86 2031*4882a593Smuzhiyun:Type: vcpu ioctl 2032*4882a593Smuzhiyun:Parameters: none 2033*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 2034*4882a593Smuzhiyun 2035*4882a593SmuzhiyunQueues an NMI on the thread's vcpu. Note this is well defined only 2036*4882a593Smuzhiyunwhen KVM_CREATE_IRQCHIP has not been called, since this is an interface 2037*4882a593Smuzhiyunbetween the virtual cpu core and virtual local APIC. After KVM_CREATE_IRQCHIP 2038*4882a593Smuzhiyunhas been called, this interface is completely emulated within the kernel. 2039*4882a593Smuzhiyun 2040*4882a593SmuzhiyunTo use this to emulate the LINT1 input with KVM_CREATE_IRQCHIP, use the 2041*4882a593Smuzhiyunfollowing algorithm: 2042*4882a593Smuzhiyun 2043*4882a593Smuzhiyun - pause the vcpu 2044*4882a593Smuzhiyun - read the local APIC's state (KVM_GET_LAPIC) 2045*4882a593Smuzhiyun - check whether changing LINT1 will queue an NMI (see the LVT entry for LINT1) 2046*4882a593Smuzhiyun - if so, issue KVM_NMI 2047*4882a593Smuzhiyun - resume the vcpu 2048*4882a593Smuzhiyun 2049*4882a593SmuzhiyunSome guests configure the LINT1 NMI input to cause a panic, aiding in 2050*4882a593Smuzhiyundebugging. 2051*4882a593Smuzhiyun 2052*4882a593Smuzhiyun 2053*4882a593Smuzhiyun4.65 KVM_S390_UCAS_MAP 2054*4882a593Smuzhiyun---------------------- 2055*4882a593Smuzhiyun 2056*4882a593Smuzhiyun:Capability: KVM_CAP_S390_UCONTROL 2057*4882a593Smuzhiyun:Architectures: s390 2058*4882a593Smuzhiyun:Type: vcpu ioctl 2059*4882a593Smuzhiyun:Parameters: struct kvm_s390_ucas_mapping (in) 2060*4882a593Smuzhiyun:Returns: 0 in case of success 2061*4882a593Smuzhiyun 2062*4882a593SmuzhiyunThe parameter is defined like this:: 2063*4882a593Smuzhiyun 2064*4882a593Smuzhiyun struct kvm_s390_ucas_mapping { 2065*4882a593Smuzhiyun __u64 user_addr; 2066*4882a593Smuzhiyun __u64 vcpu_addr; 2067*4882a593Smuzhiyun __u64 length; 2068*4882a593Smuzhiyun }; 2069*4882a593Smuzhiyun 2070*4882a593SmuzhiyunThis ioctl maps the memory at "user_addr" with the length "length" to 2071*4882a593Smuzhiyunthe vcpu's address space starting at "vcpu_addr". All parameters need to 2072*4882a593Smuzhiyunbe aligned by 1 megabyte. 2073*4882a593Smuzhiyun 2074*4882a593Smuzhiyun 2075*4882a593Smuzhiyun4.66 KVM_S390_UCAS_UNMAP 2076*4882a593Smuzhiyun------------------------ 2077*4882a593Smuzhiyun 2078*4882a593Smuzhiyun:Capability: KVM_CAP_S390_UCONTROL 2079*4882a593Smuzhiyun:Architectures: s390 2080*4882a593Smuzhiyun:Type: vcpu ioctl 2081*4882a593Smuzhiyun:Parameters: struct kvm_s390_ucas_mapping (in) 2082*4882a593Smuzhiyun:Returns: 0 in case of success 2083*4882a593Smuzhiyun 2084*4882a593SmuzhiyunThe parameter is defined like this:: 2085*4882a593Smuzhiyun 2086*4882a593Smuzhiyun struct kvm_s390_ucas_mapping { 2087*4882a593Smuzhiyun __u64 user_addr; 2088*4882a593Smuzhiyun __u64 vcpu_addr; 2089*4882a593Smuzhiyun __u64 length; 2090*4882a593Smuzhiyun }; 2091*4882a593Smuzhiyun 2092*4882a593SmuzhiyunThis ioctl unmaps the memory in the vcpu's address space starting at 2093*4882a593Smuzhiyun"vcpu_addr" with the length "length". The field "user_addr" is ignored. 2094*4882a593SmuzhiyunAll parameters need to be aligned by 1 megabyte. 2095*4882a593Smuzhiyun 2096*4882a593Smuzhiyun 2097*4882a593Smuzhiyun4.67 KVM_S390_VCPU_FAULT 2098*4882a593Smuzhiyun------------------------ 2099*4882a593Smuzhiyun 2100*4882a593Smuzhiyun:Capability: KVM_CAP_S390_UCONTROL 2101*4882a593Smuzhiyun:Architectures: s390 2102*4882a593Smuzhiyun:Type: vcpu ioctl 2103*4882a593Smuzhiyun:Parameters: vcpu absolute address (in) 2104*4882a593Smuzhiyun:Returns: 0 in case of success 2105*4882a593Smuzhiyun 2106*4882a593SmuzhiyunThis call creates a page table entry on the virtual cpu's address space 2107*4882a593Smuzhiyun(for user controlled virtual machines) or the virtual machine's address 2108*4882a593Smuzhiyunspace (for regular virtual machines). This only works for minor faults, 2109*4882a593Smuzhiyunthus it's recommended to access subject memory page via the user page 2110*4882a593Smuzhiyuntable upfront. This is useful to handle validity intercepts for user 2111*4882a593Smuzhiyuncontrolled virtual machines to fault in the virtual cpu's lowcore pages 2112*4882a593Smuzhiyunprior to calling the KVM_RUN ioctl. 2113*4882a593Smuzhiyun 2114*4882a593Smuzhiyun 2115*4882a593Smuzhiyun4.68 KVM_SET_ONE_REG 2116*4882a593Smuzhiyun-------------------- 2117*4882a593Smuzhiyun 2118*4882a593Smuzhiyun:Capability: KVM_CAP_ONE_REG 2119*4882a593Smuzhiyun:Architectures: all 2120*4882a593Smuzhiyun:Type: vcpu ioctl 2121*4882a593Smuzhiyun:Parameters: struct kvm_one_reg (in) 2122*4882a593Smuzhiyun:Returns: 0 on success, negative value on failure 2123*4882a593Smuzhiyun 2124*4882a593SmuzhiyunErrors: 2125*4882a593Smuzhiyun 2126*4882a593Smuzhiyun ====== ============================================================ 2127*4882a593Smuzhiyun ENOENT no such register 2128*4882a593Smuzhiyun EINVAL invalid register ID, or no such register or used with VMs in 2129*4882a593Smuzhiyun protected virtualization mode on s390 2130*4882a593Smuzhiyun EPERM (arm64) register access not allowed before vcpu finalization 2131*4882a593Smuzhiyun ====== ============================================================ 2132*4882a593Smuzhiyun 2133*4882a593Smuzhiyun(These error codes are indicative only: do not rely on a specific error 2134*4882a593Smuzhiyuncode being returned in a specific situation.) 2135*4882a593Smuzhiyun 2136*4882a593Smuzhiyun:: 2137*4882a593Smuzhiyun 2138*4882a593Smuzhiyun struct kvm_one_reg { 2139*4882a593Smuzhiyun __u64 id; 2140*4882a593Smuzhiyun __u64 addr; 2141*4882a593Smuzhiyun }; 2142*4882a593Smuzhiyun 2143*4882a593SmuzhiyunUsing this ioctl, a single vcpu register can be set to a specific value 2144*4882a593Smuzhiyundefined by user space with the passed in struct kvm_one_reg, where id 2145*4882a593Smuzhiyunrefers to the register identifier as described below and addr is a pointer 2146*4882a593Smuzhiyunto a variable with the respective size. There can be architecture agnostic 2147*4882a593Smuzhiyunand architecture specific registers. Each have their own range of operation 2148*4882a593Smuzhiyunand their own constants and width. To keep track of the implemented 2149*4882a593Smuzhiyunregisters, find a list below: 2150*4882a593Smuzhiyun 2151*4882a593Smuzhiyun ======= =============================== ============ 2152*4882a593Smuzhiyun Arch Register Width (bits) 2153*4882a593Smuzhiyun ======= =============================== ============ 2154*4882a593Smuzhiyun PPC KVM_REG_PPC_HIOR 64 2155*4882a593Smuzhiyun PPC KVM_REG_PPC_IAC1 64 2156*4882a593Smuzhiyun PPC KVM_REG_PPC_IAC2 64 2157*4882a593Smuzhiyun PPC KVM_REG_PPC_IAC3 64 2158*4882a593Smuzhiyun PPC KVM_REG_PPC_IAC4 64 2159*4882a593Smuzhiyun PPC KVM_REG_PPC_DAC1 64 2160*4882a593Smuzhiyun PPC KVM_REG_PPC_DAC2 64 2161*4882a593Smuzhiyun PPC KVM_REG_PPC_DABR 64 2162*4882a593Smuzhiyun PPC KVM_REG_PPC_DSCR 64 2163*4882a593Smuzhiyun PPC KVM_REG_PPC_PURR 64 2164*4882a593Smuzhiyun PPC KVM_REG_PPC_SPURR 64 2165*4882a593Smuzhiyun PPC KVM_REG_PPC_DAR 64 2166*4882a593Smuzhiyun PPC KVM_REG_PPC_DSISR 32 2167*4882a593Smuzhiyun PPC KVM_REG_PPC_AMR 64 2168*4882a593Smuzhiyun PPC KVM_REG_PPC_UAMOR 64 2169*4882a593Smuzhiyun PPC KVM_REG_PPC_MMCR0 64 2170*4882a593Smuzhiyun PPC KVM_REG_PPC_MMCR1 64 2171*4882a593Smuzhiyun PPC KVM_REG_PPC_MMCRA 64 2172*4882a593Smuzhiyun PPC KVM_REG_PPC_MMCR2 64 2173*4882a593Smuzhiyun PPC KVM_REG_PPC_MMCRS 64 2174*4882a593Smuzhiyun PPC KVM_REG_PPC_MMCR3 64 2175*4882a593Smuzhiyun PPC KVM_REG_PPC_SIAR 64 2176*4882a593Smuzhiyun PPC KVM_REG_PPC_SDAR 64 2177*4882a593Smuzhiyun PPC KVM_REG_PPC_SIER 64 2178*4882a593Smuzhiyun PPC KVM_REG_PPC_SIER2 64 2179*4882a593Smuzhiyun PPC KVM_REG_PPC_SIER3 64 2180*4882a593Smuzhiyun PPC KVM_REG_PPC_PMC1 32 2181*4882a593Smuzhiyun PPC KVM_REG_PPC_PMC2 32 2182*4882a593Smuzhiyun PPC KVM_REG_PPC_PMC3 32 2183*4882a593Smuzhiyun PPC KVM_REG_PPC_PMC4 32 2184*4882a593Smuzhiyun PPC KVM_REG_PPC_PMC5 32 2185*4882a593Smuzhiyun PPC KVM_REG_PPC_PMC6 32 2186*4882a593Smuzhiyun PPC KVM_REG_PPC_PMC7 32 2187*4882a593Smuzhiyun PPC KVM_REG_PPC_PMC8 32 2188*4882a593Smuzhiyun PPC KVM_REG_PPC_FPR0 64 2189*4882a593Smuzhiyun ... 2190*4882a593Smuzhiyun PPC KVM_REG_PPC_FPR31 64 2191*4882a593Smuzhiyun PPC KVM_REG_PPC_VR0 128 2192*4882a593Smuzhiyun ... 2193*4882a593Smuzhiyun PPC KVM_REG_PPC_VR31 128 2194*4882a593Smuzhiyun PPC KVM_REG_PPC_VSR0 128 2195*4882a593Smuzhiyun ... 2196*4882a593Smuzhiyun PPC KVM_REG_PPC_VSR31 128 2197*4882a593Smuzhiyun PPC KVM_REG_PPC_FPSCR 64 2198*4882a593Smuzhiyun PPC KVM_REG_PPC_VSCR 32 2199*4882a593Smuzhiyun PPC KVM_REG_PPC_VPA_ADDR 64 2200*4882a593Smuzhiyun PPC KVM_REG_PPC_VPA_SLB 128 2201*4882a593Smuzhiyun PPC KVM_REG_PPC_VPA_DTL 128 2202*4882a593Smuzhiyun PPC KVM_REG_PPC_EPCR 32 2203*4882a593Smuzhiyun PPC KVM_REG_PPC_EPR 32 2204*4882a593Smuzhiyun PPC KVM_REG_PPC_TCR 32 2205*4882a593Smuzhiyun PPC KVM_REG_PPC_TSR 32 2206*4882a593Smuzhiyun PPC KVM_REG_PPC_OR_TSR 32 2207*4882a593Smuzhiyun PPC KVM_REG_PPC_CLEAR_TSR 32 2208*4882a593Smuzhiyun PPC KVM_REG_PPC_MAS0 32 2209*4882a593Smuzhiyun PPC KVM_REG_PPC_MAS1 32 2210*4882a593Smuzhiyun PPC KVM_REG_PPC_MAS2 64 2211*4882a593Smuzhiyun PPC KVM_REG_PPC_MAS7_3 64 2212*4882a593Smuzhiyun PPC KVM_REG_PPC_MAS4 32 2213*4882a593Smuzhiyun PPC KVM_REG_PPC_MAS6 32 2214*4882a593Smuzhiyun PPC KVM_REG_PPC_MMUCFG 32 2215*4882a593Smuzhiyun PPC KVM_REG_PPC_TLB0CFG 32 2216*4882a593Smuzhiyun PPC KVM_REG_PPC_TLB1CFG 32 2217*4882a593Smuzhiyun PPC KVM_REG_PPC_TLB2CFG 32 2218*4882a593Smuzhiyun PPC KVM_REG_PPC_TLB3CFG 32 2219*4882a593Smuzhiyun PPC KVM_REG_PPC_TLB0PS 32 2220*4882a593Smuzhiyun PPC KVM_REG_PPC_TLB1PS 32 2221*4882a593Smuzhiyun PPC KVM_REG_PPC_TLB2PS 32 2222*4882a593Smuzhiyun PPC KVM_REG_PPC_TLB3PS 32 2223*4882a593Smuzhiyun PPC KVM_REG_PPC_EPTCFG 32 2224*4882a593Smuzhiyun PPC KVM_REG_PPC_ICP_STATE 64 2225*4882a593Smuzhiyun PPC KVM_REG_PPC_VP_STATE 128 2226*4882a593Smuzhiyun PPC KVM_REG_PPC_TB_OFFSET 64 2227*4882a593Smuzhiyun PPC KVM_REG_PPC_SPMC1 32 2228*4882a593Smuzhiyun PPC KVM_REG_PPC_SPMC2 32 2229*4882a593Smuzhiyun PPC KVM_REG_PPC_IAMR 64 2230*4882a593Smuzhiyun PPC KVM_REG_PPC_TFHAR 64 2231*4882a593Smuzhiyun PPC KVM_REG_PPC_TFIAR 64 2232*4882a593Smuzhiyun PPC KVM_REG_PPC_TEXASR 64 2233*4882a593Smuzhiyun PPC KVM_REG_PPC_FSCR 64 2234*4882a593Smuzhiyun PPC KVM_REG_PPC_PSPB 32 2235*4882a593Smuzhiyun PPC KVM_REG_PPC_EBBHR 64 2236*4882a593Smuzhiyun PPC KVM_REG_PPC_EBBRR 64 2237*4882a593Smuzhiyun PPC KVM_REG_PPC_BESCR 64 2238*4882a593Smuzhiyun PPC KVM_REG_PPC_TAR 64 2239*4882a593Smuzhiyun PPC KVM_REG_PPC_DPDES 64 2240*4882a593Smuzhiyun PPC KVM_REG_PPC_DAWR 64 2241*4882a593Smuzhiyun PPC KVM_REG_PPC_DAWRX 64 2242*4882a593Smuzhiyun PPC KVM_REG_PPC_CIABR 64 2243*4882a593Smuzhiyun PPC KVM_REG_PPC_IC 64 2244*4882a593Smuzhiyun PPC KVM_REG_PPC_VTB 64 2245*4882a593Smuzhiyun PPC KVM_REG_PPC_CSIGR 64 2246*4882a593Smuzhiyun PPC KVM_REG_PPC_TACR 64 2247*4882a593Smuzhiyun PPC KVM_REG_PPC_TCSCR 64 2248*4882a593Smuzhiyun PPC KVM_REG_PPC_PID 64 2249*4882a593Smuzhiyun PPC KVM_REG_PPC_ACOP 64 2250*4882a593Smuzhiyun PPC KVM_REG_PPC_VRSAVE 32 2251*4882a593Smuzhiyun PPC KVM_REG_PPC_LPCR 32 2252*4882a593Smuzhiyun PPC KVM_REG_PPC_LPCR_64 64 2253*4882a593Smuzhiyun PPC KVM_REG_PPC_PPR 64 2254*4882a593Smuzhiyun PPC KVM_REG_PPC_ARCH_COMPAT 32 2255*4882a593Smuzhiyun PPC KVM_REG_PPC_DABRX 32 2256*4882a593Smuzhiyun PPC KVM_REG_PPC_WORT 64 2257*4882a593Smuzhiyun PPC KVM_REG_PPC_SPRG9 64 2258*4882a593Smuzhiyun PPC KVM_REG_PPC_DBSR 32 2259*4882a593Smuzhiyun PPC KVM_REG_PPC_TIDR 64 2260*4882a593Smuzhiyun PPC KVM_REG_PPC_PSSCR 64 2261*4882a593Smuzhiyun PPC KVM_REG_PPC_DEC_EXPIRY 64 2262*4882a593Smuzhiyun PPC KVM_REG_PPC_PTCR 64 2263*4882a593Smuzhiyun PPC KVM_REG_PPC_TM_GPR0 64 2264*4882a593Smuzhiyun ... 2265*4882a593Smuzhiyun PPC KVM_REG_PPC_TM_GPR31 64 2266*4882a593Smuzhiyun PPC KVM_REG_PPC_TM_VSR0 128 2267*4882a593Smuzhiyun ... 2268*4882a593Smuzhiyun PPC KVM_REG_PPC_TM_VSR63 128 2269*4882a593Smuzhiyun PPC KVM_REG_PPC_TM_CR 64 2270*4882a593Smuzhiyun PPC KVM_REG_PPC_TM_LR 64 2271*4882a593Smuzhiyun PPC KVM_REG_PPC_TM_CTR 64 2272*4882a593Smuzhiyun PPC KVM_REG_PPC_TM_FPSCR 64 2273*4882a593Smuzhiyun PPC KVM_REG_PPC_TM_AMR 64 2274*4882a593Smuzhiyun PPC KVM_REG_PPC_TM_PPR 64 2275*4882a593Smuzhiyun PPC KVM_REG_PPC_TM_VRSAVE 64 2276*4882a593Smuzhiyun PPC KVM_REG_PPC_TM_VSCR 32 2277*4882a593Smuzhiyun PPC KVM_REG_PPC_TM_DSCR 64 2278*4882a593Smuzhiyun PPC KVM_REG_PPC_TM_TAR 64 2279*4882a593Smuzhiyun PPC KVM_REG_PPC_TM_XER 64 2280*4882a593Smuzhiyun 2281*4882a593Smuzhiyun MIPS KVM_REG_MIPS_R0 64 2282*4882a593Smuzhiyun ... 2283*4882a593Smuzhiyun MIPS KVM_REG_MIPS_R31 64 2284*4882a593Smuzhiyun MIPS KVM_REG_MIPS_HI 64 2285*4882a593Smuzhiyun MIPS KVM_REG_MIPS_LO 64 2286*4882a593Smuzhiyun MIPS KVM_REG_MIPS_PC 64 2287*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_INDEX 32 2288*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_ENTRYLO0 64 2289*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_ENTRYLO1 64 2290*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_CONTEXT 64 2291*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_CONTEXTCONFIG 32 2292*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_USERLOCAL 64 2293*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_XCONTEXTCONFIG 64 2294*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_PAGEMASK 32 2295*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_PAGEGRAIN 32 2296*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_SEGCTL0 64 2297*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_SEGCTL1 64 2298*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_SEGCTL2 64 2299*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_PWBASE 64 2300*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_PWFIELD 64 2301*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_PWSIZE 64 2302*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_WIRED 32 2303*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_PWCTL 32 2304*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_HWRENA 32 2305*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_BADVADDR 64 2306*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_BADINSTR 32 2307*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_BADINSTRP 32 2308*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_COUNT 32 2309*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_ENTRYHI 64 2310*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_COMPARE 32 2311*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_STATUS 32 2312*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_INTCTL 32 2313*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_CAUSE 32 2314*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_EPC 64 2315*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_PRID 32 2316*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_EBASE 64 2317*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_CONFIG 32 2318*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_CONFIG1 32 2319*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_CONFIG2 32 2320*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_CONFIG3 32 2321*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_CONFIG4 32 2322*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_CONFIG5 32 2323*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_CONFIG7 32 2324*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_XCONTEXT 64 2325*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_ERROREPC 64 2326*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_KSCRATCH1 64 2327*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_KSCRATCH2 64 2328*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_KSCRATCH3 64 2329*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_KSCRATCH4 64 2330*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_KSCRATCH5 64 2331*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_KSCRATCH6 64 2332*4882a593Smuzhiyun MIPS KVM_REG_MIPS_CP0_MAAR(0..63) 64 2333*4882a593Smuzhiyun MIPS KVM_REG_MIPS_COUNT_CTL 64 2334*4882a593Smuzhiyun MIPS KVM_REG_MIPS_COUNT_RESUME 64 2335*4882a593Smuzhiyun MIPS KVM_REG_MIPS_COUNT_HZ 64 2336*4882a593Smuzhiyun MIPS KVM_REG_MIPS_FPR_32(0..31) 32 2337*4882a593Smuzhiyun MIPS KVM_REG_MIPS_FPR_64(0..31) 64 2338*4882a593Smuzhiyun MIPS KVM_REG_MIPS_VEC_128(0..31) 128 2339*4882a593Smuzhiyun MIPS KVM_REG_MIPS_FCR_IR 32 2340*4882a593Smuzhiyun MIPS KVM_REG_MIPS_FCR_CSR 32 2341*4882a593Smuzhiyun MIPS KVM_REG_MIPS_MSA_IR 32 2342*4882a593Smuzhiyun MIPS KVM_REG_MIPS_MSA_CSR 32 2343*4882a593Smuzhiyun ======= =============================== ============ 2344*4882a593Smuzhiyun 2345*4882a593SmuzhiyunARM registers are mapped using the lower 32 bits. The upper 16 of that 2346*4882a593Smuzhiyunis the register group type, or coprocessor number: 2347*4882a593Smuzhiyun 2348*4882a593SmuzhiyunARM core registers have the following id bit patterns:: 2349*4882a593Smuzhiyun 2350*4882a593Smuzhiyun 0x4020 0000 0010 <index into the kvm_regs struct:16> 2351*4882a593Smuzhiyun 2352*4882a593SmuzhiyunARM 32-bit CP15 registers have the following id bit patterns:: 2353*4882a593Smuzhiyun 2354*4882a593Smuzhiyun 0x4020 0000 000F <zero:1> <crn:4> <crm:4> <opc1:4> <opc2:3> 2355*4882a593Smuzhiyun 2356*4882a593SmuzhiyunARM 64-bit CP15 registers have the following id bit patterns:: 2357*4882a593Smuzhiyun 2358*4882a593Smuzhiyun 0x4030 0000 000F <zero:1> <zero:4> <crm:4> <opc1:4> <zero:3> 2359*4882a593Smuzhiyun 2360*4882a593SmuzhiyunARM CCSIDR registers are demultiplexed by CSSELR value:: 2361*4882a593Smuzhiyun 2362*4882a593Smuzhiyun 0x4020 0000 0011 00 <csselr:8> 2363*4882a593Smuzhiyun 2364*4882a593SmuzhiyunARM 32-bit VFP control registers have the following id bit patterns:: 2365*4882a593Smuzhiyun 2366*4882a593Smuzhiyun 0x4020 0000 0012 1 <regno:12> 2367*4882a593Smuzhiyun 2368*4882a593SmuzhiyunARM 64-bit FP registers have the following id bit patterns:: 2369*4882a593Smuzhiyun 2370*4882a593Smuzhiyun 0x4030 0000 0012 0 <regno:12> 2371*4882a593Smuzhiyun 2372*4882a593SmuzhiyunARM firmware pseudo-registers have the following bit pattern:: 2373*4882a593Smuzhiyun 2374*4882a593Smuzhiyun 0x4030 0000 0014 <regno:16> 2375*4882a593Smuzhiyun 2376*4882a593Smuzhiyun 2377*4882a593Smuzhiyunarm64 registers are mapped using the lower 32 bits. The upper 16 of 2378*4882a593Smuzhiyunthat is the register group type, or coprocessor number: 2379*4882a593Smuzhiyun 2380*4882a593Smuzhiyunarm64 core/FP-SIMD registers have the following id bit patterns. Note 2381*4882a593Smuzhiyunthat the size of the access is variable, as the kvm_regs structure 2382*4882a593Smuzhiyuncontains elements ranging from 32 to 128 bits. The index is a 32bit 2383*4882a593Smuzhiyunvalue in the kvm_regs structure seen as a 32bit array:: 2384*4882a593Smuzhiyun 2385*4882a593Smuzhiyun 0x60x0 0000 0010 <index into the kvm_regs struct:16> 2386*4882a593Smuzhiyun 2387*4882a593SmuzhiyunSpecifically: 2388*4882a593Smuzhiyun 2389*4882a593Smuzhiyun======================= ========= ===== ======================================= 2390*4882a593Smuzhiyun Encoding Register Bits kvm_regs member 2391*4882a593Smuzhiyun======================= ========= ===== ======================================= 2392*4882a593Smuzhiyun 0x6030 0000 0010 0000 X0 64 regs.regs[0] 2393*4882a593Smuzhiyun 0x6030 0000 0010 0002 X1 64 regs.regs[1] 2394*4882a593Smuzhiyun ... 2395*4882a593Smuzhiyun 0x6030 0000 0010 003c X30 64 regs.regs[30] 2396*4882a593Smuzhiyun 0x6030 0000 0010 003e SP 64 regs.sp 2397*4882a593Smuzhiyun 0x6030 0000 0010 0040 PC 64 regs.pc 2398*4882a593Smuzhiyun 0x6030 0000 0010 0042 PSTATE 64 regs.pstate 2399*4882a593Smuzhiyun 0x6030 0000 0010 0044 SP_EL1 64 sp_el1 2400*4882a593Smuzhiyun 0x6030 0000 0010 0046 ELR_EL1 64 elr_el1 2401*4882a593Smuzhiyun 0x6030 0000 0010 0048 SPSR_EL1 64 spsr[KVM_SPSR_EL1] (alias SPSR_SVC) 2402*4882a593Smuzhiyun 0x6030 0000 0010 004a SPSR_ABT 64 spsr[KVM_SPSR_ABT] 2403*4882a593Smuzhiyun 0x6030 0000 0010 004c SPSR_UND 64 spsr[KVM_SPSR_UND] 2404*4882a593Smuzhiyun 0x6030 0000 0010 004e SPSR_IRQ 64 spsr[KVM_SPSR_IRQ] 2405*4882a593Smuzhiyun 0x6060 0000 0010 0050 SPSR_FIQ 64 spsr[KVM_SPSR_FIQ] 2406*4882a593Smuzhiyun 0x6040 0000 0010 0054 V0 128 fp_regs.vregs[0] [1]_ 2407*4882a593Smuzhiyun 0x6040 0000 0010 0058 V1 128 fp_regs.vregs[1] [1]_ 2408*4882a593Smuzhiyun ... 2409*4882a593Smuzhiyun 0x6040 0000 0010 00d0 V31 128 fp_regs.vregs[31] [1]_ 2410*4882a593Smuzhiyun 0x6020 0000 0010 00d4 FPSR 32 fp_regs.fpsr 2411*4882a593Smuzhiyun 0x6020 0000 0010 00d5 FPCR 32 fp_regs.fpcr 2412*4882a593Smuzhiyun======================= ========= ===== ======================================= 2413*4882a593Smuzhiyun 2414*4882a593Smuzhiyun.. [1] These encodings are not accepted for SVE-enabled vcpus. See 2415*4882a593Smuzhiyun KVM_ARM_VCPU_INIT. 2416*4882a593Smuzhiyun 2417*4882a593Smuzhiyun The equivalent register content can be accessed via bits [127:0] of 2418*4882a593Smuzhiyun the corresponding SVE Zn registers instead for vcpus that have SVE 2419*4882a593Smuzhiyun enabled (see below). 2420*4882a593Smuzhiyun 2421*4882a593Smuzhiyunarm64 CCSIDR registers are demultiplexed by CSSELR value:: 2422*4882a593Smuzhiyun 2423*4882a593Smuzhiyun 0x6020 0000 0011 00 <csselr:8> 2424*4882a593Smuzhiyun 2425*4882a593Smuzhiyunarm64 system registers have the following id bit patterns:: 2426*4882a593Smuzhiyun 2427*4882a593Smuzhiyun 0x6030 0000 0013 <op0:2> <op1:3> <crn:4> <crm:4> <op2:3> 2428*4882a593Smuzhiyun 2429*4882a593Smuzhiyun.. warning:: 2430*4882a593Smuzhiyun 2431*4882a593Smuzhiyun Two system register IDs do not follow the specified pattern. These 2432*4882a593Smuzhiyun are KVM_REG_ARM_TIMER_CVAL and KVM_REG_ARM_TIMER_CNT, which map to 2433*4882a593Smuzhiyun system registers CNTV_CVAL_EL0 and CNTVCT_EL0 respectively. These 2434*4882a593Smuzhiyun two had their values accidentally swapped, which means TIMER_CVAL is 2435*4882a593Smuzhiyun derived from the register encoding for CNTVCT_EL0 and TIMER_CNT is 2436*4882a593Smuzhiyun derived from the register encoding for CNTV_CVAL_EL0. As this is 2437*4882a593Smuzhiyun API, it must remain this way. 2438*4882a593Smuzhiyun 2439*4882a593Smuzhiyunarm64 firmware pseudo-registers have the following bit pattern:: 2440*4882a593Smuzhiyun 2441*4882a593Smuzhiyun 0x6030 0000 0014 <regno:16> 2442*4882a593Smuzhiyun 2443*4882a593Smuzhiyunarm64 SVE registers have the following bit patterns:: 2444*4882a593Smuzhiyun 2445*4882a593Smuzhiyun 0x6080 0000 0015 00 <n:5> <slice:5> Zn bits[2048*slice + 2047 : 2048*slice] 2446*4882a593Smuzhiyun 0x6050 0000 0015 04 <n:4> <slice:5> Pn bits[256*slice + 255 : 256*slice] 2447*4882a593Smuzhiyun 0x6050 0000 0015 060 <slice:5> FFR bits[256*slice + 255 : 256*slice] 2448*4882a593Smuzhiyun 0x6060 0000 0015 ffff KVM_REG_ARM64_SVE_VLS pseudo-register 2449*4882a593Smuzhiyun 2450*4882a593SmuzhiyunAccess to register IDs where 2048 * slice >= 128 * max_vq will fail with 2451*4882a593SmuzhiyunENOENT. max_vq is the vcpu's maximum supported vector length in 128-bit 2452*4882a593Smuzhiyunquadwords: see [2]_ below. 2453*4882a593Smuzhiyun 2454*4882a593SmuzhiyunThese registers are only accessible on vcpus for which SVE is enabled. 2455*4882a593SmuzhiyunSee KVM_ARM_VCPU_INIT for details. 2456*4882a593Smuzhiyun 2457*4882a593SmuzhiyunIn addition, except for KVM_REG_ARM64_SVE_VLS, these registers are not 2458*4882a593Smuzhiyunaccessible until the vcpu's SVE configuration has been finalized 2459*4882a593Smuzhiyunusing KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE). See KVM_ARM_VCPU_INIT 2460*4882a593Smuzhiyunand KVM_ARM_VCPU_FINALIZE for more information about this procedure. 2461*4882a593Smuzhiyun 2462*4882a593SmuzhiyunKVM_REG_ARM64_SVE_VLS is a pseudo-register that allows the set of vector 2463*4882a593Smuzhiyunlengths supported by the vcpu to be discovered and configured by 2464*4882a593Smuzhiyunuserspace. When transferred to or from user memory via KVM_GET_ONE_REG 2465*4882a593Smuzhiyunor KVM_SET_ONE_REG, the value of this register is of type 2466*4882a593Smuzhiyun__u64[KVM_ARM64_SVE_VLS_WORDS], and encodes the set of vector lengths as 2467*4882a593Smuzhiyunfollows:: 2468*4882a593Smuzhiyun 2469*4882a593Smuzhiyun __u64 vector_lengths[KVM_ARM64_SVE_VLS_WORDS]; 2470*4882a593Smuzhiyun 2471*4882a593Smuzhiyun if (vq >= SVE_VQ_MIN && vq <= SVE_VQ_MAX && 2472*4882a593Smuzhiyun ((vector_lengths[(vq - KVM_ARM64_SVE_VQ_MIN) / 64] >> 2473*4882a593Smuzhiyun ((vq - KVM_ARM64_SVE_VQ_MIN) % 64)) & 1)) 2474*4882a593Smuzhiyun /* Vector length vq * 16 bytes supported */ 2475*4882a593Smuzhiyun else 2476*4882a593Smuzhiyun /* Vector length vq * 16 bytes not supported */ 2477*4882a593Smuzhiyun 2478*4882a593Smuzhiyun.. [2] The maximum value vq for which the above condition is true is 2479*4882a593Smuzhiyun max_vq. This is the maximum vector length available to the guest on 2480*4882a593Smuzhiyun this vcpu, and determines which register slices are visible through 2481*4882a593Smuzhiyun this ioctl interface. 2482*4882a593Smuzhiyun 2483*4882a593Smuzhiyun(See Documentation/arm64/sve.rst for an explanation of the "vq" 2484*4882a593Smuzhiyunnomenclature.) 2485*4882a593Smuzhiyun 2486*4882a593SmuzhiyunKVM_REG_ARM64_SVE_VLS is only accessible after KVM_ARM_VCPU_INIT. 2487*4882a593SmuzhiyunKVM_ARM_VCPU_INIT initialises it to the best set of vector lengths that 2488*4882a593Smuzhiyunthe host supports. 2489*4882a593Smuzhiyun 2490*4882a593SmuzhiyunUserspace may subsequently modify it if desired until the vcpu's SVE 2491*4882a593Smuzhiyunconfiguration is finalized using KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE). 2492*4882a593Smuzhiyun 2493*4882a593SmuzhiyunApart from simply removing all vector lengths from the host set that 2494*4882a593Smuzhiyunexceed some value, support for arbitrarily chosen sets of vector lengths 2495*4882a593Smuzhiyunis hardware-dependent and may not be available. Attempting to configure 2496*4882a593Smuzhiyunan invalid set of vector lengths via KVM_SET_ONE_REG will fail with 2497*4882a593SmuzhiyunEINVAL. 2498*4882a593Smuzhiyun 2499*4882a593SmuzhiyunAfter the vcpu's SVE configuration is finalized, further attempts to 2500*4882a593Smuzhiyunwrite this register will fail with EPERM. 2501*4882a593Smuzhiyun 2502*4882a593Smuzhiyun 2503*4882a593SmuzhiyunMIPS registers are mapped using the lower 32 bits. The upper 16 of that is 2504*4882a593Smuzhiyunthe register group type: 2505*4882a593Smuzhiyun 2506*4882a593SmuzhiyunMIPS core registers (see above) have the following id bit patterns:: 2507*4882a593Smuzhiyun 2508*4882a593Smuzhiyun 0x7030 0000 0000 <reg:16> 2509*4882a593Smuzhiyun 2510*4882a593SmuzhiyunMIPS CP0 registers (see KVM_REG_MIPS_CP0_* above) have the following id bit 2511*4882a593Smuzhiyunpatterns depending on whether they're 32-bit or 64-bit registers:: 2512*4882a593Smuzhiyun 2513*4882a593Smuzhiyun 0x7020 0000 0001 00 <reg:5> <sel:3> (32-bit) 2514*4882a593Smuzhiyun 0x7030 0000 0001 00 <reg:5> <sel:3> (64-bit) 2515*4882a593Smuzhiyun 2516*4882a593SmuzhiyunNote: KVM_REG_MIPS_CP0_ENTRYLO0 and KVM_REG_MIPS_CP0_ENTRYLO1 are the MIPS64 2517*4882a593Smuzhiyunversions of the EntryLo registers regardless of the word size of the host 2518*4882a593Smuzhiyunhardware, host kernel, guest, and whether XPA is present in the guest, i.e. 2519*4882a593Smuzhiyunwith the RI and XI bits (if they exist) in bits 63 and 62 respectively, and 2520*4882a593Smuzhiyunthe PFNX field starting at bit 30. 2521*4882a593Smuzhiyun 2522*4882a593SmuzhiyunMIPS MAARs (see KVM_REG_MIPS_CP0_MAAR(*) above) have the following id bit 2523*4882a593Smuzhiyunpatterns:: 2524*4882a593Smuzhiyun 2525*4882a593Smuzhiyun 0x7030 0000 0001 01 <reg:8> 2526*4882a593Smuzhiyun 2527*4882a593SmuzhiyunMIPS KVM control registers (see above) have the following id bit patterns:: 2528*4882a593Smuzhiyun 2529*4882a593Smuzhiyun 0x7030 0000 0002 <reg:16> 2530*4882a593Smuzhiyun 2531*4882a593SmuzhiyunMIPS FPU registers (see KVM_REG_MIPS_FPR_{32,64}() above) have the following 2532*4882a593Smuzhiyunid bit patterns depending on the size of the register being accessed. They are 2533*4882a593Smuzhiyunalways accessed according to the current guest FPU mode (Status.FR and 2534*4882a593SmuzhiyunConfig5.FRE), i.e. as the guest would see them, and they become unpredictable 2535*4882a593Smuzhiyunif the guest FPU mode is changed. MIPS SIMD Architecture (MSA) vector 2536*4882a593Smuzhiyunregisters (see KVM_REG_MIPS_VEC_128() above) have similar patterns as they 2537*4882a593Smuzhiyunoverlap the FPU registers:: 2538*4882a593Smuzhiyun 2539*4882a593Smuzhiyun 0x7020 0000 0003 00 <0:3> <reg:5> (32-bit FPU registers) 2540*4882a593Smuzhiyun 0x7030 0000 0003 00 <0:3> <reg:5> (64-bit FPU registers) 2541*4882a593Smuzhiyun 0x7040 0000 0003 00 <0:3> <reg:5> (128-bit MSA vector registers) 2542*4882a593Smuzhiyun 2543*4882a593SmuzhiyunMIPS FPU control registers (see KVM_REG_MIPS_FCR_{IR,CSR} above) have the 2544*4882a593Smuzhiyunfollowing id bit patterns:: 2545*4882a593Smuzhiyun 2546*4882a593Smuzhiyun 0x7020 0000 0003 01 <0:3> <reg:5> 2547*4882a593Smuzhiyun 2548*4882a593SmuzhiyunMIPS MSA control registers (see KVM_REG_MIPS_MSA_{IR,CSR} above) have the 2549*4882a593Smuzhiyunfollowing id bit patterns:: 2550*4882a593Smuzhiyun 2551*4882a593Smuzhiyun 0x7020 0000 0003 02 <0:3> <reg:5> 2552*4882a593Smuzhiyun 2553*4882a593Smuzhiyun 2554*4882a593Smuzhiyun4.69 KVM_GET_ONE_REG 2555*4882a593Smuzhiyun-------------------- 2556*4882a593Smuzhiyun 2557*4882a593Smuzhiyun:Capability: KVM_CAP_ONE_REG 2558*4882a593Smuzhiyun:Architectures: all 2559*4882a593Smuzhiyun:Type: vcpu ioctl 2560*4882a593Smuzhiyun:Parameters: struct kvm_one_reg (in and out) 2561*4882a593Smuzhiyun:Returns: 0 on success, negative value on failure 2562*4882a593Smuzhiyun 2563*4882a593SmuzhiyunErrors include: 2564*4882a593Smuzhiyun 2565*4882a593Smuzhiyun ======== ============================================================ 2566*4882a593Smuzhiyun ENOENT no such register 2567*4882a593Smuzhiyun EINVAL invalid register ID, or no such register or used with VMs in 2568*4882a593Smuzhiyun protected virtualization mode on s390 2569*4882a593Smuzhiyun EPERM (arm64) register access not allowed before vcpu finalization 2570*4882a593Smuzhiyun ======== ============================================================ 2571*4882a593Smuzhiyun 2572*4882a593Smuzhiyun(These error codes are indicative only: do not rely on a specific error 2573*4882a593Smuzhiyuncode being returned in a specific situation.) 2574*4882a593Smuzhiyun 2575*4882a593SmuzhiyunThis ioctl allows to receive the value of a single register implemented 2576*4882a593Smuzhiyunin a vcpu. The register to read is indicated by the "id" field of the 2577*4882a593Smuzhiyunkvm_one_reg struct passed in. On success, the register value can be found 2578*4882a593Smuzhiyunat the memory location pointed to by "addr". 2579*4882a593Smuzhiyun 2580*4882a593SmuzhiyunThe list of registers accessible using this interface is identical to the 2581*4882a593Smuzhiyunlist in 4.68. 2582*4882a593Smuzhiyun 2583*4882a593Smuzhiyun 2584*4882a593Smuzhiyun4.70 KVM_KVMCLOCK_CTRL 2585*4882a593Smuzhiyun---------------------- 2586*4882a593Smuzhiyun 2587*4882a593Smuzhiyun:Capability: KVM_CAP_KVMCLOCK_CTRL 2588*4882a593Smuzhiyun:Architectures: Any that implement pvclocks (currently x86 only) 2589*4882a593Smuzhiyun:Type: vcpu ioctl 2590*4882a593Smuzhiyun:Parameters: None 2591*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 2592*4882a593Smuzhiyun 2593*4882a593SmuzhiyunThis ioctl sets a flag accessible to the guest indicating that the specified 2594*4882a593SmuzhiyunvCPU has been paused by the host userspace. 2595*4882a593Smuzhiyun 2596*4882a593SmuzhiyunThe host will set a flag in the pvclock structure that is checked from the 2597*4882a593Smuzhiyunsoft lockup watchdog. The flag is part of the pvclock structure that is 2598*4882a593Smuzhiyunshared between guest and host, specifically the second bit of the flags 2599*4882a593Smuzhiyunfield of the pvclock_vcpu_time_info structure. It will be set exclusively by 2600*4882a593Smuzhiyunthe host and read/cleared exclusively by the guest. The guest operation of 2601*4882a593Smuzhiyunchecking and clearing the flag must be an atomic operation so 2602*4882a593Smuzhiyunload-link/store-conditional, or equivalent must be used. There are two cases 2603*4882a593Smuzhiyunwhere the guest will clear the flag: when the soft lockup watchdog timer resets 2604*4882a593Smuzhiyunitself or when a soft lockup is detected. This ioctl can be called any time 2605*4882a593Smuzhiyunafter pausing the vcpu, but before it is resumed. 2606*4882a593Smuzhiyun 2607*4882a593Smuzhiyun 2608*4882a593Smuzhiyun4.71 KVM_SIGNAL_MSI 2609*4882a593Smuzhiyun------------------- 2610*4882a593Smuzhiyun 2611*4882a593Smuzhiyun:Capability: KVM_CAP_SIGNAL_MSI 2612*4882a593Smuzhiyun:Architectures: x86 arm arm64 2613*4882a593Smuzhiyun:Type: vm ioctl 2614*4882a593Smuzhiyun:Parameters: struct kvm_msi (in) 2615*4882a593Smuzhiyun:Returns: >0 on delivery, 0 if guest blocked the MSI, and -1 on error 2616*4882a593Smuzhiyun 2617*4882a593SmuzhiyunDirectly inject a MSI message. Only valid with in-kernel irqchip that handles 2618*4882a593SmuzhiyunMSI messages. 2619*4882a593Smuzhiyun 2620*4882a593Smuzhiyun:: 2621*4882a593Smuzhiyun 2622*4882a593Smuzhiyun struct kvm_msi { 2623*4882a593Smuzhiyun __u32 address_lo; 2624*4882a593Smuzhiyun __u32 address_hi; 2625*4882a593Smuzhiyun __u32 data; 2626*4882a593Smuzhiyun __u32 flags; 2627*4882a593Smuzhiyun __u32 devid; 2628*4882a593Smuzhiyun __u8 pad[12]; 2629*4882a593Smuzhiyun }; 2630*4882a593Smuzhiyun 2631*4882a593Smuzhiyunflags: 2632*4882a593Smuzhiyun KVM_MSI_VALID_DEVID: devid contains a valid value. The per-VM 2633*4882a593Smuzhiyun KVM_CAP_MSI_DEVID capability advertises the requirement to provide 2634*4882a593Smuzhiyun the device ID. If this capability is not available, userspace 2635*4882a593Smuzhiyun should never set the KVM_MSI_VALID_DEVID flag as the ioctl might fail. 2636*4882a593Smuzhiyun 2637*4882a593SmuzhiyunIf KVM_MSI_VALID_DEVID is set, devid contains a unique device identifier 2638*4882a593Smuzhiyunfor the device that wrote the MSI message. For PCI, this is usually a 2639*4882a593SmuzhiyunBFD identifier in the lower 16 bits. 2640*4882a593Smuzhiyun 2641*4882a593SmuzhiyunOn x86, address_hi is ignored unless the KVM_X2APIC_API_USE_32BIT_IDS 2642*4882a593Smuzhiyunfeature of KVM_CAP_X2APIC_API capability is enabled. If it is enabled, 2643*4882a593Smuzhiyunaddress_hi bits 31-8 provide bits 31-8 of the destination id. Bits 7-0 of 2644*4882a593Smuzhiyunaddress_hi must be zero. 2645*4882a593Smuzhiyun 2646*4882a593Smuzhiyun 2647*4882a593Smuzhiyun4.71 KVM_CREATE_PIT2 2648*4882a593Smuzhiyun-------------------- 2649*4882a593Smuzhiyun 2650*4882a593Smuzhiyun:Capability: KVM_CAP_PIT2 2651*4882a593Smuzhiyun:Architectures: x86 2652*4882a593Smuzhiyun:Type: vm ioctl 2653*4882a593Smuzhiyun:Parameters: struct kvm_pit_config (in) 2654*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 2655*4882a593Smuzhiyun 2656*4882a593SmuzhiyunCreates an in-kernel device model for the i8254 PIT. This call is only valid 2657*4882a593Smuzhiyunafter enabling in-kernel irqchip support via KVM_CREATE_IRQCHIP. The following 2658*4882a593Smuzhiyunparameters have to be passed:: 2659*4882a593Smuzhiyun 2660*4882a593Smuzhiyun struct kvm_pit_config { 2661*4882a593Smuzhiyun __u32 flags; 2662*4882a593Smuzhiyun __u32 pad[15]; 2663*4882a593Smuzhiyun }; 2664*4882a593Smuzhiyun 2665*4882a593SmuzhiyunValid flags are:: 2666*4882a593Smuzhiyun 2667*4882a593Smuzhiyun #define KVM_PIT_SPEAKER_DUMMY 1 /* emulate speaker port stub */ 2668*4882a593Smuzhiyun 2669*4882a593SmuzhiyunPIT timer interrupts may use a per-VM kernel thread for injection. If it 2670*4882a593Smuzhiyunexists, this thread will have a name of the following pattern:: 2671*4882a593Smuzhiyun 2672*4882a593Smuzhiyun kvm-pit/<owner-process-pid> 2673*4882a593Smuzhiyun 2674*4882a593SmuzhiyunWhen running a guest with elevated priorities, the scheduling parameters of 2675*4882a593Smuzhiyunthis thread may have to be adjusted accordingly. 2676*4882a593Smuzhiyun 2677*4882a593SmuzhiyunThis IOCTL replaces the obsolete KVM_CREATE_PIT. 2678*4882a593Smuzhiyun 2679*4882a593Smuzhiyun 2680*4882a593Smuzhiyun4.72 KVM_GET_PIT2 2681*4882a593Smuzhiyun----------------- 2682*4882a593Smuzhiyun 2683*4882a593Smuzhiyun:Capability: KVM_CAP_PIT_STATE2 2684*4882a593Smuzhiyun:Architectures: x86 2685*4882a593Smuzhiyun:Type: vm ioctl 2686*4882a593Smuzhiyun:Parameters: struct kvm_pit_state2 (out) 2687*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 2688*4882a593Smuzhiyun 2689*4882a593SmuzhiyunRetrieves the state of the in-kernel PIT model. Only valid after 2690*4882a593SmuzhiyunKVM_CREATE_PIT2. The state is returned in the following structure:: 2691*4882a593Smuzhiyun 2692*4882a593Smuzhiyun struct kvm_pit_state2 { 2693*4882a593Smuzhiyun struct kvm_pit_channel_state channels[3]; 2694*4882a593Smuzhiyun __u32 flags; 2695*4882a593Smuzhiyun __u32 reserved[9]; 2696*4882a593Smuzhiyun }; 2697*4882a593Smuzhiyun 2698*4882a593SmuzhiyunValid flags are:: 2699*4882a593Smuzhiyun 2700*4882a593Smuzhiyun /* disable PIT in HPET legacy mode */ 2701*4882a593Smuzhiyun #define KVM_PIT_FLAGS_HPET_LEGACY 0x00000001 2702*4882a593Smuzhiyun 2703*4882a593SmuzhiyunThis IOCTL replaces the obsolete KVM_GET_PIT. 2704*4882a593Smuzhiyun 2705*4882a593Smuzhiyun 2706*4882a593Smuzhiyun4.73 KVM_SET_PIT2 2707*4882a593Smuzhiyun----------------- 2708*4882a593Smuzhiyun 2709*4882a593Smuzhiyun:Capability: KVM_CAP_PIT_STATE2 2710*4882a593Smuzhiyun:Architectures: x86 2711*4882a593Smuzhiyun:Type: vm ioctl 2712*4882a593Smuzhiyun:Parameters: struct kvm_pit_state2 (in) 2713*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 2714*4882a593Smuzhiyun 2715*4882a593SmuzhiyunSets the state of the in-kernel PIT model. Only valid after KVM_CREATE_PIT2. 2716*4882a593SmuzhiyunSee KVM_GET_PIT2 for details on struct kvm_pit_state2. 2717*4882a593Smuzhiyun 2718*4882a593SmuzhiyunThis IOCTL replaces the obsolete KVM_SET_PIT. 2719*4882a593Smuzhiyun 2720*4882a593Smuzhiyun 2721*4882a593Smuzhiyun4.74 KVM_PPC_GET_SMMU_INFO 2722*4882a593Smuzhiyun-------------------------- 2723*4882a593Smuzhiyun 2724*4882a593Smuzhiyun:Capability: KVM_CAP_PPC_GET_SMMU_INFO 2725*4882a593Smuzhiyun:Architectures: powerpc 2726*4882a593Smuzhiyun:Type: vm ioctl 2727*4882a593Smuzhiyun:Parameters: None 2728*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 2729*4882a593Smuzhiyun 2730*4882a593SmuzhiyunThis populates and returns a structure describing the features of 2731*4882a593Smuzhiyunthe "Server" class MMU emulation supported by KVM. 2732*4882a593SmuzhiyunThis can in turn be used by userspace to generate the appropriate 2733*4882a593Smuzhiyundevice-tree properties for the guest operating system. 2734*4882a593Smuzhiyun 2735*4882a593SmuzhiyunThe structure contains some global information, followed by an 2736*4882a593Smuzhiyunarray of supported segment page sizes:: 2737*4882a593Smuzhiyun 2738*4882a593Smuzhiyun struct kvm_ppc_smmu_info { 2739*4882a593Smuzhiyun __u64 flags; 2740*4882a593Smuzhiyun __u32 slb_size; 2741*4882a593Smuzhiyun __u32 pad; 2742*4882a593Smuzhiyun struct kvm_ppc_one_seg_page_size sps[KVM_PPC_PAGE_SIZES_MAX_SZ]; 2743*4882a593Smuzhiyun }; 2744*4882a593Smuzhiyun 2745*4882a593SmuzhiyunThe supported flags are: 2746*4882a593Smuzhiyun 2747*4882a593Smuzhiyun - KVM_PPC_PAGE_SIZES_REAL: 2748*4882a593Smuzhiyun When that flag is set, guest page sizes must "fit" the backing 2749*4882a593Smuzhiyun store page sizes. When not set, any page size in the list can 2750*4882a593Smuzhiyun be used regardless of how they are backed by userspace. 2751*4882a593Smuzhiyun 2752*4882a593Smuzhiyun - KVM_PPC_1T_SEGMENTS 2753*4882a593Smuzhiyun The emulated MMU supports 1T segments in addition to the 2754*4882a593Smuzhiyun standard 256M ones. 2755*4882a593Smuzhiyun 2756*4882a593Smuzhiyun - KVM_PPC_NO_HASH 2757*4882a593Smuzhiyun This flag indicates that HPT guests are not supported by KVM, 2758*4882a593Smuzhiyun thus all guests must use radix MMU mode. 2759*4882a593Smuzhiyun 2760*4882a593SmuzhiyunThe "slb_size" field indicates how many SLB entries are supported 2761*4882a593Smuzhiyun 2762*4882a593SmuzhiyunThe "sps" array contains 8 entries indicating the supported base 2763*4882a593Smuzhiyunpage sizes for a segment in increasing order. Each entry is defined 2764*4882a593Smuzhiyunas follow:: 2765*4882a593Smuzhiyun 2766*4882a593Smuzhiyun struct kvm_ppc_one_seg_page_size { 2767*4882a593Smuzhiyun __u32 page_shift; /* Base page shift of segment (or 0) */ 2768*4882a593Smuzhiyun __u32 slb_enc; /* SLB encoding for BookS */ 2769*4882a593Smuzhiyun struct kvm_ppc_one_page_size enc[KVM_PPC_PAGE_SIZES_MAX_SZ]; 2770*4882a593Smuzhiyun }; 2771*4882a593Smuzhiyun 2772*4882a593SmuzhiyunAn entry with a "page_shift" of 0 is unused. Because the array is 2773*4882a593Smuzhiyunorganized in increasing order, a lookup can stop when encoutering 2774*4882a593Smuzhiyunsuch an entry. 2775*4882a593Smuzhiyun 2776*4882a593SmuzhiyunThe "slb_enc" field provides the encoding to use in the SLB for the 2777*4882a593Smuzhiyunpage size. The bits are in positions such as the value can directly 2778*4882a593Smuzhiyunbe OR'ed into the "vsid" argument of the slbmte instruction. 2779*4882a593Smuzhiyun 2780*4882a593SmuzhiyunThe "enc" array is a list which for each of those segment base page 2781*4882a593Smuzhiyunsize provides the list of supported actual page sizes (which can be 2782*4882a593Smuzhiyunonly larger or equal to the base page size), along with the 2783*4882a593Smuzhiyuncorresponding encoding in the hash PTE. Similarly, the array is 2784*4882a593Smuzhiyun8 entries sorted by increasing sizes and an entry with a "0" shift 2785*4882a593Smuzhiyunis an empty entry and a terminator:: 2786*4882a593Smuzhiyun 2787*4882a593Smuzhiyun struct kvm_ppc_one_page_size { 2788*4882a593Smuzhiyun __u32 page_shift; /* Page shift (or 0) */ 2789*4882a593Smuzhiyun __u32 pte_enc; /* Encoding in the HPTE (>>12) */ 2790*4882a593Smuzhiyun }; 2791*4882a593Smuzhiyun 2792*4882a593SmuzhiyunThe "pte_enc" field provides a value that can OR'ed into the hash 2793*4882a593SmuzhiyunPTE's RPN field (ie, it needs to be shifted left by 12 to OR it 2794*4882a593Smuzhiyuninto the hash PTE second double word). 2795*4882a593Smuzhiyun 2796*4882a593Smuzhiyun4.75 KVM_IRQFD 2797*4882a593Smuzhiyun-------------- 2798*4882a593Smuzhiyun 2799*4882a593Smuzhiyun:Capability: KVM_CAP_IRQFD 2800*4882a593Smuzhiyun:Architectures: x86 s390 arm arm64 2801*4882a593Smuzhiyun:Type: vm ioctl 2802*4882a593Smuzhiyun:Parameters: struct kvm_irqfd (in) 2803*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 2804*4882a593Smuzhiyun 2805*4882a593SmuzhiyunAllows setting an eventfd to directly trigger a guest interrupt. 2806*4882a593Smuzhiyunkvm_irqfd.fd specifies the file descriptor to use as the eventfd and 2807*4882a593Smuzhiyunkvm_irqfd.gsi specifies the irqchip pin toggled by this event. When 2808*4882a593Smuzhiyunan event is triggered on the eventfd, an interrupt is injected into 2809*4882a593Smuzhiyunthe guest using the specified gsi pin. The irqfd is removed using 2810*4882a593Smuzhiyunthe KVM_IRQFD_FLAG_DEASSIGN flag, specifying both kvm_irqfd.fd 2811*4882a593Smuzhiyunand kvm_irqfd.gsi. 2812*4882a593Smuzhiyun 2813*4882a593SmuzhiyunWith KVM_CAP_IRQFD_RESAMPLE, KVM_IRQFD supports a de-assert and notify 2814*4882a593Smuzhiyunmechanism allowing emulation of level-triggered, irqfd-based 2815*4882a593Smuzhiyuninterrupts. When KVM_IRQFD_FLAG_RESAMPLE is set the user must pass an 2816*4882a593Smuzhiyunadditional eventfd in the kvm_irqfd.resamplefd field. When operating 2817*4882a593Smuzhiyunin resample mode, posting of an interrupt through kvm_irq.fd asserts 2818*4882a593Smuzhiyunthe specified gsi in the irqchip. When the irqchip is resampled, such 2819*4882a593Smuzhiyunas from an EOI, the gsi is de-asserted and the user is notified via 2820*4882a593Smuzhiyunkvm_irqfd.resamplefd. It is the user's responsibility to re-queue 2821*4882a593Smuzhiyunthe interrupt if the device making use of it still requires service. 2822*4882a593SmuzhiyunNote that closing the resamplefd is not sufficient to disable the 2823*4882a593Smuzhiyunirqfd. The KVM_IRQFD_FLAG_RESAMPLE is only necessary on assignment 2824*4882a593Smuzhiyunand need not be specified with KVM_IRQFD_FLAG_DEASSIGN. 2825*4882a593Smuzhiyun 2826*4882a593SmuzhiyunOn arm/arm64, gsi routing being supported, the following can happen: 2827*4882a593Smuzhiyun 2828*4882a593Smuzhiyun- in case no routing entry is associated to this gsi, injection fails 2829*4882a593Smuzhiyun- in case the gsi is associated to an irqchip routing entry, 2830*4882a593Smuzhiyun irqchip.pin + 32 corresponds to the injected SPI ID. 2831*4882a593Smuzhiyun- in case the gsi is associated to an MSI routing entry, the MSI 2832*4882a593Smuzhiyun message and device ID are translated into an LPI (support restricted 2833*4882a593Smuzhiyun to GICv3 ITS in-kernel emulation). 2834*4882a593Smuzhiyun 2835*4882a593Smuzhiyun4.76 KVM_PPC_ALLOCATE_HTAB 2836*4882a593Smuzhiyun-------------------------- 2837*4882a593Smuzhiyun 2838*4882a593Smuzhiyun:Capability: KVM_CAP_PPC_ALLOC_HTAB 2839*4882a593Smuzhiyun:Architectures: powerpc 2840*4882a593Smuzhiyun:Type: vm ioctl 2841*4882a593Smuzhiyun:Parameters: Pointer to u32 containing hash table order (in/out) 2842*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 2843*4882a593Smuzhiyun 2844*4882a593SmuzhiyunThis requests the host kernel to allocate an MMU hash table for a 2845*4882a593Smuzhiyunguest using the PAPR paravirtualization interface. This only does 2846*4882a593Smuzhiyunanything if the kernel is configured to use the Book 3S HV style of 2847*4882a593Smuzhiyunvirtualization. Otherwise the capability doesn't exist and the ioctl 2848*4882a593Smuzhiyunreturns an ENOTTY error. The rest of this description assumes Book 3S 2849*4882a593SmuzhiyunHV. 2850*4882a593Smuzhiyun 2851*4882a593SmuzhiyunThere must be no vcpus running when this ioctl is called; if there 2852*4882a593Smuzhiyunare, it will do nothing and return an EBUSY error. 2853*4882a593Smuzhiyun 2854*4882a593SmuzhiyunThe parameter is a pointer to a 32-bit unsigned integer variable 2855*4882a593Smuzhiyuncontaining the order (log base 2) of the desired size of the hash 2856*4882a593Smuzhiyuntable, which must be between 18 and 46. On successful return from the 2857*4882a593Smuzhiyunioctl, the value will not be changed by the kernel. 2858*4882a593Smuzhiyun 2859*4882a593SmuzhiyunIf no hash table has been allocated when any vcpu is asked to run 2860*4882a593Smuzhiyun(with the KVM_RUN ioctl), the host kernel will allocate a 2861*4882a593Smuzhiyundefault-sized hash table (16 MB). 2862*4882a593Smuzhiyun 2863*4882a593SmuzhiyunIf this ioctl is called when a hash table has already been allocated, 2864*4882a593Smuzhiyunwith a different order from the existing hash table, the existing hash 2865*4882a593Smuzhiyuntable will be freed and a new one allocated. If this is ioctl is 2866*4882a593Smuzhiyuncalled when a hash table has already been allocated of the same order 2867*4882a593Smuzhiyunas specified, the kernel will clear out the existing hash table (zero 2868*4882a593Smuzhiyunall HPTEs). In either case, if the guest is using the virtualized 2869*4882a593Smuzhiyunreal-mode area (VRMA) facility, the kernel will re-create the VMRA 2870*4882a593SmuzhiyunHPTEs on the next KVM_RUN of any vcpu. 2871*4882a593Smuzhiyun 2872*4882a593Smuzhiyun4.77 KVM_S390_INTERRUPT 2873*4882a593Smuzhiyun----------------------- 2874*4882a593Smuzhiyun 2875*4882a593Smuzhiyun:Capability: basic 2876*4882a593Smuzhiyun:Architectures: s390 2877*4882a593Smuzhiyun:Type: vm ioctl, vcpu ioctl 2878*4882a593Smuzhiyun:Parameters: struct kvm_s390_interrupt (in) 2879*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 2880*4882a593Smuzhiyun 2881*4882a593SmuzhiyunAllows to inject an interrupt to the guest. Interrupts can be floating 2882*4882a593Smuzhiyun(vm ioctl) or per cpu (vcpu ioctl), depending on the interrupt type. 2883*4882a593Smuzhiyun 2884*4882a593SmuzhiyunInterrupt parameters are passed via kvm_s390_interrupt:: 2885*4882a593Smuzhiyun 2886*4882a593Smuzhiyun struct kvm_s390_interrupt { 2887*4882a593Smuzhiyun __u32 type; 2888*4882a593Smuzhiyun __u32 parm; 2889*4882a593Smuzhiyun __u64 parm64; 2890*4882a593Smuzhiyun }; 2891*4882a593Smuzhiyun 2892*4882a593Smuzhiyuntype can be one of the following: 2893*4882a593Smuzhiyun 2894*4882a593SmuzhiyunKVM_S390_SIGP_STOP (vcpu) 2895*4882a593Smuzhiyun - sigp stop; optional flags in parm 2896*4882a593SmuzhiyunKVM_S390_PROGRAM_INT (vcpu) 2897*4882a593Smuzhiyun - program check; code in parm 2898*4882a593SmuzhiyunKVM_S390_SIGP_SET_PREFIX (vcpu) 2899*4882a593Smuzhiyun - sigp set prefix; prefix address in parm 2900*4882a593SmuzhiyunKVM_S390_RESTART (vcpu) 2901*4882a593Smuzhiyun - restart 2902*4882a593SmuzhiyunKVM_S390_INT_CLOCK_COMP (vcpu) 2903*4882a593Smuzhiyun - clock comparator interrupt 2904*4882a593SmuzhiyunKVM_S390_INT_CPU_TIMER (vcpu) 2905*4882a593Smuzhiyun - CPU timer interrupt 2906*4882a593SmuzhiyunKVM_S390_INT_VIRTIO (vm) 2907*4882a593Smuzhiyun - virtio external interrupt; external interrupt 2908*4882a593Smuzhiyun parameters in parm and parm64 2909*4882a593SmuzhiyunKVM_S390_INT_SERVICE (vm) 2910*4882a593Smuzhiyun - sclp external interrupt; sclp parameter in parm 2911*4882a593SmuzhiyunKVM_S390_INT_EMERGENCY (vcpu) 2912*4882a593Smuzhiyun - sigp emergency; source cpu in parm 2913*4882a593SmuzhiyunKVM_S390_INT_EXTERNAL_CALL (vcpu) 2914*4882a593Smuzhiyun - sigp external call; source cpu in parm 2915*4882a593SmuzhiyunKVM_S390_INT_IO(ai,cssid,ssid,schid) (vm) 2916*4882a593Smuzhiyun - compound value to indicate an 2917*4882a593Smuzhiyun I/O interrupt (ai - adapter interrupt; cssid,ssid,schid - subchannel); 2918*4882a593Smuzhiyun I/O interruption parameters in parm (subchannel) and parm64 (intparm, 2919*4882a593Smuzhiyun interruption subclass) 2920*4882a593SmuzhiyunKVM_S390_MCHK (vm, vcpu) 2921*4882a593Smuzhiyun - machine check interrupt; cr 14 bits in parm, machine check interrupt 2922*4882a593Smuzhiyun code in parm64 (note that machine checks needing further payload are not 2923*4882a593Smuzhiyun supported by this ioctl) 2924*4882a593Smuzhiyun 2925*4882a593SmuzhiyunThis is an asynchronous vcpu ioctl and can be invoked from any thread. 2926*4882a593Smuzhiyun 2927*4882a593Smuzhiyun4.78 KVM_PPC_GET_HTAB_FD 2928*4882a593Smuzhiyun------------------------ 2929*4882a593Smuzhiyun 2930*4882a593Smuzhiyun:Capability: KVM_CAP_PPC_HTAB_FD 2931*4882a593Smuzhiyun:Architectures: powerpc 2932*4882a593Smuzhiyun:Type: vm ioctl 2933*4882a593Smuzhiyun:Parameters: Pointer to struct kvm_get_htab_fd (in) 2934*4882a593Smuzhiyun:Returns: file descriptor number (>= 0) on success, -1 on error 2935*4882a593Smuzhiyun 2936*4882a593SmuzhiyunThis returns a file descriptor that can be used either to read out the 2937*4882a593Smuzhiyunentries in the guest's hashed page table (HPT), or to write entries to 2938*4882a593Smuzhiyuninitialize the HPT. The returned fd can only be written to if the 2939*4882a593SmuzhiyunKVM_GET_HTAB_WRITE bit is set in the flags field of the argument, and 2940*4882a593Smuzhiyuncan only be read if that bit is clear. The argument struct looks like 2941*4882a593Smuzhiyunthis:: 2942*4882a593Smuzhiyun 2943*4882a593Smuzhiyun /* For KVM_PPC_GET_HTAB_FD */ 2944*4882a593Smuzhiyun struct kvm_get_htab_fd { 2945*4882a593Smuzhiyun __u64 flags; 2946*4882a593Smuzhiyun __u64 start_index; 2947*4882a593Smuzhiyun __u64 reserved[2]; 2948*4882a593Smuzhiyun }; 2949*4882a593Smuzhiyun 2950*4882a593Smuzhiyun /* Values for kvm_get_htab_fd.flags */ 2951*4882a593Smuzhiyun #define KVM_GET_HTAB_BOLTED_ONLY ((__u64)0x1) 2952*4882a593Smuzhiyun #define KVM_GET_HTAB_WRITE ((__u64)0x2) 2953*4882a593Smuzhiyun 2954*4882a593SmuzhiyunThe 'start_index' field gives the index in the HPT of the entry at 2955*4882a593Smuzhiyunwhich to start reading. It is ignored when writing. 2956*4882a593Smuzhiyun 2957*4882a593SmuzhiyunReads on the fd will initially supply information about all 2958*4882a593Smuzhiyun"interesting" HPT entries. Interesting entries are those with the 2959*4882a593Smuzhiyunbolted bit set, if the KVM_GET_HTAB_BOLTED_ONLY bit is set, otherwise 2960*4882a593Smuzhiyunall entries. When the end of the HPT is reached, the read() will 2961*4882a593Smuzhiyunreturn. If read() is called again on the fd, it will start again from 2962*4882a593Smuzhiyunthe beginning of the HPT, but will only return HPT entries that have 2963*4882a593Smuzhiyunchanged since they were last read. 2964*4882a593Smuzhiyun 2965*4882a593SmuzhiyunData read or written is structured as a header (8 bytes) followed by a 2966*4882a593Smuzhiyunseries of valid HPT entries (16 bytes) each. The header indicates how 2967*4882a593Smuzhiyunmany valid HPT entries there are and how many invalid entries follow 2968*4882a593Smuzhiyunthe valid entries. The invalid entries are not represented explicitly 2969*4882a593Smuzhiyunin the stream. The header format is:: 2970*4882a593Smuzhiyun 2971*4882a593Smuzhiyun struct kvm_get_htab_header { 2972*4882a593Smuzhiyun __u32 index; 2973*4882a593Smuzhiyun __u16 n_valid; 2974*4882a593Smuzhiyun __u16 n_invalid; 2975*4882a593Smuzhiyun }; 2976*4882a593Smuzhiyun 2977*4882a593SmuzhiyunWrites to the fd create HPT entries starting at the index given in the 2978*4882a593Smuzhiyunheader; first 'n_valid' valid entries with contents from the data 2979*4882a593Smuzhiyunwritten, then 'n_invalid' invalid entries, invalidating any previously 2980*4882a593Smuzhiyunvalid entries found. 2981*4882a593Smuzhiyun 2982*4882a593Smuzhiyun4.79 KVM_CREATE_DEVICE 2983*4882a593Smuzhiyun---------------------- 2984*4882a593Smuzhiyun 2985*4882a593Smuzhiyun:Capability: KVM_CAP_DEVICE_CTRL 2986*4882a593Smuzhiyun:Type: vm ioctl 2987*4882a593Smuzhiyun:Parameters: struct kvm_create_device (in/out) 2988*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 2989*4882a593Smuzhiyun 2990*4882a593SmuzhiyunErrors: 2991*4882a593Smuzhiyun 2992*4882a593Smuzhiyun ====== ======================================================= 2993*4882a593Smuzhiyun ENODEV The device type is unknown or unsupported 2994*4882a593Smuzhiyun EEXIST Device already created, and this type of device may not 2995*4882a593Smuzhiyun be instantiated multiple times 2996*4882a593Smuzhiyun ====== ======================================================= 2997*4882a593Smuzhiyun 2998*4882a593Smuzhiyun Other error conditions may be defined by individual device types or 2999*4882a593Smuzhiyun have their standard meanings. 3000*4882a593Smuzhiyun 3001*4882a593SmuzhiyunCreates an emulated device in the kernel. The file descriptor returned 3002*4882a593Smuzhiyunin fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR. 3003*4882a593Smuzhiyun 3004*4882a593SmuzhiyunIf the KVM_CREATE_DEVICE_TEST flag is set, only test whether the 3005*4882a593Smuzhiyundevice type is supported (not necessarily whether it can be created 3006*4882a593Smuzhiyunin the current vm). 3007*4882a593Smuzhiyun 3008*4882a593SmuzhiyunIndividual devices should not define flags. Attributes should be used 3009*4882a593Smuzhiyunfor specifying any behavior that is not implied by the device type 3010*4882a593Smuzhiyunnumber. 3011*4882a593Smuzhiyun 3012*4882a593Smuzhiyun:: 3013*4882a593Smuzhiyun 3014*4882a593Smuzhiyun struct kvm_create_device { 3015*4882a593Smuzhiyun __u32 type; /* in: KVM_DEV_TYPE_xxx */ 3016*4882a593Smuzhiyun __u32 fd; /* out: device handle */ 3017*4882a593Smuzhiyun __u32 flags; /* in: KVM_CREATE_DEVICE_xxx */ 3018*4882a593Smuzhiyun }; 3019*4882a593Smuzhiyun 3020*4882a593Smuzhiyun4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR 3021*4882a593Smuzhiyun-------------------------------------------- 3022*4882a593Smuzhiyun 3023*4882a593Smuzhiyun:Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device, 3024*4882a593Smuzhiyun KVM_CAP_VCPU_ATTRIBUTES for vcpu device 3025*4882a593Smuzhiyun:Type: device ioctl, vm ioctl, vcpu ioctl 3026*4882a593Smuzhiyun:Parameters: struct kvm_device_attr 3027*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 3028*4882a593Smuzhiyun 3029*4882a593SmuzhiyunErrors: 3030*4882a593Smuzhiyun 3031*4882a593Smuzhiyun ===== ============================================================= 3032*4882a593Smuzhiyun ENXIO The group or attribute is unknown/unsupported for this device 3033*4882a593Smuzhiyun or hardware support is missing. 3034*4882a593Smuzhiyun EPERM The attribute cannot (currently) be accessed this way 3035*4882a593Smuzhiyun (e.g. read-only attribute, or attribute that only makes 3036*4882a593Smuzhiyun sense when the device is in a different state) 3037*4882a593Smuzhiyun ===== ============================================================= 3038*4882a593Smuzhiyun 3039*4882a593Smuzhiyun Other error conditions may be defined by individual device types. 3040*4882a593Smuzhiyun 3041*4882a593SmuzhiyunGets/sets a specified piece of device configuration and/or state. The 3042*4882a593Smuzhiyunsemantics are device-specific. See individual device documentation in 3043*4882a593Smuzhiyunthe "devices" directory. As with ONE_REG, the size of the data 3044*4882a593Smuzhiyuntransferred is defined by the particular attribute. 3045*4882a593Smuzhiyun 3046*4882a593Smuzhiyun:: 3047*4882a593Smuzhiyun 3048*4882a593Smuzhiyun struct kvm_device_attr { 3049*4882a593Smuzhiyun __u32 flags; /* no flags currently defined */ 3050*4882a593Smuzhiyun __u32 group; /* device-defined */ 3051*4882a593Smuzhiyun __u64 attr; /* group-defined */ 3052*4882a593Smuzhiyun __u64 addr; /* userspace address of attr data */ 3053*4882a593Smuzhiyun }; 3054*4882a593Smuzhiyun 3055*4882a593Smuzhiyun4.81 KVM_HAS_DEVICE_ATTR 3056*4882a593Smuzhiyun------------------------ 3057*4882a593Smuzhiyun 3058*4882a593Smuzhiyun:Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device, 3059*4882a593Smuzhiyun KVM_CAP_VCPU_ATTRIBUTES for vcpu device 3060*4882a593Smuzhiyun:Type: device ioctl, vm ioctl, vcpu ioctl 3061*4882a593Smuzhiyun:Parameters: struct kvm_device_attr 3062*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 3063*4882a593Smuzhiyun 3064*4882a593SmuzhiyunErrors: 3065*4882a593Smuzhiyun 3066*4882a593Smuzhiyun ===== ============================================================= 3067*4882a593Smuzhiyun ENXIO The group or attribute is unknown/unsupported for this device 3068*4882a593Smuzhiyun or hardware support is missing. 3069*4882a593Smuzhiyun ===== ============================================================= 3070*4882a593Smuzhiyun 3071*4882a593SmuzhiyunTests whether a device supports a particular attribute. A successful 3072*4882a593Smuzhiyunreturn indicates the attribute is implemented. It does not necessarily 3073*4882a593Smuzhiyunindicate that the attribute can be read or written in the device's 3074*4882a593Smuzhiyuncurrent state. "addr" is ignored. 3075*4882a593Smuzhiyun 3076*4882a593Smuzhiyun4.82 KVM_ARM_VCPU_INIT 3077*4882a593Smuzhiyun---------------------- 3078*4882a593Smuzhiyun 3079*4882a593Smuzhiyun:Capability: basic 3080*4882a593Smuzhiyun:Architectures: arm, arm64 3081*4882a593Smuzhiyun:Type: vcpu ioctl 3082*4882a593Smuzhiyun:Parameters: struct kvm_vcpu_init (in) 3083*4882a593Smuzhiyun:Returns: 0 on success; -1 on error 3084*4882a593Smuzhiyun 3085*4882a593SmuzhiyunErrors: 3086*4882a593Smuzhiyun 3087*4882a593Smuzhiyun ====== ================================================================= 3088*4882a593Smuzhiyun EINVAL the target is unknown, or the combination of features is invalid. 3089*4882a593Smuzhiyun ENOENT a features bit specified is unknown. 3090*4882a593Smuzhiyun ====== ================================================================= 3091*4882a593Smuzhiyun 3092*4882a593SmuzhiyunThis tells KVM what type of CPU to present to the guest, and what 3093*4882a593Smuzhiyunoptional features it should have. This will cause a reset of the cpu 3094*4882a593Smuzhiyunregisters to their initial values. If this is not called, KVM_RUN will 3095*4882a593Smuzhiyunreturn ENOEXEC for that vcpu. 3096*4882a593Smuzhiyun 3097*4882a593SmuzhiyunNote that because some registers reflect machine topology, all vcpus 3098*4882a593Smuzhiyunshould be created before this ioctl is invoked. 3099*4882a593Smuzhiyun 3100*4882a593SmuzhiyunUserspace can call this function multiple times for a given vcpu, including 3101*4882a593Smuzhiyunafter the vcpu has been run. This will reset the vcpu to its initial 3102*4882a593Smuzhiyunstate. All calls to this function after the initial call must use the same 3103*4882a593Smuzhiyuntarget and same set of feature flags, otherwise EINVAL will be returned. 3104*4882a593Smuzhiyun 3105*4882a593SmuzhiyunPossible features: 3106*4882a593Smuzhiyun 3107*4882a593Smuzhiyun - KVM_ARM_VCPU_POWER_OFF: Starts the CPU in a power-off state. 3108*4882a593Smuzhiyun Depends on KVM_CAP_ARM_PSCI. If not set, the CPU will be powered on 3109*4882a593Smuzhiyun and execute guest code when KVM_RUN is called. 3110*4882a593Smuzhiyun - KVM_ARM_VCPU_EL1_32BIT: Starts the CPU in a 32bit mode. 3111*4882a593Smuzhiyun Depends on KVM_CAP_ARM_EL1_32BIT (arm64 only). 3112*4882a593Smuzhiyun - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 (or a future revision 3113*4882a593Smuzhiyun backward compatible with v0.2) for the CPU. 3114*4882a593Smuzhiyun Depends on KVM_CAP_ARM_PSCI_0_2. 3115*4882a593Smuzhiyun - KVM_ARM_VCPU_PMU_V3: Emulate PMUv3 for the CPU. 3116*4882a593Smuzhiyun Depends on KVM_CAP_ARM_PMU_V3. 3117*4882a593Smuzhiyun 3118*4882a593Smuzhiyun - KVM_ARM_VCPU_PTRAUTH_ADDRESS: Enables Address Pointer authentication 3119*4882a593Smuzhiyun for arm64 only. 3120*4882a593Smuzhiyun Depends on KVM_CAP_ARM_PTRAUTH_ADDRESS. 3121*4882a593Smuzhiyun If KVM_CAP_ARM_PTRAUTH_ADDRESS and KVM_CAP_ARM_PTRAUTH_GENERIC are 3122*4882a593Smuzhiyun both present, then both KVM_ARM_VCPU_PTRAUTH_ADDRESS and 3123*4882a593Smuzhiyun KVM_ARM_VCPU_PTRAUTH_GENERIC must be requested or neither must be 3124*4882a593Smuzhiyun requested. 3125*4882a593Smuzhiyun 3126*4882a593Smuzhiyun - KVM_ARM_VCPU_PTRAUTH_GENERIC: Enables Generic Pointer authentication 3127*4882a593Smuzhiyun for arm64 only. 3128*4882a593Smuzhiyun Depends on KVM_CAP_ARM_PTRAUTH_GENERIC. 3129*4882a593Smuzhiyun If KVM_CAP_ARM_PTRAUTH_ADDRESS and KVM_CAP_ARM_PTRAUTH_GENERIC are 3130*4882a593Smuzhiyun both present, then both KVM_ARM_VCPU_PTRAUTH_ADDRESS and 3131*4882a593Smuzhiyun KVM_ARM_VCPU_PTRAUTH_GENERIC must be requested or neither must be 3132*4882a593Smuzhiyun requested. 3133*4882a593Smuzhiyun 3134*4882a593Smuzhiyun - KVM_ARM_VCPU_SVE: Enables SVE for the CPU (arm64 only). 3135*4882a593Smuzhiyun Depends on KVM_CAP_ARM_SVE. 3136*4882a593Smuzhiyun Requires KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE): 3137*4882a593Smuzhiyun 3138*4882a593Smuzhiyun * After KVM_ARM_VCPU_INIT: 3139*4882a593Smuzhiyun 3140*4882a593Smuzhiyun - KVM_REG_ARM64_SVE_VLS may be read using KVM_GET_ONE_REG: the 3141*4882a593Smuzhiyun initial value of this pseudo-register indicates the best set of 3142*4882a593Smuzhiyun vector lengths possible for a vcpu on this host. 3143*4882a593Smuzhiyun 3144*4882a593Smuzhiyun * Before KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE): 3145*4882a593Smuzhiyun 3146*4882a593Smuzhiyun - KVM_RUN and KVM_GET_REG_LIST are not available; 3147*4882a593Smuzhiyun 3148*4882a593Smuzhiyun - KVM_GET_ONE_REG and KVM_SET_ONE_REG cannot be used to access 3149*4882a593Smuzhiyun the scalable archietctural SVE registers 3150*4882a593Smuzhiyun KVM_REG_ARM64_SVE_ZREG(), KVM_REG_ARM64_SVE_PREG() or 3151*4882a593Smuzhiyun KVM_REG_ARM64_SVE_FFR; 3152*4882a593Smuzhiyun 3153*4882a593Smuzhiyun - KVM_REG_ARM64_SVE_VLS may optionally be written using 3154*4882a593Smuzhiyun KVM_SET_ONE_REG, to modify the set of vector lengths available 3155*4882a593Smuzhiyun for the vcpu. 3156*4882a593Smuzhiyun 3157*4882a593Smuzhiyun * After KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_SVE): 3158*4882a593Smuzhiyun 3159*4882a593Smuzhiyun - the KVM_REG_ARM64_SVE_VLS pseudo-register is immutable, and can 3160*4882a593Smuzhiyun no longer be written using KVM_SET_ONE_REG. 3161*4882a593Smuzhiyun 3162*4882a593Smuzhiyun4.83 KVM_ARM_PREFERRED_TARGET 3163*4882a593Smuzhiyun----------------------------- 3164*4882a593Smuzhiyun 3165*4882a593Smuzhiyun:Capability: basic 3166*4882a593Smuzhiyun:Architectures: arm, arm64 3167*4882a593Smuzhiyun:Type: vm ioctl 3168*4882a593Smuzhiyun:Parameters: struct kvm_vcpu_init (out) 3169*4882a593Smuzhiyun:Returns: 0 on success; -1 on error 3170*4882a593Smuzhiyun 3171*4882a593SmuzhiyunErrors: 3172*4882a593Smuzhiyun 3173*4882a593Smuzhiyun ====== ========================================== 3174*4882a593Smuzhiyun ENODEV no preferred target available for the host 3175*4882a593Smuzhiyun ====== ========================================== 3176*4882a593Smuzhiyun 3177*4882a593SmuzhiyunThis queries KVM for preferred CPU target type which can be emulated 3178*4882a593Smuzhiyunby KVM on underlying host. 3179*4882a593Smuzhiyun 3180*4882a593SmuzhiyunThe ioctl returns struct kvm_vcpu_init instance containing information 3181*4882a593Smuzhiyunabout preferred CPU target type and recommended features for it. The 3182*4882a593Smuzhiyunkvm_vcpu_init->features bitmap returned will have feature bits set if 3183*4882a593Smuzhiyunthe preferred target recommends setting these features, but this is 3184*4882a593Smuzhiyunnot mandatory. 3185*4882a593Smuzhiyun 3186*4882a593SmuzhiyunThe information returned by this ioctl can be used to prepare an instance 3187*4882a593Smuzhiyunof struct kvm_vcpu_init for KVM_ARM_VCPU_INIT ioctl which will result in 3188*4882a593SmuzhiyunVCPU matching underlying host. 3189*4882a593Smuzhiyun 3190*4882a593Smuzhiyun 3191*4882a593Smuzhiyun4.84 KVM_GET_REG_LIST 3192*4882a593Smuzhiyun--------------------- 3193*4882a593Smuzhiyun 3194*4882a593Smuzhiyun:Capability: basic 3195*4882a593Smuzhiyun:Architectures: arm, arm64, mips 3196*4882a593Smuzhiyun:Type: vcpu ioctl 3197*4882a593Smuzhiyun:Parameters: struct kvm_reg_list (in/out) 3198*4882a593Smuzhiyun:Returns: 0 on success; -1 on error 3199*4882a593Smuzhiyun 3200*4882a593SmuzhiyunErrors: 3201*4882a593Smuzhiyun 3202*4882a593Smuzhiyun ===== ============================================================== 3203*4882a593Smuzhiyun E2BIG the reg index list is too big to fit in the array specified by 3204*4882a593Smuzhiyun the user (the number required will be written into n). 3205*4882a593Smuzhiyun ===== ============================================================== 3206*4882a593Smuzhiyun 3207*4882a593Smuzhiyun:: 3208*4882a593Smuzhiyun 3209*4882a593Smuzhiyun struct kvm_reg_list { 3210*4882a593Smuzhiyun __u64 n; /* number of registers in reg[] */ 3211*4882a593Smuzhiyun __u64 reg[0]; 3212*4882a593Smuzhiyun }; 3213*4882a593Smuzhiyun 3214*4882a593SmuzhiyunThis ioctl returns the guest registers that are supported for the 3215*4882a593SmuzhiyunKVM_GET_ONE_REG/KVM_SET_ONE_REG calls. 3216*4882a593Smuzhiyun 3217*4882a593Smuzhiyun 3218*4882a593Smuzhiyun4.85 KVM_ARM_SET_DEVICE_ADDR (deprecated) 3219*4882a593Smuzhiyun----------------------------------------- 3220*4882a593Smuzhiyun 3221*4882a593Smuzhiyun:Capability: KVM_CAP_ARM_SET_DEVICE_ADDR 3222*4882a593Smuzhiyun:Architectures: arm, arm64 3223*4882a593Smuzhiyun:Type: vm ioctl 3224*4882a593Smuzhiyun:Parameters: struct kvm_arm_device_address (in) 3225*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 3226*4882a593Smuzhiyun 3227*4882a593SmuzhiyunErrors: 3228*4882a593Smuzhiyun 3229*4882a593Smuzhiyun ====== ============================================ 3230*4882a593Smuzhiyun ENODEV The device id is unknown 3231*4882a593Smuzhiyun ENXIO Device not supported on current system 3232*4882a593Smuzhiyun EEXIST Address already set 3233*4882a593Smuzhiyun E2BIG Address outside guest physical address space 3234*4882a593Smuzhiyun EBUSY Address overlaps with other device range 3235*4882a593Smuzhiyun ====== ============================================ 3236*4882a593Smuzhiyun 3237*4882a593Smuzhiyun:: 3238*4882a593Smuzhiyun 3239*4882a593Smuzhiyun struct kvm_arm_device_addr { 3240*4882a593Smuzhiyun __u64 id; 3241*4882a593Smuzhiyun __u64 addr; 3242*4882a593Smuzhiyun }; 3243*4882a593Smuzhiyun 3244*4882a593SmuzhiyunSpecify a device address in the guest's physical address space where guests 3245*4882a593Smuzhiyuncan access emulated or directly exposed devices, which the host kernel needs 3246*4882a593Smuzhiyunto know about. The id field is an architecture specific identifier for a 3247*4882a593Smuzhiyunspecific device. 3248*4882a593Smuzhiyun 3249*4882a593SmuzhiyunARM/arm64 divides the id field into two parts, a device id and an 3250*4882a593Smuzhiyunaddress type id specific to the individual device:: 3251*4882a593Smuzhiyun 3252*4882a593Smuzhiyun bits: | 63 ... 32 | 31 ... 16 | 15 ... 0 | 3253*4882a593Smuzhiyun field: | 0x00000000 | device id | addr type id | 3254*4882a593Smuzhiyun 3255*4882a593SmuzhiyunARM/arm64 currently only require this when using the in-kernel GIC 3256*4882a593Smuzhiyunsupport for the hardware VGIC features, using KVM_ARM_DEVICE_VGIC_V2 3257*4882a593Smuzhiyunas the device id. When setting the base address for the guest's 3258*4882a593Smuzhiyunmapping of the VGIC virtual CPU and distributor interface, the ioctl 3259*4882a593Smuzhiyunmust be called after calling KVM_CREATE_IRQCHIP, but before calling 3260*4882a593SmuzhiyunKVM_RUN on any of the VCPUs. Calling this ioctl twice for any of the 3261*4882a593Smuzhiyunbase addresses will return -EEXIST. 3262*4882a593Smuzhiyun 3263*4882a593SmuzhiyunNote, this IOCTL is deprecated and the more flexible SET/GET_DEVICE_ATTR API 3264*4882a593Smuzhiyunshould be used instead. 3265*4882a593Smuzhiyun 3266*4882a593Smuzhiyun 3267*4882a593Smuzhiyun4.86 KVM_PPC_RTAS_DEFINE_TOKEN 3268*4882a593Smuzhiyun------------------------------ 3269*4882a593Smuzhiyun 3270*4882a593Smuzhiyun:Capability: KVM_CAP_PPC_RTAS 3271*4882a593Smuzhiyun:Architectures: ppc 3272*4882a593Smuzhiyun:Type: vm ioctl 3273*4882a593Smuzhiyun:Parameters: struct kvm_rtas_token_args 3274*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 3275*4882a593Smuzhiyun 3276*4882a593SmuzhiyunDefines a token value for a RTAS (Run Time Abstraction Services) 3277*4882a593Smuzhiyunservice in order to allow it to be handled in the kernel. The 3278*4882a593Smuzhiyunargument struct gives the name of the service, which must be the name 3279*4882a593Smuzhiyunof a service that has a kernel-side implementation. If the token 3280*4882a593Smuzhiyunvalue is non-zero, it will be associated with that service, and 3281*4882a593Smuzhiyunsubsequent RTAS calls by the guest specifying that token will be 3282*4882a593Smuzhiyunhandled by the kernel. If the token value is 0, then any token 3283*4882a593Smuzhiyunassociated with the service will be forgotten, and subsequent RTAS 3284*4882a593Smuzhiyuncalls by the guest for that service will be passed to userspace to be 3285*4882a593Smuzhiyunhandled. 3286*4882a593Smuzhiyun 3287*4882a593Smuzhiyun4.87 KVM_SET_GUEST_DEBUG 3288*4882a593Smuzhiyun------------------------ 3289*4882a593Smuzhiyun 3290*4882a593Smuzhiyun:Capability: KVM_CAP_SET_GUEST_DEBUG 3291*4882a593Smuzhiyun:Architectures: x86, s390, ppc, arm64 3292*4882a593Smuzhiyun:Type: vcpu ioctl 3293*4882a593Smuzhiyun:Parameters: struct kvm_guest_debug (in) 3294*4882a593Smuzhiyun:Returns: 0 on success; -1 on error 3295*4882a593Smuzhiyun 3296*4882a593Smuzhiyun:: 3297*4882a593Smuzhiyun 3298*4882a593Smuzhiyun struct kvm_guest_debug { 3299*4882a593Smuzhiyun __u32 control; 3300*4882a593Smuzhiyun __u32 pad; 3301*4882a593Smuzhiyun struct kvm_guest_debug_arch arch; 3302*4882a593Smuzhiyun }; 3303*4882a593Smuzhiyun 3304*4882a593SmuzhiyunSet up the processor specific debug registers and configure vcpu for 3305*4882a593Smuzhiyunhandling guest debug events. There are two parts to the structure, the 3306*4882a593Smuzhiyunfirst a control bitfield indicates the type of debug events to handle 3307*4882a593Smuzhiyunwhen running. Common control bits are: 3308*4882a593Smuzhiyun 3309*4882a593Smuzhiyun - KVM_GUESTDBG_ENABLE: guest debugging is enabled 3310*4882a593Smuzhiyun - KVM_GUESTDBG_SINGLESTEP: the next run should single-step 3311*4882a593Smuzhiyun 3312*4882a593SmuzhiyunThe top 16 bits of the control field are architecture specific control 3313*4882a593Smuzhiyunflags which can include the following: 3314*4882a593Smuzhiyun 3315*4882a593Smuzhiyun - KVM_GUESTDBG_USE_SW_BP: using software breakpoints [x86, arm64] 3316*4882a593Smuzhiyun - KVM_GUESTDBG_USE_HW_BP: using hardware breakpoints [x86, s390, arm64] 3317*4882a593Smuzhiyun - KVM_GUESTDBG_INJECT_DB: inject DB type exception [x86] 3318*4882a593Smuzhiyun - KVM_GUESTDBG_INJECT_BP: inject BP type exception [x86] 3319*4882a593Smuzhiyun - KVM_GUESTDBG_EXIT_PENDING: trigger an immediate guest exit [s390] 3320*4882a593Smuzhiyun 3321*4882a593SmuzhiyunFor example KVM_GUESTDBG_USE_SW_BP indicates that software breakpoints 3322*4882a593Smuzhiyunare enabled in memory so we need to ensure breakpoint exceptions are 3323*4882a593Smuzhiyuncorrectly trapped and the KVM run loop exits at the breakpoint and not 3324*4882a593Smuzhiyunrunning off into the normal guest vector. For KVM_GUESTDBG_USE_HW_BP 3325*4882a593Smuzhiyunwe need to ensure the guest vCPUs architecture specific registers are 3326*4882a593Smuzhiyunupdated to the correct (supplied) values. 3327*4882a593Smuzhiyun 3328*4882a593SmuzhiyunThe second part of the structure is architecture specific and 3329*4882a593Smuzhiyuntypically contains a set of debug registers. 3330*4882a593Smuzhiyun 3331*4882a593SmuzhiyunFor arm64 the number of debug registers is implementation defined and 3332*4882a593Smuzhiyuncan be determined by querying the KVM_CAP_GUEST_DEBUG_HW_BPS and 3333*4882a593SmuzhiyunKVM_CAP_GUEST_DEBUG_HW_WPS capabilities which return a positive number 3334*4882a593Smuzhiyunindicating the number of supported registers. 3335*4882a593Smuzhiyun 3336*4882a593SmuzhiyunFor ppc, the KVM_CAP_PPC_GUEST_DEBUG_SSTEP capability indicates whether 3337*4882a593Smuzhiyunthe single-step debug event (KVM_GUESTDBG_SINGLESTEP) is supported. 3338*4882a593Smuzhiyun 3339*4882a593SmuzhiyunWhen debug events exit the main run loop with the reason 3340*4882a593SmuzhiyunKVM_EXIT_DEBUG with the kvm_debug_exit_arch part of the kvm_run 3341*4882a593Smuzhiyunstructure containing architecture specific debug information. 3342*4882a593Smuzhiyun 3343*4882a593Smuzhiyun4.88 KVM_GET_EMULATED_CPUID 3344*4882a593Smuzhiyun--------------------------- 3345*4882a593Smuzhiyun 3346*4882a593Smuzhiyun:Capability: KVM_CAP_EXT_EMUL_CPUID 3347*4882a593Smuzhiyun:Architectures: x86 3348*4882a593Smuzhiyun:Type: system ioctl 3349*4882a593Smuzhiyun:Parameters: struct kvm_cpuid2 (in/out) 3350*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 3351*4882a593Smuzhiyun 3352*4882a593Smuzhiyun:: 3353*4882a593Smuzhiyun 3354*4882a593Smuzhiyun struct kvm_cpuid2 { 3355*4882a593Smuzhiyun __u32 nent; 3356*4882a593Smuzhiyun __u32 flags; 3357*4882a593Smuzhiyun struct kvm_cpuid_entry2 entries[0]; 3358*4882a593Smuzhiyun }; 3359*4882a593Smuzhiyun 3360*4882a593SmuzhiyunThe member 'flags' is used for passing flags from userspace. 3361*4882a593Smuzhiyun 3362*4882a593Smuzhiyun:: 3363*4882a593Smuzhiyun 3364*4882a593Smuzhiyun #define KVM_CPUID_FLAG_SIGNIFCANT_INDEX BIT(0) 3365*4882a593Smuzhiyun #define KVM_CPUID_FLAG_STATEFUL_FUNC BIT(1) /* deprecated */ 3366*4882a593Smuzhiyun #define KVM_CPUID_FLAG_STATE_READ_NEXT BIT(2) /* deprecated */ 3367*4882a593Smuzhiyun 3368*4882a593Smuzhiyun struct kvm_cpuid_entry2 { 3369*4882a593Smuzhiyun __u32 function; 3370*4882a593Smuzhiyun __u32 index; 3371*4882a593Smuzhiyun __u32 flags; 3372*4882a593Smuzhiyun __u32 eax; 3373*4882a593Smuzhiyun __u32 ebx; 3374*4882a593Smuzhiyun __u32 ecx; 3375*4882a593Smuzhiyun __u32 edx; 3376*4882a593Smuzhiyun __u32 padding[3]; 3377*4882a593Smuzhiyun }; 3378*4882a593Smuzhiyun 3379*4882a593SmuzhiyunThis ioctl returns x86 cpuid features which are emulated by 3380*4882a593Smuzhiyunkvm.Userspace can use the information returned by this ioctl to query 3381*4882a593Smuzhiyunwhich features are emulated by kvm instead of being present natively. 3382*4882a593Smuzhiyun 3383*4882a593SmuzhiyunUserspace invokes KVM_GET_EMULATED_CPUID by passing a kvm_cpuid2 3384*4882a593Smuzhiyunstructure with the 'nent' field indicating the number of entries in 3385*4882a593Smuzhiyunthe variable-size array 'entries'. If the number of entries is too low 3386*4882a593Smuzhiyunto describe the cpu capabilities, an error (E2BIG) is returned. If the 3387*4882a593Smuzhiyunnumber is too high, the 'nent' field is adjusted and an error (ENOMEM) 3388*4882a593Smuzhiyunis returned. If the number is just right, the 'nent' field is adjusted 3389*4882a593Smuzhiyunto the number of valid entries in the 'entries' array, which is then 3390*4882a593Smuzhiyunfilled. 3391*4882a593Smuzhiyun 3392*4882a593SmuzhiyunThe entries returned are the set CPUID bits of the respective features 3393*4882a593Smuzhiyunwhich kvm emulates, as returned by the CPUID instruction, with unknown 3394*4882a593Smuzhiyunor unsupported feature bits cleared. 3395*4882a593Smuzhiyun 3396*4882a593SmuzhiyunFeatures like x2apic, for example, may not be present in the host cpu 3397*4882a593Smuzhiyunbut are exposed by kvm in KVM_GET_SUPPORTED_CPUID because they can be 3398*4882a593Smuzhiyunemulated efficiently and thus not included here. 3399*4882a593Smuzhiyun 3400*4882a593SmuzhiyunThe fields in each entry are defined as follows: 3401*4882a593Smuzhiyun 3402*4882a593Smuzhiyun function: 3403*4882a593Smuzhiyun the eax value used to obtain the entry 3404*4882a593Smuzhiyun index: 3405*4882a593Smuzhiyun the ecx value used to obtain the entry (for entries that are 3406*4882a593Smuzhiyun affected by ecx) 3407*4882a593Smuzhiyun flags: 3408*4882a593Smuzhiyun an OR of zero or more of the following: 3409*4882a593Smuzhiyun 3410*4882a593Smuzhiyun KVM_CPUID_FLAG_SIGNIFCANT_INDEX: 3411*4882a593Smuzhiyun if the index field is valid 3412*4882a593Smuzhiyun 3413*4882a593Smuzhiyun eax, ebx, ecx, edx: 3414*4882a593Smuzhiyun 3415*4882a593Smuzhiyun the values returned by the cpuid instruction for 3416*4882a593Smuzhiyun this function/index combination 3417*4882a593Smuzhiyun 3418*4882a593Smuzhiyun4.89 KVM_S390_MEM_OP 3419*4882a593Smuzhiyun-------------------- 3420*4882a593Smuzhiyun 3421*4882a593Smuzhiyun:Capability: KVM_CAP_S390_MEM_OP 3422*4882a593Smuzhiyun:Architectures: s390 3423*4882a593Smuzhiyun:Type: vcpu ioctl 3424*4882a593Smuzhiyun:Parameters: struct kvm_s390_mem_op (in) 3425*4882a593Smuzhiyun:Returns: = 0 on success, 3426*4882a593Smuzhiyun < 0 on generic error (e.g. -EFAULT or -ENOMEM), 3427*4882a593Smuzhiyun > 0 if an exception occurred while walking the page tables 3428*4882a593Smuzhiyun 3429*4882a593SmuzhiyunRead or write data from/to the logical (virtual) memory of a VCPU. 3430*4882a593Smuzhiyun 3431*4882a593SmuzhiyunParameters are specified via the following structure:: 3432*4882a593Smuzhiyun 3433*4882a593Smuzhiyun struct kvm_s390_mem_op { 3434*4882a593Smuzhiyun __u64 gaddr; /* the guest address */ 3435*4882a593Smuzhiyun __u64 flags; /* flags */ 3436*4882a593Smuzhiyun __u32 size; /* amount of bytes */ 3437*4882a593Smuzhiyun __u32 op; /* type of operation */ 3438*4882a593Smuzhiyun __u64 buf; /* buffer in userspace */ 3439*4882a593Smuzhiyun __u8 ar; /* the access register number */ 3440*4882a593Smuzhiyun __u8 reserved[31]; /* should be set to 0 */ 3441*4882a593Smuzhiyun }; 3442*4882a593Smuzhiyun 3443*4882a593SmuzhiyunThe type of operation is specified in the "op" field. It is either 3444*4882a593SmuzhiyunKVM_S390_MEMOP_LOGICAL_READ for reading from logical memory space or 3445*4882a593SmuzhiyunKVM_S390_MEMOP_LOGICAL_WRITE for writing to logical memory space. The 3446*4882a593SmuzhiyunKVM_S390_MEMOP_F_CHECK_ONLY flag can be set in the "flags" field to check 3447*4882a593Smuzhiyunwhether the corresponding memory access would create an access exception 3448*4882a593Smuzhiyun(without touching the data in the memory at the destination). In case an 3449*4882a593Smuzhiyunaccess exception occurred while walking the MMU tables of the guest, the 3450*4882a593Smuzhiyunioctl returns a positive error number to indicate the type of exception. 3451*4882a593SmuzhiyunThis exception is also raised directly at the corresponding VCPU if the 3452*4882a593Smuzhiyunflag KVM_S390_MEMOP_F_INJECT_EXCEPTION is set in the "flags" field. 3453*4882a593Smuzhiyun 3454*4882a593SmuzhiyunThe start address of the memory region has to be specified in the "gaddr" 3455*4882a593Smuzhiyunfield, and the length of the region in the "size" field (which must not 3456*4882a593Smuzhiyunbe 0). The maximum value for "size" can be obtained by checking the 3457*4882a593SmuzhiyunKVM_CAP_S390_MEM_OP capability. "buf" is the buffer supplied by the 3458*4882a593Smuzhiyunuserspace application where the read data should be written to for 3459*4882a593SmuzhiyunKVM_S390_MEMOP_LOGICAL_READ, or where the data that should be written is 3460*4882a593Smuzhiyunstored for a KVM_S390_MEMOP_LOGICAL_WRITE. When KVM_S390_MEMOP_F_CHECK_ONLY 3461*4882a593Smuzhiyunis specified, "buf" is unused and can be NULL. "ar" designates the access 3462*4882a593Smuzhiyunregister number to be used; the valid range is 0..15. 3463*4882a593Smuzhiyun 3464*4882a593SmuzhiyunThe "reserved" field is meant for future extensions. It is not used by 3465*4882a593SmuzhiyunKVM with the currently defined set of flags. 3466*4882a593Smuzhiyun 3467*4882a593Smuzhiyun4.90 KVM_S390_GET_SKEYS 3468*4882a593Smuzhiyun----------------------- 3469*4882a593Smuzhiyun 3470*4882a593Smuzhiyun:Capability: KVM_CAP_S390_SKEYS 3471*4882a593Smuzhiyun:Architectures: s390 3472*4882a593Smuzhiyun:Type: vm ioctl 3473*4882a593Smuzhiyun:Parameters: struct kvm_s390_skeys 3474*4882a593Smuzhiyun:Returns: 0 on success, KVM_S390_GET_KEYS_NONE if guest is not using storage 3475*4882a593Smuzhiyun keys, negative value on error 3476*4882a593Smuzhiyun 3477*4882a593SmuzhiyunThis ioctl is used to get guest storage key values on the s390 3478*4882a593Smuzhiyunarchitecture. The ioctl takes parameters via the kvm_s390_skeys struct:: 3479*4882a593Smuzhiyun 3480*4882a593Smuzhiyun struct kvm_s390_skeys { 3481*4882a593Smuzhiyun __u64 start_gfn; 3482*4882a593Smuzhiyun __u64 count; 3483*4882a593Smuzhiyun __u64 skeydata_addr; 3484*4882a593Smuzhiyun __u32 flags; 3485*4882a593Smuzhiyun __u32 reserved[9]; 3486*4882a593Smuzhiyun }; 3487*4882a593Smuzhiyun 3488*4882a593SmuzhiyunThe start_gfn field is the number of the first guest frame whose storage keys 3489*4882a593Smuzhiyunyou want to get. 3490*4882a593Smuzhiyun 3491*4882a593SmuzhiyunThe count field is the number of consecutive frames (starting from start_gfn) 3492*4882a593Smuzhiyunwhose storage keys to get. The count field must be at least 1 and the maximum 3493*4882a593Smuzhiyunallowed value is defined as KVM_S390_SKEYS_ALLOC_MAX. Values outside this range 3494*4882a593Smuzhiyunwill cause the ioctl to return -EINVAL. 3495*4882a593Smuzhiyun 3496*4882a593SmuzhiyunThe skeydata_addr field is the address to a buffer large enough to hold count 3497*4882a593Smuzhiyunbytes. This buffer will be filled with storage key data by the ioctl. 3498*4882a593Smuzhiyun 3499*4882a593Smuzhiyun4.91 KVM_S390_SET_SKEYS 3500*4882a593Smuzhiyun----------------------- 3501*4882a593Smuzhiyun 3502*4882a593Smuzhiyun:Capability: KVM_CAP_S390_SKEYS 3503*4882a593Smuzhiyun:Architectures: s390 3504*4882a593Smuzhiyun:Type: vm ioctl 3505*4882a593Smuzhiyun:Parameters: struct kvm_s390_skeys 3506*4882a593Smuzhiyun:Returns: 0 on success, negative value on error 3507*4882a593Smuzhiyun 3508*4882a593SmuzhiyunThis ioctl is used to set guest storage key values on the s390 3509*4882a593Smuzhiyunarchitecture. The ioctl takes parameters via the kvm_s390_skeys struct. 3510*4882a593SmuzhiyunSee section on KVM_S390_GET_SKEYS for struct definition. 3511*4882a593Smuzhiyun 3512*4882a593SmuzhiyunThe start_gfn field is the number of the first guest frame whose storage keys 3513*4882a593Smuzhiyunyou want to set. 3514*4882a593Smuzhiyun 3515*4882a593SmuzhiyunThe count field is the number of consecutive frames (starting from start_gfn) 3516*4882a593Smuzhiyunwhose storage keys to get. The count field must be at least 1 and the maximum 3517*4882a593Smuzhiyunallowed value is defined as KVM_S390_SKEYS_ALLOC_MAX. Values outside this range 3518*4882a593Smuzhiyunwill cause the ioctl to return -EINVAL. 3519*4882a593Smuzhiyun 3520*4882a593SmuzhiyunThe skeydata_addr field is the address to a buffer containing count bytes of 3521*4882a593Smuzhiyunstorage keys. Each byte in the buffer will be set as the storage key for a 3522*4882a593Smuzhiyunsingle frame starting at start_gfn for count frames. 3523*4882a593Smuzhiyun 3524*4882a593SmuzhiyunNote: If any architecturally invalid key value is found in the given data then 3525*4882a593Smuzhiyunthe ioctl will return -EINVAL. 3526*4882a593Smuzhiyun 3527*4882a593Smuzhiyun4.92 KVM_S390_IRQ 3528*4882a593Smuzhiyun----------------- 3529*4882a593Smuzhiyun 3530*4882a593Smuzhiyun:Capability: KVM_CAP_S390_INJECT_IRQ 3531*4882a593Smuzhiyun:Architectures: s390 3532*4882a593Smuzhiyun:Type: vcpu ioctl 3533*4882a593Smuzhiyun:Parameters: struct kvm_s390_irq (in) 3534*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 3535*4882a593Smuzhiyun 3536*4882a593SmuzhiyunErrors: 3537*4882a593Smuzhiyun 3538*4882a593Smuzhiyun 3539*4882a593Smuzhiyun ====== ================================================================= 3540*4882a593Smuzhiyun EINVAL interrupt type is invalid 3541*4882a593Smuzhiyun type is KVM_S390_SIGP_STOP and flag parameter is invalid value, 3542*4882a593Smuzhiyun type is KVM_S390_INT_EXTERNAL_CALL and code is bigger 3543*4882a593Smuzhiyun than the maximum of VCPUs 3544*4882a593Smuzhiyun EBUSY type is KVM_S390_SIGP_SET_PREFIX and vcpu is not stopped, 3545*4882a593Smuzhiyun type is KVM_S390_SIGP_STOP and a stop irq is already pending, 3546*4882a593Smuzhiyun type is KVM_S390_INT_EXTERNAL_CALL and an external call interrupt 3547*4882a593Smuzhiyun is already pending 3548*4882a593Smuzhiyun ====== ================================================================= 3549*4882a593Smuzhiyun 3550*4882a593SmuzhiyunAllows to inject an interrupt to the guest. 3551*4882a593Smuzhiyun 3552*4882a593SmuzhiyunUsing struct kvm_s390_irq as a parameter allows 3553*4882a593Smuzhiyunto inject additional payload which is not 3554*4882a593Smuzhiyunpossible via KVM_S390_INTERRUPT. 3555*4882a593Smuzhiyun 3556*4882a593SmuzhiyunInterrupt parameters are passed via kvm_s390_irq:: 3557*4882a593Smuzhiyun 3558*4882a593Smuzhiyun struct kvm_s390_irq { 3559*4882a593Smuzhiyun __u64 type; 3560*4882a593Smuzhiyun union { 3561*4882a593Smuzhiyun struct kvm_s390_io_info io; 3562*4882a593Smuzhiyun struct kvm_s390_ext_info ext; 3563*4882a593Smuzhiyun struct kvm_s390_pgm_info pgm; 3564*4882a593Smuzhiyun struct kvm_s390_emerg_info emerg; 3565*4882a593Smuzhiyun struct kvm_s390_extcall_info extcall; 3566*4882a593Smuzhiyun struct kvm_s390_prefix_info prefix; 3567*4882a593Smuzhiyun struct kvm_s390_stop_info stop; 3568*4882a593Smuzhiyun struct kvm_s390_mchk_info mchk; 3569*4882a593Smuzhiyun char reserved[64]; 3570*4882a593Smuzhiyun } u; 3571*4882a593Smuzhiyun }; 3572*4882a593Smuzhiyun 3573*4882a593Smuzhiyuntype can be one of the following: 3574*4882a593Smuzhiyun 3575*4882a593Smuzhiyun- KVM_S390_SIGP_STOP - sigp stop; parameter in .stop 3576*4882a593Smuzhiyun- KVM_S390_PROGRAM_INT - program check; parameters in .pgm 3577*4882a593Smuzhiyun- KVM_S390_SIGP_SET_PREFIX - sigp set prefix; parameters in .prefix 3578*4882a593Smuzhiyun- KVM_S390_RESTART - restart; no parameters 3579*4882a593Smuzhiyun- KVM_S390_INT_CLOCK_COMP - clock comparator interrupt; no parameters 3580*4882a593Smuzhiyun- KVM_S390_INT_CPU_TIMER - CPU timer interrupt; no parameters 3581*4882a593Smuzhiyun- KVM_S390_INT_EMERGENCY - sigp emergency; parameters in .emerg 3582*4882a593Smuzhiyun- KVM_S390_INT_EXTERNAL_CALL - sigp external call; parameters in .extcall 3583*4882a593Smuzhiyun- KVM_S390_MCHK - machine check interrupt; parameters in .mchk 3584*4882a593Smuzhiyun 3585*4882a593SmuzhiyunThis is an asynchronous vcpu ioctl and can be invoked from any thread. 3586*4882a593Smuzhiyun 3587*4882a593Smuzhiyun4.94 KVM_S390_GET_IRQ_STATE 3588*4882a593Smuzhiyun--------------------------- 3589*4882a593Smuzhiyun 3590*4882a593Smuzhiyun:Capability: KVM_CAP_S390_IRQ_STATE 3591*4882a593Smuzhiyun:Architectures: s390 3592*4882a593Smuzhiyun:Type: vcpu ioctl 3593*4882a593Smuzhiyun:Parameters: struct kvm_s390_irq_state (out) 3594*4882a593Smuzhiyun:Returns: >= number of bytes copied into buffer, 3595*4882a593Smuzhiyun -EINVAL if buffer size is 0, 3596*4882a593Smuzhiyun -ENOBUFS if buffer size is too small to fit all pending interrupts, 3597*4882a593Smuzhiyun -EFAULT if the buffer address was invalid 3598*4882a593Smuzhiyun 3599*4882a593SmuzhiyunThis ioctl allows userspace to retrieve the complete state of all currently 3600*4882a593Smuzhiyunpending interrupts in a single buffer. Use cases include migration 3601*4882a593Smuzhiyunand introspection. The parameter structure contains the address of a 3602*4882a593Smuzhiyunuserspace buffer and its length:: 3603*4882a593Smuzhiyun 3604*4882a593Smuzhiyun struct kvm_s390_irq_state { 3605*4882a593Smuzhiyun __u64 buf; 3606*4882a593Smuzhiyun __u32 flags; /* will stay unused for compatibility reasons */ 3607*4882a593Smuzhiyun __u32 len; 3608*4882a593Smuzhiyun __u32 reserved[4]; /* will stay unused for compatibility reasons */ 3609*4882a593Smuzhiyun }; 3610*4882a593Smuzhiyun 3611*4882a593SmuzhiyunUserspace passes in the above struct and for each pending interrupt a 3612*4882a593Smuzhiyunstruct kvm_s390_irq is copied to the provided buffer. 3613*4882a593Smuzhiyun 3614*4882a593SmuzhiyunThe structure contains a flags and a reserved field for future extensions. As 3615*4882a593Smuzhiyunthe kernel never checked for flags == 0 and QEMU never pre-zeroed flags and 3616*4882a593Smuzhiyunreserved, these fields can not be used in the future without breaking 3617*4882a593Smuzhiyuncompatibility. 3618*4882a593Smuzhiyun 3619*4882a593SmuzhiyunIf -ENOBUFS is returned the buffer provided was too small and userspace 3620*4882a593Smuzhiyunmay retry with a bigger buffer. 3621*4882a593Smuzhiyun 3622*4882a593Smuzhiyun4.95 KVM_S390_SET_IRQ_STATE 3623*4882a593Smuzhiyun--------------------------- 3624*4882a593Smuzhiyun 3625*4882a593Smuzhiyun:Capability: KVM_CAP_S390_IRQ_STATE 3626*4882a593Smuzhiyun:Architectures: s390 3627*4882a593Smuzhiyun:Type: vcpu ioctl 3628*4882a593Smuzhiyun:Parameters: struct kvm_s390_irq_state (in) 3629*4882a593Smuzhiyun:Returns: 0 on success, 3630*4882a593Smuzhiyun -EFAULT if the buffer address was invalid, 3631*4882a593Smuzhiyun -EINVAL for an invalid buffer length (see below), 3632*4882a593Smuzhiyun -EBUSY if there were already interrupts pending, 3633*4882a593Smuzhiyun errors occurring when actually injecting the 3634*4882a593Smuzhiyun interrupt. See KVM_S390_IRQ. 3635*4882a593Smuzhiyun 3636*4882a593SmuzhiyunThis ioctl allows userspace to set the complete state of all cpu-local 3637*4882a593Smuzhiyuninterrupts currently pending for the vcpu. It is intended for restoring 3638*4882a593Smuzhiyuninterrupt state after a migration. The input parameter is a userspace buffer 3639*4882a593Smuzhiyuncontaining a struct kvm_s390_irq_state:: 3640*4882a593Smuzhiyun 3641*4882a593Smuzhiyun struct kvm_s390_irq_state { 3642*4882a593Smuzhiyun __u64 buf; 3643*4882a593Smuzhiyun __u32 flags; /* will stay unused for compatibility reasons */ 3644*4882a593Smuzhiyun __u32 len; 3645*4882a593Smuzhiyun __u32 reserved[4]; /* will stay unused for compatibility reasons */ 3646*4882a593Smuzhiyun }; 3647*4882a593Smuzhiyun 3648*4882a593SmuzhiyunThe restrictions for flags and reserved apply as well. 3649*4882a593Smuzhiyun(see KVM_S390_GET_IRQ_STATE) 3650*4882a593Smuzhiyun 3651*4882a593SmuzhiyunThe userspace memory referenced by buf contains a struct kvm_s390_irq 3652*4882a593Smuzhiyunfor each interrupt to be injected into the guest. 3653*4882a593SmuzhiyunIf one of the interrupts could not be injected for some reason the 3654*4882a593Smuzhiyunioctl aborts. 3655*4882a593Smuzhiyun 3656*4882a593Smuzhiyunlen must be a multiple of sizeof(struct kvm_s390_irq). It must be > 0 3657*4882a593Smuzhiyunand it must not exceed (max_vcpus + 32) * sizeof(struct kvm_s390_irq), 3658*4882a593Smuzhiyunwhich is the maximum number of possibly pending cpu-local interrupts. 3659*4882a593Smuzhiyun 3660*4882a593Smuzhiyun4.96 KVM_SMI 3661*4882a593Smuzhiyun------------ 3662*4882a593Smuzhiyun 3663*4882a593Smuzhiyun:Capability: KVM_CAP_X86_SMM 3664*4882a593Smuzhiyun:Architectures: x86 3665*4882a593Smuzhiyun:Type: vcpu ioctl 3666*4882a593Smuzhiyun:Parameters: none 3667*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 3668*4882a593Smuzhiyun 3669*4882a593SmuzhiyunQueues an SMI on the thread's vcpu. 3670*4882a593Smuzhiyun 3671*4882a593Smuzhiyun4.97 KVM_CAP_PPC_MULTITCE 3672*4882a593Smuzhiyun------------------------- 3673*4882a593Smuzhiyun 3674*4882a593Smuzhiyun:Capability: KVM_CAP_PPC_MULTITCE 3675*4882a593Smuzhiyun:Architectures: ppc 3676*4882a593Smuzhiyun:Type: vm 3677*4882a593Smuzhiyun 3678*4882a593SmuzhiyunThis capability means the kernel is capable of handling hypercalls 3679*4882a593SmuzhiyunH_PUT_TCE_INDIRECT and H_STUFF_TCE without passing those into the user 3680*4882a593Smuzhiyunspace. This significantly accelerates DMA operations for PPC KVM guests. 3681*4882a593SmuzhiyunUser space should expect that its handlers for these hypercalls 3682*4882a593Smuzhiyunare not going to be called if user space previously registered LIOBN 3683*4882a593Smuzhiyunin KVM (via KVM_CREATE_SPAPR_TCE or similar calls). 3684*4882a593Smuzhiyun 3685*4882a593SmuzhiyunIn order to enable H_PUT_TCE_INDIRECT and H_STUFF_TCE use in the guest, 3686*4882a593Smuzhiyunuser space might have to advertise it for the guest. For example, 3687*4882a593SmuzhiyunIBM pSeries (sPAPR) guest starts using them if "hcall-multi-tce" is 3688*4882a593Smuzhiyunpresent in the "ibm,hypertas-functions" device-tree property. 3689*4882a593Smuzhiyun 3690*4882a593SmuzhiyunThe hypercalls mentioned above may or may not be processed successfully 3691*4882a593Smuzhiyunin the kernel based fast path. If they can not be handled by the kernel, 3692*4882a593Smuzhiyunthey will get passed on to user space. So user space still has to have 3693*4882a593Smuzhiyunan implementation for these despite the in kernel acceleration. 3694*4882a593Smuzhiyun 3695*4882a593SmuzhiyunThis capability is always enabled. 3696*4882a593Smuzhiyun 3697*4882a593Smuzhiyun4.98 KVM_CREATE_SPAPR_TCE_64 3698*4882a593Smuzhiyun---------------------------- 3699*4882a593Smuzhiyun 3700*4882a593Smuzhiyun:Capability: KVM_CAP_SPAPR_TCE_64 3701*4882a593Smuzhiyun:Architectures: powerpc 3702*4882a593Smuzhiyun:Type: vm ioctl 3703*4882a593Smuzhiyun:Parameters: struct kvm_create_spapr_tce_64 (in) 3704*4882a593Smuzhiyun:Returns: file descriptor for manipulating the created TCE table 3705*4882a593Smuzhiyun 3706*4882a593SmuzhiyunThis is an extension for KVM_CAP_SPAPR_TCE which only supports 32bit 3707*4882a593Smuzhiyunwindows, described in 4.62 KVM_CREATE_SPAPR_TCE 3708*4882a593Smuzhiyun 3709*4882a593SmuzhiyunThis capability uses extended struct in ioctl interface:: 3710*4882a593Smuzhiyun 3711*4882a593Smuzhiyun /* for KVM_CAP_SPAPR_TCE_64 */ 3712*4882a593Smuzhiyun struct kvm_create_spapr_tce_64 { 3713*4882a593Smuzhiyun __u64 liobn; 3714*4882a593Smuzhiyun __u32 page_shift; 3715*4882a593Smuzhiyun __u32 flags; 3716*4882a593Smuzhiyun __u64 offset; /* in pages */ 3717*4882a593Smuzhiyun __u64 size; /* in pages */ 3718*4882a593Smuzhiyun }; 3719*4882a593Smuzhiyun 3720*4882a593SmuzhiyunThe aim of extension is to support an additional bigger DMA window with 3721*4882a593Smuzhiyuna variable page size. 3722*4882a593SmuzhiyunKVM_CREATE_SPAPR_TCE_64 receives a 64bit window size, an IOMMU page shift and 3723*4882a593Smuzhiyuna bus offset of the corresponding DMA window, @size and @offset are numbers 3724*4882a593Smuzhiyunof IOMMU pages. 3725*4882a593Smuzhiyun 3726*4882a593Smuzhiyun@flags are not used at the moment. 3727*4882a593Smuzhiyun 3728*4882a593SmuzhiyunThe rest of functionality is identical to KVM_CREATE_SPAPR_TCE. 3729*4882a593Smuzhiyun 3730*4882a593Smuzhiyun4.99 KVM_REINJECT_CONTROL 3731*4882a593Smuzhiyun------------------------- 3732*4882a593Smuzhiyun 3733*4882a593Smuzhiyun:Capability: KVM_CAP_REINJECT_CONTROL 3734*4882a593Smuzhiyun:Architectures: x86 3735*4882a593Smuzhiyun:Type: vm ioctl 3736*4882a593Smuzhiyun:Parameters: struct kvm_reinject_control (in) 3737*4882a593Smuzhiyun:Returns: 0 on success, 3738*4882a593Smuzhiyun -EFAULT if struct kvm_reinject_control cannot be read, 3739*4882a593Smuzhiyun -ENXIO if KVM_CREATE_PIT or KVM_CREATE_PIT2 didn't succeed earlier. 3740*4882a593Smuzhiyun 3741*4882a593Smuzhiyuni8254 (PIT) has two modes, reinject and !reinject. The default is reinject, 3742*4882a593Smuzhiyunwhere KVM queues elapsed i8254 ticks and monitors completion of interrupt from 3743*4882a593Smuzhiyunvector(s) that i8254 injects. Reinject mode dequeues a tick and injects its 3744*4882a593Smuzhiyuninterrupt whenever there isn't a pending interrupt from i8254. 3745*4882a593Smuzhiyun!reinject mode injects an interrupt as soon as a tick arrives. 3746*4882a593Smuzhiyun 3747*4882a593Smuzhiyun:: 3748*4882a593Smuzhiyun 3749*4882a593Smuzhiyun struct kvm_reinject_control { 3750*4882a593Smuzhiyun __u8 pit_reinject; 3751*4882a593Smuzhiyun __u8 reserved[31]; 3752*4882a593Smuzhiyun }; 3753*4882a593Smuzhiyun 3754*4882a593Smuzhiyunpit_reinject = 0 (!reinject mode) is recommended, unless running an old 3755*4882a593Smuzhiyunoperating system that uses the PIT for timing (e.g. Linux 2.4.x). 3756*4882a593Smuzhiyun 3757*4882a593Smuzhiyun4.100 KVM_PPC_CONFIGURE_V3_MMU 3758*4882a593Smuzhiyun------------------------------ 3759*4882a593Smuzhiyun 3760*4882a593Smuzhiyun:Capability: KVM_CAP_PPC_RADIX_MMU or KVM_CAP_PPC_HASH_MMU_V3 3761*4882a593Smuzhiyun:Architectures: ppc 3762*4882a593Smuzhiyun:Type: vm ioctl 3763*4882a593Smuzhiyun:Parameters: struct kvm_ppc_mmuv3_cfg (in) 3764*4882a593Smuzhiyun:Returns: 0 on success, 3765*4882a593Smuzhiyun -EFAULT if struct kvm_ppc_mmuv3_cfg cannot be read, 3766*4882a593Smuzhiyun -EINVAL if the configuration is invalid 3767*4882a593Smuzhiyun 3768*4882a593SmuzhiyunThis ioctl controls whether the guest will use radix or HPT (hashed 3769*4882a593Smuzhiyunpage table) translation, and sets the pointer to the process table for 3770*4882a593Smuzhiyunthe guest. 3771*4882a593Smuzhiyun 3772*4882a593Smuzhiyun:: 3773*4882a593Smuzhiyun 3774*4882a593Smuzhiyun struct kvm_ppc_mmuv3_cfg { 3775*4882a593Smuzhiyun __u64 flags; 3776*4882a593Smuzhiyun __u64 process_table; 3777*4882a593Smuzhiyun }; 3778*4882a593Smuzhiyun 3779*4882a593SmuzhiyunThere are two bits that can be set in flags; KVM_PPC_MMUV3_RADIX and 3780*4882a593SmuzhiyunKVM_PPC_MMUV3_GTSE. KVM_PPC_MMUV3_RADIX, if set, configures the guest 3781*4882a593Smuzhiyunto use radix tree translation, and if clear, to use HPT translation. 3782*4882a593SmuzhiyunKVM_PPC_MMUV3_GTSE, if set and if KVM permits it, configures the guest 3783*4882a593Smuzhiyunto be able to use the global TLB and SLB invalidation instructions; 3784*4882a593Smuzhiyunif clear, the guest may not use these instructions. 3785*4882a593Smuzhiyun 3786*4882a593SmuzhiyunThe process_table field specifies the address and size of the guest 3787*4882a593Smuzhiyunprocess table, which is in the guest's space. This field is formatted 3788*4882a593Smuzhiyunas the second doubleword of the partition table entry, as defined in 3789*4882a593Smuzhiyunthe Power ISA V3.00, Book III section 5.7.6.1. 3790*4882a593Smuzhiyun 3791*4882a593Smuzhiyun4.101 KVM_PPC_GET_RMMU_INFO 3792*4882a593Smuzhiyun--------------------------- 3793*4882a593Smuzhiyun 3794*4882a593Smuzhiyun:Capability: KVM_CAP_PPC_RADIX_MMU 3795*4882a593Smuzhiyun:Architectures: ppc 3796*4882a593Smuzhiyun:Type: vm ioctl 3797*4882a593Smuzhiyun:Parameters: struct kvm_ppc_rmmu_info (out) 3798*4882a593Smuzhiyun:Returns: 0 on success, 3799*4882a593Smuzhiyun -EFAULT if struct kvm_ppc_rmmu_info cannot be written, 3800*4882a593Smuzhiyun -EINVAL if no useful information can be returned 3801*4882a593Smuzhiyun 3802*4882a593SmuzhiyunThis ioctl returns a structure containing two things: (a) a list 3803*4882a593Smuzhiyuncontaining supported radix tree geometries, and (b) a list that maps 3804*4882a593Smuzhiyunpage sizes to put in the "AP" (actual page size) field for the tlbie 3805*4882a593Smuzhiyun(TLB invalidate entry) instruction. 3806*4882a593Smuzhiyun 3807*4882a593Smuzhiyun:: 3808*4882a593Smuzhiyun 3809*4882a593Smuzhiyun struct kvm_ppc_rmmu_info { 3810*4882a593Smuzhiyun struct kvm_ppc_radix_geom { 3811*4882a593Smuzhiyun __u8 page_shift; 3812*4882a593Smuzhiyun __u8 level_bits[4]; 3813*4882a593Smuzhiyun __u8 pad[3]; 3814*4882a593Smuzhiyun } geometries[8]; 3815*4882a593Smuzhiyun __u32 ap_encodings[8]; 3816*4882a593Smuzhiyun }; 3817*4882a593Smuzhiyun 3818*4882a593SmuzhiyunThe geometries[] field gives up to 8 supported geometries for the 3819*4882a593Smuzhiyunradix page table, in terms of the log base 2 of the smallest page 3820*4882a593Smuzhiyunsize, and the number of bits indexed at each level of the tree, from 3821*4882a593Smuzhiyunthe PTE level up to the PGD level in that order. Any unused entries 3822*4882a593Smuzhiyunwill have 0 in the page_shift field. 3823*4882a593Smuzhiyun 3824*4882a593SmuzhiyunThe ap_encodings gives the supported page sizes and their AP field 3825*4882a593Smuzhiyunencodings, encoded with the AP value in the top 3 bits and the log 3826*4882a593Smuzhiyunbase 2 of the page size in the bottom 6 bits. 3827*4882a593Smuzhiyun 3828*4882a593Smuzhiyun4.102 KVM_PPC_RESIZE_HPT_PREPARE 3829*4882a593Smuzhiyun-------------------------------- 3830*4882a593Smuzhiyun 3831*4882a593Smuzhiyun:Capability: KVM_CAP_SPAPR_RESIZE_HPT 3832*4882a593Smuzhiyun:Architectures: powerpc 3833*4882a593Smuzhiyun:Type: vm ioctl 3834*4882a593Smuzhiyun:Parameters: struct kvm_ppc_resize_hpt (in) 3835*4882a593Smuzhiyun:Returns: 0 on successful completion, 3836*4882a593Smuzhiyun >0 if a new HPT is being prepared, the value is an estimated 3837*4882a593Smuzhiyun number of milliseconds until preparation is complete, 3838*4882a593Smuzhiyun -EFAULT if struct kvm_reinject_control cannot be read, 3839*4882a593Smuzhiyun -EINVAL if the supplied shift or flags are invalid, 3840*4882a593Smuzhiyun -ENOMEM if unable to allocate the new HPT, 3841*4882a593Smuzhiyun -ENOSPC if there was a hash collision 3842*4882a593Smuzhiyun 3843*4882a593Smuzhiyun:: 3844*4882a593Smuzhiyun 3845*4882a593Smuzhiyun struct kvm_ppc_rmmu_info { 3846*4882a593Smuzhiyun struct kvm_ppc_radix_geom { 3847*4882a593Smuzhiyun __u8 page_shift; 3848*4882a593Smuzhiyun __u8 level_bits[4]; 3849*4882a593Smuzhiyun __u8 pad[3]; 3850*4882a593Smuzhiyun } geometries[8]; 3851*4882a593Smuzhiyun __u32 ap_encodings[8]; 3852*4882a593Smuzhiyun }; 3853*4882a593Smuzhiyun 3854*4882a593SmuzhiyunThe geometries[] field gives up to 8 supported geometries for the 3855*4882a593Smuzhiyunradix page table, in terms of the log base 2 of the smallest page 3856*4882a593Smuzhiyunsize, and the number of bits indexed at each level of the tree, from 3857*4882a593Smuzhiyunthe PTE level up to the PGD level in that order. Any unused entries 3858*4882a593Smuzhiyunwill have 0 in the page_shift field. 3859*4882a593Smuzhiyun 3860*4882a593SmuzhiyunThe ap_encodings gives the supported page sizes and their AP field 3861*4882a593Smuzhiyunencodings, encoded with the AP value in the top 3 bits and the log 3862*4882a593Smuzhiyunbase 2 of the page size in the bottom 6 bits. 3863*4882a593Smuzhiyun 3864*4882a593Smuzhiyun4.102 KVM_PPC_RESIZE_HPT_PREPARE 3865*4882a593Smuzhiyun-------------------------------- 3866*4882a593Smuzhiyun 3867*4882a593Smuzhiyun:Capability: KVM_CAP_SPAPR_RESIZE_HPT 3868*4882a593Smuzhiyun:Architectures: powerpc 3869*4882a593Smuzhiyun:Type: vm ioctl 3870*4882a593Smuzhiyun:Parameters: struct kvm_ppc_resize_hpt (in) 3871*4882a593Smuzhiyun:Returns: 0 on successful completion, 3872*4882a593Smuzhiyun >0 if a new HPT is being prepared, the value is an estimated 3873*4882a593Smuzhiyun number of milliseconds until preparation is complete, 3874*4882a593Smuzhiyun -EFAULT if struct kvm_reinject_control cannot be read, 3875*4882a593Smuzhiyun -EINVAL if the supplied shift or flags are invalid,when moving existing 3876*4882a593Smuzhiyun HPT entries to the new HPT, 3877*4882a593Smuzhiyun -EIO on other error conditions 3878*4882a593Smuzhiyun 3879*4882a593SmuzhiyunUsed to implement the PAPR extension for runtime resizing of a guest's 3880*4882a593SmuzhiyunHashed Page Table (HPT). Specifically this starts, stops or monitors 3881*4882a593Smuzhiyunthe preparation of a new potential HPT for the guest, essentially 3882*4882a593Smuzhiyunimplementing the H_RESIZE_HPT_PREPARE hypercall. 3883*4882a593Smuzhiyun 3884*4882a593SmuzhiyunIf called with shift > 0 when there is no pending HPT for the guest, 3885*4882a593Smuzhiyunthis begins preparation of a new pending HPT of size 2^(shift) bytes. 3886*4882a593SmuzhiyunIt then returns a positive integer with the estimated number of 3887*4882a593Smuzhiyunmilliseconds until preparation is complete. 3888*4882a593Smuzhiyun 3889*4882a593SmuzhiyunIf called when there is a pending HPT whose size does not match that 3890*4882a593Smuzhiyunrequested in the parameters, discards the existing pending HPT and 3891*4882a593Smuzhiyuncreates a new one as above. 3892*4882a593Smuzhiyun 3893*4882a593SmuzhiyunIf called when there is a pending HPT of the size requested, will: 3894*4882a593Smuzhiyun 3895*4882a593Smuzhiyun * If preparation of the pending HPT is already complete, return 0 3896*4882a593Smuzhiyun * If preparation of the pending HPT has failed, return an error 3897*4882a593Smuzhiyun code, then discard the pending HPT. 3898*4882a593Smuzhiyun * If preparation of the pending HPT is still in progress, return an 3899*4882a593Smuzhiyun estimated number of milliseconds until preparation is complete. 3900*4882a593Smuzhiyun 3901*4882a593SmuzhiyunIf called with shift == 0, discards any currently pending HPT and 3902*4882a593Smuzhiyunreturns 0 (i.e. cancels any in-progress preparation). 3903*4882a593Smuzhiyun 3904*4882a593Smuzhiyunflags is reserved for future expansion, currently setting any bits in 3905*4882a593Smuzhiyunflags will result in an -EINVAL. 3906*4882a593Smuzhiyun 3907*4882a593SmuzhiyunNormally this will be called repeatedly with the same parameters until 3908*4882a593Smuzhiyunit returns <= 0. The first call will initiate preparation, subsequent 3909*4882a593Smuzhiyunones will monitor preparation until it completes or fails. 3910*4882a593Smuzhiyun 3911*4882a593Smuzhiyun:: 3912*4882a593Smuzhiyun 3913*4882a593Smuzhiyun struct kvm_ppc_resize_hpt { 3914*4882a593Smuzhiyun __u64 flags; 3915*4882a593Smuzhiyun __u32 shift; 3916*4882a593Smuzhiyun __u32 pad; 3917*4882a593Smuzhiyun }; 3918*4882a593Smuzhiyun 3919*4882a593Smuzhiyun4.103 KVM_PPC_RESIZE_HPT_COMMIT 3920*4882a593Smuzhiyun------------------------------- 3921*4882a593Smuzhiyun 3922*4882a593Smuzhiyun:Capability: KVM_CAP_SPAPR_RESIZE_HPT 3923*4882a593Smuzhiyun:Architectures: powerpc 3924*4882a593Smuzhiyun:Type: vm ioctl 3925*4882a593Smuzhiyun:Parameters: struct kvm_ppc_resize_hpt (in) 3926*4882a593Smuzhiyun:Returns: 0 on successful completion, 3927*4882a593Smuzhiyun -EFAULT if struct kvm_reinject_control cannot be read, 3928*4882a593Smuzhiyun -EINVAL if the supplied shift or flags are invalid, 3929*4882a593Smuzhiyun -ENXIO is there is no pending HPT, or the pending HPT doesn't 3930*4882a593Smuzhiyun have the requested size, 3931*4882a593Smuzhiyun -EBUSY if the pending HPT is not fully prepared, 3932*4882a593Smuzhiyun -ENOSPC if there was a hash collision when moving existing 3933*4882a593Smuzhiyun HPT entries to the new HPT, 3934*4882a593Smuzhiyun -EIO on other error conditions 3935*4882a593Smuzhiyun 3936*4882a593SmuzhiyunUsed to implement the PAPR extension for runtime resizing of a guest's 3937*4882a593SmuzhiyunHashed Page Table (HPT). Specifically this requests that the guest be 3938*4882a593Smuzhiyuntransferred to working with the new HPT, essentially implementing the 3939*4882a593SmuzhiyunH_RESIZE_HPT_COMMIT hypercall. 3940*4882a593Smuzhiyun 3941*4882a593SmuzhiyunThis should only be called after KVM_PPC_RESIZE_HPT_PREPARE has 3942*4882a593Smuzhiyunreturned 0 with the same parameters. In other cases 3943*4882a593SmuzhiyunKVM_PPC_RESIZE_HPT_COMMIT will return an error (usually -ENXIO or 3944*4882a593Smuzhiyun-EBUSY, though others may be possible if the preparation was started, 3945*4882a593Smuzhiyunbut failed). 3946*4882a593Smuzhiyun 3947*4882a593SmuzhiyunThis will have undefined effects on the guest if it has not already 3948*4882a593Smuzhiyunplaced itself in a quiescent state where no vcpu will make MMU enabled 3949*4882a593Smuzhiyunmemory accesses. 3950*4882a593Smuzhiyun 3951*4882a593SmuzhiyunOn succsful completion, the pending HPT will become the guest's active 3952*4882a593SmuzhiyunHPT and the previous HPT will be discarded. 3953*4882a593Smuzhiyun 3954*4882a593SmuzhiyunOn failure, the guest will still be operating on its previous HPT. 3955*4882a593Smuzhiyun 3956*4882a593Smuzhiyun:: 3957*4882a593Smuzhiyun 3958*4882a593Smuzhiyun struct kvm_ppc_resize_hpt { 3959*4882a593Smuzhiyun __u64 flags; 3960*4882a593Smuzhiyun __u32 shift; 3961*4882a593Smuzhiyun __u32 pad; 3962*4882a593Smuzhiyun }; 3963*4882a593Smuzhiyun 3964*4882a593Smuzhiyun4.104 KVM_X86_GET_MCE_CAP_SUPPORTED 3965*4882a593Smuzhiyun----------------------------------- 3966*4882a593Smuzhiyun 3967*4882a593Smuzhiyun:Capability: KVM_CAP_MCE 3968*4882a593Smuzhiyun:Architectures: x86 3969*4882a593Smuzhiyun:Type: system ioctl 3970*4882a593Smuzhiyun:Parameters: u64 mce_cap (out) 3971*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 3972*4882a593Smuzhiyun 3973*4882a593SmuzhiyunReturns supported MCE capabilities. The u64 mce_cap parameter 3974*4882a593Smuzhiyunhas the same format as the MSR_IA32_MCG_CAP register. Supported 3975*4882a593Smuzhiyuncapabilities will have the corresponding bits set. 3976*4882a593Smuzhiyun 3977*4882a593Smuzhiyun4.105 KVM_X86_SETUP_MCE 3978*4882a593Smuzhiyun----------------------- 3979*4882a593Smuzhiyun 3980*4882a593Smuzhiyun:Capability: KVM_CAP_MCE 3981*4882a593Smuzhiyun:Architectures: x86 3982*4882a593Smuzhiyun:Type: vcpu ioctl 3983*4882a593Smuzhiyun:Parameters: u64 mcg_cap (in) 3984*4882a593Smuzhiyun:Returns: 0 on success, 3985*4882a593Smuzhiyun -EFAULT if u64 mcg_cap cannot be read, 3986*4882a593Smuzhiyun -EINVAL if the requested number of banks is invalid, 3987*4882a593Smuzhiyun -EINVAL if requested MCE capability is not supported. 3988*4882a593Smuzhiyun 3989*4882a593SmuzhiyunInitializes MCE support for use. The u64 mcg_cap parameter 3990*4882a593Smuzhiyunhas the same format as the MSR_IA32_MCG_CAP register and 3991*4882a593Smuzhiyunspecifies which capabilities should be enabled. The maximum 3992*4882a593Smuzhiyunsupported number of error-reporting banks can be retrieved when 3993*4882a593Smuzhiyunchecking for KVM_CAP_MCE. The supported capabilities can be 3994*4882a593Smuzhiyunretrieved with KVM_X86_GET_MCE_CAP_SUPPORTED. 3995*4882a593Smuzhiyun 3996*4882a593Smuzhiyun4.106 KVM_X86_SET_MCE 3997*4882a593Smuzhiyun--------------------- 3998*4882a593Smuzhiyun 3999*4882a593Smuzhiyun:Capability: KVM_CAP_MCE 4000*4882a593Smuzhiyun:Architectures: x86 4001*4882a593Smuzhiyun:Type: vcpu ioctl 4002*4882a593Smuzhiyun:Parameters: struct kvm_x86_mce (in) 4003*4882a593Smuzhiyun:Returns: 0 on success, 4004*4882a593Smuzhiyun -EFAULT if struct kvm_x86_mce cannot be read, 4005*4882a593Smuzhiyun -EINVAL if the bank number is invalid, 4006*4882a593Smuzhiyun -EINVAL if VAL bit is not set in status field. 4007*4882a593Smuzhiyun 4008*4882a593SmuzhiyunInject a machine check error (MCE) into the guest. The input 4009*4882a593Smuzhiyunparameter is:: 4010*4882a593Smuzhiyun 4011*4882a593Smuzhiyun struct kvm_x86_mce { 4012*4882a593Smuzhiyun __u64 status; 4013*4882a593Smuzhiyun __u64 addr; 4014*4882a593Smuzhiyun __u64 misc; 4015*4882a593Smuzhiyun __u64 mcg_status; 4016*4882a593Smuzhiyun __u8 bank; 4017*4882a593Smuzhiyun __u8 pad1[7]; 4018*4882a593Smuzhiyun __u64 pad2[3]; 4019*4882a593Smuzhiyun }; 4020*4882a593Smuzhiyun 4021*4882a593SmuzhiyunIf the MCE being reported is an uncorrected error, KVM will 4022*4882a593Smuzhiyuninject it as an MCE exception into the guest. If the guest 4023*4882a593SmuzhiyunMCG_STATUS register reports that an MCE is in progress, KVM 4024*4882a593Smuzhiyuncauses an KVM_EXIT_SHUTDOWN vmexit. 4025*4882a593Smuzhiyun 4026*4882a593SmuzhiyunOtherwise, if the MCE is a corrected error, KVM will just 4027*4882a593Smuzhiyunstore it in the corresponding bank (provided this bank is 4028*4882a593Smuzhiyunnot holding a previously reported uncorrected error). 4029*4882a593Smuzhiyun 4030*4882a593Smuzhiyun4.107 KVM_S390_GET_CMMA_BITS 4031*4882a593Smuzhiyun---------------------------- 4032*4882a593Smuzhiyun 4033*4882a593Smuzhiyun:Capability: KVM_CAP_S390_CMMA_MIGRATION 4034*4882a593Smuzhiyun:Architectures: s390 4035*4882a593Smuzhiyun:Type: vm ioctl 4036*4882a593Smuzhiyun:Parameters: struct kvm_s390_cmma_log (in, out) 4037*4882a593Smuzhiyun:Returns: 0 on success, a negative value on error 4038*4882a593Smuzhiyun 4039*4882a593SmuzhiyunThis ioctl is used to get the values of the CMMA bits on the s390 4040*4882a593Smuzhiyunarchitecture. It is meant to be used in two scenarios: 4041*4882a593Smuzhiyun 4042*4882a593Smuzhiyun- During live migration to save the CMMA values. Live migration needs 4043*4882a593Smuzhiyun to be enabled via the KVM_REQ_START_MIGRATION VM property. 4044*4882a593Smuzhiyun- To non-destructively peek at the CMMA values, with the flag 4045*4882a593Smuzhiyun KVM_S390_CMMA_PEEK set. 4046*4882a593Smuzhiyun 4047*4882a593SmuzhiyunThe ioctl takes parameters via the kvm_s390_cmma_log struct. The desired 4048*4882a593Smuzhiyunvalues are written to a buffer whose location is indicated via the "values" 4049*4882a593Smuzhiyunmember in the kvm_s390_cmma_log struct. The values in the input struct are 4050*4882a593Smuzhiyunalso updated as needed. 4051*4882a593Smuzhiyun 4052*4882a593SmuzhiyunEach CMMA value takes up one byte. 4053*4882a593Smuzhiyun 4054*4882a593Smuzhiyun:: 4055*4882a593Smuzhiyun 4056*4882a593Smuzhiyun struct kvm_s390_cmma_log { 4057*4882a593Smuzhiyun __u64 start_gfn; 4058*4882a593Smuzhiyun __u32 count; 4059*4882a593Smuzhiyun __u32 flags; 4060*4882a593Smuzhiyun union { 4061*4882a593Smuzhiyun __u64 remaining; 4062*4882a593Smuzhiyun __u64 mask; 4063*4882a593Smuzhiyun }; 4064*4882a593Smuzhiyun __u64 values; 4065*4882a593Smuzhiyun }; 4066*4882a593Smuzhiyun 4067*4882a593Smuzhiyunstart_gfn is the number of the first guest frame whose CMMA values are 4068*4882a593Smuzhiyunto be retrieved, 4069*4882a593Smuzhiyun 4070*4882a593Smuzhiyuncount is the length of the buffer in bytes, 4071*4882a593Smuzhiyun 4072*4882a593Smuzhiyunvalues points to the buffer where the result will be written to. 4073*4882a593Smuzhiyun 4074*4882a593SmuzhiyunIf count is greater than KVM_S390_SKEYS_MAX, then it is considered to be 4075*4882a593SmuzhiyunKVM_S390_SKEYS_MAX. KVM_S390_SKEYS_MAX is re-used for consistency with 4076*4882a593Smuzhiyunother ioctls. 4077*4882a593Smuzhiyun 4078*4882a593SmuzhiyunThe result is written in the buffer pointed to by the field values, and 4079*4882a593Smuzhiyunthe values of the input parameter are updated as follows. 4080*4882a593Smuzhiyun 4081*4882a593SmuzhiyunDepending on the flags, different actions are performed. The only 4082*4882a593Smuzhiyunsupported flag so far is KVM_S390_CMMA_PEEK. 4083*4882a593Smuzhiyun 4084*4882a593SmuzhiyunThe default behaviour if KVM_S390_CMMA_PEEK is not set is: 4085*4882a593Smuzhiyunstart_gfn will indicate the first page frame whose CMMA bits were dirty. 4086*4882a593SmuzhiyunIt is not necessarily the same as the one passed as input, as clean pages 4087*4882a593Smuzhiyunare skipped. 4088*4882a593Smuzhiyun 4089*4882a593Smuzhiyuncount will indicate the number of bytes actually written in the buffer. 4090*4882a593SmuzhiyunIt can (and very often will) be smaller than the input value, since the 4091*4882a593Smuzhiyunbuffer is only filled until 16 bytes of clean values are found (which 4092*4882a593Smuzhiyunare then not copied in the buffer). Since a CMMA migration block needs 4093*4882a593Smuzhiyunthe base address and the length, for a total of 16 bytes, we will send 4094*4882a593Smuzhiyunback some clean data if there is some dirty data afterwards, as long as 4095*4882a593Smuzhiyunthe size of the clean data does not exceed the size of the header. This 4096*4882a593Smuzhiyunallows to minimize the amount of data to be saved or transferred over 4097*4882a593Smuzhiyunthe network at the expense of more roundtrips to userspace. The next 4098*4882a593Smuzhiyuninvocation of the ioctl will skip over all the clean values, saving 4099*4882a593Smuzhiyunpotentially more than just the 16 bytes we found. 4100*4882a593Smuzhiyun 4101*4882a593SmuzhiyunIf KVM_S390_CMMA_PEEK is set: 4102*4882a593Smuzhiyunthe existing storage attributes are read even when not in migration 4103*4882a593Smuzhiyunmode, and no other action is performed; 4104*4882a593Smuzhiyun 4105*4882a593Smuzhiyunthe output start_gfn will be equal to the input start_gfn, 4106*4882a593Smuzhiyun 4107*4882a593Smuzhiyunthe output count will be equal to the input count, except if the end of 4108*4882a593Smuzhiyunmemory has been reached. 4109*4882a593Smuzhiyun 4110*4882a593SmuzhiyunIn both cases: 4111*4882a593Smuzhiyunthe field "remaining" will indicate the total number of dirty CMMA values 4112*4882a593Smuzhiyunstill remaining, or 0 if KVM_S390_CMMA_PEEK is set and migration mode is 4113*4882a593Smuzhiyunnot enabled. 4114*4882a593Smuzhiyun 4115*4882a593Smuzhiyunmask is unused. 4116*4882a593Smuzhiyun 4117*4882a593Smuzhiyunvalues points to the userspace buffer where the result will be stored. 4118*4882a593Smuzhiyun 4119*4882a593SmuzhiyunThis ioctl can fail with -ENOMEM if not enough memory can be allocated to 4120*4882a593Smuzhiyuncomplete the task, with -ENXIO if CMMA is not enabled, with -EINVAL if 4121*4882a593SmuzhiyunKVM_S390_CMMA_PEEK is not set but migration mode was not enabled, with 4122*4882a593Smuzhiyun-EFAULT if the userspace address is invalid or if no page table is 4123*4882a593Smuzhiyunpresent for the addresses (e.g. when using hugepages). 4124*4882a593Smuzhiyun 4125*4882a593Smuzhiyun4.108 KVM_S390_SET_CMMA_BITS 4126*4882a593Smuzhiyun---------------------------- 4127*4882a593Smuzhiyun 4128*4882a593Smuzhiyun:Capability: KVM_CAP_S390_CMMA_MIGRATION 4129*4882a593Smuzhiyun:Architectures: s390 4130*4882a593Smuzhiyun:Type: vm ioctl 4131*4882a593Smuzhiyun:Parameters: struct kvm_s390_cmma_log (in) 4132*4882a593Smuzhiyun:Returns: 0 on success, a negative value on error 4133*4882a593Smuzhiyun 4134*4882a593SmuzhiyunThis ioctl is used to set the values of the CMMA bits on the s390 4135*4882a593Smuzhiyunarchitecture. It is meant to be used during live migration to restore 4136*4882a593Smuzhiyunthe CMMA values, but there are no restrictions on its use. 4137*4882a593SmuzhiyunThe ioctl takes parameters via the kvm_s390_cmma_values struct. 4138*4882a593SmuzhiyunEach CMMA value takes up one byte. 4139*4882a593Smuzhiyun 4140*4882a593Smuzhiyun:: 4141*4882a593Smuzhiyun 4142*4882a593Smuzhiyun struct kvm_s390_cmma_log { 4143*4882a593Smuzhiyun __u64 start_gfn; 4144*4882a593Smuzhiyun __u32 count; 4145*4882a593Smuzhiyun __u32 flags; 4146*4882a593Smuzhiyun union { 4147*4882a593Smuzhiyun __u64 remaining; 4148*4882a593Smuzhiyun __u64 mask; 4149*4882a593Smuzhiyun }; 4150*4882a593Smuzhiyun __u64 values; 4151*4882a593Smuzhiyun }; 4152*4882a593Smuzhiyun 4153*4882a593Smuzhiyunstart_gfn indicates the starting guest frame number, 4154*4882a593Smuzhiyun 4155*4882a593Smuzhiyuncount indicates how many values are to be considered in the buffer, 4156*4882a593Smuzhiyun 4157*4882a593Smuzhiyunflags is not used and must be 0. 4158*4882a593Smuzhiyun 4159*4882a593Smuzhiyunmask indicates which PGSTE bits are to be considered. 4160*4882a593Smuzhiyun 4161*4882a593Smuzhiyunremaining is not used. 4162*4882a593Smuzhiyun 4163*4882a593Smuzhiyunvalues points to the buffer in userspace where to store the values. 4164*4882a593Smuzhiyun 4165*4882a593SmuzhiyunThis ioctl can fail with -ENOMEM if not enough memory can be allocated to 4166*4882a593Smuzhiyuncomplete the task, with -ENXIO if CMMA is not enabled, with -EINVAL if 4167*4882a593Smuzhiyunthe count field is too large (e.g. more than KVM_S390_CMMA_SIZE_MAX) or 4168*4882a593Smuzhiyunif the flags field was not 0, with -EFAULT if the userspace address is 4169*4882a593Smuzhiyuninvalid, if invalid pages are written to (e.g. after the end of memory) 4170*4882a593Smuzhiyunor if no page table is present for the addresses (e.g. when using 4171*4882a593Smuzhiyunhugepages). 4172*4882a593Smuzhiyun 4173*4882a593Smuzhiyun4.109 KVM_PPC_GET_CPU_CHAR 4174*4882a593Smuzhiyun-------------------------- 4175*4882a593Smuzhiyun 4176*4882a593Smuzhiyun:Capability: KVM_CAP_PPC_GET_CPU_CHAR 4177*4882a593Smuzhiyun:Architectures: powerpc 4178*4882a593Smuzhiyun:Type: vm ioctl 4179*4882a593Smuzhiyun:Parameters: struct kvm_ppc_cpu_char (out) 4180*4882a593Smuzhiyun:Returns: 0 on successful completion, 4181*4882a593Smuzhiyun -EFAULT if struct kvm_ppc_cpu_char cannot be written 4182*4882a593Smuzhiyun 4183*4882a593SmuzhiyunThis ioctl gives userspace information about certain characteristics 4184*4882a593Smuzhiyunof the CPU relating to speculative execution of instructions and 4185*4882a593Smuzhiyunpossible information leakage resulting from speculative execution (see 4186*4882a593SmuzhiyunCVE-2017-5715, CVE-2017-5753 and CVE-2017-5754). The information is 4187*4882a593Smuzhiyunreturned in struct kvm_ppc_cpu_char, which looks like this:: 4188*4882a593Smuzhiyun 4189*4882a593Smuzhiyun struct kvm_ppc_cpu_char { 4190*4882a593Smuzhiyun __u64 character; /* characteristics of the CPU */ 4191*4882a593Smuzhiyun __u64 behaviour; /* recommended software behaviour */ 4192*4882a593Smuzhiyun __u64 character_mask; /* valid bits in character */ 4193*4882a593Smuzhiyun __u64 behaviour_mask; /* valid bits in behaviour */ 4194*4882a593Smuzhiyun }; 4195*4882a593Smuzhiyun 4196*4882a593SmuzhiyunFor extensibility, the character_mask and behaviour_mask fields 4197*4882a593Smuzhiyunindicate which bits of character and behaviour have been filled in by 4198*4882a593Smuzhiyunthe kernel. If the set of defined bits is extended in future then 4199*4882a593Smuzhiyunuserspace will be able to tell whether it is running on a kernel that 4200*4882a593Smuzhiyunknows about the new bits. 4201*4882a593Smuzhiyun 4202*4882a593SmuzhiyunThe character field describes attributes of the CPU which can help 4203*4882a593Smuzhiyunwith preventing inadvertent information disclosure - specifically, 4204*4882a593Smuzhiyunwhether there is an instruction to flash-invalidate the L1 data cache 4205*4882a593Smuzhiyun(ori 30,30,0 or mtspr SPRN_TRIG2,rN), whether the L1 data cache is set 4206*4882a593Smuzhiyunto a mode where entries can only be used by the thread that created 4207*4882a593Smuzhiyunthem, whether the bcctr[l] instruction prevents speculation, and 4208*4882a593Smuzhiyunwhether a speculation barrier instruction (ori 31,31,0) is provided. 4209*4882a593Smuzhiyun 4210*4882a593SmuzhiyunThe behaviour field describes actions that software should take to 4211*4882a593Smuzhiyunprevent inadvertent information disclosure, and thus describes which 4212*4882a593Smuzhiyunvulnerabilities the hardware is subject to; specifically whether the 4213*4882a593SmuzhiyunL1 data cache should be flushed when returning to user mode from the 4214*4882a593Smuzhiyunkernel, and whether a speculation barrier should be placed between an 4215*4882a593Smuzhiyunarray bounds check and the array access. 4216*4882a593Smuzhiyun 4217*4882a593SmuzhiyunThese fields use the same bit definitions as the new 4218*4882a593SmuzhiyunH_GET_CPU_CHARACTERISTICS hypercall. 4219*4882a593Smuzhiyun 4220*4882a593Smuzhiyun4.110 KVM_MEMORY_ENCRYPT_OP 4221*4882a593Smuzhiyun--------------------------- 4222*4882a593Smuzhiyun 4223*4882a593Smuzhiyun:Capability: basic 4224*4882a593Smuzhiyun:Architectures: x86 4225*4882a593Smuzhiyun:Type: vm 4226*4882a593Smuzhiyun:Parameters: an opaque platform specific structure (in/out) 4227*4882a593Smuzhiyun:Returns: 0 on success; -1 on error 4228*4882a593Smuzhiyun 4229*4882a593SmuzhiyunIf the platform supports creating encrypted VMs then this ioctl can be used 4230*4882a593Smuzhiyunfor issuing platform-specific memory encryption commands to manage those 4231*4882a593Smuzhiyunencrypted VMs. 4232*4882a593Smuzhiyun 4233*4882a593SmuzhiyunCurrently, this ioctl is used for issuing Secure Encrypted Virtualization 4234*4882a593Smuzhiyun(SEV) commands on AMD Processors. The SEV commands are defined in 4235*4882a593SmuzhiyunDocumentation/virt/kvm/amd-memory-encryption.rst. 4236*4882a593Smuzhiyun 4237*4882a593Smuzhiyun4.111 KVM_MEMORY_ENCRYPT_REG_REGION 4238*4882a593Smuzhiyun----------------------------------- 4239*4882a593Smuzhiyun 4240*4882a593Smuzhiyun:Capability: basic 4241*4882a593Smuzhiyun:Architectures: x86 4242*4882a593Smuzhiyun:Type: system 4243*4882a593Smuzhiyun:Parameters: struct kvm_enc_region (in) 4244*4882a593Smuzhiyun:Returns: 0 on success; -1 on error 4245*4882a593Smuzhiyun 4246*4882a593SmuzhiyunThis ioctl can be used to register a guest memory region which may 4247*4882a593Smuzhiyuncontain encrypted data (e.g. guest RAM, SMRAM etc). 4248*4882a593Smuzhiyun 4249*4882a593SmuzhiyunIt is used in the SEV-enabled guest. When encryption is enabled, a guest 4250*4882a593Smuzhiyunmemory region may contain encrypted data. The SEV memory encryption 4251*4882a593Smuzhiyunengine uses a tweak such that two identical plaintext pages, each at 4252*4882a593Smuzhiyundifferent locations will have differing ciphertexts. So swapping or 4253*4882a593Smuzhiyunmoving ciphertext of those pages will not result in plaintext being 4254*4882a593Smuzhiyunswapped. So relocating (or migrating) physical backing pages for the SEV 4255*4882a593Smuzhiyunguest will require some additional steps. 4256*4882a593Smuzhiyun 4257*4882a593SmuzhiyunNote: The current SEV key management spec does not provide commands to 4258*4882a593Smuzhiyunswap or migrate (move) ciphertext pages. Hence, for now we pin the guest 4259*4882a593Smuzhiyunmemory region registered with the ioctl. 4260*4882a593Smuzhiyun 4261*4882a593Smuzhiyun4.112 KVM_MEMORY_ENCRYPT_UNREG_REGION 4262*4882a593Smuzhiyun------------------------------------- 4263*4882a593Smuzhiyun 4264*4882a593Smuzhiyun:Capability: basic 4265*4882a593Smuzhiyun:Architectures: x86 4266*4882a593Smuzhiyun:Type: system 4267*4882a593Smuzhiyun:Parameters: struct kvm_enc_region (in) 4268*4882a593Smuzhiyun:Returns: 0 on success; -1 on error 4269*4882a593Smuzhiyun 4270*4882a593SmuzhiyunThis ioctl can be used to unregister the guest memory region registered 4271*4882a593Smuzhiyunwith KVM_MEMORY_ENCRYPT_REG_REGION ioctl above. 4272*4882a593Smuzhiyun 4273*4882a593Smuzhiyun4.113 KVM_HYPERV_EVENTFD 4274*4882a593Smuzhiyun------------------------ 4275*4882a593Smuzhiyun 4276*4882a593Smuzhiyun:Capability: KVM_CAP_HYPERV_EVENTFD 4277*4882a593Smuzhiyun:Architectures: x86 4278*4882a593Smuzhiyun:Type: vm ioctl 4279*4882a593Smuzhiyun:Parameters: struct kvm_hyperv_eventfd (in) 4280*4882a593Smuzhiyun 4281*4882a593SmuzhiyunThis ioctl (un)registers an eventfd to receive notifications from the guest on 4282*4882a593Smuzhiyunthe specified Hyper-V connection id through the SIGNAL_EVENT hypercall, without 4283*4882a593Smuzhiyuncausing a user exit. SIGNAL_EVENT hypercall with non-zero event flag number 4284*4882a593Smuzhiyun(bits 24-31) still triggers a KVM_EXIT_HYPERV_HCALL user exit. 4285*4882a593Smuzhiyun 4286*4882a593Smuzhiyun:: 4287*4882a593Smuzhiyun 4288*4882a593Smuzhiyun struct kvm_hyperv_eventfd { 4289*4882a593Smuzhiyun __u32 conn_id; 4290*4882a593Smuzhiyun __s32 fd; 4291*4882a593Smuzhiyun __u32 flags; 4292*4882a593Smuzhiyun __u32 padding[3]; 4293*4882a593Smuzhiyun }; 4294*4882a593Smuzhiyun 4295*4882a593SmuzhiyunThe conn_id field should fit within 24 bits:: 4296*4882a593Smuzhiyun 4297*4882a593Smuzhiyun #define KVM_HYPERV_CONN_ID_MASK 0x00ffffff 4298*4882a593Smuzhiyun 4299*4882a593SmuzhiyunThe acceptable values for the flags field are:: 4300*4882a593Smuzhiyun 4301*4882a593Smuzhiyun #define KVM_HYPERV_EVENTFD_DEASSIGN (1 << 0) 4302*4882a593Smuzhiyun 4303*4882a593Smuzhiyun:Returns: 0 on success, 4304*4882a593Smuzhiyun -EINVAL if conn_id or flags is outside the allowed range, 4305*4882a593Smuzhiyun -ENOENT on deassign if the conn_id isn't registered, 4306*4882a593Smuzhiyun -EEXIST on assign if the conn_id is already registered 4307*4882a593Smuzhiyun 4308*4882a593Smuzhiyun4.114 KVM_GET_NESTED_STATE 4309*4882a593Smuzhiyun-------------------------- 4310*4882a593Smuzhiyun 4311*4882a593Smuzhiyun:Capability: KVM_CAP_NESTED_STATE 4312*4882a593Smuzhiyun:Architectures: x86 4313*4882a593Smuzhiyun:Type: vcpu ioctl 4314*4882a593Smuzhiyun:Parameters: struct kvm_nested_state (in/out) 4315*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 4316*4882a593Smuzhiyun 4317*4882a593SmuzhiyunErrors: 4318*4882a593Smuzhiyun 4319*4882a593Smuzhiyun ===== ============================================================= 4320*4882a593Smuzhiyun E2BIG the total state size exceeds the value of 'size' specified by 4321*4882a593Smuzhiyun the user; the size required will be written into size. 4322*4882a593Smuzhiyun ===== ============================================================= 4323*4882a593Smuzhiyun 4324*4882a593Smuzhiyun:: 4325*4882a593Smuzhiyun 4326*4882a593Smuzhiyun struct kvm_nested_state { 4327*4882a593Smuzhiyun __u16 flags; 4328*4882a593Smuzhiyun __u16 format; 4329*4882a593Smuzhiyun __u32 size; 4330*4882a593Smuzhiyun 4331*4882a593Smuzhiyun union { 4332*4882a593Smuzhiyun struct kvm_vmx_nested_state_hdr vmx; 4333*4882a593Smuzhiyun struct kvm_svm_nested_state_hdr svm; 4334*4882a593Smuzhiyun 4335*4882a593Smuzhiyun /* Pad the header to 128 bytes. */ 4336*4882a593Smuzhiyun __u8 pad[120]; 4337*4882a593Smuzhiyun } hdr; 4338*4882a593Smuzhiyun 4339*4882a593Smuzhiyun union { 4340*4882a593Smuzhiyun struct kvm_vmx_nested_state_data vmx[0]; 4341*4882a593Smuzhiyun struct kvm_svm_nested_state_data svm[0]; 4342*4882a593Smuzhiyun } data; 4343*4882a593Smuzhiyun }; 4344*4882a593Smuzhiyun 4345*4882a593Smuzhiyun #define KVM_STATE_NESTED_GUEST_MODE 0x00000001 4346*4882a593Smuzhiyun #define KVM_STATE_NESTED_RUN_PENDING 0x00000002 4347*4882a593Smuzhiyun #define KVM_STATE_NESTED_EVMCS 0x00000004 4348*4882a593Smuzhiyun 4349*4882a593Smuzhiyun #define KVM_STATE_NESTED_FORMAT_VMX 0 4350*4882a593Smuzhiyun #define KVM_STATE_NESTED_FORMAT_SVM 1 4351*4882a593Smuzhiyun 4352*4882a593Smuzhiyun #define KVM_STATE_NESTED_VMX_VMCS_SIZE 0x1000 4353*4882a593Smuzhiyun 4354*4882a593Smuzhiyun #define KVM_STATE_NESTED_VMX_SMM_GUEST_MODE 0x00000001 4355*4882a593Smuzhiyun #define KVM_STATE_NESTED_VMX_SMM_VMXON 0x00000002 4356*4882a593Smuzhiyun 4357*4882a593Smuzhiyun #define KVM_STATE_VMX_PREEMPTION_TIMER_DEADLINE 0x00000001 4358*4882a593Smuzhiyun 4359*4882a593Smuzhiyun struct kvm_vmx_nested_state_hdr { 4360*4882a593Smuzhiyun __u64 vmxon_pa; 4361*4882a593Smuzhiyun __u64 vmcs12_pa; 4362*4882a593Smuzhiyun 4363*4882a593Smuzhiyun struct { 4364*4882a593Smuzhiyun __u16 flags; 4365*4882a593Smuzhiyun } smm; 4366*4882a593Smuzhiyun 4367*4882a593Smuzhiyun __u32 flags; 4368*4882a593Smuzhiyun __u64 preemption_timer_deadline; 4369*4882a593Smuzhiyun }; 4370*4882a593Smuzhiyun 4371*4882a593Smuzhiyun struct kvm_vmx_nested_state_data { 4372*4882a593Smuzhiyun __u8 vmcs12[KVM_STATE_NESTED_VMX_VMCS_SIZE]; 4373*4882a593Smuzhiyun __u8 shadow_vmcs12[KVM_STATE_NESTED_VMX_VMCS_SIZE]; 4374*4882a593Smuzhiyun }; 4375*4882a593Smuzhiyun 4376*4882a593SmuzhiyunThis ioctl copies the vcpu's nested virtualization state from the kernel to 4377*4882a593Smuzhiyunuserspace. 4378*4882a593Smuzhiyun 4379*4882a593SmuzhiyunThe maximum size of the state can be retrieved by passing KVM_CAP_NESTED_STATE 4380*4882a593Smuzhiyunto the KVM_CHECK_EXTENSION ioctl(). 4381*4882a593Smuzhiyun 4382*4882a593Smuzhiyun4.115 KVM_SET_NESTED_STATE 4383*4882a593Smuzhiyun-------------------------- 4384*4882a593Smuzhiyun 4385*4882a593Smuzhiyun:Capability: KVM_CAP_NESTED_STATE 4386*4882a593Smuzhiyun:Architectures: x86 4387*4882a593Smuzhiyun:Type: vcpu ioctl 4388*4882a593Smuzhiyun:Parameters: struct kvm_nested_state (in) 4389*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 4390*4882a593Smuzhiyun 4391*4882a593SmuzhiyunThis copies the vcpu's kvm_nested_state struct from userspace to the kernel. 4392*4882a593SmuzhiyunFor the definition of struct kvm_nested_state, see KVM_GET_NESTED_STATE. 4393*4882a593Smuzhiyun 4394*4882a593Smuzhiyun4.116 KVM_(UN)REGISTER_COALESCED_MMIO 4395*4882a593Smuzhiyun------------------------------------- 4396*4882a593Smuzhiyun 4397*4882a593Smuzhiyun:Capability: KVM_CAP_COALESCED_MMIO (for coalesced mmio) 4398*4882a593Smuzhiyun KVM_CAP_COALESCED_PIO (for coalesced pio) 4399*4882a593Smuzhiyun:Architectures: all 4400*4882a593Smuzhiyun:Type: vm ioctl 4401*4882a593Smuzhiyun:Parameters: struct kvm_coalesced_mmio_zone 4402*4882a593Smuzhiyun:Returns: 0 on success, < 0 on error 4403*4882a593Smuzhiyun 4404*4882a593SmuzhiyunCoalesced I/O is a performance optimization that defers hardware 4405*4882a593Smuzhiyunregister write emulation so that userspace exits are avoided. It is 4406*4882a593Smuzhiyuntypically used to reduce the overhead of emulating frequently accessed 4407*4882a593Smuzhiyunhardware registers. 4408*4882a593Smuzhiyun 4409*4882a593SmuzhiyunWhen a hardware register is configured for coalesced I/O, write accesses 4410*4882a593Smuzhiyundo not exit to userspace and their value is recorded in a ring buffer 4411*4882a593Smuzhiyunthat is shared between kernel and userspace. 4412*4882a593Smuzhiyun 4413*4882a593SmuzhiyunCoalesced I/O is used if one or more write accesses to a hardware 4414*4882a593Smuzhiyunregister can be deferred until a read or a write to another hardware 4415*4882a593Smuzhiyunregister on the same device. This last access will cause a vmexit and 4416*4882a593Smuzhiyunuserspace will process accesses from the ring buffer before emulating 4417*4882a593Smuzhiyunit. That will avoid exiting to userspace on repeated writes. 4418*4882a593Smuzhiyun 4419*4882a593SmuzhiyunCoalesced pio is based on coalesced mmio. There is little difference 4420*4882a593Smuzhiyunbetween coalesced mmio and pio except that coalesced pio records accesses 4421*4882a593Smuzhiyunto I/O ports. 4422*4882a593Smuzhiyun 4423*4882a593Smuzhiyun4.117 KVM_CLEAR_DIRTY_LOG (vm ioctl) 4424*4882a593Smuzhiyun------------------------------------ 4425*4882a593Smuzhiyun 4426*4882a593Smuzhiyun:Capability: KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 4427*4882a593Smuzhiyun:Architectures: x86, arm, arm64, mips 4428*4882a593Smuzhiyun:Type: vm ioctl 4429*4882a593Smuzhiyun:Parameters: struct kvm_dirty_log (in) 4430*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 4431*4882a593Smuzhiyun 4432*4882a593Smuzhiyun:: 4433*4882a593Smuzhiyun 4434*4882a593Smuzhiyun /* for KVM_CLEAR_DIRTY_LOG */ 4435*4882a593Smuzhiyun struct kvm_clear_dirty_log { 4436*4882a593Smuzhiyun __u32 slot; 4437*4882a593Smuzhiyun __u32 num_pages; 4438*4882a593Smuzhiyun __u64 first_page; 4439*4882a593Smuzhiyun union { 4440*4882a593Smuzhiyun void __user *dirty_bitmap; /* one bit per page */ 4441*4882a593Smuzhiyun __u64 padding; 4442*4882a593Smuzhiyun }; 4443*4882a593Smuzhiyun }; 4444*4882a593Smuzhiyun 4445*4882a593SmuzhiyunThe ioctl clears the dirty status of pages in a memory slot, according to 4446*4882a593Smuzhiyunthe bitmap that is passed in struct kvm_clear_dirty_log's dirty_bitmap 4447*4882a593Smuzhiyunfield. Bit 0 of the bitmap corresponds to page "first_page" in the 4448*4882a593Smuzhiyunmemory slot, and num_pages is the size in bits of the input bitmap. 4449*4882a593Smuzhiyunfirst_page must be a multiple of 64; num_pages must also be a multiple of 4450*4882a593Smuzhiyun64 unless first_page + num_pages is the size of the memory slot. For each 4451*4882a593Smuzhiyunbit that is set in the input bitmap, the corresponding page is marked "clean" 4452*4882a593Smuzhiyunin KVM's dirty bitmap, and dirty tracking is re-enabled for that page 4453*4882a593Smuzhiyun(for example via write-protection, or by clearing the dirty bit in 4454*4882a593Smuzhiyuna page table entry). 4455*4882a593Smuzhiyun 4456*4882a593SmuzhiyunIf KVM_CAP_MULTI_ADDRESS_SPACE is available, bits 16-31 specifies 4457*4882a593Smuzhiyunthe address space for which you want to return the dirty bitmap. 4458*4882a593SmuzhiyunThey must be less than the value that KVM_CHECK_EXTENSION returns for 4459*4882a593Smuzhiyunthe KVM_CAP_MULTI_ADDRESS_SPACE capability. 4460*4882a593Smuzhiyun 4461*4882a593SmuzhiyunThis ioctl is mostly useful when KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 4462*4882a593Smuzhiyunis enabled; for more information, see the description of the capability. 4463*4882a593SmuzhiyunHowever, it can always be used as long as KVM_CHECK_EXTENSION confirms 4464*4882a593Smuzhiyunthat KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 is present. 4465*4882a593Smuzhiyun 4466*4882a593Smuzhiyun4.118 KVM_GET_SUPPORTED_HV_CPUID 4467*4882a593Smuzhiyun-------------------------------- 4468*4882a593Smuzhiyun 4469*4882a593Smuzhiyun:Capability: KVM_CAP_HYPERV_CPUID 4470*4882a593Smuzhiyun:Architectures: x86 4471*4882a593Smuzhiyun:Type: vcpu ioctl 4472*4882a593Smuzhiyun:Parameters: struct kvm_cpuid2 (in/out) 4473*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 4474*4882a593Smuzhiyun 4475*4882a593Smuzhiyun:: 4476*4882a593Smuzhiyun 4477*4882a593Smuzhiyun struct kvm_cpuid2 { 4478*4882a593Smuzhiyun __u32 nent; 4479*4882a593Smuzhiyun __u32 padding; 4480*4882a593Smuzhiyun struct kvm_cpuid_entry2 entries[0]; 4481*4882a593Smuzhiyun }; 4482*4882a593Smuzhiyun 4483*4882a593Smuzhiyun struct kvm_cpuid_entry2 { 4484*4882a593Smuzhiyun __u32 function; 4485*4882a593Smuzhiyun __u32 index; 4486*4882a593Smuzhiyun __u32 flags; 4487*4882a593Smuzhiyun __u32 eax; 4488*4882a593Smuzhiyun __u32 ebx; 4489*4882a593Smuzhiyun __u32 ecx; 4490*4882a593Smuzhiyun __u32 edx; 4491*4882a593Smuzhiyun __u32 padding[3]; 4492*4882a593Smuzhiyun }; 4493*4882a593Smuzhiyun 4494*4882a593SmuzhiyunThis ioctl returns x86 cpuid features leaves related to Hyper-V emulation in 4495*4882a593SmuzhiyunKVM. Userspace can use the information returned by this ioctl to construct 4496*4882a593Smuzhiyuncpuid information presented to guests consuming Hyper-V enlightenments (e.g. 4497*4882a593SmuzhiyunWindows or Hyper-V guests). 4498*4882a593Smuzhiyun 4499*4882a593SmuzhiyunCPUID feature leaves returned by this ioctl are defined by Hyper-V Top Level 4500*4882a593SmuzhiyunFunctional Specification (TLFS). These leaves can't be obtained with 4501*4882a593SmuzhiyunKVM_GET_SUPPORTED_CPUID ioctl because some of them intersect with KVM feature 4502*4882a593Smuzhiyunleaves (0x40000000, 0x40000001). 4503*4882a593Smuzhiyun 4504*4882a593SmuzhiyunCurrently, the following list of CPUID leaves are returned: 4505*4882a593Smuzhiyun - HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS 4506*4882a593Smuzhiyun - HYPERV_CPUID_INTERFACE 4507*4882a593Smuzhiyun - HYPERV_CPUID_VERSION 4508*4882a593Smuzhiyun - HYPERV_CPUID_FEATURES 4509*4882a593Smuzhiyun - HYPERV_CPUID_ENLIGHTMENT_INFO 4510*4882a593Smuzhiyun - HYPERV_CPUID_IMPLEMENT_LIMITS 4511*4882a593Smuzhiyun - HYPERV_CPUID_NESTED_FEATURES 4512*4882a593Smuzhiyun - HYPERV_CPUID_SYNDBG_VENDOR_AND_MAX_FUNCTIONS 4513*4882a593Smuzhiyun - HYPERV_CPUID_SYNDBG_INTERFACE 4514*4882a593Smuzhiyun - HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES 4515*4882a593Smuzhiyun 4516*4882a593SmuzhiyunHYPERV_CPUID_NESTED_FEATURES leaf is only exposed when Enlightened VMCS was 4517*4882a593Smuzhiyunenabled on the corresponding vCPU (KVM_CAP_HYPERV_ENLIGHTENED_VMCS). 4518*4882a593Smuzhiyun 4519*4882a593SmuzhiyunUserspace invokes KVM_GET_SUPPORTED_HV_CPUID by passing a kvm_cpuid2 structure 4520*4882a593Smuzhiyunwith the 'nent' field indicating the number of entries in the variable-size 4521*4882a593Smuzhiyunarray 'entries'. If the number of entries is too low to describe all Hyper-V 4522*4882a593Smuzhiyunfeature leaves, an error (E2BIG) is returned. If the number is more or equal 4523*4882a593Smuzhiyunto the number of Hyper-V feature leaves, the 'nent' field is adjusted to the 4524*4882a593Smuzhiyunnumber of valid entries in the 'entries' array, which is then filled. 4525*4882a593Smuzhiyun 4526*4882a593Smuzhiyun'index' and 'flags' fields in 'struct kvm_cpuid_entry2' are currently reserved, 4527*4882a593Smuzhiyunuserspace should not expect to get any particular value there. 4528*4882a593Smuzhiyun 4529*4882a593Smuzhiyun4.119 KVM_ARM_VCPU_FINALIZE 4530*4882a593Smuzhiyun--------------------------- 4531*4882a593Smuzhiyun 4532*4882a593Smuzhiyun:Architectures: arm, arm64 4533*4882a593Smuzhiyun:Type: vcpu ioctl 4534*4882a593Smuzhiyun:Parameters: int feature (in) 4535*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 4536*4882a593Smuzhiyun 4537*4882a593SmuzhiyunErrors: 4538*4882a593Smuzhiyun 4539*4882a593Smuzhiyun ====== ============================================================== 4540*4882a593Smuzhiyun EPERM feature not enabled, needs configuration, or already finalized 4541*4882a593Smuzhiyun EINVAL feature unknown or not present 4542*4882a593Smuzhiyun ====== ============================================================== 4543*4882a593Smuzhiyun 4544*4882a593SmuzhiyunRecognised values for feature: 4545*4882a593Smuzhiyun 4546*4882a593Smuzhiyun ===== =========================================== 4547*4882a593Smuzhiyun arm64 KVM_ARM_VCPU_SVE (requires KVM_CAP_ARM_SVE) 4548*4882a593Smuzhiyun ===== =========================================== 4549*4882a593Smuzhiyun 4550*4882a593SmuzhiyunFinalizes the configuration of the specified vcpu feature. 4551*4882a593Smuzhiyun 4552*4882a593SmuzhiyunThe vcpu must already have been initialised, enabling the affected feature, by 4553*4882a593Smuzhiyunmeans of a successful KVM_ARM_VCPU_INIT call with the appropriate flag set in 4554*4882a593Smuzhiyunfeatures[]. 4555*4882a593Smuzhiyun 4556*4882a593SmuzhiyunFor affected vcpu features, this is a mandatory step that must be performed 4557*4882a593Smuzhiyunbefore the vcpu is fully usable. 4558*4882a593Smuzhiyun 4559*4882a593SmuzhiyunBetween KVM_ARM_VCPU_INIT and KVM_ARM_VCPU_FINALIZE, the feature may be 4560*4882a593Smuzhiyunconfigured by use of ioctls such as KVM_SET_ONE_REG. The exact configuration 4561*4882a593Smuzhiyunthat should be performaned and how to do it are feature-dependent. 4562*4882a593Smuzhiyun 4563*4882a593SmuzhiyunOther calls that depend on a particular feature being finalized, such as 4564*4882a593SmuzhiyunKVM_RUN, KVM_GET_REG_LIST, KVM_GET_ONE_REG and KVM_SET_ONE_REG, will fail with 4565*4882a593Smuzhiyun-EPERM unless the feature has already been finalized by means of a 4566*4882a593SmuzhiyunKVM_ARM_VCPU_FINALIZE call. 4567*4882a593Smuzhiyun 4568*4882a593SmuzhiyunSee KVM_ARM_VCPU_INIT for details of vcpu features that require finalization 4569*4882a593Smuzhiyunusing this ioctl. 4570*4882a593Smuzhiyun 4571*4882a593Smuzhiyun4.120 KVM_SET_PMU_EVENT_FILTER 4572*4882a593Smuzhiyun------------------------------ 4573*4882a593Smuzhiyun 4574*4882a593Smuzhiyun:Capability: KVM_CAP_PMU_EVENT_FILTER 4575*4882a593Smuzhiyun:Architectures: x86 4576*4882a593Smuzhiyun:Type: vm ioctl 4577*4882a593Smuzhiyun:Parameters: struct kvm_pmu_event_filter (in) 4578*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 4579*4882a593Smuzhiyun 4580*4882a593Smuzhiyun:: 4581*4882a593Smuzhiyun 4582*4882a593Smuzhiyun struct kvm_pmu_event_filter { 4583*4882a593Smuzhiyun __u32 action; 4584*4882a593Smuzhiyun __u32 nevents; 4585*4882a593Smuzhiyun __u32 fixed_counter_bitmap; 4586*4882a593Smuzhiyun __u32 flags; 4587*4882a593Smuzhiyun __u32 pad[4]; 4588*4882a593Smuzhiyun __u64 events[0]; 4589*4882a593Smuzhiyun }; 4590*4882a593Smuzhiyun 4591*4882a593SmuzhiyunThis ioctl restricts the set of PMU events that the guest can program. 4592*4882a593SmuzhiyunThe argument holds a list of events which will be allowed or denied. 4593*4882a593SmuzhiyunThe eventsel+umask of each event the guest attempts to program is compared 4594*4882a593Smuzhiyunagainst the events field to determine whether the guest should have access. 4595*4882a593SmuzhiyunThe events field only controls general purpose counters; fixed purpose 4596*4882a593Smuzhiyuncounters are controlled by the fixed_counter_bitmap. 4597*4882a593Smuzhiyun 4598*4882a593SmuzhiyunNo flags are defined yet, the field must be zero. 4599*4882a593Smuzhiyun 4600*4882a593SmuzhiyunValid values for 'action':: 4601*4882a593Smuzhiyun 4602*4882a593Smuzhiyun #define KVM_PMU_EVENT_ALLOW 0 4603*4882a593Smuzhiyun #define KVM_PMU_EVENT_DENY 1 4604*4882a593Smuzhiyun 4605*4882a593Smuzhiyun4.121 KVM_PPC_SVM_OFF 4606*4882a593Smuzhiyun--------------------- 4607*4882a593Smuzhiyun 4608*4882a593Smuzhiyun:Capability: basic 4609*4882a593Smuzhiyun:Architectures: powerpc 4610*4882a593Smuzhiyun:Type: vm ioctl 4611*4882a593Smuzhiyun:Parameters: none 4612*4882a593Smuzhiyun:Returns: 0 on successful completion, 4613*4882a593Smuzhiyun 4614*4882a593SmuzhiyunErrors: 4615*4882a593Smuzhiyun 4616*4882a593Smuzhiyun ====== ================================================================ 4617*4882a593Smuzhiyun EINVAL if ultravisor failed to terminate the secure guest 4618*4882a593Smuzhiyun ENOMEM if hypervisor failed to allocate new radix page tables for guest 4619*4882a593Smuzhiyun ====== ================================================================ 4620*4882a593Smuzhiyun 4621*4882a593SmuzhiyunThis ioctl is used to turn off the secure mode of the guest or transition 4622*4882a593Smuzhiyunthe guest from secure mode to normal mode. This is invoked when the guest 4623*4882a593Smuzhiyunis reset. This has no effect if called for a normal guest. 4624*4882a593Smuzhiyun 4625*4882a593SmuzhiyunThis ioctl issues an ultravisor call to terminate the secure guest, 4626*4882a593Smuzhiyununpins the VPA pages and releases all the device pages that are used to 4627*4882a593Smuzhiyuntrack the secure pages by hypervisor. 4628*4882a593Smuzhiyun 4629*4882a593Smuzhiyun4.122 KVM_S390_NORMAL_RESET 4630*4882a593Smuzhiyun--------------------------- 4631*4882a593Smuzhiyun 4632*4882a593Smuzhiyun:Capability: KVM_CAP_S390_VCPU_RESETS 4633*4882a593Smuzhiyun:Architectures: s390 4634*4882a593Smuzhiyun:Type: vcpu ioctl 4635*4882a593Smuzhiyun:Parameters: none 4636*4882a593Smuzhiyun:Returns: 0 4637*4882a593Smuzhiyun 4638*4882a593SmuzhiyunThis ioctl resets VCPU registers and control structures according to 4639*4882a593Smuzhiyunthe cpu reset definition in the POP (Principles Of Operation). 4640*4882a593Smuzhiyun 4641*4882a593Smuzhiyun4.123 KVM_S390_INITIAL_RESET 4642*4882a593Smuzhiyun---------------------------- 4643*4882a593Smuzhiyun 4644*4882a593Smuzhiyun:Capability: none 4645*4882a593Smuzhiyun:Architectures: s390 4646*4882a593Smuzhiyun:Type: vcpu ioctl 4647*4882a593Smuzhiyun:Parameters: none 4648*4882a593Smuzhiyun:Returns: 0 4649*4882a593Smuzhiyun 4650*4882a593SmuzhiyunThis ioctl resets VCPU registers and control structures according to 4651*4882a593Smuzhiyunthe initial cpu reset definition in the POP. However, the cpu is not 4652*4882a593Smuzhiyunput into ESA mode. This reset is a superset of the normal reset. 4653*4882a593Smuzhiyun 4654*4882a593Smuzhiyun4.124 KVM_S390_CLEAR_RESET 4655*4882a593Smuzhiyun-------------------------- 4656*4882a593Smuzhiyun 4657*4882a593Smuzhiyun:Capability: KVM_CAP_S390_VCPU_RESETS 4658*4882a593Smuzhiyun:Architectures: s390 4659*4882a593Smuzhiyun:Type: vcpu ioctl 4660*4882a593Smuzhiyun:Parameters: none 4661*4882a593Smuzhiyun:Returns: 0 4662*4882a593Smuzhiyun 4663*4882a593SmuzhiyunThis ioctl resets VCPU registers and control structures according to 4664*4882a593Smuzhiyunthe clear cpu reset definition in the POP. However, the cpu is not put 4665*4882a593Smuzhiyuninto ESA mode. This reset is a superset of the initial reset. 4666*4882a593Smuzhiyun 4667*4882a593Smuzhiyun 4668*4882a593Smuzhiyun4.125 KVM_S390_PV_COMMAND 4669*4882a593Smuzhiyun------------------------- 4670*4882a593Smuzhiyun 4671*4882a593Smuzhiyun:Capability: KVM_CAP_S390_PROTECTED 4672*4882a593Smuzhiyun:Architectures: s390 4673*4882a593Smuzhiyun:Type: vm ioctl 4674*4882a593Smuzhiyun:Parameters: struct kvm_pv_cmd 4675*4882a593Smuzhiyun:Returns: 0 on success, < 0 on error 4676*4882a593Smuzhiyun 4677*4882a593Smuzhiyun:: 4678*4882a593Smuzhiyun 4679*4882a593Smuzhiyun struct kvm_pv_cmd { 4680*4882a593Smuzhiyun __u32 cmd; /* Command to be executed */ 4681*4882a593Smuzhiyun __u16 rc; /* Ultravisor return code */ 4682*4882a593Smuzhiyun __u16 rrc; /* Ultravisor return reason code */ 4683*4882a593Smuzhiyun __u64 data; /* Data or address */ 4684*4882a593Smuzhiyun __u32 flags; /* flags for future extensions. Must be 0 for now */ 4685*4882a593Smuzhiyun __u32 reserved[3]; 4686*4882a593Smuzhiyun }; 4687*4882a593Smuzhiyun 4688*4882a593Smuzhiyuncmd values: 4689*4882a593Smuzhiyun 4690*4882a593SmuzhiyunKVM_PV_ENABLE 4691*4882a593Smuzhiyun Allocate memory and register the VM with the Ultravisor, thereby 4692*4882a593Smuzhiyun donating memory to the Ultravisor that will become inaccessible to 4693*4882a593Smuzhiyun KVM. All existing CPUs are converted to protected ones. After this 4694*4882a593Smuzhiyun command has succeeded, any CPU added via hotplug will become 4695*4882a593Smuzhiyun protected during its creation as well. 4696*4882a593Smuzhiyun 4697*4882a593Smuzhiyun Errors: 4698*4882a593Smuzhiyun 4699*4882a593Smuzhiyun ===== ============================= 4700*4882a593Smuzhiyun EINTR an unmasked signal is pending 4701*4882a593Smuzhiyun ===== ============================= 4702*4882a593Smuzhiyun 4703*4882a593SmuzhiyunKVM_PV_DISABLE 4704*4882a593Smuzhiyun 4705*4882a593Smuzhiyun Deregister the VM from the Ultravisor and reclaim the memory that 4706*4882a593Smuzhiyun had been donated to the Ultravisor, making it usable by the kernel 4707*4882a593Smuzhiyun again. All registered VCPUs are converted back to non-protected 4708*4882a593Smuzhiyun ones. 4709*4882a593Smuzhiyun 4710*4882a593SmuzhiyunKVM_PV_VM_SET_SEC_PARMS 4711*4882a593Smuzhiyun Pass the image header from VM memory to the Ultravisor in 4712*4882a593Smuzhiyun preparation of image unpacking and verification. 4713*4882a593Smuzhiyun 4714*4882a593SmuzhiyunKVM_PV_VM_UNPACK 4715*4882a593Smuzhiyun Unpack (protect and decrypt) a page of the encrypted boot image. 4716*4882a593Smuzhiyun 4717*4882a593SmuzhiyunKVM_PV_VM_VERIFY 4718*4882a593Smuzhiyun Verify the integrity of the unpacked image. Only if this succeeds, 4719*4882a593Smuzhiyun KVM is allowed to start protected VCPUs. 4720*4882a593Smuzhiyun 4721*4882a593Smuzhiyun4.126 KVM_X86_SET_MSR_FILTER 4722*4882a593Smuzhiyun---------------------------- 4723*4882a593Smuzhiyun 4724*4882a593Smuzhiyun:Capability: KVM_X86_SET_MSR_FILTER 4725*4882a593Smuzhiyun:Architectures: x86 4726*4882a593Smuzhiyun:Type: vm ioctl 4727*4882a593Smuzhiyun:Parameters: struct kvm_msr_filter 4728*4882a593Smuzhiyun:Returns: 0 on success, < 0 on error 4729*4882a593Smuzhiyun 4730*4882a593Smuzhiyun:: 4731*4882a593Smuzhiyun 4732*4882a593Smuzhiyun struct kvm_msr_filter_range { 4733*4882a593Smuzhiyun #define KVM_MSR_FILTER_READ (1 << 0) 4734*4882a593Smuzhiyun #define KVM_MSR_FILTER_WRITE (1 << 1) 4735*4882a593Smuzhiyun __u32 flags; 4736*4882a593Smuzhiyun __u32 nmsrs; /* number of msrs in bitmap */ 4737*4882a593Smuzhiyun __u32 base; /* MSR index the bitmap starts at */ 4738*4882a593Smuzhiyun __u8 *bitmap; /* a 1 bit allows the operations in flags, 0 denies */ 4739*4882a593Smuzhiyun }; 4740*4882a593Smuzhiyun 4741*4882a593Smuzhiyun #define KVM_MSR_FILTER_MAX_RANGES 16 4742*4882a593Smuzhiyun struct kvm_msr_filter { 4743*4882a593Smuzhiyun #define KVM_MSR_FILTER_DEFAULT_ALLOW (0 << 0) 4744*4882a593Smuzhiyun #define KVM_MSR_FILTER_DEFAULT_DENY (1 << 0) 4745*4882a593Smuzhiyun __u32 flags; 4746*4882a593Smuzhiyun struct kvm_msr_filter_range ranges[KVM_MSR_FILTER_MAX_RANGES]; 4747*4882a593Smuzhiyun }; 4748*4882a593Smuzhiyun 4749*4882a593Smuzhiyunflags values for ``struct kvm_msr_filter_range``: 4750*4882a593Smuzhiyun 4751*4882a593Smuzhiyun``KVM_MSR_FILTER_READ`` 4752*4882a593Smuzhiyun 4753*4882a593Smuzhiyun Filter read accesses to MSRs using the given bitmap. A 0 in the bitmap 4754*4882a593Smuzhiyun indicates that a read should immediately fail, while a 1 indicates that 4755*4882a593Smuzhiyun a read for a particular MSR should be handled regardless of the default 4756*4882a593Smuzhiyun filter action. 4757*4882a593Smuzhiyun 4758*4882a593Smuzhiyun``KVM_MSR_FILTER_WRITE`` 4759*4882a593Smuzhiyun 4760*4882a593Smuzhiyun Filter write accesses to MSRs using the given bitmap. A 0 in the bitmap 4761*4882a593Smuzhiyun indicates that a write should immediately fail, while a 1 indicates that 4762*4882a593Smuzhiyun a write for a particular MSR should be handled regardless of the default 4763*4882a593Smuzhiyun filter action. 4764*4882a593Smuzhiyun 4765*4882a593Smuzhiyun``KVM_MSR_FILTER_READ | KVM_MSR_FILTER_WRITE`` 4766*4882a593Smuzhiyun 4767*4882a593Smuzhiyun Filter both read and write accesses to MSRs using the given bitmap. A 0 4768*4882a593Smuzhiyun in the bitmap indicates that both reads and writes should immediately fail, 4769*4882a593Smuzhiyun while a 1 indicates that reads and writes for a particular MSR are not 4770*4882a593Smuzhiyun filtered by this range. 4771*4882a593Smuzhiyun 4772*4882a593Smuzhiyunflags values for ``struct kvm_msr_filter``: 4773*4882a593Smuzhiyun 4774*4882a593Smuzhiyun``KVM_MSR_FILTER_DEFAULT_ALLOW`` 4775*4882a593Smuzhiyun 4776*4882a593Smuzhiyun If no filter range matches an MSR index that is getting accessed, KVM will 4777*4882a593Smuzhiyun fall back to allowing access to the MSR. 4778*4882a593Smuzhiyun 4779*4882a593Smuzhiyun``KVM_MSR_FILTER_DEFAULT_DENY`` 4780*4882a593Smuzhiyun 4781*4882a593Smuzhiyun If no filter range matches an MSR index that is getting accessed, KVM will 4782*4882a593Smuzhiyun fall back to rejecting access to the MSR. In this mode, all MSRs that should 4783*4882a593Smuzhiyun be processed by KVM need to explicitly be marked as allowed in the bitmaps. 4784*4882a593Smuzhiyun 4785*4882a593SmuzhiyunThis ioctl allows user space to define up to 16 bitmaps of MSR ranges to 4786*4882a593Smuzhiyunspecify whether a certain MSR access should be explicitly filtered for or not. 4787*4882a593Smuzhiyun 4788*4882a593SmuzhiyunIf this ioctl has never been invoked, MSR accesses are not guarded and the 4789*4882a593Smuzhiyundefault KVM in-kernel emulation behavior is fully preserved. 4790*4882a593Smuzhiyun 4791*4882a593SmuzhiyunCalling this ioctl with an empty set of ranges (all nmsrs == 0) disables MSR 4792*4882a593Smuzhiyunfiltering. In that mode, ``KVM_MSR_FILTER_DEFAULT_DENY`` is invalid and causes 4793*4882a593Smuzhiyunan error. 4794*4882a593Smuzhiyun 4795*4882a593SmuzhiyunAs soon as the filtering is in place, every MSR access is processed through 4796*4882a593Smuzhiyunthe filtering except for accesses to the x2APIC MSRs (from 0x800 to 0x8ff); 4797*4882a593Smuzhiyunx2APIC MSRs are always allowed, independent of the ``default_allow`` setting, 4798*4882a593Smuzhiyunand their behavior depends on the ``X2APIC_ENABLE`` bit of the APIC base 4799*4882a593Smuzhiyunregister. 4800*4882a593Smuzhiyun 4801*4882a593SmuzhiyunIf a bit is within one of the defined ranges, read and write accesses are 4802*4882a593Smuzhiyunguarded by the bitmap's value for the MSR index if the kind of access 4803*4882a593Smuzhiyunis included in the ``struct kvm_msr_filter_range`` flags. If no range 4804*4882a593Smuzhiyuncover this particular access, the behavior is determined by the flags 4805*4882a593Smuzhiyunfield in the kvm_msr_filter struct: ``KVM_MSR_FILTER_DEFAULT_ALLOW`` 4806*4882a593Smuzhiyunand ``KVM_MSR_FILTER_DEFAULT_DENY``. 4807*4882a593Smuzhiyun 4808*4882a593SmuzhiyunEach bitmap range specifies a range of MSRs to potentially allow access on. 4809*4882a593SmuzhiyunThe range goes from MSR index [base .. base+nmsrs]. The flags field 4810*4882a593Smuzhiyunindicates whether reads, writes or both reads and writes are filtered 4811*4882a593Smuzhiyunby setting a 1 bit in the bitmap for the corresponding MSR index. 4812*4882a593Smuzhiyun 4813*4882a593SmuzhiyunIf an MSR access is not permitted through the filtering, it generates a 4814*4882a593Smuzhiyun#GP inside the guest. When combined with KVM_CAP_X86_USER_SPACE_MSR, that 4815*4882a593Smuzhiyunallows user space to deflect and potentially handle various MSR accesses 4816*4882a593Smuzhiyuninto user space. 4817*4882a593Smuzhiyun 4818*4882a593SmuzhiyunNote, invoking this ioctl with a vCPU is running is inherently racy. However, 4819*4882a593SmuzhiyunKVM does guarantee that vCPUs will see either the previous filter or the new 4820*4882a593Smuzhiyunfilter, e.g. MSRs with identical settings in both the old and new filter will 4821*4882a593Smuzhiyunhave deterministic behavior. 4822*4882a593Smuzhiyun 4823*4882a593Smuzhiyun 4824*4882a593Smuzhiyun5. The kvm_run structure 4825*4882a593Smuzhiyun======================== 4826*4882a593Smuzhiyun 4827*4882a593SmuzhiyunApplication code obtains a pointer to the kvm_run structure by 4828*4882a593Smuzhiyunmmap()ing a vcpu fd. From that point, application code can control 4829*4882a593Smuzhiyunexecution by changing fields in kvm_run prior to calling the KVM_RUN 4830*4882a593Smuzhiyunioctl, and obtain information about the reason KVM_RUN returned by 4831*4882a593Smuzhiyunlooking up structure members. 4832*4882a593Smuzhiyun 4833*4882a593Smuzhiyun:: 4834*4882a593Smuzhiyun 4835*4882a593Smuzhiyun struct kvm_run { 4836*4882a593Smuzhiyun /* in */ 4837*4882a593Smuzhiyun __u8 request_interrupt_window; 4838*4882a593Smuzhiyun 4839*4882a593SmuzhiyunRequest that KVM_RUN return when it becomes possible to inject external 4840*4882a593Smuzhiyuninterrupts into the guest. Useful in conjunction with KVM_INTERRUPT. 4841*4882a593Smuzhiyun 4842*4882a593Smuzhiyun:: 4843*4882a593Smuzhiyun 4844*4882a593Smuzhiyun __u8 immediate_exit; 4845*4882a593Smuzhiyun 4846*4882a593SmuzhiyunThis field is polled once when KVM_RUN starts; if non-zero, KVM_RUN 4847*4882a593Smuzhiyunexits immediately, returning -EINTR. In the common scenario where a 4848*4882a593Smuzhiyunsignal is used to "kick" a VCPU out of KVM_RUN, this field can be used 4849*4882a593Smuzhiyunto avoid usage of KVM_SET_SIGNAL_MASK, which has worse scalability. 4850*4882a593SmuzhiyunRather than blocking the signal outside KVM_RUN, userspace can set up 4851*4882a593Smuzhiyuna signal handler that sets run->immediate_exit to a non-zero value. 4852*4882a593Smuzhiyun 4853*4882a593SmuzhiyunThis field is ignored if KVM_CAP_IMMEDIATE_EXIT is not available. 4854*4882a593Smuzhiyun 4855*4882a593Smuzhiyun:: 4856*4882a593Smuzhiyun 4857*4882a593Smuzhiyun __u8 padding1[6]; 4858*4882a593Smuzhiyun 4859*4882a593Smuzhiyun /* out */ 4860*4882a593Smuzhiyun __u32 exit_reason; 4861*4882a593Smuzhiyun 4862*4882a593SmuzhiyunWhen KVM_RUN has returned successfully (return value 0), this informs 4863*4882a593Smuzhiyunapplication code why KVM_RUN has returned. Allowable values for this 4864*4882a593Smuzhiyunfield are detailed below. 4865*4882a593Smuzhiyun 4866*4882a593Smuzhiyun:: 4867*4882a593Smuzhiyun 4868*4882a593Smuzhiyun __u8 ready_for_interrupt_injection; 4869*4882a593Smuzhiyun 4870*4882a593SmuzhiyunIf request_interrupt_window has been specified, this field indicates 4871*4882a593Smuzhiyunan interrupt can be injected now with KVM_INTERRUPT. 4872*4882a593Smuzhiyun 4873*4882a593Smuzhiyun:: 4874*4882a593Smuzhiyun 4875*4882a593Smuzhiyun __u8 if_flag; 4876*4882a593Smuzhiyun 4877*4882a593SmuzhiyunThe value of the current interrupt flag. Only valid if in-kernel 4878*4882a593Smuzhiyunlocal APIC is not used. 4879*4882a593Smuzhiyun 4880*4882a593Smuzhiyun:: 4881*4882a593Smuzhiyun 4882*4882a593Smuzhiyun __u16 flags; 4883*4882a593Smuzhiyun 4884*4882a593SmuzhiyunMore architecture-specific flags detailing state of the VCPU that may 4885*4882a593Smuzhiyunaffect the device's behavior. The only currently defined flag is 4886*4882a593SmuzhiyunKVM_RUN_X86_SMM, which is valid on x86 machines and is set if the 4887*4882a593SmuzhiyunVCPU is in system management mode. 4888*4882a593Smuzhiyun 4889*4882a593Smuzhiyun:: 4890*4882a593Smuzhiyun 4891*4882a593Smuzhiyun /* in (pre_kvm_run), out (post_kvm_run) */ 4892*4882a593Smuzhiyun __u64 cr8; 4893*4882a593Smuzhiyun 4894*4882a593SmuzhiyunThe value of the cr8 register. Only valid if in-kernel local APIC is 4895*4882a593Smuzhiyunnot used. Both input and output. 4896*4882a593Smuzhiyun 4897*4882a593Smuzhiyun:: 4898*4882a593Smuzhiyun 4899*4882a593Smuzhiyun __u64 apic_base; 4900*4882a593Smuzhiyun 4901*4882a593SmuzhiyunThe value of the APIC BASE msr. Only valid if in-kernel local 4902*4882a593SmuzhiyunAPIC is not used. Both input and output. 4903*4882a593Smuzhiyun 4904*4882a593Smuzhiyun:: 4905*4882a593Smuzhiyun 4906*4882a593Smuzhiyun union { 4907*4882a593Smuzhiyun /* KVM_EXIT_UNKNOWN */ 4908*4882a593Smuzhiyun struct { 4909*4882a593Smuzhiyun __u64 hardware_exit_reason; 4910*4882a593Smuzhiyun } hw; 4911*4882a593Smuzhiyun 4912*4882a593SmuzhiyunIf exit_reason is KVM_EXIT_UNKNOWN, the vcpu has exited due to unknown 4913*4882a593Smuzhiyunreasons. Further architecture-specific information is available in 4914*4882a593Smuzhiyunhardware_exit_reason. 4915*4882a593Smuzhiyun 4916*4882a593Smuzhiyun:: 4917*4882a593Smuzhiyun 4918*4882a593Smuzhiyun /* KVM_EXIT_FAIL_ENTRY */ 4919*4882a593Smuzhiyun struct { 4920*4882a593Smuzhiyun __u64 hardware_entry_failure_reason; 4921*4882a593Smuzhiyun __u32 cpu; /* if KVM_LAST_CPU */ 4922*4882a593Smuzhiyun } fail_entry; 4923*4882a593Smuzhiyun 4924*4882a593SmuzhiyunIf exit_reason is KVM_EXIT_FAIL_ENTRY, the vcpu could not be run due 4925*4882a593Smuzhiyunto unknown reasons. Further architecture-specific information is 4926*4882a593Smuzhiyunavailable in hardware_entry_failure_reason. 4927*4882a593Smuzhiyun 4928*4882a593Smuzhiyun:: 4929*4882a593Smuzhiyun 4930*4882a593Smuzhiyun /* KVM_EXIT_EXCEPTION */ 4931*4882a593Smuzhiyun struct { 4932*4882a593Smuzhiyun __u32 exception; 4933*4882a593Smuzhiyun __u32 error_code; 4934*4882a593Smuzhiyun } ex; 4935*4882a593Smuzhiyun 4936*4882a593SmuzhiyunUnused. 4937*4882a593Smuzhiyun 4938*4882a593Smuzhiyun:: 4939*4882a593Smuzhiyun 4940*4882a593Smuzhiyun /* KVM_EXIT_IO */ 4941*4882a593Smuzhiyun struct { 4942*4882a593Smuzhiyun #define KVM_EXIT_IO_IN 0 4943*4882a593Smuzhiyun #define KVM_EXIT_IO_OUT 1 4944*4882a593Smuzhiyun __u8 direction; 4945*4882a593Smuzhiyun __u8 size; /* bytes */ 4946*4882a593Smuzhiyun __u16 port; 4947*4882a593Smuzhiyun __u32 count; 4948*4882a593Smuzhiyun __u64 data_offset; /* relative to kvm_run start */ 4949*4882a593Smuzhiyun } io; 4950*4882a593Smuzhiyun 4951*4882a593SmuzhiyunIf exit_reason is KVM_EXIT_IO, then the vcpu has 4952*4882a593Smuzhiyunexecuted a port I/O instruction which could not be satisfied by kvm. 4953*4882a593Smuzhiyundata_offset describes where the data is located (KVM_EXIT_IO_OUT) or 4954*4882a593Smuzhiyunwhere kvm expects application code to place the data for the next 4955*4882a593SmuzhiyunKVM_RUN invocation (KVM_EXIT_IO_IN). Data format is a packed array. 4956*4882a593Smuzhiyun 4957*4882a593Smuzhiyun:: 4958*4882a593Smuzhiyun 4959*4882a593Smuzhiyun /* KVM_EXIT_DEBUG */ 4960*4882a593Smuzhiyun struct { 4961*4882a593Smuzhiyun struct kvm_debug_exit_arch arch; 4962*4882a593Smuzhiyun } debug; 4963*4882a593Smuzhiyun 4964*4882a593SmuzhiyunIf the exit_reason is KVM_EXIT_DEBUG, then a vcpu is processing a debug event 4965*4882a593Smuzhiyunfor which architecture specific information is returned. 4966*4882a593Smuzhiyun 4967*4882a593Smuzhiyun:: 4968*4882a593Smuzhiyun 4969*4882a593Smuzhiyun /* KVM_EXIT_MMIO */ 4970*4882a593Smuzhiyun struct { 4971*4882a593Smuzhiyun __u64 phys_addr; 4972*4882a593Smuzhiyun __u8 data[8]; 4973*4882a593Smuzhiyun __u32 len; 4974*4882a593Smuzhiyun __u8 is_write; 4975*4882a593Smuzhiyun } mmio; 4976*4882a593Smuzhiyun 4977*4882a593SmuzhiyunIf exit_reason is KVM_EXIT_MMIO, then the vcpu has 4978*4882a593Smuzhiyunexecuted a memory-mapped I/O instruction which could not be satisfied 4979*4882a593Smuzhiyunby kvm. The 'data' member contains the written data if 'is_write' is 4980*4882a593Smuzhiyuntrue, and should be filled by application code otherwise. 4981*4882a593Smuzhiyun 4982*4882a593SmuzhiyunThe 'data' member contains, in its first 'len' bytes, the value as it would 4983*4882a593Smuzhiyunappear if the VCPU performed a load or store of the appropriate width directly 4984*4882a593Smuzhiyunto the byte array. 4985*4882a593Smuzhiyun 4986*4882a593Smuzhiyun.. note:: 4987*4882a593Smuzhiyun 4988*4882a593Smuzhiyun For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_PAPR, 4989*4882a593Smuzhiyun KVM_EXIT_EPR, KVM_EXIT_X86_RDMSR and KVM_EXIT_X86_WRMSR the corresponding 4990*4882a593Smuzhiyun operations are complete (and guest state is consistent) only after userspace 4991*4882a593Smuzhiyun has re-entered the kernel with KVM_RUN. The kernel side will first finish 4992*4882a593Smuzhiyun incomplete operations and then check for pending signals. Userspace 4993*4882a593Smuzhiyun can re-enter the guest with an unmasked signal pending to complete 4994*4882a593Smuzhiyun pending operations. 4995*4882a593Smuzhiyun 4996*4882a593Smuzhiyun:: 4997*4882a593Smuzhiyun 4998*4882a593Smuzhiyun /* KVM_EXIT_HYPERCALL */ 4999*4882a593Smuzhiyun struct { 5000*4882a593Smuzhiyun __u64 nr; 5001*4882a593Smuzhiyun __u64 args[6]; 5002*4882a593Smuzhiyun __u64 ret; 5003*4882a593Smuzhiyun __u32 longmode; 5004*4882a593Smuzhiyun __u32 pad; 5005*4882a593Smuzhiyun } hypercall; 5006*4882a593Smuzhiyun 5007*4882a593SmuzhiyunUnused. This was once used for 'hypercall to userspace'. To implement 5008*4882a593Smuzhiyunsuch functionality, use KVM_EXIT_IO (x86) or KVM_EXIT_MMIO (all except s390). 5009*4882a593Smuzhiyun 5010*4882a593Smuzhiyun.. note:: KVM_EXIT_IO is significantly faster than KVM_EXIT_MMIO. 5011*4882a593Smuzhiyun 5012*4882a593Smuzhiyun:: 5013*4882a593Smuzhiyun 5014*4882a593Smuzhiyun /* KVM_EXIT_TPR_ACCESS */ 5015*4882a593Smuzhiyun struct { 5016*4882a593Smuzhiyun __u64 rip; 5017*4882a593Smuzhiyun __u32 is_write; 5018*4882a593Smuzhiyun __u32 pad; 5019*4882a593Smuzhiyun } tpr_access; 5020*4882a593Smuzhiyun 5021*4882a593SmuzhiyunTo be documented (KVM_TPR_ACCESS_REPORTING). 5022*4882a593Smuzhiyun 5023*4882a593Smuzhiyun:: 5024*4882a593Smuzhiyun 5025*4882a593Smuzhiyun /* KVM_EXIT_S390_SIEIC */ 5026*4882a593Smuzhiyun struct { 5027*4882a593Smuzhiyun __u8 icptcode; 5028*4882a593Smuzhiyun __u64 mask; /* psw upper half */ 5029*4882a593Smuzhiyun __u64 addr; /* psw lower half */ 5030*4882a593Smuzhiyun __u16 ipa; 5031*4882a593Smuzhiyun __u32 ipb; 5032*4882a593Smuzhiyun } s390_sieic; 5033*4882a593Smuzhiyun 5034*4882a593Smuzhiyuns390 specific. 5035*4882a593Smuzhiyun 5036*4882a593Smuzhiyun:: 5037*4882a593Smuzhiyun 5038*4882a593Smuzhiyun /* KVM_EXIT_S390_RESET */ 5039*4882a593Smuzhiyun #define KVM_S390_RESET_POR 1 5040*4882a593Smuzhiyun #define KVM_S390_RESET_CLEAR 2 5041*4882a593Smuzhiyun #define KVM_S390_RESET_SUBSYSTEM 4 5042*4882a593Smuzhiyun #define KVM_S390_RESET_CPU_INIT 8 5043*4882a593Smuzhiyun #define KVM_S390_RESET_IPL 16 5044*4882a593Smuzhiyun __u64 s390_reset_flags; 5045*4882a593Smuzhiyun 5046*4882a593Smuzhiyuns390 specific. 5047*4882a593Smuzhiyun 5048*4882a593Smuzhiyun:: 5049*4882a593Smuzhiyun 5050*4882a593Smuzhiyun /* KVM_EXIT_S390_UCONTROL */ 5051*4882a593Smuzhiyun struct { 5052*4882a593Smuzhiyun __u64 trans_exc_code; 5053*4882a593Smuzhiyun __u32 pgm_code; 5054*4882a593Smuzhiyun } s390_ucontrol; 5055*4882a593Smuzhiyun 5056*4882a593Smuzhiyuns390 specific. A page fault has occurred for a user controlled virtual 5057*4882a593Smuzhiyunmachine (KVM_VM_S390_UNCONTROL) on it's host page table that cannot be 5058*4882a593Smuzhiyunresolved by the kernel. 5059*4882a593SmuzhiyunThe program code and the translation exception code that were placed 5060*4882a593Smuzhiyunin the cpu's lowcore are presented here as defined by the z Architecture 5061*4882a593SmuzhiyunPrinciples of Operation Book in the Chapter for Dynamic Address Translation 5062*4882a593Smuzhiyun(DAT) 5063*4882a593Smuzhiyun 5064*4882a593Smuzhiyun:: 5065*4882a593Smuzhiyun 5066*4882a593Smuzhiyun /* KVM_EXIT_DCR */ 5067*4882a593Smuzhiyun struct { 5068*4882a593Smuzhiyun __u32 dcrn; 5069*4882a593Smuzhiyun __u32 data; 5070*4882a593Smuzhiyun __u8 is_write; 5071*4882a593Smuzhiyun } dcr; 5072*4882a593Smuzhiyun 5073*4882a593SmuzhiyunDeprecated - was used for 440 KVM. 5074*4882a593Smuzhiyun 5075*4882a593Smuzhiyun:: 5076*4882a593Smuzhiyun 5077*4882a593Smuzhiyun /* KVM_EXIT_OSI */ 5078*4882a593Smuzhiyun struct { 5079*4882a593Smuzhiyun __u64 gprs[32]; 5080*4882a593Smuzhiyun } osi; 5081*4882a593Smuzhiyun 5082*4882a593SmuzhiyunMOL uses a special hypercall interface it calls 'OSI'. To enable it, we catch 5083*4882a593Smuzhiyunhypercalls and exit with this exit struct that contains all the guest gprs. 5084*4882a593Smuzhiyun 5085*4882a593SmuzhiyunIf exit_reason is KVM_EXIT_OSI, then the vcpu has triggered such a hypercall. 5086*4882a593SmuzhiyunUserspace can now handle the hypercall and when it's done modify the gprs as 5087*4882a593Smuzhiyunnecessary. Upon guest entry all guest GPRs will then be replaced by the values 5088*4882a593Smuzhiyunin this struct. 5089*4882a593Smuzhiyun 5090*4882a593Smuzhiyun:: 5091*4882a593Smuzhiyun 5092*4882a593Smuzhiyun /* KVM_EXIT_PAPR_HCALL */ 5093*4882a593Smuzhiyun struct { 5094*4882a593Smuzhiyun __u64 nr; 5095*4882a593Smuzhiyun __u64 ret; 5096*4882a593Smuzhiyun __u64 args[9]; 5097*4882a593Smuzhiyun } papr_hcall; 5098*4882a593Smuzhiyun 5099*4882a593SmuzhiyunThis is used on 64-bit PowerPC when emulating a pSeries partition, 5100*4882a593Smuzhiyune.g. with the 'pseries' machine type in qemu. It occurs when the 5101*4882a593Smuzhiyunguest does a hypercall using the 'sc 1' instruction. The 'nr' field 5102*4882a593Smuzhiyuncontains the hypercall number (from the guest R3), and 'args' contains 5103*4882a593Smuzhiyunthe arguments (from the guest R4 - R12). Userspace should put the 5104*4882a593Smuzhiyunreturn code in 'ret' and any extra returned values in args[]. 5105*4882a593SmuzhiyunThe possible hypercalls are defined in the Power Architecture Platform 5106*4882a593SmuzhiyunRequirements (PAPR) document available from www.power.org (free 5107*4882a593Smuzhiyundeveloper registration required to access it). 5108*4882a593Smuzhiyun 5109*4882a593Smuzhiyun:: 5110*4882a593Smuzhiyun 5111*4882a593Smuzhiyun /* KVM_EXIT_S390_TSCH */ 5112*4882a593Smuzhiyun struct { 5113*4882a593Smuzhiyun __u16 subchannel_id; 5114*4882a593Smuzhiyun __u16 subchannel_nr; 5115*4882a593Smuzhiyun __u32 io_int_parm; 5116*4882a593Smuzhiyun __u32 io_int_word; 5117*4882a593Smuzhiyun __u32 ipb; 5118*4882a593Smuzhiyun __u8 dequeued; 5119*4882a593Smuzhiyun } s390_tsch; 5120*4882a593Smuzhiyun 5121*4882a593Smuzhiyuns390 specific. This exit occurs when KVM_CAP_S390_CSS_SUPPORT has been enabled 5122*4882a593Smuzhiyunand TEST SUBCHANNEL was intercepted. If dequeued is set, a pending I/O 5123*4882a593Smuzhiyuninterrupt for the target subchannel has been dequeued and subchannel_id, 5124*4882a593Smuzhiyunsubchannel_nr, io_int_parm and io_int_word contain the parameters for that 5125*4882a593Smuzhiyuninterrupt. ipb is needed for instruction parameter decoding. 5126*4882a593Smuzhiyun 5127*4882a593Smuzhiyun:: 5128*4882a593Smuzhiyun 5129*4882a593Smuzhiyun /* KVM_EXIT_EPR */ 5130*4882a593Smuzhiyun struct { 5131*4882a593Smuzhiyun __u32 epr; 5132*4882a593Smuzhiyun } epr; 5133*4882a593Smuzhiyun 5134*4882a593SmuzhiyunOn FSL BookE PowerPC chips, the interrupt controller has a fast patch 5135*4882a593Smuzhiyuninterrupt acknowledge path to the core. When the core successfully 5136*4882a593Smuzhiyundelivers an interrupt, it automatically populates the EPR register with 5137*4882a593Smuzhiyunthe interrupt vector number and acknowledges the interrupt inside 5138*4882a593Smuzhiyunthe interrupt controller. 5139*4882a593Smuzhiyun 5140*4882a593SmuzhiyunIn case the interrupt controller lives in user space, we need to do 5141*4882a593Smuzhiyunthe interrupt acknowledge cycle through it to fetch the next to be 5142*4882a593Smuzhiyundelivered interrupt vector using this exit. 5143*4882a593Smuzhiyun 5144*4882a593SmuzhiyunIt gets triggered whenever both KVM_CAP_PPC_EPR are enabled and an 5145*4882a593Smuzhiyunexternal interrupt has just been delivered into the guest. User space 5146*4882a593Smuzhiyunshould put the acknowledged interrupt vector into the 'epr' field. 5147*4882a593Smuzhiyun 5148*4882a593Smuzhiyun:: 5149*4882a593Smuzhiyun 5150*4882a593Smuzhiyun /* KVM_EXIT_SYSTEM_EVENT */ 5151*4882a593Smuzhiyun struct { 5152*4882a593Smuzhiyun #define KVM_SYSTEM_EVENT_SHUTDOWN 1 5153*4882a593Smuzhiyun #define KVM_SYSTEM_EVENT_RESET 2 5154*4882a593Smuzhiyun #define KVM_SYSTEM_EVENT_CRASH 3 5155*4882a593Smuzhiyun __u32 type; 5156*4882a593Smuzhiyun __u64 flags; 5157*4882a593Smuzhiyun } system_event; 5158*4882a593Smuzhiyun 5159*4882a593SmuzhiyunIf exit_reason is KVM_EXIT_SYSTEM_EVENT then the vcpu has triggered 5160*4882a593Smuzhiyuna system-level event using some architecture specific mechanism (hypercall 5161*4882a593Smuzhiyunor some special instruction). In case of ARM/ARM64, this is triggered using 5162*4882a593SmuzhiyunHVC instruction based PSCI call from the vcpu. The 'type' field describes 5163*4882a593Smuzhiyunthe system-level event type. The 'flags' field describes architecture 5164*4882a593Smuzhiyunspecific flags for the system-level event. 5165*4882a593Smuzhiyun 5166*4882a593SmuzhiyunValid values for 'type' are: 5167*4882a593Smuzhiyun 5168*4882a593Smuzhiyun - KVM_SYSTEM_EVENT_SHUTDOWN -- the guest has requested a shutdown of the 5169*4882a593Smuzhiyun VM. Userspace is not obliged to honour this, and if it does honour 5170*4882a593Smuzhiyun this does not need to destroy the VM synchronously (ie it may call 5171*4882a593Smuzhiyun KVM_RUN again before shutdown finally occurs). 5172*4882a593Smuzhiyun - KVM_SYSTEM_EVENT_RESET -- the guest has requested a reset of the VM. 5173*4882a593Smuzhiyun As with SHUTDOWN, userspace can choose to ignore the request, or 5174*4882a593Smuzhiyun to schedule the reset to occur in the future and may call KVM_RUN again. 5175*4882a593Smuzhiyun - KVM_SYSTEM_EVENT_CRASH -- the guest crash occurred and the guest 5176*4882a593Smuzhiyun has requested a crash condition maintenance. Userspace can choose 5177*4882a593Smuzhiyun to ignore the request, or to gather VM memory core dump and/or 5178*4882a593Smuzhiyun reset/shutdown of the VM. 5179*4882a593Smuzhiyun 5180*4882a593Smuzhiyun:: 5181*4882a593Smuzhiyun 5182*4882a593Smuzhiyun /* KVM_EXIT_IOAPIC_EOI */ 5183*4882a593Smuzhiyun struct { 5184*4882a593Smuzhiyun __u8 vector; 5185*4882a593Smuzhiyun } eoi; 5186*4882a593Smuzhiyun 5187*4882a593SmuzhiyunIndicates that the VCPU's in-kernel local APIC received an EOI for a 5188*4882a593Smuzhiyunlevel-triggered IOAPIC interrupt. This exit only triggers when the 5189*4882a593SmuzhiyunIOAPIC is implemented in userspace (i.e. KVM_CAP_SPLIT_IRQCHIP is enabled); 5190*4882a593Smuzhiyunthe userspace IOAPIC should process the EOI and retrigger the interrupt if 5191*4882a593Smuzhiyunit is still asserted. Vector is the LAPIC interrupt vector for which the 5192*4882a593SmuzhiyunEOI was received. 5193*4882a593Smuzhiyun 5194*4882a593Smuzhiyun:: 5195*4882a593Smuzhiyun 5196*4882a593Smuzhiyun struct kvm_hyperv_exit { 5197*4882a593Smuzhiyun #define KVM_EXIT_HYPERV_SYNIC 1 5198*4882a593Smuzhiyun #define KVM_EXIT_HYPERV_HCALL 2 5199*4882a593Smuzhiyun #define KVM_EXIT_HYPERV_SYNDBG 3 5200*4882a593Smuzhiyun __u32 type; 5201*4882a593Smuzhiyun __u32 pad1; 5202*4882a593Smuzhiyun union { 5203*4882a593Smuzhiyun struct { 5204*4882a593Smuzhiyun __u32 msr; 5205*4882a593Smuzhiyun __u32 pad2; 5206*4882a593Smuzhiyun __u64 control; 5207*4882a593Smuzhiyun __u64 evt_page; 5208*4882a593Smuzhiyun __u64 msg_page; 5209*4882a593Smuzhiyun } synic; 5210*4882a593Smuzhiyun struct { 5211*4882a593Smuzhiyun __u64 input; 5212*4882a593Smuzhiyun __u64 result; 5213*4882a593Smuzhiyun __u64 params[2]; 5214*4882a593Smuzhiyun } hcall; 5215*4882a593Smuzhiyun struct { 5216*4882a593Smuzhiyun __u32 msr; 5217*4882a593Smuzhiyun __u32 pad2; 5218*4882a593Smuzhiyun __u64 control; 5219*4882a593Smuzhiyun __u64 status; 5220*4882a593Smuzhiyun __u64 send_page; 5221*4882a593Smuzhiyun __u64 recv_page; 5222*4882a593Smuzhiyun __u64 pending_page; 5223*4882a593Smuzhiyun } syndbg; 5224*4882a593Smuzhiyun } u; 5225*4882a593Smuzhiyun }; 5226*4882a593Smuzhiyun /* KVM_EXIT_HYPERV */ 5227*4882a593Smuzhiyun struct kvm_hyperv_exit hyperv; 5228*4882a593Smuzhiyun 5229*4882a593SmuzhiyunIndicates that the VCPU exits into userspace to process some tasks 5230*4882a593Smuzhiyunrelated to Hyper-V emulation. 5231*4882a593Smuzhiyun 5232*4882a593SmuzhiyunValid values for 'type' are: 5233*4882a593Smuzhiyun 5234*4882a593Smuzhiyun - KVM_EXIT_HYPERV_SYNIC -- synchronously notify user-space about 5235*4882a593Smuzhiyun 5236*4882a593SmuzhiyunHyper-V SynIC state change. Notification is used to remap SynIC 5237*4882a593Smuzhiyunevent/message pages and to enable/disable SynIC messages/events processing 5238*4882a593Smuzhiyunin userspace. 5239*4882a593Smuzhiyun 5240*4882a593Smuzhiyun - KVM_EXIT_HYPERV_SYNDBG -- synchronously notify user-space about 5241*4882a593Smuzhiyun 5242*4882a593SmuzhiyunHyper-V Synthetic debugger state change. Notification is used to either update 5243*4882a593Smuzhiyunthe pending_page location or to send a control command (send the buffer located 5244*4882a593Smuzhiyunin send_page or recv a buffer to recv_page). 5245*4882a593Smuzhiyun 5246*4882a593Smuzhiyun:: 5247*4882a593Smuzhiyun 5248*4882a593Smuzhiyun /* KVM_EXIT_ARM_NISV */ 5249*4882a593Smuzhiyun struct { 5250*4882a593Smuzhiyun __u64 esr_iss; 5251*4882a593Smuzhiyun __u64 fault_ipa; 5252*4882a593Smuzhiyun } arm_nisv; 5253*4882a593Smuzhiyun 5254*4882a593SmuzhiyunUsed on arm and arm64 systems. If a guest accesses memory not in a memslot, 5255*4882a593SmuzhiyunKVM will typically return to userspace and ask it to do MMIO emulation on its 5256*4882a593Smuzhiyunbehalf. However, for certain classes of instructions, no instruction decode 5257*4882a593Smuzhiyun(direction, length of memory access) is provided, and fetching and decoding 5258*4882a593Smuzhiyunthe instruction from the VM is overly complicated to live in the kernel. 5259*4882a593Smuzhiyun 5260*4882a593SmuzhiyunHistorically, when this situation occurred, KVM would print a warning and kill 5261*4882a593Smuzhiyunthe VM. KVM assumed that if the guest accessed non-memslot memory, it was 5262*4882a593Smuzhiyuntrying to do I/O, which just couldn't be emulated, and the warning message was 5263*4882a593Smuzhiyunphrased accordingly. However, what happened more often was that a guest bug 5264*4882a593Smuzhiyuncaused access outside the guest memory areas which should lead to a more 5265*4882a593Smuzhiyunmeaningful warning message and an external abort in the guest, if the access 5266*4882a593Smuzhiyundid not fall within an I/O window. 5267*4882a593Smuzhiyun 5268*4882a593SmuzhiyunUserspace implementations can query for KVM_CAP_ARM_NISV_TO_USER, and enable 5269*4882a593Smuzhiyunthis capability at VM creation. Once this is done, these types of errors will 5270*4882a593Smuzhiyuninstead return to userspace with KVM_EXIT_ARM_NISV, with the valid bits from 5271*4882a593Smuzhiyunthe HSR (arm) and ESR_EL2 (arm64) in the esr_iss field, and the faulting IPA 5272*4882a593Smuzhiyunin the fault_ipa field. Userspace can either fix up the access if it's 5273*4882a593Smuzhiyunactually an I/O access by decoding the instruction from guest memory (if it's 5274*4882a593Smuzhiyunvery brave) and continue executing the guest, or it can decide to suspend, 5275*4882a593Smuzhiyundump, or restart the guest. 5276*4882a593Smuzhiyun 5277*4882a593SmuzhiyunNote that KVM does not skip the faulting instruction as it does for 5278*4882a593SmuzhiyunKVM_EXIT_MMIO, but userspace has to emulate any change to the processing state 5279*4882a593Smuzhiyunif it decides to decode and emulate the instruction. 5280*4882a593Smuzhiyun 5281*4882a593Smuzhiyun:: 5282*4882a593Smuzhiyun 5283*4882a593Smuzhiyun /* KVM_EXIT_X86_RDMSR / KVM_EXIT_X86_WRMSR */ 5284*4882a593Smuzhiyun struct { 5285*4882a593Smuzhiyun __u8 error; /* user -> kernel */ 5286*4882a593Smuzhiyun __u8 pad[7]; 5287*4882a593Smuzhiyun __u32 reason; /* kernel -> user */ 5288*4882a593Smuzhiyun __u32 index; /* kernel -> user */ 5289*4882a593Smuzhiyun __u64 data; /* kernel <-> user */ 5290*4882a593Smuzhiyun } msr; 5291*4882a593Smuzhiyun 5292*4882a593SmuzhiyunUsed on x86 systems. When the VM capability KVM_CAP_X86_USER_SPACE_MSR is 5293*4882a593Smuzhiyunenabled, MSR accesses to registers that would invoke a #GP by KVM kernel code 5294*4882a593Smuzhiyunwill instead trigger a KVM_EXIT_X86_RDMSR exit for reads and KVM_EXIT_X86_WRMSR 5295*4882a593Smuzhiyunexit for writes. 5296*4882a593Smuzhiyun 5297*4882a593SmuzhiyunThe "reason" field specifies why the MSR trap occurred. User space will only 5298*4882a593Smuzhiyunreceive MSR exit traps when a particular reason was requested during through 5299*4882a593SmuzhiyunENABLE_CAP. Currently valid exit reasons are: 5300*4882a593Smuzhiyun 5301*4882a593Smuzhiyun KVM_MSR_EXIT_REASON_UNKNOWN - access to MSR that is unknown to KVM 5302*4882a593Smuzhiyun KVM_MSR_EXIT_REASON_INVAL - access to invalid MSRs or reserved bits 5303*4882a593Smuzhiyun KVM_MSR_EXIT_REASON_FILTER - access blocked by KVM_X86_SET_MSR_FILTER 5304*4882a593Smuzhiyun 5305*4882a593SmuzhiyunFor KVM_EXIT_X86_RDMSR, the "index" field tells user space which MSR the guest 5306*4882a593Smuzhiyunwants to read. To respond to this request with a successful read, user space 5307*4882a593Smuzhiyunwrites the respective data into the "data" field and must continue guest 5308*4882a593Smuzhiyunexecution to ensure the read data is transferred into guest register state. 5309*4882a593Smuzhiyun 5310*4882a593SmuzhiyunIf the RDMSR request was unsuccessful, user space indicates that with a "1" in 5311*4882a593Smuzhiyunthe "error" field. This will inject a #GP into the guest when the VCPU is 5312*4882a593Smuzhiyunexecuted again. 5313*4882a593Smuzhiyun 5314*4882a593SmuzhiyunFor KVM_EXIT_X86_WRMSR, the "index" field tells user space which MSR the guest 5315*4882a593Smuzhiyunwants to write. Once finished processing the event, user space must continue 5316*4882a593SmuzhiyunvCPU execution. If the MSR write was unsuccessful, user space also sets the 5317*4882a593Smuzhiyun"error" field to "1". 5318*4882a593Smuzhiyun 5319*4882a593Smuzhiyun:: 5320*4882a593Smuzhiyun 5321*4882a593Smuzhiyun /* Fix the size of the union. */ 5322*4882a593Smuzhiyun char padding[256]; 5323*4882a593Smuzhiyun }; 5324*4882a593Smuzhiyun 5325*4882a593Smuzhiyun /* 5326*4882a593Smuzhiyun * shared registers between kvm and userspace. 5327*4882a593Smuzhiyun * kvm_valid_regs specifies the register classes set by the host 5328*4882a593Smuzhiyun * kvm_dirty_regs specified the register classes dirtied by userspace 5329*4882a593Smuzhiyun * struct kvm_sync_regs is architecture specific, as well as the 5330*4882a593Smuzhiyun * bits for kvm_valid_regs and kvm_dirty_regs 5331*4882a593Smuzhiyun */ 5332*4882a593Smuzhiyun __u64 kvm_valid_regs; 5333*4882a593Smuzhiyun __u64 kvm_dirty_regs; 5334*4882a593Smuzhiyun union { 5335*4882a593Smuzhiyun struct kvm_sync_regs regs; 5336*4882a593Smuzhiyun char padding[SYNC_REGS_SIZE_BYTES]; 5337*4882a593Smuzhiyun } s; 5338*4882a593Smuzhiyun 5339*4882a593SmuzhiyunIf KVM_CAP_SYNC_REGS is defined, these fields allow userspace to access 5340*4882a593Smuzhiyuncertain guest registers without having to call SET/GET_*REGS. Thus we can 5341*4882a593Smuzhiyunavoid some system call overhead if userspace has to handle the exit. 5342*4882a593SmuzhiyunUserspace can query the validity of the structure by checking 5343*4882a593Smuzhiyunkvm_valid_regs for specific bits. These bits are architecture specific 5344*4882a593Smuzhiyunand usually define the validity of a groups of registers. (e.g. one bit 5345*4882a593Smuzhiyunfor general purpose registers) 5346*4882a593Smuzhiyun 5347*4882a593SmuzhiyunPlease note that the kernel is allowed to use the kvm_run structure as the 5348*4882a593Smuzhiyunprimary storage for certain register types. Therefore, the kernel may use the 5349*4882a593Smuzhiyunvalues in kvm_run even if the corresponding bit in kvm_dirty_regs is not set. 5350*4882a593Smuzhiyun 5351*4882a593Smuzhiyun:: 5352*4882a593Smuzhiyun 5353*4882a593Smuzhiyun }; 5354*4882a593Smuzhiyun 5355*4882a593Smuzhiyun 5356*4882a593Smuzhiyun 5357*4882a593Smuzhiyun6. Capabilities that can be enabled on vCPUs 5358*4882a593Smuzhiyun============================================ 5359*4882a593Smuzhiyun 5360*4882a593SmuzhiyunThere are certain capabilities that change the behavior of the virtual CPU or 5361*4882a593Smuzhiyunthe virtual machine when enabled. To enable them, please see section 4.37. 5362*4882a593SmuzhiyunBelow you can find a list of capabilities and what their effect on the vCPU or 5363*4882a593Smuzhiyunthe virtual machine is when enabling them. 5364*4882a593Smuzhiyun 5365*4882a593SmuzhiyunThe following information is provided along with the description: 5366*4882a593Smuzhiyun 5367*4882a593Smuzhiyun Architectures: 5368*4882a593Smuzhiyun which instruction set architectures provide this ioctl. 5369*4882a593Smuzhiyun x86 includes both i386 and x86_64. 5370*4882a593Smuzhiyun 5371*4882a593Smuzhiyun Target: 5372*4882a593Smuzhiyun whether this is a per-vcpu or per-vm capability. 5373*4882a593Smuzhiyun 5374*4882a593Smuzhiyun Parameters: 5375*4882a593Smuzhiyun what parameters are accepted by the capability. 5376*4882a593Smuzhiyun 5377*4882a593Smuzhiyun Returns: 5378*4882a593Smuzhiyun the return value. General error numbers (EBADF, ENOMEM, EINVAL) 5379*4882a593Smuzhiyun are not detailed, but errors with specific meanings are. 5380*4882a593Smuzhiyun 5381*4882a593Smuzhiyun 5382*4882a593Smuzhiyun6.1 KVM_CAP_PPC_OSI 5383*4882a593Smuzhiyun------------------- 5384*4882a593Smuzhiyun 5385*4882a593Smuzhiyun:Architectures: ppc 5386*4882a593Smuzhiyun:Target: vcpu 5387*4882a593Smuzhiyun:Parameters: none 5388*4882a593Smuzhiyun:Returns: 0 on success; -1 on error 5389*4882a593Smuzhiyun 5390*4882a593SmuzhiyunThis capability enables interception of OSI hypercalls that otherwise would 5391*4882a593Smuzhiyunbe treated as normal system calls to be injected into the guest. OSI hypercalls 5392*4882a593Smuzhiyunwere invented by Mac-on-Linux to have a standardized communication mechanism 5393*4882a593Smuzhiyunbetween the guest and the host. 5394*4882a593Smuzhiyun 5395*4882a593SmuzhiyunWhen this capability is enabled, KVM_EXIT_OSI can occur. 5396*4882a593Smuzhiyun 5397*4882a593Smuzhiyun 5398*4882a593Smuzhiyun6.2 KVM_CAP_PPC_PAPR 5399*4882a593Smuzhiyun-------------------- 5400*4882a593Smuzhiyun 5401*4882a593Smuzhiyun:Architectures: ppc 5402*4882a593Smuzhiyun:Target: vcpu 5403*4882a593Smuzhiyun:Parameters: none 5404*4882a593Smuzhiyun:Returns: 0 on success; -1 on error 5405*4882a593Smuzhiyun 5406*4882a593SmuzhiyunThis capability enables interception of PAPR hypercalls. PAPR hypercalls are 5407*4882a593Smuzhiyundone using the hypercall instruction "sc 1". 5408*4882a593Smuzhiyun 5409*4882a593SmuzhiyunIt also sets the guest privilege level to "supervisor" mode. Usually the guest 5410*4882a593Smuzhiyunruns in "hypervisor" privilege mode with a few missing features. 5411*4882a593Smuzhiyun 5412*4882a593SmuzhiyunIn addition to the above, it changes the semantics of SDR1. In this mode, the 5413*4882a593SmuzhiyunHTAB address part of SDR1 contains an HVA instead of a GPA, as PAPR keeps the 5414*4882a593SmuzhiyunHTAB invisible to the guest. 5415*4882a593Smuzhiyun 5416*4882a593SmuzhiyunWhen this capability is enabled, KVM_EXIT_PAPR_HCALL can occur. 5417*4882a593Smuzhiyun 5418*4882a593Smuzhiyun 5419*4882a593Smuzhiyun6.3 KVM_CAP_SW_TLB 5420*4882a593Smuzhiyun------------------ 5421*4882a593Smuzhiyun 5422*4882a593Smuzhiyun:Architectures: ppc 5423*4882a593Smuzhiyun:Target: vcpu 5424*4882a593Smuzhiyun:Parameters: args[0] is the address of a struct kvm_config_tlb 5425*4882a593Smuzhiyun:Returns: 0 on success; -1 on error 5426*4882a593Smuzhiyun 5427*4882a593Smuzhiyun:: 5428*4882a593Smuzhiyun 5429*4882a593Smuzhiyun struct kvm_config_tlb { 5430*4882a593Smuzhiyun __u64 params; 5431*4882a593Smuzhiyun __u64 array; 5432*4882a593Smuzhiyun __u32 mmu_type; 5433*4882a593Smuzhiyun __u32 array_len; 5434*4882a593Smuzhiyun }; 5435*4882a593Smuzhiyun 5436*4882a593SmuzhiyunConfigures the virtual CPU's TLB array, establishing a shared memory area 5437*4882a593Smuzhiyunbetween userspace and KVM. The "params" and "array" fields are userspace 5438*4882a593Smuzhiyunaddresses of mmu-type-specific data structures. The "array_len" field is an 5439*4882a593Smuzhiyunsafety mechanism, and should be set to the size in bytes of the memory that 5440*4882a593Smuzhiyunuserspace has reserved for the array. It must be at least the size dictated 5441*4882a593Smuzhiyunby "mmu_type" and "params". 5442*4882a593Smuzhiyun 5443*4882a593SmuzhiyunWhile KVM_RUN is active, the shared region is under control of KVM. Its 5444*4882a593Smuzhiyuncontents are undefined, and any modification by userspace results in 5445*4882a593Smuzhiyunboundedly undefined behavior. 5446*4882a593Smuzhiyun 5447*4882a593SmuzhiyunOn return from KVM_RUN, the shared region will reflect the current state of 5448*4882a593Smuzhiyunthe guest's TLB. If userspace makes any changes, it must call KVM_DIRTY_TLB 5449*4882a593Smuzhiyunto tell KVM which entries have been changed, prior to calling KVM_RUN again 5450*4882a593Smuzhiyunon this vcpu. 5451*4882a593Smuzhiyun 5452*4882a593SmuzhiyunFor mmu types KVM_MMU_FSL_BOOKE_NOHV and KVM_MMU_FSL_BOOKE_HV: 5453*4882a593Smuzhiyun 5454*4882a593Smuzhiyun - The "params" field is of type "struct kvm_book3e_206_tlb_params". 5455*4882a593Smuzhiyun - The "array" field points to an array of type "struct 5456*4882a593Smuzhiyun kvm_book3e_206_tlb_entry". 5457*4882a593Smuzhiyun - The array consists of all entries in the first TLB, followed by all 5458*4882a593Smuzhiyun entries in the second TLB. 5459*4882a593Smuzhiyun - Within a TLB, entries are ordered first by increasing set number. Within a 5460*4882a593Smuzhiyun set, entries are ordered by way (increasing ESEL). 5461*4882a593Smuzhiyun - The hash for determining set number in TLB0 is: (MAS2 >> 12) & (num_sets - 1) 5462*4882a593Smuzhiyun where "num_sets" is the tlb_sizes[] value divided by the tlb_ways[] value. 5463*4882a593Smuzhiyun - The tsize field of mas1 shall be set to 4K on TLB0, even though the 5464*4882a593Smuzhiyun hardware ignores this value for TLB0. 5465*4882a593Smuzhiyun 5466*4882a593Smuzhiyun6.4 KVM_CAP_S390_CSS_SUPPORT 5467*4882a593Smuzhiyun---------------------------- 5468*4882a593Smuzhiyun 5469*4882a593Smuzhiyun:Architectures: s390 5470*4882a593Smuzhiyun:Target: vcpu 5471*4882a593Smuzhiyun:Parameters: none 5472*4882a593Smuzhiyun:Returns: 0 on success; -1 on error 5473*4882a593Smuzhiyun 5474*4882a593SmuzhiyunThis capability enables support for handling of channel I/O instructions. 5475*4882a593Smuzhiyun 5476*4882a593SmuzhiyunTEST PENDING INTERRUPTION and the interrupt portion of TEST SUBCHANNEL are 5477*4882a593Smuzhiyunhandled in-kernel, while the other I/O instructions are passed to userspace. 5478*4882a593Smuzhiyun 5479*4882a593SmuzhiyunWhen this capability is enabled, KVM_EXIT_S390_TSCH will occur on TEST 5480*4882a593SmuzhiyunSUBCHANNEL intercepts. 5481*4882a593Smuzhiyun 5482*4882a593SmuzhiyunNote that even though this capability is enabled per-vcpu, the complete 5483*4882a593Smuzhiyunvirtual machine is affected. 5484*4882a593Smuzhiyun 5485*4882a593Smuzhiyun6.5 KVM_CAP_PPC_EPR 5486*4882a593Smuzhiyun------------------- 5487*4882a593Smuzhiyun 5488*4882a593Smuzhiyun:Architectures: ppc 5489*4882a593Smuzhiyun:Target: vcpu 5490*4882a593Smuzhiyun:Parameters: args[0] defines whether the proxy facility is active 5491*4882a593Smuzhiyun:Returns: 0 on success; -1 on error 5492*4882a593Smuzhiyun 5493*4882a593SmuzhiyunThis capability enables or disables the delivery of interrupts through the 5494*4882a593Smuzhiyunexternal proxy facility. 5495*4882a593Smuzhiyun 5496*4882a593SmuzhiyunWhen enabled (args[0] != 0), every time the guest gets an external interrupt 5497*4882a593Smuzhiyundelivered, it automatically exits into user space with a KVM_EXIT_EPR exit 5498*4882a593Smuzhiyunto receive the topmost interrupt vector. 5499*4882a593Smuzhiyun 5500*4882a593SmuzhiyunWhen disabled (args[0] == 0), behavior is as if this facility is unsupported. 5501*4882a593Smuzhiyun 5502*4882a593SmuzhiyunWhen this capability is enabled, KVM_EXIT_EPR can occur. 5503*4882a593Smuzhiyun 5504*4882a593Smuzhiyun6.6 KVM_CAP_IRQ_MPIC 5505*4882a593Smuzhiyun-------------------- 5506*4882a593Smuzhiyun 5507*4882a593Smuzhiyun:Architectures: ppc 5508*4882a593Smuzhiyun:Parameters: args[0] is the MPIC device fd; 5509*4882a593Smuzhiyun args[1] is the MPIC CPU number for this vcpu 5510*4882a593Smuzhiyun 5511*4882a593SmuzhiyunThis capability connects the vcpu to an in-kernel MPIC device. 5512*4882a593Smuzhiyun 5513*4882a593Smuzhiyun6.7 KVM_CAP_IRQ_XICS 5514*4882a593Smuzhiyun-------------------- 5515*4882a593Smuzhiyun 5516*4882a593Smuzhiyun:Architectures: ppc 5517*4882a593Smuzhiyun:Target: vcpu 5518*4882a593Smuzhiyun:Parameters: args[0] is the XICS device fd; 5519*4882a593Smuzhiyun args[1] is the XICS CPU number (server ID) for this vcpu 5520*4882a593Smuzhiyun 5521*4882a593SmuzhiyunThis capability connects the vcpu to an in-kernel XICS device. 5522*4882a593Smuzhiyun 5523*4882a593Smuzhiyun6.8 KVM_CAP_S390_IRQCHIP 5524*4882a593Smuzhiyun------------------------ 5525*4882a593Smuzhiyun 5526*4882a593Smuzhiyun:Architectures: s390 5527*4882a593Smuzhiyun:Target: vm 5528*4882a593Smuzhiyun:Parameters: none 5529*4882a593Smuzhiyun 5530*4882a593SmuzhiyunThis capability enables the in-kernel irqchip for s390. Please refer to 5531*4882a593Smuzhiyun"4.24 KVM_CREATE_IRQCHIP" for details. 5532*4882a593Smuzhiyun 5533*4882a593Smuzhiyun6.9 KVM_CAP_MIPS_FPU 5534*4882a593Smuzhiyun-------------------- 5535*4882a593Smuzhiyun 5536*4882a593Smuzhiyun:Architectures: mips 5537*4882a593Smuzhiyun:Target: vcpu 5538*4882a593Smuzhiyun:Parameters: args[0] is reserved for future use (should be 0). 5539*4882a593Smuzhiyun 5540*4882a593SmuzhiyunThis capability allows the use of the host Floating Point Unit by the guest. It 5541*4882a593Smuzhiyunallows the Config1.FP bit to be set to enable the FPU in the guest. Once this is 5542*4882a593Smuzhiyundone the ``KVM_REG_MIPS_FPR_*`` and ``KVM_REG_MIPS_FCR_*`` registers can be 5543*4882a593Smuzhiyunaccessed (depending on the current guest FPU register mode), and the Status.FR, 5544*4882a593SmuzhiyunConfig5.FRE bits are accessible via the KVM API and also from the guest, 5545*4882a593Smuzhiyundepending on them being supported by the FPU. 5546*4882a593Smuzhiyun 5547*4882a593Smuzhiyun6.10 KVM_CAP_MIPS_MSA 5548*4882a593Smuzhiyun--------------------- 5549*4882a593Smuzhiyun 5550*4882a593Smuzhiyun:Architectures: mips 5551*4882a593Smuzhiyun:Target: vcpu 5552*4882a593Smuzhiyun:Parameters: args[0] is reserved for future use (should be 0). 5553*4882a593Smuzhiyun 5554*4882a593SmuzhiyunThis capability allows the use of the MIPS SIMD Architecture (MSA) by the guest. 5555*4882a593SmuzhiyunIt allows the Config3.MSAP bit to be set to enable the use of MSA by the guest. 5556*4882a593SmuzhiyunOnce this is done the ``KVM_REG_MIPS_VEC_*`` and ``KVM_REG_MIPS_MSA_*`` 5557*4882a593Smuzhiyunregisters can be accessed, and the Config5.MSAEn bit is accessible via the 5558*4882a593SmuzhiyunKVM API and also from the guest. 5559*4882a593Smuzhiyun 5560*4882a593Smuzhiyun6.74 KVM_CAP_SYNC_REGS 5561*4882a593Smuzhiyun---------------------- 5562*4882a593Smuzhiyun 5563*4882a593Smuzhiyun:Architectures: s390, x86 5564*4882a593Smuzhiyun:Target: s390: always enabled, x86: vcpu 5565*4882a593Smuzhiyun:Parameters: none 5566*4882a593Smuzhiyun:Returns: x86: KVM_CHECK_EXTENSION returns a bit-array indicating which register 5567*4882a593Smuzhiyun sets are supported 5568*4882a593Smuzhiyun (bitfields defined in arch/x86/include/uapi/asm/kvm.h). 5569*4882a593Smuzhiyun 5570*4882a593SmuzhiyunAs described above in the kvm_sync_regs struct info in section 5 (kvm_run): 5571*4882a593SmuzhiyunKVM_CAP_SYNC_REGS "allow[s] userspace to access certain guest registers 5572*4882a593Smuzhiyunwithout having to call SET/GET_*REGS". This reduces overhead by eliminating 5573*4882a593Smuzhiyunrepeated ioctl calls for setting and/or getting register values. This is 5574*4882a593Smuzhiyunparticularly important when userspace is making synchronous guest state 5575*4882a593Smuzhiyunmodifications, e.g. when emulating and/or intercepting instructions in 5576*4882a593Smuzhiyunuserspace. 5577*4882a593Smuzhiyun 5578*4882a593SmuzhiyunFor s390 specifics, please refer to the source code. 5579*4882a593Smuzhiyun 5580*4882a593SmuzhiyunFor x86: 5581*4882a593Smuzhiyun 5582*4882a593Smuzhiyun- the register sets to be copied out to kvm_run are selectable 5583*4882a593Smuzhiyun by userspace (rather that all sets being copied out for every exit). 5584*4882a593Smuzhiyun- vcpu_events are available in addition to regs and sregs. 5585*4882a593Smuzhiyun 5586*4882a593SmuzhiyunFor x86, the 'kvm_valid_regs' field of struct kvm_run is overloaded to 5587*4882a593Smuzhiyunfunction as an input bit-array field set by userspace to indicate the 5588*4882a593Smuzhiyunspecific register sets to be copied out on the next exit. 5589*4882a593Smuzhiyun 5590*4882a593SmuzhiyunTo indicate when userspace has modified values that should be copied into 5591*4882a593Smuzhiyunthe vCPU, the all architecture bitarray field, 'kvm_dirty_regs' must be set. 5592*4882a593SmuzhiyunThis is done using the same bitflags as for the 'kvm_valid_regs' field. 5593*4882a593SmuzhiyunIf the dirty bit is not set, then the register set values will not be copied 5594*4882a593Smuzhiyuninto the vCPU even if they've been modified. 5595*4882a593Smuzhiyun 5596*4882a593SmuzhiyunUnused bitfields in the bitarrays must be set to zero. 5597*4882a593Smuzhiyun 5598*4882a593Smuzhiyun:: 5599*4882a593Smuzhiyun 5600*4882a593Smuzhiyun struct kvm_sync_regs { 5601*4882a593Smuzhiyun struct kvm_regs regs; 5602*4882a593Smuzhiyun struct kvm_sregs sregs; 5603*4882a593Smuzhiyun struct kvm_vcpu_events events; 5604*4882a593Smuzhiyun }; 5605*4882a593Smuzhiyun 5606*4882a593Smuzhiyun6.75 KVM_CAP_PPC_IRQ_XIVE 5607*4882a593Smuzhiyun------------------------- 5608*4882a593Smuzhiyun 5609*4882a593Smuzhiyun:Architectures: ppc 5610*4882a593Smuzhiyun:Target: vcpu 5611*4882a593Smuzhiyun:Parameters: args[0] is the XIVE device fd; 5612*4882a593Smuzhiyun args[1] is the XIVE CPU number (server ID) for this vcpu 5613*4882a593Smuzhiyun 5614*4882a593SmuzhiyunThis capability connects the vcpu to an in-kernel XIVE device. 5615*4882a593Smuzhiyun 5616*4882a593Smuzhiyun7. Capabilities that can be enabled on VMs 5617*4882a593Smuzhiyun========================================== 5618*4882a593Smuzhiyun 5619*4882a593SmuzhiyunThere are certain capabilities that change the behavior of the virtual 5620*4882a593Smuzhiyunmachine when enabled. To enable them, please see section 4.37. Below 5621*4882a593Smuzhiyunyou can find a list of capabilities and what their effect on the VM 5622*4882a593Smuzhiyunis when enabling them. 5623*4882a593Smuzhiyun 5624*4882a593SmuzhiyunThe following information is provided along with the description: 5625*4882a593Smuzhiyun 5626*4882a593Smuzhiyun Architectures: 5627*4882a593Smuzhiyun which instruction set architectures provide this ioctl. 5628*4882a593Smuzhiyun x86 includes both i386 and x86_64. 5629*4882a593Smuzhiyun 5630*4882a593Smuzhiyun Parameters: 5631*4882a593Smuzhiyun what parameters are accepted by the capability. 5632*4882a593Smuzhiyun 5633*4882a593Smuzhiyun Returns: 5634*4882a593Smuzhiyun the return value. General error numbers (EBADF, ENOMEM, EINVAL) 5635*4882a593Smuzhiyun are not detailed, but errors with specific meanings are. 5636*4882a593Smuzhiyun 5637*4882a593Smuzhiyun 5638*4882a593Smuzhiyun7.1 KVM_CAP_PPC_ENABLE_HCALL 5639*4882a593Smuzhiyun---------------------------- 5640*4882a593Smuzhiyun 5641*4882a593Smuzhiyun:Architectures: ppc 5642*4882a593Smuzhiyun:Parameters: args[0] is the sPAPR hcall number; 5643*4882a593Smuzhiyun args[1] is 0 to disable, 1 to enable in-kernel handling 5644*4882a593Smuzhiyun 5645*4882a593SmuzhiyunThis capability controls whether individual sPAPR hypercalls (hcalls) 5646*4882a593Smuzhiyunget handled by the kernel or not. Enabling or disabling in-kernel 5647*4882a593Smuzhiyunhandling of an hcall is effective across the VM. On creation, an 5648*4882a593Smuzhiyuninitial set of hcalls are enabled for in-kernel handling, which 5649*4882a593Smuzhiyunconsists of those hcalls for which in-kernel handlers were implemented 5650*4882a593Smuzhiyunbefore this capability was implemented. If disabled, the kernel will 5651*4882a593Smuzhiyunnot to attempt to handle the hcall, but will always exit to userspace 5652*4882a593Smuzhiyunto handle it. Note that it may not make sense to enable some and 5653*4882a593Smuzhiyundisable others of a group of related hcalls, but KVM does not prevent 5654*4882a593Smuzhiyunuserspace from doing that. 5655*4882a593Smuzhiyun 5656*4882a593SmuzhiyunIf the hcall number specified is not one that has an in-kernel 5657*4882a593Smuzhiyunimplementation, the KVM_ENABLE_CAP ioctl will fail with an EINVAL 5658*4882a593Smuzhiyunerror. 5659*4882a593Smuzhiyun 5660*4882a593Smuzhiyun7.2 KVM_CAP_S390_USER_SIGP 5661*4882a593Smuzhiyun-------------------------- 5662*4882a593Smuzhiyun 5663*4882a593Smuzhiyun:Architectures: s390 5664*4882a593Smuzhiyun:Parameters: none 5665*4882a593Smuzhiyun 5666*4882a593SmuzhiyunThis capability controls which SIGP orders will be handled completely in user 5667*4882a593Smuzhiyunspace. With this capability enabled, all fast orders will be handled completely 5668*4882a593Smuzhiyunin the kernel: 5669*4882a593Smuzhiyun 5670*4882a593Smuzhiyun- SENSE 5671*4882a593Smuzhiyun- SENSE RUNNING 5672*4882a593Smuzhiyun- EXTERNAL CALL 5673*4882a593Smuzhiyun- EMERGENCY SIGNAL 5674*4882a593Smuzhiyun- CONDITIONAL EMERGENCY SIGNAL 5675*4882a593Smuzhiyun 5676*4882a593SmuzhiyunAll other orders will be handled completely in user space. 5677*4882a593Smuzhiyun 5678*4882a593SmuzhiyunOnly privileged operation exceptions will be checked for in the kernel (or even 5679*4882a593Smuzhiyunin the hardware prior to interception). If this capability is not enabled, the 5680*4882a593Smuzhiyunold way of handling SIGP orders is used (partially in kernel and user space). 5681*4882a593Smuzhiyun 5682*4882a593Smuzhiyun7.3 KVM_CAP_S390_VECTOR_REGISTERS 5683*4882a593Smuzhiyun--------------------------------- 5684*4882a593Smuzhiyun 5685*4882a593Smuzhiyun:Architectures: s390 5686*4882a593Smuzhiyun:Parameters: none 5687*4882a593Smuzhiyun:Returns: 0 on success, negative value on error 5688*4882a593Smuzhiyun 5689*4882a593SmuzhiyunAllows use of the vector registers introduced with z13 processor, and 5690*4882a593Smuzhiyunprovides for the synchronization between host and user space. Will 5691*4882a593Smuzhiyunreturn -EINVAL if the machine does not support vectors. 5692*4882a593Smuzhiyun 5693*4882a593Smuzhiyun7.4 KVM_CAP_S390_USER_STSI 5694*4882a593Smuzhiyun-------------------------- 5695*4882a593Smuzhiyun 5696*4882a593Smuzhiyun:Architectures: s390 5697*4882a593Smuzhiyun:Parameters: none 5698*4882a593Smuzhiyun 5699*4882a593SmuzhiyunThis capability allows post-handlers for the STSI instruction. After 5700*4882a593Smuzhiyuninitial handling in the kernel, KVM exits to user space with 5701*4882a593SmuzhiyunKVM_EXIT_S390_STSI to allow user space to insert further data. 5702*4882a593Smuzhiyun 5703*4882a593SmuzhiyunBefore exiting to userspace, kvm handlers should fill in s390_stsi field of 5704*4882a593Smuzhiyunvcpu->run:: 5705*4882a593Smuzhiyun 5706*4882a593Smuzhiyun struct { 5707*4882a593Smuzhiyun __u64 addr; 5708*4882a593Smuzhiyun __u8 ar; 5709*4882a593Smuzhiyun __u8 reserved; 5710*4882a593Smuzhiyun __u8 fc; 5711*4882a593Smuzhiyun __u8 sel1; 5712*4882a593Smuzhiyun __u16 sel2; 5713*4882a593Smuzhiyun } s390_stsi; 5714*4882a593Smuzhiyun 5715*4882a593Smuzhiyun @addr - guest address of STSI SYSIB 5716*4882a593Smuzhiyun @fc - function code 5717*4882a593Smuzhiyun @sel1 - selector 1 5718*4882a593Smuzhiyun @sel2 - selector 2 5719*4882a593Smuzhiyun @ar - access register number 5720*4882a593Smuzhiyun 5721*4882a593SmuzhiyunKVM handlers should exit to userspace with rc = -EREMOTE. 5722*4882a593Smuzhiyun 5723*4882a593Smuzhiyun7.5 KVM_CAP_SPLIT_IRQCHIP 5724*4882a593Smuzhiyun------------------------- 5725*4882a593Smuzhiyun 5726*4882a593Smuzhiyun:Architectures: x86 5727*4882a593Smuzhiyun:Parameters: args[0] - number of routes reserved for userspace IOAPICs 5728*4882a593Smuzhiyun:Returns: 0 on success, -1 on error 5729*4882a593Smuzhiyun 5730*4882a593SmuzhiyunCreate a local apic for each processor in the kernel. This can be used 5731*4882a593Smuzhiyuninstead of KVM_CREATE_IRQCHIP if the userspace VMM wishes to emulate the 5732*4882a593SmuzhiyunIOAPIC and PIC (and also the PIT, even though this has to be enabled 5733*4882a593Smuzhiyunseparately). 5734*4882a593Smuzhiyun 5735*4882a593SmuzhiyunThis capability also enables in kernel routing of interrupt requests; 5736*4882a593Smuzhiyunwhen KVM_CAP_SPLIT_IRQCHIP only routes of KVM_IRQ_ROUTING_MSI type are 5737*4882a593Smuzhiyunused in the IRQ routing table. The first args[0] MSI routes are reserved 5738*4882a593Smuzhiyunfor the IOAPIC pins. Whenever the LAPIC receives an EOI for these routes, 5739*4882a593Smuzhiyuna KVM_EXIT_IOAPIC_EOI vmexit will be reported to userspace. 5740*4882a593Smuzhiyun 5741*4882a593SmuzhiyunFails if VCPU has already been created, or if the irqchip is already in the 5742*4882a593Smuzhiyunkernel (i.e. KVM_CREATE_IRQCHIP has already been called). 5743*4882a593Smuzhiyun 5744*4882a593Smuzhiyun7.6 KVM_CAP_S390_RI 5745*4882a593Smuzhiyun------------------- 5746*4882a593Smuzhiyun 5747*4882a593Smuzhiyun:Architectures: s390 5748*4882a593Smuzhiyun:Parameters: none 5749*4882a593Smuzhiyun 5750*4882a593SmuzhiyunAllows use of runtime-instrumentation introduced with zEC12 processor. 5751*4882a593SmuzhiyunWill return -EINVAL if the machine does not support runtime-instrumentation. 5752*4882a593SmuzhiyunWill return -EBUSY if a VCPU has already been created. 5753*4882a593Smuzhiyun 5754*4882a593Smuzhiyun7.7 KVM_CAP_X2APIC_API 5755*4882a593Smuzhiyun---------------------- 5756*4882a593Smuzhiyun 5757*4882a593Smuzhiyun:Architectures: x86 5758*4882a593Smuzhiyun:Parameters: args[0] - features that should be enabled 5759*4882a593Smuzhiyun:Returns: 0 on success, -EINVAL when args[0] contains invalid features 5760*4882a593Smuzhiyun 5761*4882a593SmuzhiyunValid feature flags in args[0] are:: 5762*4882a593Smuzhiyun 5763*4882a593Smuzhiyun #define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0) 5764*4882a593Smuzhiyun #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1) 5765*4882a593Smuzhiyun 5766*4882a593SmuzhiyunEnabling KVM_X2APIC_API_USE_32BIT_IDS changes the behavior of 5767*4882a593SmuzhiyunKVM_SET_GSI_ROUTING, KVM_SIGNAL_MSI, KVM_SET_LAPIC, and KVM_GET_LAPIC, 5768*4882a593Smuzhiyunallowing the use of 32-bit APIC IDs. See KVM_CAP_X2APIC_API in their 5769*4882a593Smuzhiyunrespective sections. 5770*4882a593Smuzhiyun 5771*4882a593SmuzhiyunKVM_X2APIC_API_DISABLE_BROADCAST_QUIRK must be enabled for x2APIC to work 5772*4882a593Smuzhiyunin logical mode or with more than 255 VCPUs. Otherwise, KVM treats 0xff 5773*4882a593Smuzhiyunas a broadcast even in x2APIC mode in order to support physical x2APIC 5774*4882a593Smuzhiyunwithout interrupt remapping. This is undesirable in logical mode, 5775*4882a593Smuzhiyunwhere 0xff represents CPUs 0-7 in cluster 0. 5776*4882a593Smuzhiyun 5777*4882a593Smuzhiyun7.8 KVM_CAP_S390_USER_INSTR0 5778*4882a593Smuzhiyun---------------------------- 5779*4882a593Smuzhiyun 5780*4882a593Smuzhiyun:Architectures: s390 5781*4882a593Smuzhiyun:Parameters: none 5782*4882a593Smuzhiyun 5783*4882a593SmuzhiyunWith this capability enabled, all illegal instructions 0x0000 (2 bytes) will 5784*4882a593Smuzhiyunbe intercepted and forwarded to user space. User space can use this 5785*4882a593Smuzhiyunmechanism e.g. to realize 2-byte software breakpoints. The kernel will 5786*4882a593Smuzhiyunnot inject an operating exception for these instructions, user space has 5787*4882a593Smuzhiyunto take care of that. 5788*4882a593Smuzhiyun 5789*4882a593SmuzhiyunThis capability can be enabled dynamically even if VCPUs were already 5790*4882a593Smuzhiyuncreated and are running. 5791*4882a593Smuzhiyun 5792*4882a593Smuzhiyun7.9 KVM_CAP_S390_GS 5793*4882a593Smuzhiyun------------------- 5794*4882a593Smuzhiyun 5795*4882a593Smuzhiyun:Architectures: s390 5796*4882a593Smuzhiyun:Parameters: none 5797*4882a593Smuzhiyun:Returns: 0 on success; -EINVAL if the machine does not support 5798*4882a593Smuzhiyun guarded storage; -EBUSY if a VCPU has already been created. 5799*4882a593Smuzhiyun 5800*4882a593SmuzhiyunAllows use of guarded storage for the KVM guest. 5801*4882a593Smuzhiyun 5802*4882a593Smuzhiyun7.10 KVM_CAP_S390_AIS 5803*4882a593Smuzhiyun--------------------- 5804*4882a593Smuzhiyun 5805*4882a593Smuzhiyun:Architectures: s390 5806*4882a593Smuzhiyun:Parameters: none 5807*4882a593Smuzhiyun 5808*4882a593SmuzhiyunAllow use of adapter-interruption suppression. 5809*4882a593Smuzhiyun:Returns: 0 on success; -EBUSY if a VCPU has already been created. 5810*4882a593Smuzhiyun 5811*4882a593Smuzhiyun7.11 KVM_CAP_PPC_SMT 5812*4882a593Smuzhiyun-------------------- 5813*4882a593Smuzhiyun 5814*4882a593Smuzhiyun:Architectures: ppc 5815*4882a593Smuzhiyun:Parameters: vsmt_mode, flags 5816*4882a593Smuzhiyun 5817*4882a593SmuzhiyunEnabling this capability on a VM provides userspace with a way to set 5818*4882a593Smuzhiyunthe desired virtual SMT mode (i.e. the number of virtual CPUs per 5819*4882a593Smuzhiyunvirtual core). The virtual SMT mode, vsmt_mode, must be a power of 2 5820*4882a593Smuzhiyunbetween 1 and 8. On POWER8, vsmt_mode must also be no greater than 5821*4882a593Smuzhiyunthe number of threads per subcore for the host. Currently flags must 5822*4882a593Smuzhiyunbe 0. A successful call to enable this capability will result in 5823*4882a593Smuzhiyunvsmt_mode being returned when the KVM_CAP_PPC_SMT capability is 5824*4882a593Smuzhiyunsubsequently queried for the VM. This capability is only supported by 5825*4882a593SmuzhiyunHV KVM, and can only be set before any VCPUs have been created. 5826*4882a593SmuzhiyunThe KVM_CAP_PPC_SMT_POSSIBLE capability indicates which virtual SMT 5827*4882a593Smuzhiyunmodes are available. 5828*4882a593Smuzhiyun 5829*4882a593Smuzhiyun7.12 KVM_CAP_PPC_FWNMI 5830*4882a593Smuzhiyun---------------------- 5831*4882a593Smuzhiyun 5832*4882a593Smuzhiyun:Architectures: ppc 5833*4882a593Smuzhiyun:Parameters: none 5834*4882a593Smuzhiyun 5835*4882a593SmuzhiyunWith this capability a machine check exception in the guest address 5836*4882a593Smuzhiyunspace will cause KVM to exit the guest with NMI exit reason. This 5837*4882a593Smuzhiyunenables QEMU to build error log and branch to guest kernel registered 5838*4882a593Smuzhiyunmachine check handling routine. Without this capability KVM will 5839*4882a593Smuzhiyunbranch to guests' 0x200 interrupt vector. 5840*4882a593Smuzhiyun 5841*4882a593Smuzhiyun7.13 KVM_CAP_X86_DISABLE_EXITS 5842*4882a593Smuzhiyun------------------------------ 5843*4882a593Smuzhiyun 5844*4882a593Smuzhiyun:Architectures: x86 5845*4882a593Smuzhiyun:Parameters: args[0] defines which exits are disabled 5846*4882a593Smuzhiyun:Returns: 0 on success, -EINVAL when args[0] contains invalid exits 5847*4882a593Smuzhiyun 5848*4882a593SmuzhiyunValid bits in args[0] are:: 5849*4882a593Smuzhiyun 5850*4882a593Smuzhiyun #define KVM_X86_DISABLE_EXITS_MWAIT (1 << 0) 5851*4882a593Smuzhiyun #define KVM_X86_DISABLE_EXITS_HLT (1 << 1) 5852*4882a593Smuzhiyun #define KVM_X86_DISABLE_EXITS_PAUSE (1 << 2) 5853*4882a593Smuzhiyun #define KVM_X86_DISABLE_EXITS_CSTATE (1 << 3) 5854*4882a593Smuzhiyun 5855*4882a593SmuzhiyunEnabling this capability on a VM provides userspace with a way to no 5856*4882a593Smuzhiyunlonger intercept some instructions for improved latency in some 5857*4882a593Smuzhiyunworkloads, and is suggested when vCPUs are associated to dedicated 5858*4882a593Smuzhiyunphysical CPUs. More bits can be added in the future; userspace can 5859*4882a593Smuzhiyunjust pass the KVM_CHECK_EXTENSION result to KVM_ENABLE_CAP to disable 5860*4882a593Smuzhiyunall such vmexits. 5861*4882a593Smuzhiyun 5862*4882a593SmuzhiyunDo not enable KVM_FEATURE_PV_UNHALT if you disable HLT exits. 5863*4882a593Smuzhiyun 5864*4882a593Smuzhiyun7.14 KVM_CAP_S390_HPAGE_1M 5865*4882a593Smuzhiyun-------------------------- 5866*4882a593Smuzhiyun 5867*4882a593Smuzhiyun:Architectures: s390 5868*4882a593Smuzhiyun:Parameters: none 5869*4882a593Smuzhiyun:Returns: 0 on success, -EINVAL if hpage module parameter was not set 5870*4882a593Smuzhiyun or cmma is enabled, or the VM has the KVM_VM_S390_UCONTROL 5871*4882a593Smuzhiyun flag set 5872*4882a593Smuzhiyun 5873*4882a593SmuzhiyunWith this capability the KVM support for memory backing with 1m pages 5874*4882a593Smuzhiyunthrough hugetlbfs can be enabled for a VM. After the capability is 5875*4882a593Smuzhiyunenabled, cmma can't be enabled anymore and pfmfi and the storage key 5876*4882a593Smuzhiyuninterpretation are disabled. If cmma has already been enabled or the 5877*4882a593Smuzhiyunhpage module parameter is not set to 1, -EINVAL is returned. 5878*4882a593Smuzhiyun 5879*4882a593SmuzhiyunWhile it is generally possible to create a huge page backed VM without 5880*4882a593Smuzhiyunthis capability, the VM will not be able to run. 5881*4882a593Smuzhiyun 5882*4882a593Smuzhiyun7.15 KVM_CAP_MSR_PLATFORM_INFO 5883*4882a593Smuzhiyun------------------------------ 5884*4882a593Smuzhiyun 5885*4882a593Smuzhiyun:Architectures: x86 5886*4882a593Smuzhiyun:Parameters: args[0] whether feature should be enabled or not 5887*4882a593Smuzhiyun 5888*4882a593SmuzhiyunWith this capability, a guest may read the MSR_PLATFORM_INFO MSR. Otherwise, 5889*4882a593Smuzhiyuna #GP would be raised when the guest tries to access. Currently, this 5890*4882a593Smuzhiyuncapability does not enable write permissions of this MSR for the guest. 5891*4882a593Smuzhiyun 5892*4882a593Smuzhiyun7.16 KVM_CAP_PPC_NESTED_HV 5893*4882a593Smuzhiyun-------------------------- 5894*4882a593Smuzhiyun 5895*4882a593Smuzhiyun:Architectures: ppc 5896*4882a593Smuzhiyun:Parameters: none 5897*4882a593Smuzhiyun:Returns: 0 on success, -EINVAL when the implementation doesn't support 5898*4882a593Smuzhiyun nested-HV virtualization. 5899*4882a593Smuzhiyun 5900*4882a593SmuzhiyunHV-KVM on POWER9 and later systems allows for "nested-HV" 5901*4882a593Smuzhiyunvirtualization, which provides a way for a guest VM to run guests that 5902*4882a593Smuzhiyuncan run using the CPU's supervisor mode (privileged non-hypervisor 5903*4882a593Smuzhiyunstate). Enabling this capability on a VM depends on the CPU having 5904*4882a593Smuzhiyunthe necessary functionality and on the facility being enabled with a 5905*4882a593Smuzhiyunkvm-hv module parameter. 5906*4882a593Smuzhiyun 5907*4882a593Smuzhiyun7.17 KVM_CAP_EXCEPTION_PAYLOAD 5908*4882a593Smuzhiyun------------------------------ 5909*4882a593Smuzhiyun 5910*4882a593Smuzhiyun:Architectures: x86 5911*4882a593Smuzhiyun:Parameters: args[0] whether feature should be enabled or not 5912*4882a593Smuzhiyun 5913*4882a593SmuzhiyunWith this capability enabled, CR2 will not be modified prior to the 5914*4882a593Smuzhiyunemulated VM-exit when L1 intercepts a #PF exception that occurs in 5915*4882a593SmuzhiyunL2. Similarly, for kvm-intel only, DR6 will not be modified prior to 5916*4882a593Smuzhiyunthe emulated VM-exit when L1 intercepts a #DB exception that occurs in 5917*4882a593SmuzhiyunL2. As a result, when KVM_GET_VCPU_EVENTS reports a pending #PF (or 5918*4882a593Smuzhiyun#DB) exception for L2, exception.has_payload will be set and the 5919*4882a593Smuzhiyunfaulting address (or the new DR6 bits*) will be reported in the 5920*4882a593Smuzhiyunexception_payload field. Similarly, when userspace injects a #PF (or 5921*4882a593Smuzhiyun#DB) into L2 using KVM_SET_VCPU_EVENTS, it is expected to set 5922*4882a593Smuzhiyunexception.has_payload and to put the faulting address - or the new DR6 5923*4882a593Smuzhiyunbits\ [#]_ - in the exception_payload field. 5924*4882a593Smuzhiyun 5925*4882a593SmuzhiyunThis capability also enables exception.pending in struct 5926*4882a593Smuzhiyunkvm_vcpu_events, which allows userspace to distinguish between pending 5927*4882a593Smuzhiyunand injected exceptions. 5928*4882a593Smuzhiyun 5929*4882a593Smuzhiyun 5930*4882a593Smuzhiyun.. [#] For the new DR6 bits, note that bit 16 is set iff the #DB exception 5931*4882a593Smuzhiyun will clear DR6.RTM. 5932*4882a593Smuzhiyun 5933*4882a593Smuzhiyun7.18 KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 5934*4882a593Smuzhiyun 5935*4882a593Smuzhiyun:Architectures: x86, arm, arm64, mips 5936*4882a593Smuzhiyun:Parameters: args[0] whether feature should be enabled or not 5937*4882a593Smuzhiyun 5938*4882a593SmuzhiyunValid flags are:: 5939*4882a593Smuzhiyun 5940*4882a593Smuzhiyun #define KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE (1 << 0) 5941*4882a593Smuzhiyun #define KVM_DIRTY_LOG_INITIALLY_SET (1 << 1) 5942*4882a593Smuzhiyun 5943*4882a593SmuzhiyunWith KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE is set, KVM_GET_DIRTY_LOG will not 5944*4882a593Smuzhiyunautomatically clear and write-protect all pages that are returned as dirty. 5945*4882a593SmuzhiyunRather, userspace will have to do this operation separately using 5946*4882a593SmuzhiyunKVM_CLEAR_DIRTY_LOG. 5947*4882a593Smuzhiyun 5948*4882a593SmuzhiyunAt the cost of a slightly more complicated operation, this provides better 5949*4882a593Smuzhiyunscalability and responsiveness for two reasons. First, 5950*4882a593SmuzhiyunKVM_CLEAR_DIRTY_LOG ioctl can operate on a 64-page granularity rather 5951*4882a593Smuzhiyunthan requiring to sync a full memslot; this ensures that KVM does not 5952*4882a593Smuzhiyuntake spinlocks for an extended period of time. Second, in some cases a 5953*4882a593Smuzhiyunlarge amount of time can pass between a call to KVM_GET_DIRTY_LOG and 5954*4882a593Smuzhiyunuserspace actually using the data in the page. Pages can be modified 5955*4882a593Smuzhiyunduring this time, which is inefficient for both the guest and userspace: 5956*4882a593Smuzhiyunthe guest will incur a higher penalty due to write protection faults, 5957*4882a593Smuzhiyunwhile userspace can see false reports of dirty pages. Manual reprotection 5958*4882a593Smuzhiyunhelps reducing this time, improving guest performance and reducing the 5959*4882a593Smuzhiyunnumber of dirty log false positives. 5960*4882a593Smuzhiyun 5961*4882a593SmuzhiyunWith KVM_DIRTY_LOG_INITIALLY_SET set, all the bits of the dirty bitmap 5962*4882a593Smuzhiyunwill be initialized to 1 when created. This also improves performance because 5963*4882a593Smuzhiyundirty logging can be enabled gradually in small chunks on the first call 5964*4882a593Smuzhiyunto KVM_CLEAR_DIRTY_LOG. KVM_DIRTY_LOG_INITIALLY_SET depends on 5965*4882a593SmuzhiyunKVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE (it is also only available on 5966*4882a593Smuzhiyunx86 and arm64 for now). 5967*4882a593Smuzhiyun 5968*4882a593SmuzhiyunKVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 was previously available under the name 5969*4882a593SmuzhiyunKVM_CAP_MANUAL_DIRTY_LOG_PROTECT, but the implementation had bugs that make 5970*4882a593Smuzhiyunit hard or impossible to use it correctly. The availability of 5971*4882a593SmuzhiyunKVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 signals that those bugs are fixed. 5972*4882a593SmuzhiyunUserspace should not try to use KVM_CAP_MANUAL_DIRTY_LOG_PROTECT. 5973*4882a593Smuzhiyun 5974*4882a593Smuzhiyun7.19 KVM_CAP_PPC_SECURE_GUEST 5975*4882a593Smuzhiyun------------------------------ 5976*4882a593Smuzhiyun 5977*4882a593Smuzhiyun:Architectures: ppc 5978*4882a593Smuzhiyun 5979*4882a593SmuzhiyunThis capability indicates that KVM is running on a host that has 5980*4882a593Smuzhiyunultravisor firmware and thus can support a secure guest. On such a 5981*4882a593Smuzhiyunsystem, a guest can ask the ultravisor to make it a secure guest, 5982*4882a593Smuzhiyunone whose memory is inaccessible to the host except for pages which 5983*4882a593Smuzhiyunare explicitly requested to be shared with the host. The ultravisor 5984*4882a593Smuzhiyunnotifies KVM when a guest requests to become a secure guest, and KVM 5985*4882a593Smuzhiyunhas the opportunity to veto the transition. 5986*4882a593Smuzhiyun 5987*4882a593SmuzhiyunIf present, this capability can be enabled for a VM, meaning that KVM 5988*4882a593Smuzhiyunwill allow the transition to secure guest mode. Otherwise KVM will 5989*4882a593Smuzhiyunveto the transition. 5990*4882a593Smuzhiyun 5991*4882a593Smuzhiyun7.20 KVM_CAP_HALT_POLL 5992*4882a593Smuzhiyun---------------------- 5993*4882a593Smuzhiyun 5994*4882a593Smuzhiyun:Architectures: all 5995*4882a593Smuzhiyun:Target: VM 5996*4882a593Smuzhiyun:Parameters: args[0] is the maximum poll time in nanoseconds 5997*4882a593Smuzhiyun:Returns: 0 on success; -1 on error 5998*4882a593Smuzhiyun 5999*4882a593SmuzhiyunThis capability overrides the kvm module parameter halt_poll_ns for the 6000*4882a593Smuzhiyuntarget VM. 6001*4882a593Smuzhiyun 6002*4882a593SmuzhiyunVCPU polling allows a VCPU to poll for wakeup events instead of immediately 6003*4882a593Smuzhiyunscheduling during guest halts. The maximum time a VCPU can spend polling is 6004*4882a593Smuzhiyuncontrolled by the kvm module parameter halt_poll_ns. This capability allows 6005*4882a593Smuzhiyunthe maximum halt time to specified on a per-VM basis, effectively overriding 6006*4882a593Smuzhiyunthe module parameter for the target VM. 6007*4882a593Smuzhiyun 6008*4882a593Smuzhiyun7.21 KVM_CAP_X86_USER_SPACE_MSR 6009*4882a593Smuzhiyun------------------------------- 6010*4882a593Smuzhiyun 6011*4882a593Smuzhiyun:Architectures: x86 6012*4882a593Smuzhiyun:Target: VM 6013*4882a593Smuzhiyun:Parameters: args[0] contains the mask of KVM_MSR_EXIT_REASON_* events to report 6014*4882a593Smuzhiyun:Returns: 0 on success; -1 on error 6015*4882a593Smuzhiyun 6016*4882a593SmuzhiyunThis capability enables trapping of #GP invoking RDMSR and WRMSR instructions 6017*4882a593Smuzhiyuninto user space. 6018*4882a593Smuzhiyun 6019*4882a593SmuzhiyunWhen a guest requests to read or write an MSR, KVM may not implement all MSRs 6020*4882a593Smuzhiyunthat are relevant to a respective system. It also does not differentiate by 6021*4882a593SmuzhiyunCPU type. 6022*4882a593Smuzhiyun 6023*4882a593SmuzhiyunTo allow more fine grained control over MSR handling, user space may enable 6024*4882a593Smuzhiyunthis capability. With it enabled, MSR accesses that match the mask specified in 6025*4882a593Smuzhiyunargs[0] and trigger a #GP event inside the guest by KVM will instead trigger 6026*4882a593SmuzhiyunKVM_EXIT_X86_RDMSR and KVM_EXIT_X86_WRMSR exit notifications which user space 6027*4882a593Smuzhiyuncan then handle to implement model specific MSR handling and/or user notifications 6028*4882a593Smuzhiyunto inform a user that an MSR was not handled. 6029*4882a593Smuzhiyun 6030*4882a593Smuzhiyun8. Other capabilities. 6031*4882a593Smuzhiyun====================== 6032*4882a593Smuzhiyun 6033*4882a593SmuzhiyunThis section lists capabilities that give information about other 6034*4882a593Smuzhiyunfeatures of the KVM implementation. 6035*4882a593Smuzhiyun 6036*4882a593Smuzhiyun8.1 KVM_CAP_PPC_HWRNG 6037*4882a593Smuzhiyun--------------------- 6038*4882a593Smuzhiyun 6039*4882a593Smuzhiyun:Architectures: ppc 6040*4882a593Smuzhiyun 6041*4882a593SmuzhiyunThis capability, if KVM_CHECK_EXTENSION indicates that it is 6042*4882a593Smuzhiyunavailable, means that the kernel has an implementation of the 6043*4882a593SmuzhiyunH_RANDOM hypercall backed by a hardware random-number generator. 6044*4882a593SmuzhiyunIf present, the kernel H_RANDOM handler can be enabled for guest use 6045*4882a593Smuzhiyunwith the KVM_CAP_PPC_ENABLE_HCALL capability. 6046*4882a593Smuzhiyun 6047*4882a593Smuzhiyun8.2 KVM_CAP_HYPERV_SYNIC 6048*4882a593Smuzhiyun------------------------ 6049*4882a593Smuzhiyun 6050*4882a593Smuzhiyun:Architectures: x86 6051*4882a593Smuzhiyun 6052*4882a593SmuzhiyunThis capability, if KVM_CHECK_EXTENSION indicates that it is 6053*4882a593Smuzhiyunavailable, means that the kernel has an implementation of the 6054*4882a593SmuzhiyunHyper-V Synthetic interrupt controller(SynIC). Hyper-V SynIC is 6055*4882a593Smuzhiyunused to support Windows Hyper-V based guest paravirt drivers(VMBus). 6056*4882a593Smuzhiyun 6057*4882a593SmuzhiyunIn order to use SynIC, it has to be activated by setting this 6058*4882a593Smuzhiyuncapability via KVM_ENABLE_CAP ioctl on the vcpu fd. Note that this 6059*4882a593Smuzhiyunwill disable the use of APIC hardware virtualization even if supported 6060*4882a593Smuzhiyunby the CPU, as it's incompatible with SynIC auto-EOI behavior. 6061*4882a593Smuzhiyun 6062*4882a593Smuzhiyun8.3 KVM_CAP_PPC_RADIX_MMU 6063*4882a593Smuzhiyun------------------------- 6064*4882a593Smuzhiyun 6065*4882a593Smuzhiyun:Architectures: ppc 6066*4882a593Smuzhiyun 6067*4882a593SmuzhiyunThis capability, if KVM_CHECK_EXTENSION indicates that it is 6068*4882a593Smuzhiyunavailable, means that the kernel can support guests using the 6069*4882a593Smuzhiyunradix MMU defined in Power ISA V3.00 (as implemented in the POWER9 6070*4882a593Smuzhiyunprocessor). 6071*4882a593Smuzhiyun 6072*4882a593Smuzhiyun8.4 KVM_CAP_PPC_HASH_MMU_V3 6073*4882a593Smuzhiyun--------------------------- 6074*4882a593Smuzhiyun 6075*4882a593Smuzhiyun:Architectures: ppc 6076*4882a593Smuzhiyun 6077*4882a593SmuzhiyunThis capability, if KVM_CHECK_EXTENSION indicates that it is 6078*4882a593Smuzhiyunavailable, means that the kernel can support guests using the 6079*4882a593Smuzhiyunhashed page table MMU defined in Power ISA V3.00 (as implemented in 6080*4882a593Smuzhiyunthe POWER9 processor), including in-memory segment tables. 6081*4882a593Smuzhiyun 6082*4882a593Smuzhiyun8.5 KVM_CAP_MIPS_VZ 6083*4882a593Smuzhiyun------------------- 6084*4882a593Smuzhiyun 6085*4882a593Smuzhiyun:Architectures: mips 6086*4882a593Smuzhiyun 6087*4882a593SmuzhiyunThis capability, if KVM_CHECK_EXTENSION on the main kvm handle indicates that 6088*4882a593Smuzhiyunit is available, means that full hardware assisted virtualization capabilities 6089*4882a593Smuzhiyunof the hardware are available for use through KVM. An appropriate 6090*4882a593SmuzhiyunKVM_VM_MIPS_* type must be passed to KVM_CREATE_VM to create a VM which 6091*4882a593Smuzhiyunutilises it. 6092*4882a593Smuzhiyun 6093*4882a593SmuzhiyunIf KVM_CHECK_EXTENSION on a kvm VM handle indicates that this capability is 6094*4882a593Smuzhiyunavailable, it means that the VM is using full hardware assisted virtualization 6095*4882a593Smuzhiyuncapabilities of the hardware. This is useful to check after creating a VM with 6096*4882a593SmuzhiyunKVM_VM_MIPS_DEFAULT. 6097*4882a593Smuzhiyun 6098*4882a593SmuzhiyunThe value returned by KVM_CHECK_EXTENSION should be compared against known 6099*4882a593Smuzhiyunvalues (see below). All other values are reserved. This is to allow for the 6100*4882a593Smuzhiyunpossibility of other hardware assisted virtualization implementations which 6101*4882a593Smuzhiyunmay be incompatible with the MIPS VZ ASE. 6102*4882a593Smuzhiyun 6103*4882a593Smuzhiyun== ========================================================================== 6104*4882a593Smuzhiyun 0 The trap & emulate implementation is in use to run guest code in user 6105*4882a593Smuzhiyun mode. Guest virtual memory segments are rearranged to fit the guest in the 6106*4882a593Smuzhiyun user mode address space. 6107*4882a593Smuzhiyun 6108*4882a593Smuzhiyun 1 The MIPS VZ ASE is in use, providing full hardware assisted 6109*4882a593Smuzhiyun virtualization, including standard guest virtual memory segments. 6110*4882a593Smuzhiyun== ========================================================================== 6111*4882a593Smuzhiyun 6112*4882a593Smuzhiyun8.6 KVM_CAP_MIPS_TE 6113*4882a593Smuzhiyun------------------- 6114*4882a593Smuzhiyun 6115*4882a593Smuzhiyun:Architectures: mips 6116*4882a593Smuzhiyun 6117*4882a593SmuzhiyunThis capability, if KVM_CHECK_EXTENSION on the main kvm handle indicates that 6118*4882a593Smuzhiyunit is available, means that the trap & emulate implementation is available to 6119*4882a593Smuzhiyunrun guest code in user mode, even if KVM_CAP_MIPS_VZ indicates that hardware 6120*4882a593Smuzhiyunassisted virtualisation is also available. KVM_VM_MIPS_TE (0) must be passed 6121*4882a593Smuzhiyunto KVM_CREATE_VM to create a VM which utilises it. 6122*4882a593Smuzhiyun 6123*4882a593SmuzhiyunIf KVM_CHECK_EXTENSION on a kvm VM handle indicates that this capability is 6124*4882a593Smuzhiyunavailable, it means that the VM is using trap & emulate. 6125*4882a593Smuzhiyun 6126*4882a593Smuzhiyun8.7 KVM_CAP_MIPS_64BIT 6127*4882a593Smuzhiyun---------------------- 6128*4882a593Smuzhiyun 6129*4882a593Smuzhiyun:Architectures: mips 6130*4882a593Smuzhiyun 6131*4882a593SmuzhiyunThis capability indicates the supported architecture type of the guest, i.e. the 6132*4882a593Smuzhiyunsupported register and address width. 6133*4882a593Smuzhiyun 6134*4882a593SmuzhiyunThe values returned when this capability is checked by KVM_CHECK_EXTENSION on a 6135*4882a593Smuzhiyunkvm VM handle correspond roughly to the CP0_Config.AT register field, and should 6136*4882a593Smuzhiyunbe checked specifically against known values (see below). All other values are 6137*4882a593Smuzhiyunreserved. 6138*4882a593Smuzhiyun 6139*4882a593Smuzhiyun== ======================================================================== 6140*4882a593Smuzhiyun 0 MIPS32 or microMIPS32. 6141*4882a593Smuzhiyun Both registers and addresses are 32-bits wide. 6142*4882a593Smuzhiyun It will only be possible to run 32-bit guest code. 6143*4882a593Smuzhiyun 6144*4882a593Smuzhiyun 1 MIPS64 or microMIPS64 with access only to 32-bit compatibility segments. 6145*4882a593Smuzhiyun Registers are 64-bits wide, but addresses are 32-bits wide. 6146*4882a593Smuzhiyun 64-bit guest code may run but cannot access MIPS64 memory segments. 6147*4882a593Smuzhiyun It will also be possible to run 32-bit guest code. 6148*4882a593Smuzhiyun 6149*4882a593Smuzhiyun 2 MIPS64 or microMIPS64 with access to all address segments. 6150*4882a593Smuzhiyun Both registers and addresses are 64-bits wide. 6151*4882a593Smuzhiyun It will be possible to run 64-bit or 32-bit guest code. 6152*4882a593Smuzhiyun== ======================================================================== 6153*4882a593Smuzhiyun 6154*4882a593Smuzhiyun8.9 KVM_CAP_ARM_USER_IRQ 6155*4882a593Smuzhiyun------------------------ 6156*4882a593Smuzhiyun 6157*4882a593Smuzhiyun:Architectures: arm, arm64 6158*4882a593Smuzhiyun 6159*4882a593SmuzhiyunThis capability, if KVM_CHECK_EXTENSION indicates that it is available, means 6160*4882a593Smuzhiyunthat if userspace creates a VM without an in-kernel interrupt controller, it 6161*4882a593Smuzhiyunwill be notified of changes to the output level of in-kernel emulated devices, 6162*4882a593Smuzhiyunwhich can generate virtual interrupts, presented to the VM. 6163*4882a593SmuzhiyunFor such VMs, on every return to userspace, the kernel 6164*4882a593Smuzhiyunupdates the vcpu's run->s.regs.device_irq_level field to represent the actual 6165*4882a593Smuzhiyunoutput level of the device. 6166*4882a593Smuzhiyun 6167*4882a593SmuzhiyunWhenever kvm detects a change in the device output level, kvm guarantees at 6168*4882a593Smuzhiyunleast one return to userspace before running the VM. This exit could either 6169*4882a593Smuzhiyunbe a KVM_EXIT_INTR or any other exit event, like KVM_EXIT_MMIO. This way, 6170*4882a593Smuzhiyunuserspace can always sample the device output level and re-compute the state of 6171*4882a593Smuzhiyunthe userspace interrupt controller. Userspace should always check the state 6172*4882a593Smuzhiyunof run->s.regs.device_irq_level on every kvm exit. 6173*4882a593SmuzhiyunThe value in run->s.regs.device_irq_level can represent both level and edge 6174*4882a593Smuzhiyuntriggered interrupt signals, depending on the device. Edge triggered interrupt 6175*4882a593Smuzhiyunsignals will exit to userspace with the bit in run->s.regs.device_irq_level 6176*4882a593Smuzhiyunset exactly once per edge signal. 6177*4882a593Smuzhiyun 6178*4882a593SmuzhiyunThe field run->s.regs.device_irq_level is available independent of 6179*4882a593Smuzhiyunrun->kvm_valid_regs or run->kvm_dirty_regs bits. 6180*4882a593Smuzhiyun 6181*4882a593SmuzhiyunIf KVM_CAP_ARM_USER_IRQ is supported, the KVM_CHECK_EXTENSION ioctl returns a 6182*4882a593Smuzhiyunnumber larger than 0 indicating the version of this capability is implemented 6183*4882a593Smuzhiyunand thereby which bits in run->s.regs.device_irq_level can signal values. 6184*4882a593Smuzhiyun 6185*4882a593SmuzhiyunCurrently the following bits are defined for the device_irq_level bitmap:: 6186*4882a593Smuzhiyun 6187*4882a593Smuzhiyun KVM_CAP_ARM_USER_IRQ >= 1: 6188*4882a593Smuzhiyun 6189*4882a593Smuzhiyun KVM_ARM_DEV_EL1_VTIMER - EL1 virtual timer 6190*4882a593Smuzhiyun KVM_ARM_DEV_EL1_PTIMER - EL1 physical timer 6191*4882a593Smuzhiyun KVM_ARM_DEV_PMU - ARM PMU overflow interrupt signal 6192*4882a593Smuzhiyun 6193*4882a593SmuzhiyunFuture versions of kvm may implement additional events. These will get 6194*4882a593Smuzhiyunindicated by returning a higher number from KVM_CHECK_EXTENSION and will be 6195*4882a593Smuzhiyunlisted above. 6196*4882a593Smuzhiyun 6197*4882a593Smuzhiyun8.10 KVM_CAP_PPC_SMT_POSSIBLE 6198*4882a593Smuzhiyun----------------------------- 6199*4882a593Smuzhiyun 6200*4882a593Smuzhiyun:Architectures: ppc 6201*4882a593Smuzhiyun 6202*4882a593SmuzhiyunQuerying this capability returns a bitmap indicating the possible 6203*4882a593Smuzhiyunvirtual SMT modes that can be set using KVM_CAP_PPC_SMT. If bit N 6204*4882a593Smuzhiyun(counting from the right) is set, then a virtual SMT mode of 2^N is 6205*4882a593Smuzhiyunavailable. 6206*4882a593Smuzhiyun 6207*4882a593Smuzhiyun8.11 KVM_CAP_HYPERV_SYNIC2 6208*4882a593Smuzhiyun-------------------------- 6209*4882a593Smuzhiyun 6210*4882a593Smuzhiyun:Architectures: x86 6211*4882a593Smuzhiyun 6212*4882a593SmuzhiyunThis capability enables a newer version of Hyper-V Synthetic interrupt 6213*4882a593Smuzhiyuncontroller (SynIC). The only difference with KVM_CAP_HYPERV_SYNIC is that KVM 6214*4882a593Smuzhiyundoesn't clear SynIC message and event flags pages when they are enabled by 6215*4882a593Smuzhiyunwriting to the respective MSRs. 6216*4882a593Smuzhiyun 6217*4882a593Smuzhiyun8.12 KVM_CAP_HYPERV_VP_INDEX 6218*4882a593Smuzhiyun---------------------------- 6219*4882a593Smuzhiyun 6220*4882a593Smuzhiyun:Architectures: x86 6221*4882a593Smuzhiyun 6222*4882a593SmuzhiyunThis capability indicates that userspace can load HV_X64_MSR_VP_INDEX msr. Its 6223*4882a593Smuzhiyunvalue is used to denote the target vcpu for a SynIC interrupt. For 6224*4882a593Smuzhiyuncompatibilty, KVM initializes this msr to KVM's internal vcpu index. When this 6225*4882a593Smuzhiyuncapability is absent, userspace can still query this msr's value. 6226*4882a593Smuzhiyun 6227*4882a593Smuzhiyun8.13 KVM_CAP_S390_AIS_MIGRATION 6228*4882a593Smuzhiyun------------------------------- 6229*4882a593Smuzhiyun 6230*4882a593Smuzhiyun:Architectures: s390 6231*4882a593Smuzhiyun:Parameters: none 6232*4882a593Smuzhiyun 6233*4882a593SmuzhiyunThis capability indicates if the flic device will be able to get/set the 6234*4882a593SmuzhiyunAIS states for migration via the KVM_DEV_FLIC_AISM_ALL attribute and allows 6235*4882a593Smuzhiyunto discover this without having to create a flic device. 6236*4882a593Smuzhiyun 6237*4882a593Smuzhiyun8.14 KVM_CAP_S390_PSW 6238*4882a593Smuzhiyun--------------------- 6239*4882a593Smuzhiyun 6240*4882a593Smuzhiyun:Architectures: s390 6241*4882a593Smuzhiyun 6242*4882a593SmuzhiyunThis capability indicates that the PSW is exposed via the kvm_run structure. 6243*4882a593Smuzhiyun 6244*4882a593Smuzhiyun8.15 KVM_CAP_S390_GMAP 6245*4882a593Smuzhiyun---------------------- 6246*4882a593Smuzhiyun 6247*4882a593Smuzhiyun:Architectures: s390 6248*4882a593Smuzhiyun 6249*4882a593SmuzhiyunThis capability indicates that the user space memory used as guest mapping can 6250*4882a593Smuzhiyunbe anywhere in the user memory address space, as long as the memory slots are 6251*4882a593Smuzhiyunaligned and sized to a segment (1MB) boundary. 6252*4882a593Smuzhiyun 6253*4882a593Smuzhiyun8.16 KVM_CAP_S390_COW 6254*4882a593Smuzhiyun--------------------- 6255*4882a593Smuzhiyun 6256*4882a593Smuzhiyun:Architectures: s390 6257*4882a593Smuzhiyun 6258*4882a593SmuzhiyunThis capability indicates that the user space memory used as guest mapping can 6259*4882a593Smuzhiyunuse copy-on-write semantics as well as dirty pages tracking via read-only page 6260*4882a593Smuzhiyuntables. 6261*4882a593Smuzhiyun 6262*4882a593Smuzhiyun8.17 KVM_CAP_S390_BPB 6263*4882a593Smuzhiyun--------------------- 6264*4882a593Smuzhiyun 6265*4882a593Smuzhiyun:Architectures: s390 6266*4882a593Smuzhiyun 6267*4882a593SmuzhiyunThis capability indicates that kvm will implement the interfaces to handle 6268*4882a593Smuzhiyunreset, migration and nested KVM for branch prediction blocking. The stfle 6269*4882a593Smuzhiyunfacility 82 should not be provided to the guest without this capability. 6270*4882a593Smuzhiyun 6271*4882a593Smuzhiyun8.18 KVM_CAP_HYPERV_TLBFLUSH 6272*4882a593Smuzhiyun---------------------------- 6273*4882a593Smuzhiyun 6274*4882a593Smuzhiyun:Architectures: x86 6275*4882a593Smuzhiyun 6276*4882a593SmuzhiyunThis capability indicates that KVM supports paravirtualized Hyper-V TLB Flush 6277*4882a593Smuzhiyunhypercalls: 6278*4882a593SmuzhiyunHvFlushVirtualAddressSpace, HvFlushVirtualAddressSpaceEx, 6279*4882a593SmuzhiyunHvFlushVirtualAddressList, HvFlushVirtualAddressListEx. 6280*4882a593Smuzhiyun 6281*4882a593Smuzhiyun8.19 KVM_CAP_ARM_INJECT_SERROR_ESR 6282*4882a593Smuzhiyun---------------------------------- 6283*4882a593Smuzhiyun 6284*4882a593Smuzhiyun:Architectures: arm, arm64 6285*4882a593Smuzhiyun 6286*4882a593SmuzhiyunThis capability indicates that userspace can specify (via the 6287*4882a593SmuzhiyunKVM_SET_VCPU_EVENTS ioctl) the syndrome value reported to the guest when it 6288*4882a593Smuzhiyuntakes a virtual SError interrupt exception. 6289*4882a593SmuzhiyunIf KVM advertises this capability, userspace can only specify the ISS field for 6290*4882a593Smuzhiyunthe ESR syndrome. Other parts of the ESR, such as the EC are generated by the 6291*4882a593SmuzhiyunCPU when the exception is taken. If this virtual SError is taken to EL1 using 6292*4882a593SmuzhiyunAArch64, this value will be reported in the ISS field of ESR_ELx. 6293*4882a593Smuzhiyun 6294*4882a593SmuzhiyunSee KVM_CAP_VCPU_EVENTS for more details. 6295*4882a593Smuzhiyun 6296*4882a593Smuzhiyun8.20 KVM_CAP_HYPERV_SEND_IPI 6297*4882a593Smuzhiyun---------------------------- 6298*4882a593Smuzhiyun 6299*4882a593Smuzhiyun:Architectures: x86 6300*4882a593Smuzhiyun 6301*4882a593SmuzhiyunThis capability indicates that KVM supports paravirtualized Hyper-V IPI send 6302*4882a593Smuzhiyunhypercalls: 6303*4882a593SmuzhiyunHvCallSendSyntheticClusterIpi, HvCallSendSyntheticClusterIpiEx. 6304*4882a593Smuzhiyun 6305*4882a593Smuzhiyun8.21 KVM_CAP_HYPERV_DIRECT_TLBFLUSH 6306*4882a593Smuzhiyun----------------------------------- 6307*4882a593Smuzhiyun 6308*4882a593Smuzhiyun:Architectures: x86 6309*4882a593Smuzhiyun 6310*4882a593SmuzhiyunThis capability indicates that KVM running on top of Hyper-V hypervisor 6311*4882a593Smuzhiyunenables Direct TLB flush for its guests meaning that TLB flush 6312*4882a593Smuzhiyunhypercalls are handled by Level 0 hypervisor (Hyper-V) bypassing KVM. 6313*4882a593SmuzhiyunDue to the different ABI for hypercall parameters between Hyper-V and 6314*4882a593SmuzhiyunKVM, enabling this capability effectively disables all hypercall 6315*4882a593Smuzhiyunhandling by KVM (as some KVM hypercall may be mistakenly treated as TLB 6316*4882a593Smuzhiyunflush hypercalls by Hyper-V) so userspace should disable KVM identification 6317*4882a593Smuzhiyunin CPUID and only exposes Hyper-V identification. In this case, guest 6318*4882a593Smuzhiyunthinks it's running on Hyper-V and only use Hyper-V hypercalls. 6319*4882a593Smuzhiyun 6320*4882a593Smuzhiyun8.22 KVM_CAP_S390_VCPU_RESETS 6321*4882a593Smuzhiyun----------------------------- 6322*4882a593Smuzhiyun 6323*4882a593Smuzhiyun:Architectures: s390 6324*4882a593Smuzhiyun 6325*4882a593SmuzhiyunThis capability indicates that the KVM_S390_NORMAL_RESET and 6326*4882a593SmuzhiyunKVM_S390_CLEAR_RESET ioctls are available. 6327*4882a593Smuzhiyun 6328*4882a593Smuzhiyun8.23 KVM_CAP_S390_PROTECTED 6329*4882a593Smuzhiyun--------------------------- 6330*4882a593Smuzhiyun 6331*4882a593Smuzhiyun:Architectures: s390 6332*4882a593Smuzhiyun 6333*4882a593SmuzhiyunThis capability indicates that the Ultravisor has been initialized and 6334*4882a593SmuzhiyunKVM can therefore start protected VMs. 6335*4882a593SmuzhiyunThis capability governs the KVM_S390_PV_COMMAND ioctl and the 6336*4882a593SmuzhiyunKVM_MP_STATE_LOAD MP_STATE. KVM_SET_MP_STATE can fail for protected 6337*4882a593Smuzhiyunguests when the state change is invalid. 6338*4882a593Smuzhiyun 6339*4882a593Smuzhiyun8.24 KVM_CAP_STEAL_TIME 6340*4882a593Smuzhiyun----------------------- 6341*4882a593Smuzhiyun 6342*4882a593Smuzhiyun:Architectures: arm64, x86 6343*4882a593Smuzhiyun 6344*4882a593SmuzhiyunThis capability indicates that KVM supports steal time accounting. 6345*4882a593SmuzhiyunWhen steal time accounting is supported it may be enabled with 6346*4882a593Smuzhiyunarchitecture-specific interfaces. This capability and the architecture- 6347*4882a593Smuzhiyunspecific interfaces must be consistent, i.e. if one says the feature 6348*4882a593Smuzhiyunis supported, than the other should as well and vice versa. For arm64 6349*4882a593Smuzhiyunsee Documentation/virt/kvm/devices/vcpu.rst "KVM_ARM_VCPU_PVTIME_CTRL". 6350*4882a593SmuzhiyunFor x86 see Documentation/virt/kvm/msr.rst "MSR_KVM_STEAL_TIME". 6351*4882a593Smuzhiyun 6352*4882a593Smuzhiyun8.25 KVM_CAP_S390_DIAG318 6353*4882a593Smuzhiyun------------------------- 6354*4882a593Smuzhiyun 6355*4882a593Smuzhiyun:Architectures: s390 6356*4882a593Smuzhiyun 6357*4882a593SmuzhiyunThis capability enables a guest to set information about its control program 6358*4882a593Smuzhiyun(i.e. guest kernel type and version). The information is helpful during 6359*4882a593Smuzhiyunsystem/firmware service events, providing additional data about the guest 6360*4882a593Smuzhiyunenvironments running on the machine. 6361*4882a593Smuzhiyun 6362*4882a593SmuzhiyunThe information is associated with the DIAGNOSE 0x318 instruction, which sets 6363*4882a593Smuzhiyunan 8-byte value consisting of a one-byte Control Program Name Code (CPNC) and 6364*4882a593Smuzhiyuna 7-byte Control Program Version Code (CPVC). The CPNC determines what 6365*4882a593Smuzhiyunenvironment the control program is running in (e.g. Linux, z/VM...), and the 6366*4882a593SmuzhiyunCPVC is used for information specific to OS (e.g. Linux version, Linux 6367*4882a593Smuzhiyundistribution...) 6368*4882a593Smuzhiyun 6369*4882a593SmuzhiyunIf this capability is available, then the CPNC and CPVC can be synchronized 6370*4882a593Smuzhiyunbetween KVM and userspace via the sync regs mechanism (KVM_SYNC_DIAG318). 6371*4882a593Smuzhiyun 6372*4882a593Smuzhiyun8.26 KVM_CAP_X86_USER_SPACE_MSR 6373*4882a593Smuzhiyun------------------------------- 6374*4882a593Smuzhiyun 6375*4882a593Smuzhiyun:Architectures: x86 6376*4882a593Smuzhiyun 6377*4882a593SmuzhiyunThis capability indicates that KVM supports deflection of MSR reads and 6378*4882a593Smuzhiyunwrites to user space. It can be enabled on a VM level. If enabled, MSR 6379*4882a593Smuzhiyunaccesses that would usually trigger a #GP by KVM into the guest will 6380*4882a593Smuzhiyuninstead get bounced to user space through the KVM_EXIT_X86_RDMSR and 6381*4882a593SmuzhiyunKVM_EXIT_X86_WRMSR exit notifications. 6382*4882a593Smuzhiyun 6383*4882a593Smuzhiyun8.27 KVM_X86_SET_MSR_FILTER 6384*4882a593Smuzhiyun--------------------------- 6385*4882a593Smuzhiyun 6386*4882a593Smuzhiyun:Architectures: x86 6387*4882a593Smuzhiyun 6388*4882a593SmuzhiyunThis capability indicates that KVM supports that accesses to user defined MSRs 6389*4882a593Smuzhiyunmay be rejected. With this capability exposed, KVM exports new VM ioctl 6390*4882a593SmuzhiyunKVM_X86_SET_MSR_FILTER which user space can call to specify bitmaps of MSR 6391*4882a593Smuzhiyunranges that KVM should reject access to. 6392*4882a593Smuzhiyun 6393*4882a593SmuzhiyunIn combination with KVM_CAP_X86_USER_SPACE_MSR, this allows user space to 6394*4882a593Smuzhiyuntrap and emulate MSRs that are outside of the scope of KVM as well as 6395*4882a593Smuzhiyunlimit the attack surface on KVM's MSR emulation code. 6396*4882a593Smuzhiyun 6397*4882a593Smuzhiyun8.28 KVM_CAP_ENFORCE_PV_CPUID 6398*4882a593Smuzhiyun----------------------------- 6399*4882a593Smuzhiyun 6400*4882a593SmuzhiyunArchitectures: x86 6401*4882a593Smuzhiyun 6402*4882a593SmuzhiyunWhen enabled, KVM will disable paravirtual features provided to the 6403*4882a593Smuzhiyunguest according to the bits in the KVM_CPUID_FEATURES CPUID leaf 6404*4882a593Smuzhiyun(0x40000001). Otherwise, a guest may use the paravirtual features 6405*4882a593Smuzhiyunregardless of what has actually been exposed through the CPUID leaf. 6406