1*4882a593SmuzhiyunMDS - Microarchitectural Data Sampling 2*4882a593Smuzhiyun====================================== 3*4882a593Smuzhiyun 4*4882a593SmuzhiyunMicroarchitectural Data Sampling is a hardware vulnerability which allows 5*4882a593Smuzhiyununprivileged speculative access to data which is available in various CPU 6*4882a593Smuzhiyuninternal buffers. 7*4882a593Smuzhiyun 8*4882a593SmuzhiyunAffected processors 9*4882a593Smuzhiyun------------------- 10*4882a593Smuzhiyun 11*4882a593SmuzhiyunThis vulnerability affects a wide range of Intel processors. The 12*4882a593Smuzhiyunvulnerability is not present on: 13*4882a593Smuzhiyun 14*4882a593Smuzhiyun - Processors from AMD, Centaur and other non Intel vendors 15*4882a593Smuzhiyun 16*4882a593Smuzhiyun - Older processor models, where the CPU family is < 6 17*4882a593Smuzhiyun 18*4882a593Smuzhiyun - Some Atoms (Bonnell, Saltwell, Goldmont, GoldmontPlus) 19*4882a593Smuzhiyun 20*4882a593Smuzhiyun - Intel processors which have the ARCH_CAP_MDS_NO bit set in the 21*4882a593Smuzhiyun IA32_ARCH_CAPABILITIES MSR. 22*4882a593Smuzhiyun 23*4882a593SmuzhiyunWhether a processor is affected or not can be read out from the MDS 24*4882a593Smuzhiyunvulnerability file in sysfs. See :ref:`mds_sys_info`. 25*4882a593Smuzhiyun 26*4882a593SmuzhiyunNot all processors are affected by all variants of MDS, but the mitigation 27*4882a593Smuzhiyunis identical for all of them so the kernel treats them as a single 28*4882a593Smuzhiyunvulnerability. 29*4882a593Smuzhiyun 30*4882a593SmuzhiyunRelated CVEs 31*4882a593Smuzhiyun------------ 32*4882a593Smuzhiyun 33*4882a593SmuzhiyunThe following CVE entries are related to the MDS vulnerability: 34*4882a593Smuzhiyun 35*4882a593Smuzhiyun ============== ===== =================================================== 36*4882a593Smuzhiyun CVE-2018-12126 MSBDS Microarchitectural Store Buffer Data Sampling 37*4882a593Smuzhiyun CVE-2018-12130 MFBDS Microarchitectural Fill Buffer Data Sampling 38*4882a593Smuzhiyun CVE-2018-12127 MLPDS Microarchitectural Load Port Data Sampling 39*4882a593Smuzhiyun CVE-2019-11091 MDSUM Microarchitectural Data Sampling Uncacheable Memory 40*4882a593Smuzhiyun ============== ===== =================================================== 41*4882a593Smuzhiyun 42*4882a593SmuzhiyunProblem 43*4882a593Smuzhiyun------- 44*4882a593Smuzhiyun 45*4882a593SmuzhiyunWhen performing store, load, L1 refill operations, processors write data 46*4882a593Smuzhiyuninto temporary microarchitectural structures (buffers). The data in the 47*4882a593Smuzhiyunbuffer can be forwarded to load operations as an optimization. 48*4882a593Smuzhiyun 49*4882a593SmuzhiyunUnder certain conditions, usually a fault/assist caused by a load 50*4882a593Smuzhiyunoperation, data unrelated to the load memory address can be speculatively 51*4882a593Smuzhiyunforwarded from the buffers. Because the load operation causes a fault or 52*4882a593Smuzhiyunassist and its result will be discarded, the forwarded data will not cause 53*4882a593Smuzhiyunincorrect program execution or state changes. But a malicious operation 54*4882a593Smuzhiyunmay be able to forward this speculative data to a disclosure gadget which 55*4882a593Smuzhiyunallows in turn to infer the value via a cache side channel attack. 56*4882a593Smuzhiyun 57*4882a593SmuzhiyunBecause the buffers are potentially shared between Hyper-Threads cross 58*4882a593SmuzhiyunHyper-Thread attacks are possible. 59*4882a593Smuzhiyun 60*4882a593SmuzhiyunDeeper technical information is available in the MDS specific x86 61*4882a593Smuzhiyunarchitecture section: :ref:`Documentation/x86/mds.rst <mds>`. 62*4882a593Smuzhiyun 63*4882a593Smuzhiyun 64*4882a593SmuzhiyunAttack scenarios 65*4882a593Smuzhiyun---------------- 66*4882a593Smuzhiyun 67*4882a593SmuzhiyunAttacks against the MDS vulnerabilities can be mounted from malicious non 68*4882a593Smuzhiyunpriviledged user space applications running on hosts or guest. Malicious 69*4882a593Smuzhiyunguest OSes can obviously mount attacks as well. 70*4882a593Smuzhiyun 71*4882a593SmuzhiyunContrary to other speculation based vulnerabilities the MDS vulnerability 72*4882a593Smuzhiyundoes not allow the attacker to control the memory target address. As a 73*4882a593Smuzhiyunconsequence the attacks are purely sampling based, but as demonstrated with 74*4882a593Smuzhiyunthe TLBleed attack samples can be postprocessed successfully. 75*4882a593Smuzhiyun 76*4882a593SmuzhiyunWeb-Browsers 77*4882a593Smuzhiyun^^^^^^^^^^^^ 78*4882a593Smuzhiyun 79*4882a593Smuzhiyun It's unclear whether attacks through Web-Browsers are possible at 80*4882a593Smuzhiyun all. The exploitation through Java-Script is considered very unlikely, 81*4882a593Smuzhiyun but other widely used web technologies like Webassembly could possibly be 82*4882a593Smuzhiyun abused. 83*4882a593Smuzhiyun 84*4882a593Smuzhiyun 85*4882a593Smuzhiyun.. _mds_sys_info: 86*4882a593Smuzhiyun 87*4882a593SmuzhiyunMDS system information 88*4882a593Smuzhiyun----------------------- 89*4882a593Smuzhiyun 90*4882a593SmuzhiyunThe Linux kernel provides a sysfs interface to enumerate the current MDS 91*4882a593Smuzhiyunstatus of the system: whether the system is vulnerable, and which 92*4882a593Smuzhiyunmitigations are active. The relevant sysfs file is: 93*4882a593Smuzhiyun 94*4882a593Smuzhiyun/sys/devices/system/cpu/vulnerabilities/mds 95*4882a593Smuzhiyun 96*4882a593SmuzhiyunThe possible values in this file are: 97*4882a593Smuzhiyun 98*4882a593Smuzhiyun .. list-table:: 99*4882a593Smuzhiyun 100*4882a593Smuzhiyun * - 'Not affected' 101*4882a593Smuzhiyun - The processor is not vulnerable 102*4882a593Smuzhiyun * - 'Vulnerable' 103*4882a593Smuzhiyun - The processor is vulnerable, but no mitigation enabled 104*4882a593Smuzhiyun * - 'Vulnerable: Clear CPU buffers attempted, no microcode' 105*4882a593Smuzhiyun - The processor is vulnerable but microcode is not updated. 106*4882a593Smuzhiyun 107*4882a593Smuzhiyun The mitigation is enabled on a best effort basis. See :ref:`vmwerv` 108*4882a593Smuzhiyun * - 'Mitigation: Clear CPU buffers' 109*4882a593Smuzhiyun - The processor is vulnerable and the CPU buffer clearing mitigation is 110*4882a593Smuzhiyun enabled. 111*4882a593Smuzhiyun 112*4882a593SmuzhiyunIf the processor is vulnerable then the following information is appended 113*4882a593Smuzhiyunto the above information: 114*4882a593Smuzhiyun 115*4882a593Smuzhiyun ======================== ============================================ 116*4882a593Smuzhiyun 'SMT vulnerable' SMT is enabled 117*4882a593Smuzhiyun 'SMT mitigated' SMT is enabled and mitigated 118*4882a593Smuzhiyun 'SMT disabled' SMT is disabled 119*4882a593Smuzhiyun 'SMT Host state unknown' Kernel runs in a VM, Host SMT state unknown 120*4882a593Smuzhiyun ======================== ============================================ 121*4882a593Smuzhiyun 122*4882a593Smuzhiyun.. _vmwerv: 123*4882a593Smuzhiyun 124*4882a593SmuzhiyunBest effort mitigation mode 125*4882a593Smuzhiyun^^^^^^^^^^^^^^^^^^^^^^^^^^^ 126*4882a593Smuzhiyun 127*4882a593Smuzhiyun If the processor is vulnerable, but the availability of the microcode based 128*4882a593Smuzhiyun mitigation mechanism is not advertised via CPUID the kernel selects a best 129*4882a593Smuzhiyun effort mitigation mode. This mode invokes the mitigation instructions 130*4882a593Smuzhiyun without a guarantee that they clear the CPU buffers. 131*4882a593Smuzhiyun 132*4882a593Smuzhiyun This is done to address virtualization scenarios where the host has the 133*4882a593Smuzhiyun microcode update applied, but the hypervisor is not yet updated to expose 134*4882a593Smuzhiyun the CPUID to the guest. If the host has updated microcode the protection 135*4882a593Smuzhiyun takes effect otherwise a few cpu cycles are wasted pointlessly. 136*4882a593Smuzhiyun 137*4882a593Smuzhiyun The state in the mds sysfs file reflects this situation accordingly. 138*4882a593Smuzhiyun 139*4882a593Smuzhiyun 140*4882a593SmuzhiyunMitigation mechanism 141*4882a593Smuzhiyun------------------------- 142*4882a593Smuzhiyun 143*4882a593SmuzhiyunThe kernel detects the affected CPUs and the presence of the microcode 144*4882a593Smuzhiyunwhich is required. 145*4882a593Smuzhiyun 146*4882a593SmuzhiyunIf a CPU is affected and the microcode is available, then the kernel 147*4882a593Smuzhiyunenables the mitigation by default. The mitigation can be controlled at boot 148*4882a593Smuzhiyuntime via a kernel command line option. See 149*4882a593Smuzhiyun:ref:`mds_mitigation_control_command_line`. 150*4882a593Smuzhiyun 151*4882a593Smuzhiyun.. _cpu_buffer_clear: 152*4882a593Smuzhiyun 153*4882a593SmuzhiyunCPU buffer clearing 154*4882a593Smuzhiyun^^^^^^^^^^^^^^^^^^^ 155*4882a593Smuzhiyun 156*4882a593Smuzhiyun The mitigation for MDS clears the affected CPU buffers on return to user 157*4882a593Smuzhiyun space and when entering a guest. 158*4882a593Smuzhiyun 159*4882a593Smuzhiyun If SMT is enabled it also clears the buffers on idle entry when the CPU 160*4882a593Smuzhiyun is only affected by MSBDS and not any other MDS variant, because the 161*4882a593Smuzhiyun other variants cannot be protected against cross Hyper-Thread attacks. 162*4882a593Smuzhiyun 163*4882a593Smuzhiyun For CPUs which are only affected by MSBDS the user space, guest and idle 164*4882a593Smuzhiyun transition mitigations are sufficient and SMT is not affected. 165*4882a593Smuzhiyun 166*4882a593Smuzhiyun.. _virt_mechanism: 167*4882a593Smuzhiyun 168*4882a593SmuzhiyunVirtualization mitigation 169*4882a593Smuzhiyun^^^^^^^^^^^^^^^^^^^^^^^^^ 170*4882a593Smuzhiyun 171*4882a593Smuzhiyun The protection for host to guest transition depends on the L1TF 172*4882a593Smuzhiyun vulnerability of the CPU: 173*4882a593Smuzhiyun 174*4882a593Smuzhiyun - CPU is affected by L1TF: 175*4882a593Smuzhiyun 176*4882a593Smuzhiyun If the L1D flush mitigation is enabled and up to date microcode is 177*4882a593Smuzhiyun available, the L1D flush mitigation is automatically protecting the 178*4882a593Smuzhiyun guest transition. 179*4882a593Smuzhiyun 180*4882a593Smuzhiyun If the L1D flush mitigation is disabled then the MDS mitigation is 181*4882a593Smuzhiyun invoked explicit when the host MDS mitigation is enabled. 182*4882a593Smuzhiyun 183*4882a593Smuzhiyun For details on L1TF and virtualization see: 184*4882a593Smuzhiyun :ref:`Documentation/admin-guide/hw-vuln//l1tf.rst <mitigation_control_kvm>`. 185*4882a593Smuzhiyun 186*4882a593Smuzhiyun - CPU is not affected by L1TF: 187*4882a593Smuzhiyun 188*4882a593Smuzhiyun CPU buffers are flushed before entering the guest when the host MDS 189*4882a593Smuzhiyun mitigation is enabled. 190*4882a593Smuzhiyun 191*4882a593Smuzhiyun The resulting MDS protection matrix for the host to guest transition: 192*4882a593Smuzhiyun 193*4882a593Smuzhiyun ============ ===== ============= ============ ================= 194*4882a593Smuzhiyun L1TF MDS VMX-L1FLUSH Host MDS MDS-State 195*4882a593Smuzhiyun 196*4882a593Smuzhiyun Don't care No Don't care N/A Not affected 197*4882a593Smuzhiyun 198*4882a593Smuzhiyun Yes Yes Disabled Off Vulnerable 199*4882a593Smuzhiyun 200*4882a593Smuzhiyun Yes Yes Disabled Full Mitigated 201*4882a593Smuzhiyun 202*4882a593Smuzhiyun Yes Yes Enabled Don't care Mitigated 203*4882a593Smuzhiyun 204*4882a593Smuzhiyun No Yes N/A Off Vulnerable 205*4882a593Smuzhiyun 206*4882a593Smuzhiyun No Yes N/A Full Mitigated 207*4882a593Smuzhiyun ============ ===== ============= ============ ================= 208*4882a593Smuzhiyun 209*4882a593Smuzhiyun This only covers the host to guest transition, i.e. prevents leakage from 210*4882a593Smuzhiyun host to guest, but does not protect the guest internally. Guests need to 211*4882a593Smuzhiyun have their own protections. 212*4882a593Smuzhiyun 213*4882a593Smuzhiyun.. _xeon_phi: 214*4882a593Smuzhiyun 215*4882a593SmuzhiyunXEON PHI specific considerations 216*4882a593Smuzhiyun^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 217*4882a593Smuzhiyun 218*4882a593Smuzhiyun The XEON PHI processor family is affected by MSBDS which can be exploited 219*4882a593Smuzhiyun cross Hyper-Threads when entering idle states. Some XEON PHI variants allow 220*4882a593Smuzhiyun to use MWAIT in user space (Ring 3) which opens an potential attack vector 221*4882a593Smuzhiyun for malicious user space. The exposure can be disabled on the kernel 222*4882a593Smuzhiyun command line with the 'ring3mwait=disable' command line option. 223*4882a593Smuzhiyun 224*4882a593Smuzhiyun XEON PHI is not affected by the other MDS variants and MSBDS is mitigated 225*4882a593Smuzhiyun before the CPU enters a idle state. As XEON PHI is not affected by L1TF 226*4882a593Smuzhiyun either disabling SMT is not required for full protection. 227*4882a593Smuzhiyun 228*4882a593Smuzhiyun.. _mds_smt_control: 229*4882a593Smuzhiyun 230*4882a593SmuzhiyunSMT control 231*4882a593Smuzhiyun^^^^^^^^^^^ 232*4882a593Smuzhiyun 233*4882a593Smuzhiyun All MDS variants except MSBDS can be attacked cross Hyper-Threads. That 234*4882a593Smuzhiyun means on CPUs which are affected by MFBDS or MLPDS it is necessary to 235*4882a593Smuzhiyun disable SMT for full protection. These are most of the affected CPUs; the 236*4882a593Smuzhiyun exception is XEON PHI, see :ref:`xeon_phi`. 237*4882a593Smuzhiyun 238*4882a593Smuzhiyun Disabling SMT can have a significant performance impact, but the impact 239*4882a593Smuzhiyun depends on the type of workloads. 240*4882a593Smuzhiyun 241*4882a593Smuzhiyun See the relevant chapter in the L1TF mitigation documentation for details: 242*4882a593Smuzhiyun :ref:`Documentation/admin-guide/hw-vuln/l1tf.rst <smt_control>`. 243*4882a593Smuzhiyun 244*4882a593Smuzhiyun 245*4882a593Smuzhiyun.. _mds_mitigation_control_command_line: 246*4882a593Smuzhiyun 247*4882a593SmuzhiyunMitigation control on the kernel command line 248*4882a593Smuzhiyun--------------------------------------------- 249*4882a593Smuzhiyun 250*4882a593SmuzhiyunThe kernel command line allows to control the MDS mitigations at boot 251*4882a593Smuzhiyuntime with the option "mds=". The valid arguments for this option are: 252*4882a593Smuzhiyun 253*4882a593Smuzhiyun ============ ============================================================= 254*4882a593Smuzhiyun full If the CPU is vulnerable, enable all available mitigations 255*4882a593Smuzhiyun for the MDS vulnerability, CPU buffer clearing on exit to 256*4882a593Smuzhiyun userspace and when entering a VM. Idle transitions are 257*4882a593Smuzhiyun protected as well if SMT is enabled. 258*4882a593Smuzhiyun 259*4882a593Smuzhiyun It does not automatically disable SMT. 260*4882a593Smuzhiyun 261*4882a593Smuzhiyun full,nosmt The same as mds=full, with SMT disabled on vulnerable 262*4882a593Smuzhiyun CPUs. This is the complete mitigation. 263*4882a593Smuzhiyun 264*4882a593Smuzhiyun off Disables MDS mitigations completely. 265*4882a593Smuzhiyun 266*4882a593Smuzhiyun ============ ============================================================= 267*4882a593Smuzhiyun 268*4882a593SmuzhiyunNot specifying this option is equivalent to "mds=full". For processors 269*4882a593Smuzhiyunthat are affected by both TAA (TSX Asynchronous Abort) and MDS, 270*4882a593Smuzhiyunspecifying just "mds=off" without an accompanying "tsx_async_abort=off" 271*4882a593Smuzhiyunwill have no effect as the same mitigation is used for both 272*4882a593Smuzhiyunvulnerabilities. 273*4882a593Smuzhiyun 274*4882a593SmuzhiyunMitigation selection guide 275*4882a593Smuzhiyun-------------------------- 276*4882a593Smuzhiyun 277*4882a593Smuzhiyun1. Trusted userspace 278*4882a593Smuzhiyun^^^^^^^^^^^^^^^^^^^^ 279*4882a593Smuzhiyun 280*4882a593Smuzhiyun If all userspace applications are from a trusted source and do not 281*4882a593Smuzhiyun execute untrusted code which is supplied externally, then the mitigation 282*4882a593Smuzhiyun can be disabled. 283*4882a593Smuzhiyun 284*4882a593Smuzhiyun 285*4882a593Smuzhiyun2. Virtualization with trusted guests 286*4882a593Smuzhiyun^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 287*4882a593Smuzhiyun 288*4882a593Smuzhiyun The same considerations as above versus trusted user space apply. 289*4882a593Smuzhiyun 290*4882a593Smuzhiyun3. Virtualization with untrusted guests 291*4882a593Smuzhiyun^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 292*4882a593Smuzhiyun 293*4882a593Smuzhiyun The protection depends on the state of the L1TF mitigations. 294*4882a593Smuzhiyun See :ref:`virt_mechanism`. 295*4882a593Smuzhiyun 296*4882a593Smuzhiyun If the MDS mitigation is enabled and SMT is disabled, guest to host and 297*4882a593Smuzhiyun guest to guest attacks are prevented. 298*4882a593Smuzhiyun 299*4882a593Smuzhiyun.. _mds_default_mitigations: 300*4882a593Smuzhiyun 301*4882a593SmuzhiyunDefault mitigations 302*4882a593Smuzhiyun------------------- 303*4882a593Smuzhiyun 304*4882a593Smuzhiyun The kernel default mitigations for vulnerable processors are: 305*4882a593Smuzhiyun 306*4882a593Smuzhiyun - Enable CPU buffer clearing 307*4882a593Smuzhiyun 308*4882a593Smuzhiyun The kernel does not by default enforce the disabling of SMT, which leaves 309*4882a593Smuzhiyun SMT systems vulnerable when running untrusted code. The same rationale as 310*4882a593Smuzhiyun for L1TF applies. 311*4882a593Smuzhiyun See :ref:`Documentation/admin-guide/hw-vuln//l1tf.rst <default_mitigations>`. 312