| 7455cd17 | 29-Jan-2025 |
Govindraj Raja <govindraj.raja@arm.com> |
fix(cpus): workaround for accessing ICH_VMCR_EL2
When ICH_VMCR_EL2.VBPR1 is written in Secure state (SCR_EL3.NS==0) and then subsequently read in Non-secure state (SCR_EL3.NS==1), a wrong value migh
fix(cpus): workaround for accessing ICH_VMCR_EL2
When ICH_VMCR_EL2.VBPR1 is written in Secure state (SCR_EL3.NS==0) and then subsequently read in Non-secure state (SCR_EL3.NS==1), a wrong value might be returned. The same issue exists in the opposite way.
Adding workaround in EL3 software that performs context save/restore on a change of Security state to use a value of SCR_EL3.NS when accessing ICH_VMCR_EL2 that reflects the Security state that owns the data being saved or restored. For example, EL3 software should set SCR_EL3.NS to 1 when saving or restoring the value ICH_VMCR_EL2 for Non-secure(or Realm) state. EL3 software should clear SCR_EL3.NS to 0 when saving or restoring the value ICH_VMCR_EL2 for Secure state.
SDEN documentation: https://developer.arm.com/documentation/SDEN-1775101/latest/
Change-Id: I9f0403601c6346276e925f02eab55908b009d957 Signed-off-by: Govindraj Raja <govindraj.raja@arm.com>
show more ...
|
| 45c7328c | 20-Sep-2024 |
Boyan Karatotev <boyan.karatotev@arm.com> |
fix(cpus): avoid SME related loss of context on powerdown
Travis' and Gelas' TRMs tell us to disable SME (set PSTATE.{ZA, SM} to 0) when we're attempting to power down. What they don't tell us is th
fix(cpus): avoid SME related loss of context on powerdown
Travis' and Gelas' TRMs tell us to disable SME (set PSTATE.{ZA, SM} to 0) when we're attempting to power down. What they don't tell us is that if this isn't done, the powerdown request will be rejected. On the CPU_OFF path that's not a problem - we can force SVCR to 0 and be certain the core will power off.
On the suspend to powerdown path, however, we cannot do this. The TRM also tells us that the sequence could also be aborted on eg. GIC interrupts. If this were to happen when we have overwritten SVCR to 0, upon a return to the caller they would experience a loss of context. We know that at least Linux may call into PSCI with SVCR != 0. One option is to save the entire SME context which would be quite expensive just to work around. Another option is to downgrade the request to a normal suspend when SME was left on. This option is better as this is expected to happen rarely enough to ignore the wasted power and we don't want to burden the generic (correct) path with needless context management.
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com> Change-Id: I698fa8490ebf51461f6aa8bba84f9827c5c46ad4
show more ...
|
| 2b5e00d4 | 19-Dec-2024 |
Boyan Karatotev <boyan.karatotev@arm.com> |
feat(psci): allow cores to wake up from powerdown
The simplistic view of a core's powerdown sequence is that power is atomically cut upon calling `wfi`. However, it turns out that it has lots to do
feat(psci): allow cores to wake up from powerdown
The simplistic view of a core's powerdown sequence is that power is atomically cut upon calling `wfi`. However, it turns out that it has lots to do - it has to talk to the interconnect to exit coherency, clean caches, check for RAS errors, etc. These take significant amounts of time and are certainly not atomic. As such there is a significant window of opportunity for external events to happen. Many of these steps are not destructive to context, so theoretically, the core can just "give up" half way (or roll certain actions back) and carry on running. The point in this sequence after which roll back is not possible is called the point of no return.
One of these actions is the checking for RAS errors. It is possible for one to happen during this lengthy sequence, or at least remain undiscovered until that point. If the core were to continue powerdown when that happens, there would be no (easy) way to inform anyone about it. Rejecting the powerdown and letting software handle the error is the best way to implement this.
Arm cores since at least the a510 have included this exact feature. So far it hasn't been deemed necessary to account for it in firmware due to the low likelihood of this happening. However, events like GIC wakeup requests are much more probable. Older cores will powerdown and immediately power back up when this happens. Travis and Gelas include a feature similar to the RAS case above, called powerdown abandon. The idea is that this will improve the latency to service the interrupt by saving on work which the core and software need to do.
So far firmware has relied on the `wfi` being the point of no return and if it doesn't explicitly detect a pending interrupt quite early on, it will embark onto a sequence that it expects to end with shutdown. To accommodate for it not being a point of no return, we must undo all of the system management we did, just like in the warm boot entrypoint.
To achieve that, the pwr_domain_pwr_down_wfi hook must not be terminal. Most recent platforms do some platform management and finish on the standard `wfi`, followed by a panic or an endless loop as this is expected to not return. To make this generic, any platform that wishes to support wakeups must instead let common code call `psci_power_down_wfi()` right after. Besides wakeups, this lets common code handle powerdown errata better as well.
Then, the CPU_OFF case is simple - PSCI does not allow it to return. So the best that can be done is to attempt the `wfi` a few times (the choice of 32 is arbitrary) in the hope that the wakeup is transient. If it isn't, the only choice is to panic, as the system is likely to be in a bad state, eg. interrupts weren't routed away. The same applies for SYSTEM_OFF, SYSTEM_RESET, and SYSTEM_RESET2. There the panic won't matter as the system is going offline one way or another. The RAS case will be considered in a separate patch.
Now, the CPU_SUSPEND case is more involved. First, to powerdown it must wipe its context as it is not written on warm boot. But it cannot be overwritten in case of a wakeup. To avoid the catch 22, save a copy that will only be used if powerdown fails. That is about 500 bytes on the stack so it hopefully doesn't tip anyone over any limits. In future that can be avoided by having a core manage its own context.
Second, when the core wakes up, it must undo anything it did to prepare for poweroff, which for the cores we care about, is writing CPUPWRCTLR_EL1.CORE_PWRDN_EN. The least intrusive for the cpu library way of doing this is to simply call the power off hook again and have the hook toggle the bit. If in the future there need to be more complex sequences, their direction can be advised on the value of this bit.
Third, do the actual "resume". Most of the logic is already there for the retention suspend, so that only needs a small touch up to apply to the powerdown case as well. The missing bit is the powerdown specific state management. Luckily, the warmboot entrypoint does exactly that already too, so steal that and we're done.
All of this is hidden behind a FEAT_PABANDON flag since it has a large memory and runtime cost that we don't want to burden non pabandon cores with.
Finally, do some function renaming to better reflect their purpose and make names a little bit more consistent.
Change-Id: I2405b59300c2e24ce02e266f91b7c51474c1145f Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| cc94e71b | 26-Sep-2024 |
Boyan Karatotev <boyan.karatotev@arm.com> |
refactor(cpus): undo errata mitigations
The workarounds introduced in the three patches starting at 888eafa00b99aa06b4ff688407336811a7ff439a assumed that any powerdown request will be (forced to be)
refactor(cpus): undo errata mitigations
The workarounds introduced in the three patches starting at 888eafa00b99aa06b4ff688407336811a7ff439a assumed that any powerdown request will be (forced to be) terminal. This assumption can no longer be the case for new CPUs so there is a need to revisit these older cores. Since we may wake up, we now need to respect the workaround's recommendation that the workaround needs to be reverted on wakeup. So do exactly that.
Introduce a new helper to toggle bits in assembly. This allows us to call the workaround twice, with the first call setting the workaround and second undoing it. This is also used for gelas' an travis' powerdown routines. This is so the same function can be called again
Also fix the condition in the cpu helper macro as it was subtly wrong
Change-Id: Iff9e5251dc9d8670d085d88c070f78991955e7c3 Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| 8ae6b1ad | 28-Jan-2025 |
Arvind Ram Prakash <arvind.ramprakash@arm.com> |
fix(security): apply SMCCC_ARCH_WORKAROUND_4 to affected cpus
This patch implements SMCCC_ARCH_WORKAROUND_4 and allows discovery through SMCCC_ARCH_FEATURES. This mechanism is enabled if CVE_2024_78
fix(security): apply SMCCC_ARCH_WORKAROUND_4 to affected cpus
This patch implements SMCCC_ARCH_WORKAROUND_4 and allows discovery through SMCCC_ARCH_FEATURES. This mechanism is enabled if CVE_2024_7881 [1] is enabled by the platform. If CVE_2024_7881 mitigation is implemented, the discovery call returns 0, if not -1 (SMC_ARCH_CALL_NOT_SUPPORTED).
For more information about SMCCC_ARCH_WORKAROUND_4 [2], please refer to the SMCCC Specification reference provided below.
[1]: https://developer.arm.com/Arm%20Security%20Center/Arm%20CPU%20Vulnerability%20CVE-2024-7881 [2]: https://developer.arm.com/documentation/den0028/latest
Signed-off-by: Arvind Ram Prakash <arvind.ramprakash@arm.com> Change-Id: I1b1ffaa1f806f07472fd79d5525f81764d99bc79
show more ...
|