| 45c7328c | 20-Sep-2024 |
Boyan Karatotev <boyan.karatotev@arm.com> |
fix(cpus): avoid SME related loss of context on powerdown
Travis' and Gelas' TRMs tell us to disable SME (set PSTATE.{ZA, SM} to 0) when we're attempting to power down. What they don't tell us is th
fix(cpus): avoid SME related loss of context on powerdown
Travis' and Gelas' TRMs tell us to disable SME (set PSTATE.{ZA, SM} to 0) when we're attempting to power down. What they don't tell us is that if this isn't done, the powerdown request will be rejected. On the CPU_OFF path that's not a problem - we can force SVCR to 0 and be certain the core will power off.
On the suspend to powerdown path, however, we cannot do this. The TRM also tells us that the sequence could also be aborted on eg. GIC interrupts. If this were to happen when we have overwritten SVCR to 0, upon a return to the caller they would experience a loss of context. We know that at least Linux may call into PSCI with SVCR != 0. One option is to save the entire SME context which would be quite expensive just to work around. Another option is to downgrade the request to a normal suspend when SME was left on. This option is better as this is expected to happen rarely enough to ignore the wasted power and we don't want to burden the generic (correct) path with needless context management.
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com> Change-Id: I698fa8490ebf51461f6aa8bba84f9827c5c46ad4
show more ...
|
| 2b5e00d4 | 19-Dec-2024 |
Boyan Karatotev <boyan.karatotev@arm.com> |
feat(psci): allow cores to wake up from powerdown
The simplistic view of a core's powerdown sequence is that power is atomically cut upon calling `wfi`. However, it turns out that it has lots to do
feat(psci): allow cores to wake up from powerdown
The simplistic view of a core's powerdown sequence is that power is atomically cut upon calling `wfi`. However, it turns out that it has lots to do - it has to talk to the interconnect to exit coherency, clean caches, check for RAS errors, etc. These take significant amounts of time and are certainly not atomic. As such there is a significant window of opportunity for external events to happen. Many of these steps are not destructive to context, so theoretically, the core can just "give up" half way (or roll certain actions back) and carry on running. The point in this sequence after which roll back is not possible is called the point of no return.
One of these actions is the checking for RAS errors. It is possible for one to happen during this lengthy sequence, or at least remain undiscovered until that point. If the core were to continue powerdown when that happens, there would be no (easy) way to inform anyone about it. Rejecting the powerdown and letting software handle the error is the best way to implement this.
Arm cores since at least the a510 have included this exact feature. So far it hasn't been deemed necessary to account for it in firmware due to the low likelihood of this happening. However, events like GIC wakeup requests are much more probable. Older cores will powerdown and immediately power back up when this happens. Travis and Gelas include a feature similar to the RAS case above, called powerdown abandon. The idea is that this will improve the latency to service the interrupt by saving on work which the core and software need to do.
So far firmware has relied on the `wfi` being the point of no return and if it doesn't explicitly detect a pending interrupt quite early on, it will embark onto a sequence that it expects to end with shutdown. To accommodate for it not being a point of no return, we must undo all of the system management we did, just like in the warm boot entrypoint.
To achieve that, the pwr_domain_pwr_down_wfi hook must not be terminal. Most recent platforms do some platform management and finish on the standard `wfi`, followed by a panic or an endless loop as this is expected to not return. To make this generic, any platform that wishes to support wakeups must instead let common code call `psci_power_down_wfi()` right after. Besides wakeups, this lets common code handle powerdown errata better as well.
Then, the CPU_OFF case is simple - PSCI does not allow it to return. So the best that can be done is to attempt the `wfi` a few times (the choice of 32 is arbitrary) in the hope that the wakeup is transient. If it isn't, the only choice is to panic, as the system is likely to be in a bad state, eg. interrupts weren't routed away. The same applies for SYSTEM_OFF, SYSTEM_RESET, and SYSTEM_RESET2. There the panic won't matter as the system is going offline one way or another. The RAS case will be considered in a separate patch.
Now, the CPU_SUSPEND case is more involved. First, to powerdown it must wipe its context as it is not written on warm boot. But it cannot be overwritten in case of a wakeup. To avoid the catch 22, save a copy that will only be used if powerdown fails. That is about 500 bytes on the stack so it hopefully doesn't tip anyone over any limits. In future that can be avoided by having a core manage its own context.
Second, when the core wakes up, it must undo anything it did to prepare for poweroff, which for the cores we care about, is writing CPUPWRCTLR_EL1.CORE_PWRDN_EN. The least intrusive for the cpu library way of doing this is to simply call the power off hook again and have the hook toggle the bit. If in the future there need to be more complex sequences, their direction can be advised on the value of this bit.
Third, do the actual "resume". Most of the logic is already there for the retention suspend, so that only needs a small touch up to apply to the powerdown case as well. The missing bit is the powerdown specific state management. Luckily, the warmboot entrypoint does exactly that already too, so steal that and we're done.
All of this is hidden behind a FEAT_PABANDON flag since it has a large memory and runtime cost that we don't want to burden non pabandon cores with.
Finally, do some function renaming to better reflect their purpose and make names a little bit more consistent.
Change-Id: I2405b59300c2e24ce02e266f91b7c51474c1145f Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| cc94e71b | 26-Sep-2024 |
Boyan Karatotev <boyan.karatotev@arm.com> |
refactor(cpus): undo errata mitigations
The workarounds introduced in the three patches starting at 888eafa00b99aa06b4ff688407336811a7ff439a assumed that any powerdown request will be (forced to be)
refactor(cpus): undo errata mitigations
The workarounds introduced in the three patches starting at 888eafa00b99aa06b4ff688407336811a7ff439a assumed that any powerdown request will be (forced to be) terminal. This assumption can no longer be the case for new CPUs so there is a need to revisit these older cores. Since we may wake up, we now need to respect the workaround's recommendation that the workaround needs to be reverted on wakeup. So do exactly that.
Introduce a new helper to toggle bits in assembly. This allows us to call the workaround twice, with the first call setting the workaround and second undoing it. This is also used for gelas' an travis' powerdown routines. This is so the same function can be called again
Also fix the condition in the cpu helper macro as it was subtly wrong
Change-Id: Iff9e5251dc9d8670d085d88c070f78991955e7c3 Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| 8ae6b1ad | 28-Jan-2025 |
Arvind Ram Prakash <arvind.ramprakash@arm.com> |
fix(security): apply SMCCC_ARCH_WORKAROUND_4 to affected cpus
This patch implements SMCCC_ARCH_WORKAROUND_4 and allows discovery through SMCCC_ARCH_FEATURES. This mechanism is enabled if CVE_2024_78
fix(security): apply SMCCC_ARCH_WORKAROUND_4 to affected cpus
This patch implements SMCCC_ARCH_WORKAROUND_4 and allows discovery through SMCCC_ARCH_FEATURES. This mechanism is enabled if CVE_2024_7881 [1] is enabled by the platform. If CVE_2024_7881 mitigation is implemented, the discovery call returns 0, if not -1 (SMC_ARCH_CALL_NOT_SUPPORTED).
For more information about SMCCC_ARCH_WORKAROUND_4 [2], please refer to the SMCCC Specification reference provided below.
[1]: https://developer.arm.com/Arm%20Security%20Center/Arm%20CPU%20Vulnerability%20CVE-2024-7881 [2]: https://developer.arm.com/documentation/den0028/latest
Signed-off-by: Arvind Ram Prakash <arvind.ramprakash@arm.com> Change-Id: I1b1ffaa1f806f07472fd79d5525f81764d99bc79
show more ...
|
| b0521a16 | 06-Sep-2024 |
Arvind Ram Prakash <arvind.ramprakash@arm.com> |
fix(security): add CVE-2024-7881 mitigation to Cortex-X3
This patch mitigates CVE-2024-7881 [1] by setting CPUACTLR6_EL1[41] to 1 for Cortex-X3 CPU.
[1]: https://developer.arm.com/Arm%20Security%20
fix(security): add CVE-2024-7881 mitigation to Cortex-X3
This patch mitigates CVE-2024-7881 [1] by setting CPUACTLR6_EL1[41] to 1 for Cortex-X3 CPU.
[1]: https://developer.arm.com/Arm%20Security%20Center/Arm%20CPU%20Vulnerability%20CVE-2024-7881
Signed-off-by: Arvind Ram Prakash <arvind.ramprakash@arm.com> Change-Id: I410517d175a80fc6f459fa6ce5c30c0a38db9eaf
show more ...
|
| f532cd30 | 15-Jan-2025 |
Govindraj Raja <govindraj.raja@arm.com> |
Merge changes I137f69be,Ia2e7168f,I0e569d12,I614272ec,Ib68293f2 into integration
* changes: perf(psci): pass my_core_pos around instead of calling it repeatedly refactor(psci): move timestamp co
Merge changes I137f69be,Ia2e7168f,I0e569d12,I614272ec,Ib68293f2 into integration
* changes: perf(psci): pass my_core_pos around instead of calling it repeatedly refactor(psci): move timestamp collection to psci_pwrdown_cpu refactor(psci): factor common code out of the standby finisher refactor(psci): don't use PSCI_INVALID_PWR_LVL to signal OFF state docs(psci): drop outdated cache maintenance comment
show more ...
|
| 3b802105 | 06-Nov-2024 |
Boyan Karatotev <boyan.karatotev@arm.com> |
perf(psci): pass my_core_pos around instead of calling it repeatedly
On some platforms plat_my_core_pos is a nontrivial function that takes a bit of time and the compiler really doesn't like to inli
perf(psci): pass my_core_pos around instead of calling it repeatedly
On some platforms plat_my_core_pos is a nontrivial function that takes a bit of time and the compiler really doesn't like to inline. In the PSCI library, at least, we have no need to keep repeatedly calling it and we can instead pass it around as an argument. This saves on a lot of redundant calls, speeding the library up a bit.
Change-Id: I137f69bea80d7cac90d7a20ffe98e1ba8d77246f Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| 9b1e800e | 10-Oct-2024 |
Boyan Karatotev <boyan.karatotev@arm.com> |
refactor(psci): move timestamp collection to psci_pwrdown_cpu
psci_pwrdown_cpu has two callers, both of which save timestamps meant to measure how much time the cache maintenance operations take. Mo
refactor(psci): move timestamp collection to psci_pwrdown_cpu
psci_pwrdown_cpu has two callers, both of which save timestamps meant to measure how much time the cache maintenance operations take. Move the timestamp collection inside to save on a bit of code duplication.
Change-Id: Ia2e7168faf7773d99b696cbdb6c98db7b58e31cf Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| 44ee7714 | 30-Sep-2024 |
Boyan Karatotev <boyan.karatotev@arm.com> |
refactor(psci): factor common code out of the standby finisher
psci_suspend_to_standby_finisher and psci_cpu_suspend_finish do mostly the same stuff, besides the system management associated with th
refactor(psci): factor common code out of the standby finisher
psci_suspend_to_standby_finisher and psci_cpu_suspend_finish do mostly the same stuff, besides the system management associated with their respective wakeup paths. So bring the wake from standby path in line with the wake from reset path - caller acquires locks and manages context. This way both behave in vaguely the same way. We can also bring their names in line so it's more apparent how they are different.
This is in preparation for cores waking from sleep, coming in another patch. No functional change is expected.
Change-Id: I0e569d12f65d231606080faa0149d22efddc386d Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| 0c836554 | 30-Sep-2024 |
Boyan Karatotev <boyan.karatotev@arm.com> |
refactor(psci): don't use PSCI_INVALID_PWR_LVL to signal OFF state
The target_pwrlvl field in the psci cpu data struct only stores the highest power domain that a CPU_SUSPEND call affected, and is u
refactor(psci): don't use PSCI_INVALID_PWR_LVL to signal OFF state
The target_pwrlvl field in the psci cpu data struct only stores the highest power domain that a CPU_SUSPEND call affected, and is used to resume those same domains on warm reset. If the cpu is otherwise OFF (never turned on or CPU_OFF), then this needs to be the highest power level because we don't know the highest level that will be off.
So skip the invalidation and always keep the field to the maximum value. During suspend the field will be lowered to the appropriate value and then put back after wakeup.
Also, do that in the suspend to standby path as well as it will have been written before the sleep and it might end up incorrect.
Change-Id: I614272ec387e1d83023c94700780a0f538a9a6b6 Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|