| 0f57a388 | 03-Jul-2025 |
Boyan Karatotev <boyan.karatotev@arm.com> |
fix(cpufeat): do feature detection before feature enablement
Situations where feature configuration does not reflect hardware's features can cause unhandled exceptions at EL3. Feature detection is m
fix(cpufeat): do feature detection before feature enablement
Situations where feature configuration does not reflect hardware's features can cause unhandled exceptions at EL3. Feature detection is meant to guard against these errors by checking hardware against the configuration. For this to happen though, feature detection has to happen before these unhandled exceptions have had a chance to happen.
Change-Id: I47f05a9f01321e011623083afb638552311ed013 Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| 5d893410 | 07-Jan-2025 |
Boyan Karatotev <boyan.karatotev@arm.com> |
refactor(gic): promote most of the GIC driver to common code
More often than not, Arm based systems include some revision of a GIC. There are two ways of adding support for them in platform code - c
refactor(gic): promote most of the GIC driver to common code
More often than not, Arm based systems include some revision of a GIC. There are two ways of adding support for them in platform code - calling the top-level helpers from plat/arm/common/arm_gicvX.c or by using the driver directly. Both of these methods allow for a high degree of customisation - most functions are defined to be weak and there are no calls to any of them in generic code.
As it turns out, requirements around those GICs are largely the same. Platforms that use arm_gicvX.c use the helpers identically among each other. Platforms that use the driver directly tend to end up with calls that look a lot like the arm_gicvX.c helpers and the weakness of the functions are never exercised.
All of this results in a lot of code duplication to do what is essentially the same thing. Even though it's not a lot of code, when multiplied among many platforms it becomes significant and makes refactoring it quite difficult. It's also bug prone since the steps are a little convoluted and things are likely to work even with subtle errors (see 50009f61177421118f42d6a000611ba0e613d54b).
So promote as much of the GIC to be called from common code. Do the setup in bl31_main() and have every PSCI method do the state management directly instead of delegating it to the platform hooks. We can base this implementation on arm_gicvX.c since they already offer logical names and have worked quite well so far with minimal changes.
The main benefit of doing this is reduced code duplication. If we assume that, outside of some platform setup, GIC management is identical, then a platform can add support by telling the build system, regardless of GIC revision. The other benefit is performance - BL31 and PSCI already know the core_pos and they can pass it as an argument instead of having to call plat_my_core_pos(). Now, the only platform specific GIC actions necessary are the saving and restoring of context on entering and exiting a power domain. The PSCI library does not keep track of this so it is unable perform it itself. The routines themselves are also provided.
For compatibility all of this is hidden behind a build flag. Platforms are encouraged to adopt this driver, but it would not be practical to convert and validate every GIC based platform.
This patch renames the functions in question to follow the gic_<function>() convention. This allows the names to be version agnostic.
Finally, drop the weak definitions - they are unused, likely to remain so, and can be added back if the need arises.
Change-Id: I5b5267f4b72f633fb1096400ec8e4b208694135f Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| 51997e3d | 02-Apr-2025 |
Boyan Karatotev <boyan.karatotev@arm.com> |
perf(cpufeat): centralise PAuth key saving
prepare_el3_entry() is meant to be the one-stop shop for all the context we must fiddle with to enter EL3 proper. However, PAuth is the one exception, happ
perf(cpufeat): centralise PAuth key saving
prepare_el3_entry() is meant to be the one-stop shop for all the context we must fiddle with to enter EL3 proper. However, PAuth is the one exception, happening right after. Absorb it into prepare_el3_entry(), handling the BL1/BL31 difference.
This is a good time to also move the key saving into the enable function, also to centralise. With this it becomes apparent that saving keys just before CPU_SUSPEND is redundant as they will be reinitialised when the core wakes up.
Note that the key loading, now in save_gp_pmcr_pauth_regs, does not end in an isb. The effects of the key change are not needed until the isb in the caller, so this isb is not needed.
Change-Id: Idd286bea91140c106ab4c933c5c44b0bc2050ca2 Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| 83ec7e45 | 06-Nov-2024 |
Boyan Karatotev <boyan.karatotev@arm.com> |
perf(amu): greatly simplify AMU context management
The current code is incredibly resilient to updates to the spec and has worked quite well so far. However, recent implementations expose a weakness
perf(amu): greatly simplify AMU context management
The current code is incredibly resilient to updates to the spec and has worked quite well so far. However, recent implementations expose a weakness in that this is rather slow. A large part of it is written in assembly, making it opaque to the compiler for optimisations. The future proofness requires reading registers that are effectively `volatile`, making it even harder for the compiler, as well as adding lots of implicit barriers, making it hard for the microarchitecutre to optimise as well.
We can make a few assumptions, checked by a few well placed asserts, and remove a lot of this burden. For a start, at the moment there are 4 group 0 counters with static assignments. Contexting them is a trivial affair that doesn't need a loop. Similarly, there can only be up to 16 group 1 counters. Contexting them is a bit harder, but we can do with a single branch with a falling through switch. If/when both of these change, we have a pair of asserts and the feature detection mechanism to guard us against pretending that we support something we don't.
We can drop contexting of the offset registers. They are fully accessible by EL2 and as such are its responsibility to preserve on powerdown.
Another small thing we can do, is pass the core_pos into the hook. The caller already knows which core we're running on, we don't need to call this non-trivial function again.
Finally, knowing this, we don't really need the auxiliary AMUs to be described by the device tree. Linux doesn't care at the moment, and any information we need for EL3 can be neatly placed in a simple array.
All of this, combined with lifting the actual saving out of assembly, reduces the instructions to save the context from 180 to 40, including a lot fewer branches. The code is also much shorter and easier to read.
Also propagate to aarch32 so that the two don't diverge too much.
Change-Id: Ib62e6e9ba5be7fb9fb8965c8eee148d5598a5361 Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|