| 51997e3d | 02-Apr-2025 |
Boyan Karatotev <boyan.karatotev@arm.com> |
perf(cpufeat): centralise PAuth key saving
prepare_el3_entry() is meant to be the one-stop shop for all the context we must fiddle with to enter EL3 proper. However, PAuth is the one exception, happ
perf(cpufeat): centralise PAuth key saving
prepare_el3_entry() is meant to be the one-stop shop for all the context we must fiddle with to enter EL3 proper. However, PAuth is the one exception, happening right after. Absorb it into prepare_el3_entry(), handling the BL1/BL31 difference.
This is a good time to also move the key saving into the enable function, also to centralise. With this it becomes apparent that saving keys just before CPU_SUSPEND is redundant as they will be reinitialised when the core wakes up.
Note that the key loading, now in save_gp_pmcr_pauth_regs, does not end in an isb. The effects of the key change are not needed until the isb in the caller, so this isb is not needed.
Change-Id: Idd286bea91140c106ab4c933c5c44b0bc2050ca2 Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| f8138056 | 02-Apr-2025 |
Boyan Karatotev <boyan.karatotev@arm.com> |
refactor(cpufeat): convert FEAT_PAuth setup to C
An oversimplified view of FEAT_PAuth is that it's a symmetric encryption of the LR. PAC instructions execute as NOPs until explicitly turned on. So i
refactor(cpufeat): convert FEAT_PAuth setup to C
An oversimplified view of FEAT_PAuth is that it's a symmetric encryption of the LR. PAC instructions execute as NOPs until explicitly turned on. So in a function that turns PAuth on, the signing would have executed as a NOP and the authentication will encrypt the address, leading to a failure. That's why enablement is in assembly - we have full control of when pointer authentications happen.
However, assembly is hard to read, is opaque to the compiler for optimisations, and we need to call into C anyway for the platform hook to get the key. So convert it to C. We can instruct the compiler to not generate branch protection for the enable function only and as long as the caller doesn't do branch protection (and all callers are entrypoints written in assembly) everything will work.
Change-Id: I8917a26e1293033c910e3058664e3ca9207359b7 Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| 0a580b51 | 15-Nov-2024 |
Boyan Karatotev <boyan.karatotev@arm.com> |
perf(cm): drop ZCR_EL3 saving and some ISBs and replace them with root context
SVE and SME aren't enabled symmetrically for all worlds, but EL3 needs to context switch them nonetheless. Previously,
perf(cm): drop ZCR_EL3 saving and some ISBs and replace them with root context
SVE and SME aren't enabled symmetrically for all worlds, but EL3 needs to context switch them nonetheless. Previously, this had to happen by writing the enable bits just before reading/writing the relevant context. But since the introduction of root context, this need not be the case. We can have these enables always be present for EL3 and save on some work (and ISBs!) on every context switch.
We can also hoist ZCR_EL3 to a never changing register, as we set its value to be identical for every world, which happens to be the one we want for EL3 too.
Change-Id: I3d950e72049a298008205ba32f230d5a5c02f8b0 Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| 83ec7e45 | 06-Nov-2024 |
Boyan Karatotev <boyan.karatotev@arm.com> |
perf(amu): greatly simplify AMU context management
The current code is incredibly resilient to updates to the spec and has worked quite well so far. However, recent implementations expose a weakness
perf(amu): greatly simplify AMU context management
The current code is incredibly resilient to updates to the spec and has worked quite well so far. However, recent implementations expose a weakness in that this is rather slow. A large part of it is written in assembly, making it opaque to the compiler for optimisations. The future proofness requires reading registers that are effectively `volatile`, making it even harder for the compiler, as well as adding lots of implicit barriers, making it hard for the microarchitecutre to optimise as well.
We can make a few assumptions, checked by a few well placed asserts, and remove a lot of this burden. For a start, at the moment there are 4 group 0 counters with static assignments. Contexting them is a trivial affair that doesn't need a loop. Similarly, there can only be up to 16 group 1 counters. Contexting them is a bit harder, but we can do with a single branch with a falling through switch. If/when both of these change, we have a pair of asserts and the feature detection mechanism to guard us against pretending that we support something we don't.
We can drop contexting of the offset registers. They are fully accessible by EL2 and as such are its responsibility to preserve on powerdown.
Another small thing we can do, is pass the core_pos into the hook. The caller already knows which core we're running on, we don't need to call this non-trivial function again.
Finally, knowing this, we don't really need the auxiliary AMUs to be described by the device tree. Linux doesn't care at the moment, and any information we need for EL3 can be neatly placed in a simple array.
All of this, combined with lifting the actual saving out of assembly, reduces the instructions to save the context from 180 to 40, including a lot fewer branches. The code is also much shorter and easier to read.
Also propagate to aarch32 so that the two don't diverge too much.
Change-Id: Ib62e6e9ba5be7fb9fb8965c8eee148d5598a5361 Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| 73d98e37 | 02-Dec-2024 |
Boyan Karatotev <boyan.karatotev@arm.com> |
fix(trbe): add a tsb before context switching
Just like for SPE, we need to synchronize TRBE samples before we change the context to ensure everything goes where it was intended to. If that is not d
fix(trbe): add a tsb before context switching
Just like for SPE, we need to synchronize TRBE samples before we change the context to ensure everything goes where it was intended to. If that is not done, the in-flight entries might use any piece of now incorrect context as there are no implicit ordering requirements.
Prior to root context, the buffer drain hooks would have done that. But now that must happen much earlier. So add a tsb to prepare_el3_entry as well.
Annoyingly, the barrier can be reordered relative to other instructions by default (rule RCKVWP). So add an isb after the psb/tsb to assure that they are ordered, at least as far as context is concerned.
Then, drop the buffer draining hooks. Everything they need to do is already done by now. There's a notable difference in that there are no dsb-s now. Since EL3 does not access the buffers or the feature specific context, we don't need to wait for them to finish.
Finally, drop a stray isb in the context saving macro. It is now absorbed into root context, but was missed.
Change-Id: I30797a40ac7f91d0bb71ad271a1597e85092ccd5 Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| b36e975e | 19-Jul-2024 |
Arvind Ram Prakash <arvind.ramprakash@arm.com> |
feat(trbe): introduce trbe_disable() function
This patch adds trbe_disable() which disables Trace buffer access from lower ELs in all security state. This function makes Secure state the owner of Tr
feat(trbe): introduce trbe_disable() function
This patch adds trbe_disable() which disables Trace buffer access from lower ELs in all security state. This function makes Secure state the owner of Trace buffer and access from EL2/EL1 generate trap exceptions to EL3.
Signed-off-by: Arvind Ram Prakash <arvind.ramprakash@arm.com> Change-Id: If3e3bd621684b3c28f44c3ed2fe3df30b143f8cd
show more ...
|
| 651fe507 | 18-Jul-2024 |
Manish Pandey <manish.pandey2@arm.com> |
feat(spe): introduce spe_disable() function
Introduce a function to disable SPE feature for Non-secure state and do the default setting of making Secure state the owner of profiling buffers and trap
feat(spe): introduce spe_disable() function
Introduce a function to disable SPE feature for Non-secure state and do the default setting of making Secure state the owner of profiling buffers and trap access of profiling and profiling buffer control registers from lower ELs to EL3.
This functionality is required to handle asymmetric cores where SPE has to disabled at runtime.
Signed-off-by: Manish Pandey <manish.pandey2@arm.com> Change-Id: I2f99e922e8df06bfc900c153137aef7c9dcfd759
show more ...
|