| f2bd3528 | 19-Feb-2025 |
John Powell <john.powell@arm.com> |
fix(errata): workaround for Cortex-A510 erratum 2971420
Cortex-A510 erratum 2971420 applies to revisions r0p1, r0p2, r0p3, r1p0, r1p1, r1p2 and r1p3, and is still open.
Under some conditions, data
fix(errata): workaround for Cortex-A510 erratum 2971420
Cortex-A510 erratum 2971420 applies to revisions r0p1, r0p2, r0p3, r1p0, r1p1, r1p2 and r1p3, and is still open.
Under some conditions, data might be corrupted if Trace Buffer Extension (TRBE) is enabled. The workaround is to disable trace collection via TRBE by programming MDCR_EL3.NSTB[1] to the opposite value of SCR_EL3.NS on a security state switch. Since we only enable TRBE for non-secure world, the workaround is to disable TRBE by setting the NSTB field to 00 so accesses are trapped to EL3 and secure state owns the buffer.
SDEN: https://developer.arm.com/documentation/SDEN-1873361/latest/
Signed-off-by: John Powell <john.powell@arm.com> Change-Id: Ia77051f6b64c726a8c50596c78f220d323ab7d97
show more ...
|
| fcf2ab71 | 11-Feb-2025 |
John Powell <john.powell@arm.com> |
fix(cpus): workaround for Cortex-A715 erratum 2804830
Cortex-A715 erratum 2804830 applies to r0p0, r1p0, r1p1 and r1p2, and is fixed in r1p3.
Under some conditions, writes of a 64B-aligned, 64B gra
fix(cpus): workaround for Cortex-A715 erratum 2804830
Cortex-A715 erratum 2804830 applies to r0p0, r1p0, r1p1 and r1p2, and is fixed in r1p3.
Under some conditions, writes of a 64B-aligned, 64B granule of memory might cause data corruption without this workaround. See SDEN for details.
Since this workaround disables write streaming, it is expected to have a significant performance impact for code that is heavily reliant on write streaming, such as memcpy or memset.
SDEN: https://developer.arm.com/documentation/SDEN-2148827/latest/
Change-Id: Ia12f6c7de7c92f6ea4aec3057b228b828d48724c Signed-off-by: John Powell <john.powell@arm.com>
show more ...
|
| 8001247c | 16-Dec-2024 |
Harrison Mutai <harrison.mutai@arm.com> |
feat(handoff): add 32-bit variant of SRAM layout
Introduce the 32-bit variant of the SRAM layout used by BL1 to communicate available free SRAM to BL2. This layout was added to the specification in:
feat(handoff): add 32-bit variant of SRAM layout
Introduce the 32-bit variant of the SRAM layout used by BL1 to communicate available free SRAM to BL2. This layout was added to the specification in: https://github.com/FirmwareHandoff/firmware_handoff/pull/54.
Change-Id: I559fb8a00725eaedf01856af42d73029802aa095 Signed-off-by: Harrison Mutai <harrison.mutai@arm.com>
show more ...
|
| 7ffc1d6c | 16-Dec-2024 |
Harrison Mutai <harrison.mutai@arm.com> |
feat(handoff): add 32-bit variant of ep info
Add the 32-bit version of the entry_point_info structure used to pass the boot arguments for future executables, added to the spec under the PR: https://
feat(handoff): add 32-bit variant of ep info
Add the 32-bit version of the entry_point_info structure used to pass the boot arguments for future executables, added to the spec under the PR: https://github.com/FirmwareHandoff/firmware_handoff/pull/54.
Change-Id: Id98e0f98db6ffd4790193e201f24e62101450e20 Signed-off-by: Harrison Mutai <harrison.mutai@arm.com>
show more ...
|
| af1dd6e1 | 09-Mar-2025 |
Manish V Badarkhe <Manish.Badarkhe@arm.com> |
feat(lib): add EXTRACT_FIELD macro for field extraction
Introduce a new EXTRACT_FIELD macro to simplify the extraction of specific fields from a value by shifting the value right and applying the ma
feat(lib): add EXTRACT_FIELD macro for field extraction
Introduce a new EXTRACT_FIELD macro to simplify the extraction of specific fields from a value by shifting the value right and applying the mask.
Change-Id: Iae9573d6d23067bbde13253e264e4f6f18b806c2 Signed-off-by: Manish V Badarkhe <Manish.Badarkhe@arm.com>
show more ...
|
| bbff267b | 24-Feb-2025 |
Arvind Ram Prakash <arvind.ramprakash@arm.com> |
fix(errata-abi): add support for handling split workarounds
Certain erratum workarounds like Neoverse N1 1542419, need a part of their mitigation done in EL3 and the rest in lower EL. But currently
fix(errata-abi): add support for handling split workarounds
Certain erratum workarounds like Neoverse N1 1542419, need a part of their mitigation done in EL3 and the rest in lower EL. But currently such workarounds return HIGHER_EL_MITIGATION which indicates that the erratum has already been mitigated by a higher EL(EL3 in this case) which causes the lower EL to not apply it's part of the mitigation.
This patch fixes this issue by adding support for split workarounds so that on certain errata we return AFFECTED even though EL3 has applied it's workaround. This is done by reusing the chosen field of erratum_entry structure into a bitfield that has two bitfields - Bit 0 indicates that the erratum has been enabled in build, Bit 1 indicates that the erratum is a split workaround and should return AFFECTED instead of HIGHER_EL_MITIGATION.
SDEN documentation: https://developer.arm.com/documentation/SDEN885747/latest
Signed-off-by: Arvind Ram Prakash <arvind.ramprakash@arm.com> Change-Id: Iec94d665b5f55609507a219a7d1771eb75e7f4a7
show more ...
|
| ec6f49c2 | 01-Aug-2024 |
Vinoj Soundararajan <vinojs@google.com> |
feat(ras): add eabort get helper function
Add EABORT get field helper function to obtain SET, AET (UET) values from esr_el3/disr_el1 based on PE error state recording in the exception syndrome refer
feat(ras): add eabort get helper function
Add EABORT get field helper function to obtain SET, AET (UET) values from esr_el3/disr_el1 based on PE error state recording in the exception syndrome refer to RAS PE architecture in https://developer.arm.com/documentation/ddi0487/latest/
Change-Id: I0011f041a3089c9bbf670275687ad7c3362a07f9 Signed-off-by: Vinoj Soundararajan <vinojs@google.com>
show more ...
|
| daeae495 | 01-Aug-2024 |
Vinoj Soundararajan <vinojs@google.com> |
feat(ras): add asynchronous error type corrected
Add asynchronous error type Corrected (CE) to error status AET based on PE error state recording in the exception syndrome Refer to https://developer
feat(ras): add asynchronous error type corrected
Add asynchronous error type Corrected (CE) to error status AET based on PE error state recording in the exception syndrome Refer to https://developer.arm.com/documentation/ddi0487/latest/ RAS PE architecture.
Change-Id: I9f2525411b94c8fd397b4a0b8cf5dc47457a2771 Signed-off-by: Vinoj Soundararajan <vinojs@google.com>
show more ...
|
| e5cd3e81 | 01-Aug-2024 |
Vinoj Soundararajan <vinojs@google.com> |
fix(ras): fix typo in uncorrectable error type UEO
Fix spelling for UEO from restable to restartable based on PE error state recording in the exception syndrome Refer to https://developer.arm.com/do
fix(ras): fix typo in uncorrectable error type UEO
Fix spelling for UEO from restable to restartable based on PE error state recording in the exception syndrome Refer to https://developer.arm.com/documentation/ddi0487/latest/ RAS PE architecture.
Change-Id: I4da419f2120a7385853d4da78b409c675cdfe1c8 Signed-off-by: Vinoj Soundararajan <vinojs@google.com>
show more ...
|
| 9c17687a | 01-Aug-2024 |
Vinoj Soundararajan <vinojs@google.com> |
fix(ras): fix status synchronous error type fields
Based on SET bits of ISS encoding for an exception from Data or Instruction Abort. (Refer to ESR_EL3) 1. Fix Synchronous error type restartable val
fix(ras): fix status synchronous error type fields
Based on SET bits of ISS encoding for an exception from Data or Instruction Abort. (Refer to ESR_EL3) 1. Fix Synchronous error type restartable value from 1 to 3 2. Remove corrected CE field which is not applicable to SET
Change-Id: If357da9881bee962825bc3b9423ba7fc107f9b1d Signed-off-by: Vinoj Soundararajan <vinojs@google.com>
show more ...
|
| 7990cc80 | 28-Feb-2025 |
Manish V Badarkhe <manish.badarkhe@arm.com> |
Merge "feat(handoff): add transfer entry printer" into integration |
| c7220035 | 03-Feb-2025 |
Manish Pandey <manish.pandey2@arm.com> |
fix(el3-runtime): replace CTX_ESR_EL3 with CTX_DOUBLE_FAULT_ESR
ESR_EL3 value is updated when an exception is taken to EL3 and its value does not change until a new exception is taken to EL3. We nee
fix(el3-runtime): replace CTX_ESR_EL3 with CTX_DOUBLE_FAULT_ESR
ESR_EL3 value is updated when an exception is taken to EL3 and its value does not change until a new exception is taken to EL3. We need to save ESR in context memory only when we expect nested exception in EL3.
The scenarios where we would expect nested EL3 execution are related with FFH_SUPPORT, namely 1.Handling pending async EAs at EL3 boundry - It uses CTX_SAVED_ESR_EL3 to preserve origins esr_el3 2.Double fault handling - Introduce an explicit storage (CTX_DOUBLE_FAULT_ESR) for esr_el3 to take care of DobuleFault.
As the ESR context has been removed, read the register directly instead of its context value in RD platform.
Signed-off-by: Manish Pandey <manish.pandey2@arm.com> Change-Id: I7720c5f03903f894a77413a235e3cc05c86f9c17
show more ...
|
| 98c65165 | 26-Feb-2025 |
Govindraj Raja <govindraj.raja@arm.com> |
chore: rename arcadia to Cortex-A320
Cortex-A320 has been announced, rename arcadia to Cortex-A320.
Ref: https://newsroom.arm.com/blog/introducing-arm-cortex-a320-cpu https://www.arm.com/products/s
chore: rename arcadia to Cortex-A320
Cortex-A320 has been announced, rename arcadia to Cortex-A320.
Ref: https://newsroom.arm.com/blog/introducing-arm-cortex-a320-cpu https://www.arm.com/products/silicon-ip-cpu/cortex-a/cortex-a320
Change-Id: Ifb3743d43dca3d8caaf1e7416715ccca4fdf195f Signed-off-by: Govindraj Raja <govindraj.raja@arm.com>
show more ...
|
| 937c513d | 13-Dec-2024 |
Harrison Mutai <harrison.mutai@arm.com> |
feat(handoff): add transfer entry printer
Change-Id: Ib7d370b023f92f2fffbd341bcf874914fcc1bac2 Signed-off-by: Harrison Mutai <harrison.mutai@arm.com> |
| 0a580b51 | 15-Nov-2024 |
Boyan Karatotev <boyan.karatotev@arm.com> |
perf(cm): drop ZCR_EL3 saving and some ISBs and replace them with root context
SVE and SME aren't enabled symmetrically for all worlds, but EL3 needs to context switch them nonetheless. Previously,
perf(cm): drop ZCR_EL3 saving and some ISBs and replace them with root context
SVE and SME aren't enabled symmetrically for all worlds, but EL3 needs to context switch them nonetheless. Previously, this had to happen by writing the enable bits just before reading/writing the relevant context. But since the introduction of root context, this need not be the case. We can have these enables always be present for EL3 and save on some work (and ISBs!) on every context switch.
We can also hoist ZCR_EL3 to a never changing register, as we set its value to be identical for every world, which happens to be the one we want for EL3 too.
Change-Id: I3d950e72049a298008205ba32f230d5a5c02f8b0 Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| 83ec7e45 | 06-Nov-2024 |
Boyan Karatotev <boyan.karatotev@arm.com> |
perf(amu): greatly simplify AMU context management
The current code is incredibly resilient to updates to the spec and has worked quite well so far. However, recent implementations expose a weakness
perf(amu): greatly simplify AMU context management
The current code is incredibly resilient to updates to the spec and has worked quite well so far. However, recent implementations expose a weakness in that this is rather slow. A large part of it is written in assembly, making it opaque to the compiler for optimisations. The future proofness requires reading registers that are effectively `volatile`, making it even harder for the compiler, as well as adding lots of implicit barriers, making it hard for the microarchitecutre to optimise as well.
We can make a few assumptions, checked by a few well placed asserts, and remove a lot of this burden. For a start, at the moment there are 4 group 0 counters with static assignments. Contexting them is a trivial affair that doesn't need a loop. Similarly, there can only be up to 16 group 1 counters. Contexting them is a bit harder, but we can do with a single branch with a falling through switch. If/when both of these change, we have a pair of asserts and the feature detection mechanism to guard us against pretending that we support something we don't.
We can drop contexting of the offset registers. They are fully accessible by EL2 and as such are its responsibility to preserve on powerdown.
Another small thing we can do, is pass the core_pos into the hook. The caller already knows which core we're running on, we don't need to call this non-trivial function again.
Finally, knowing this, we don't really need the auxiliary AMUs to be described by the device tree. Linux doesn't care at the moment, and any information we need for EL3 can be neatly placed in a simple array.
All of this, combined with lifting the actual saving out of assembly, reduces the instructions to save the context from 180 to 40, including a lot fewer branches. The code is also much shorter and easier to read.
Also propagate to aarch32 so that the two don't diverge too much.
Change-Id: Ib62e6e9ba5be7fb9fb8965c8eee148d5598a5361 Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| 2590e819 | 25-Nov-2024 |
Boyan Karatotev <boyan.karatotev@arm.com> |
perf(mpmm): greatly simplify MPMM enablement
MPMM is a core-specific microarchitectural feature. It has been present in every Arm core since the Cortex-A510 and has been implemented in exactly the s
perf(mpmm): greatly simplify MPMM enablement
MPMM is a core-specific microarchitectural feature. It has been present in every Arm core since the Cortex-A510 and has been implemented in exactly the same way. Despite that, it is enabled more like an architectural feature with a top level enable flag. This utilised the identical implementation.
This duality has left MPMM in an awkward place, where its enablement should be generic, like an architectural feature, but since it is not, it should also be core-specific if it ever changes. One choice to do this has been through the device tree.
This has worked just fine so far, however, recent implementations expose a weakness in that this is rather slow - the device tree has to be read, there's a long call stack of functions with many branches, and system registers are read. In the hot path of PSCI CPU powerdown, this has a significant and measurable impact. Besides it being a rather large amount of code that is difficult to understand.
Since MPMM is a microarchitectural feature, its correct placement is in the reset function. The essence of the current enablement is to write CPUPPMCR_EL3.MPMM_EN if CPUPPMCR_EL3.MPMMPINCTL == 0. Replacing the C enablement with an assembly macro in each CPU's reset function achieves the same effect with just a single close branch and a grand total of 6 instructions (versus the old 2 branches and 32 instructions).
Having done this, the device tree entry becomes redundant. Should a core that doesn't support MPMM arise, this can cleanly be handled in the reset function. As such, the whole ENABLE_MPMM_FCONF and platform hooks mechanisms become obsolete and are removed.
Change-Id: I1d0475b21a1625bb3519f513ba109284f973ffdf Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| a8a5d39d | 24-Feb-2025 |
Manish V Badarkhe <manish.badarkhe@arm.com> |
Merge changes from topic "bk/errata_speed" into integration
* changes: refactor(cpus): declare runtime errata correctly perf(cpus): make reset errata do fewer branches perf(cpus): inline the i
Merge changes from topic "bk/errata_speed" into integration
* changes: refactor(cpus): declare runtime errata correctly perf(cpus): make reset errata do fewer branches perf(cpus): inline the init_cpu_data_ptr function perf(cpus): inline the reset function perf(cpus): inline the cpu_get_rev_var call perf(cpus): inline cpu_rev_var checks refactor(cpus): register DSU errata with the errata framework's wrappers refactor(cpus): convert checker functions to standard helpers refactor(cpus): convert the Cortex-A65 to use the errata framework fix(cpus): declare reset errata correctly
show more ...
|
| 89dba82d | 22-Jan-2025 |
Boyan Karatotev <boyan.karatotev@arm.com> |
perf(cpus): make reset errata do fewer branches
Errata application is painful for performance. For a start, it's done when the core has just come out of reset, which means branch predictors and cach
perf(cpus): make reset errata do fewer branches
Errata application is painful for performance. For a start, it's done when the core has just come out of reset, which means branch predictors and caches will be empty so a branch to a workaround function must be fetched from memory and that round trip is very slow. Then it also runs with the I-cache off, which means that the loop to iterate over the workarounds must also be fetched from memory on each iteration.
We can remove both branches. First, we can simply apply every erratum directly instead of defining a workaround function and jumping to it. Currently, no errata that need to be applied at both reset and runtime, with the same workaround function, exist. If the need arose in future, this should be achievable with a reset + runtime wrapper combo.
Then, we can construct a function that applies each erratum linearly instead of looping over the list. If this function is part of the reset function, then the only "far" branches at reset will be for the checker functions. Importantly, this mitigates the slowdown even when an erratum is disabled.
The result is ~50% speedup on N1SDP and ~20% on AArch64 Juno on wakeup from PSCI calls that end in powerdown. This is roughly back to the baseline of v2.9, before the errata framework regressed on performance (or a little better). It is important to note that there are other slowdowns since then that remain unknown.
Change-Id: Ie4d5288a331b11fd648e5c4a0b652b74160b07b9 Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| b07c317f | 19-Nov-2024 |
Boyan Karatotev <boyan.karatotev@arm.com> |
perf(cpus): inline the init_cpu_data_ptr function
Similar to the reset function inline, inline this too to not do a costly branch with no extra cost.
Change-Id: I54cc399e570e9d0f373ae13c7224d32dbdf
perf(cpus): inline the init_cpu_data_ptr function
Similar to the reset function inline, inline this too to not do a costly branch with no extra cost.
Change-Id: I54cc399e570e9d0f373ae13c7224d32dbdfae1e5 Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| 0d020822 | 19-Nov-2024 |
Boyan Karatotev <boyan.karatotev@arm.com> |
perf(cpus): inline the reset function
Similar to the cpu_rev_var and cpu_ger_rev_var functions, inline the call_reset_handler handler. This way we skip the costly branch at no extra cost as this is
perf(cpus): inline the reset function
Similar to the cpu_rev_var and cpu_ger_rev_var functions, inline the call_reset_handler handler. This way we skip the costly branch at no extra cost as this is the only place where this is called.
While we're at it, drop the options for CPU_NO_RESET_FUNC. The only cpus that need that are virtual cpus which can spare the tiny bit of performance lost. The rest are real cores which can save on the check for zero.
Now is a good time to put the assert for a missing cpu in the get_cpu_ops_ptr function so that it's a bit better encapsulated.
Change-Id: Ia7c3dcd13b75e5d7c8bafad4698994ea65f42406 Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| 41ae0473 | 03-Feb-2025 |
Sona Mathew <sonarebecca.mathew@arm.com> |
fix(rmm): add support for BRBCR_EL2 register for feat_brbe
Currently BRBE is being disabled for Realm world in EL3 by switching the SBRBE bit in mdcr_el3 register to 0b00. The patch removes the swit
fix(rmm): add support for BRBCR_EL2 register for feat_brbe
Currently BRBE is being disabled for Realm world in EL3 by switching the SBRBE bit in mdcr_el3 register to 0b00. The patch removes the switching of SBRBE bits, and adds context switch of BRBCR_EL2 register.
Change-Id: I66ca13edefc37e40fa265fd438b0b66f7d09b4bb Signed-off-by: Sona Mathew <sonarebecca.mathew@arm.com>
show more ...
|
| 36eeb59f | 04-Dec-2024 |
Boyan Karatotev <boyan.karatotev@arm.com> |
perf(cpus): inline the cpu_get_rev_var call
Similar to the cpu_rev_var_xy functions, branching far away so early in the reset sequence incurs significant slowdowns. Inline the function.
Change-Id:
perf(cpus): inline the cpu_get_rev_var call
Similar to the cpu_rev_var_xy functions, branching far away so early in the reset sequence incurs significant slowdowns. Inline the function.
Change-Id: Ifc349015902cd803e11a1946208141bfe7606b89 Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| 7791ce21 | 21-Jan-2025 |
Boyan Karatotev <boyan.karatotev@arm.com> |
perf(cpus): inline cpu_rev_var checks
We strive to apply errata as close to reset as possible with as few things enabled as possible. Importantly, the I-cache will not be enabled. This means that re
perf(cpus): inline cpu_rev_var checks
We strive to apply errata as close to reset as possible with as few things enabled as possible. Importantly, the I-cache will not be enabled. This means that repeated branches to these tiny functions must be re-fetched all the way from memory each time which has glacial speed. Cores are allowed to fetch things ahead of time though as long as execution is fairly linear. So we can trade a little bit of space (3 to 7 instructions per erratum) to keep things linear and not have to go to memory.
While we're at it, optimise the the cpu_rev_var_{ls, hs, range} functions to take up less space. Dropping the moves allows for a bit of assembly magic that produces the same result in 2 and 3 instructions respectively.
Change-Id: I51608352f23b2244ea7a99e76c10892d257f12bf Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| b62673c6 | 23-Jan-2025 |
Boyan Karatotev <boyan.karatotev@arm.com> |
refactor(cpus): register DSU errata with the errata framework's wrappers
The existing DSU errata workarounds hijack the errata framework's inner workings to register with it. However, that is undesi
refactor(cpus): register DSU errata with the errata framework's wrappers
The existing DSU errata workarounds hijack the errata framework's inner workings to register with it. However, that is undesirable as any change to the framework may end up missing these workarounds. So convert the checks and workarounds to macros and have them included with the standard wrappers.
The only problem with this is the is_scu_present_in_dsu weak function. Fortunately, it is only needed for 2 of the errata and only on 3 cores. So drop it, assuming the default behaviour and have the callers handle the exception.
Change-Id: Iefa36325804ea093e938f867b9a6f49a6984b8ae Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|