| 89dba82d | 22-Jan-2025 |
Boyan Karatotev <boyan.karatotev@arm.com> |
perf(cpus): make reset errata do fewer branches
Errata application is painful for performance. For a start, it's done when the core has just come out of reset, which means branch predictors and cach
perf(cpus): make reset errata do fewer branches
Errata application is painful for performance. For a start, it's done when the core has just come out of reset, which means branch predictors and caches will be empty so a branch to a workaround function must be fetched from memory and that round trip is very slow. Then it also runs with the I-cache off, which means that the loop to iterate over the workarounds must also be fetched from memory on each iteration.
We can remove both branches. First, we can simply apply every erratum directly instead of defining a workaround function and jumping to it. Currently, no errata that need to be applied at both reset and runtime, with the same workaround function, exist. If the need arose in future, this should be achievable with a reset + runtime wrapper combo.
Then, we can construct a function that applies each erratum linearly instead of looping over the list. If this function is part of the reset function, then the only "far" branches at reset will be for the checker functions. Importantly, this mitigates the slowdown even when an erratum is disabled.
The result is ~50% speedup on N1SDP and ~20% on AArch64 Juno on wakeup from PSCI calls that end in powerdown. This is roughly back to the baseline of v2.9, before the errata framework regressed on performance (or a little better). It is important to note that there are other slowdowns since then that remain unknown.
Change-Id: Ie4d5288a331b11fd648e5c4a0b652b74160b07b9 Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| b07c317f | 19-Nov-2024 |
Boyan Karatotev <boyan.karatotev@arm.com> |
perf(cpus): inline the init_cpu_data_ptr function
Similar to the reset function inline, inline this too to not do a costly branch with no extra cost.
Change-Id: I54cc399e570e9d0f373ae13c7224d32dbdf
perf(cpus): inline the init_cpu_data_ptr function
Similar to the reset function inline, inline this too to not do a costly branch with no extra cost.
Change-Id: I54cc399e570e9d0f373ae13c7224d32dbdfae1e5 Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| 0d020822 | 19-Nov-2024 |
Boyan Karatotev <boyan.karatotev@arm.com> |
perf(cpus): inline the reset function
Similar to the cpu_rev_var and cpu_ger_rev_var functions, inline the call_reset_handler handler. This way we skip the costly branch at no extra cost as this is
perf(cpus): inline the reset function
Similar to the cpu_rev_var and cpu_ger_rev_var functions, inline the call_reset_handler handler. This way we skip the costly branch at no extra cost as this is the only place where this is called.
While we're at it, drop the options for CPU_NO_RESET_FUNC. The only cpus that need that are virtual cpus which can spare the tiny bit of performance lost. The rest are real cores which can save on the check for zero.
Now is a good time to put the assert for a missing cpu in the get_cpu_ops_ptr function so that it's a bit better encapsulated.
Change-Id: Ia7c3dcd13b75e5d7c8bafad4698994ea65f42406 Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| 41ae0473 | 03-Feb-2025 |
Sona Mathew <sonarebecca.mathew@arm.com> |
fix(rmm): add support for BRBCR_EL2 register for feat_brbe
Currently BRBE is being disabled for Realm world in EL3 by switching the SBRBE bit in mdcr_el3 register to 0b00. The patch removes the swit
fix(rmm): add support for BRBCR_EL2 register for feat_brbe
Currently BRBE is being disabled for Realm world in EL3 by switching the SBRBE bit in mdcr_el3 register to 0b00. The patch removes the switching of SBRBE bits, and adds context switch of BRBCR_EL2 register.
Change-Id: I66ca13edefc37e40fa265fd438b0b66f7d09b4bb Signed-off-by: Sona Mathew <sonarebecca.mathew@arm.com>
show more ...
|
| 36eeb59f | 04-Dec-2024 |
Boyan Karatotev <boyan.karatotev@arm.com> |
perf(cpus): inline the cpu_get_rev_var call
Similar to the cpu_rev_var_xy functions, branching far away so early in the reset sequence incurs significant slowdowns. Inline the function.
Change-Id:
perf(cpus): inline the cpu_get_rev_var call
Similar to the cpu_rev_var_xy functions, branching far away so early in the reset sequence incurs significant slowdowns. Inline the function.
Change-Id: Ifc349015902cd803e11a1946208141bfe7606b89 Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| 7791ce21 | 21-Jan-2025 |
Boyan Karatotev <boyan.karatotev@arm.com> |
perf(cpus): inline cpu_rev_var checks
We strive to apply errata as close to reset as possible with as few things enabled as possible. Importantly, the I-cache will not be enabled. This means that re
perf(cpus): inline cpu_rev_var checks
We strive to apply errata as close to reset as possible with as few things enabled as possible. Importantly, the I-cache will not be enabled. This means that repeated branches to these tiny functions must be re-fetched all the way from memory each time which has glacial speed. Cores are allowed to fetch things ahead of time though as long as execution is fairly linear. So we can trade a little bit of space (3 to 7 instructions per erratum) to keep things linear and not have to go to memory.
While we're at it, optimise the the cpu_rev_var_{ls, hs, range} functions to take up less space. Dropping the moves allows for a bit of assembly magic that produces the same result in 2 and 3 instructions respectively.
Change-Id: I51608352f23b2244ea7a99e76c10892d257f12bf Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| b62673c6 | 23-Jan-2025 |
Boyan Karatotev <boyan.karatotev@arm.com> |
refactor(cpus): register DSU errata with the errata framework's wrappers
The existing DSU errata workarounds hijack the errata framework's inner workings to register with it. However, that is undesi
refactor(cpus): register DSU errata with the errata framework's wrappers
The existing DSU errata workarounds hijack the errata framework's inner workings to register with it. However, that is undesirable as any change to the framework may end up missing these workarounds. So convert the checks and workarounds to macros and have them included with the standard wrappers.
The only problem with this is the is_scu_present_in_dsu weak function. Fortunately, it is only needed for 2 of the errata and only on 3 cores. So drop it, assuming the default behaviour and have the callers handle the exception.
Change-Id: Iefa36325804ea093e938f867b9a6f49a6984b8ae Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| 7ce483e1 | 16-Feb-2025 |
Manish V Badarkhe <Manish.Badarkhe@arm.com> |
fix(libc): remove __Nonnull type specifier
Clang's nullability completeness checks were triggered after adding the _Nonnull specifier to one function. Removing it prevents Clang from flagging atexit
fix(libc): remove __Nonnull type specifier
Clang's nullability completeness checks were triggered after adding the _Nonnull specifier to one function. Removing it prevents Clang from flagging atexit() for missing a nullability specifier.
include/lib/libc/stdlib.h:25:25: error: pointer is missing a nullability type specifier (_Nonnull, _Nullable, or _Null_unspecified) [-Werror,-Wnullability-completeness] 25 | extern int atexit(void (*func)(void));
This change ensures compliance with the C standard while preventing unexpected build errors.
Change-Id: I2f881c55b36b692d22c3db22149c6402c32e8c3e Signed-off-by: Manish V Badarkhe <Manish.Badarkhe@arm.com>
show more ...
|
| 6db9aac6 | 12-Feb-2025 |
Lauren Wehrmeister <lauren.wehrmeister@arm.com> |
Merge changes from topic "mb/drtm" into integration
* changes: fix(drtm): fix DLME data size check fix(drtm): sort the address-map in ascending order feat(libc): import qsort implementation |
| e1362231 | 12-Feb-2025 |
Soby Mathew <soby.mathew@arm.com> |
Merge changes from topic "memory_bank" into integration
* changes: fix(qemu): statically allocate bitlocks array feat(qemu): update for renamed struct memory_bank feat(fvp): increase GPT PPS t
Merge changes from topic "memory_bank" into integration
* changes: fix(qemu): statically allocate bitlocks array feat(qemu): update for renamed struct memory_bank feat(fvp): increase GPT PPS to 1TB feat(gpt): statically allocate bitlocks array chore(gpt): define PPS in platform header files feat(fvp): allocate L0 GPT at the top of SRAM feat(fvp): change size of PCIe memory region 2 feat(rmm): add PCIe IO info to Boot manifest feat(fvp): define single Root region
show more ...
|
| fcb80d7d | 11-Feb-2025 |
Manish Pandey <manish.pandey2@arm.com> |
Merge changes I765a7fa0,Ic33f0b6d,I8d1a88c7,I381f96be,I698fa849, ... into integration
* changes: fix(cpus): clear CPUPWRCTLR_EL1.CORE_PWRDN_EN_BIT on reset chore(docs): drop the "wfi" from `pwr_
Merge changes I765a7fa0,Ic33f0b6d,I8d1a88c7,I381f96be,I698fa849, ... into integration
* changes: fix(cpus): clear CPUPWRCTLR_EL1.CORE_PWRDN_EN_BIT on reset chore(docs): drop the "wfi" from `pwr_domain_pwr_down_wfi` chore(psci): drop skip_wfi variable feat(arm): convert arm platforms to expect a wakeup fix(cpus): avoid SME related loss of context on powerdown feat(psci): allow cores to wake up from powerdown refactor: panic after calling psci_power_down_wfi() refactor(cpus): undo errata mitigations feat(cpus): add sysreg_bit_toggle
show more ...
|
| b0f1c840 | 24-Jan-2025 |
AlexeiFedorov <Alexei.Fedorov@arm.com> |
feat(gpt): statically allocate bitlocks array
Statically allocate 'gpt_bitlock' array of fine-grained 'bitlock_t' data structures in arm_bl31_setup.c. The amount of memory needed for this array is c
feat(gpt): statically allocate bitlocks array
Statically allocate 'gpt_bitlock' array of fine-grained 'bitlock_t' data structures in arm_bl31_setup.c. The amount of memory needed for this array is controlled by 'RME_GPT_BITLOCK_BLOCK' build option and 'PLAT_ARM_PPS' macro defined in platform_def.h which specifies the size of protected physical address space in bytes. 'PLAT_ARM_PPS' takes values from 4GB to 4PB supported by Arm architecture.
Change-Id: Icf620b5039e45df6828d58fca089cad83b0bc669 Signed-off-by: AlexeiFedorov <Alexei.Fedorov@arm.com>
show more ...
|
| 277713e0 | 21-Jan-2025 |
Manish V Badarkhe <Manish.Badarkhe@arm.com> |
feat(libc): import qsort implementation
Import qsort implementation from FreeBSD[1] to libc.
[1]: https://cgit.freebsd.org/src/tree/lib/libc/stdlib/qsort.c
Change-Id: Ia0d8e2d1c40c679844c0746db1b6
feat(libc): import qsort implementation
Import qsort implementation from FreeBSD[1] to libc.
[1]: https://cgit.freebsd.org/src/tree/lib/libc/stdlib/qsort.c
Change-Id: Ia0d8e2d1c40c679844c0746db1b669cda672a482 Signed-off-by: Manish V Badarkhe <Manish.Badarkhe@arm.com>
show more ...
|
| 697290a9 | 04-Feb-2025 |
Manish V Badarkhe <manish.badarkhe@arm.com> |
Merge changes from topic "us_tc_trng" into integration
* changes: feat(tc): get entropy with PSA Crypto API feat(psa): add interface with RSE for retrieving entropy fix(psa): guard Crypto APIs
Merge changes from topic "us_tc_trng" into integration
* changes: feat(tc): get entropy with PSA Crypto API feat(psa): add interface with RSE for retrieving entropy fix(psa): guard Crypto APIs with CRYPTO_SUPPORT feat(tc): enable trng feat(tc): initialize the RSE communication in earlier phase
show more ...
|
| aacdfdfe | 04-Feb-2025 |
Olivier Deprez <olivier.deprez@arm.com> |
Merge "fix(tc): enable Last-level cache (LLC) for tc4" into integration |
| 1147a470 | 31-Jan-2025 |
Leo Yan <leo.yan@arm.com> |
feat(psa): add interface with RSE for retrieving entropy
Add the AP/RSS interface for reading the entropy. And update the document for the API.
Change-Id: I61492d6b5d824a01ffeadc92f9d41ca841ba3367
feat(psa): add interface with RSE for retrieving entropy
Add the AP/RSS interface for reading the entropy. And update the document for the API.
Change-Id: I61492d6b5d824a01ffeadc92f9d41ca841ba3367 Signed-off-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Icen Zeyada <Icen.Zeyada2@arm.com>
show more ...
|
| 8a41106c | 31-Jan-2025 |
Leo Yan <leo.yan@arm.com> |
fix(psa): guard Crypto APIs with CRYPTO_SUPPORT
When building Crypto APIs, it requires dependency on external headers, e.g., Mbedtls headers. Without the CRYPTO_SUPPORT configuration, external depe
fix(psa): guard Crypto APIs with CRYPTO_SUPPORT
When building Crypto APIs, it requires dependency on external headers, e.g., Mbedtls headers. Without the CRYPTO_SUPPORT configuration, external dependencies are not set up, building Crypto APIs will fail.
Guard Crypto APIs with the CRYPTO_SUPPORT configuration, to make sure the code is built only for Crypto enabled case.
Change-Id: Iffe1220b0e6272586c46432b4f8d0512cb39b0b5 Signed-off-by: Leo Yan <leo.yan@arm.com>
show more ...
|
| e25fc9df | 22-Jan-2025 |
Govindraj Raja <govindraj.raja@arm.com> |
fix(cpus): workaround for Neoverse-V3 erratum 3701767
Neoverse-V3 erratum 3701767 that applies to r0p0, r0p1, r0p2 is still Open.
The workaround is for EL3 software that performs context save/resto
fix(cpus): workaround for Neoverse-V3 erratum 3701767
Neoverse-V3 erratum 3701767 that applies to r0p0, r0p1, r0p2 is still Open.
The workaround is for EL3 software that performs context save/restore on a change of Security state to use a value of SCR_EL3.NS when accessing ICH_VMCR_EL2 that reflects the Security state that owns the data being saved or restored.
SDEN documentation: https://developer.arm.com/documentation/SDEN-2891958/latest/
Change-Id: I5be0de881f408a9e82a07b8459d79490e9065f94 Signed-off-by: Govindraj Raja <govindraj.raja@arm.com>
show more ...
|
| fded8392 | 22-Jan-2025 |
Govindraj Raja <govindraj.raja@arm.com> |
fix(cpus): workaround for Neoverse-N3 erratum 3699563
Neoverse-N3 erratum 3699563 that applies to r0p0 is still Open.
The workaround is for EL3 software that performs context save/restore on a chan
fix(cpus): workaround for Neoverse-N3 erratum 3699563
Neoverse-N3 erratum 3699563 that applies to r0p0 is still Open.
The workaround is for EL3 software that performs context save/restore on a change of Security state to use a value of SCR_EL3.NS when accessing ICH_VMCR_EL2 that reflects the Security state that owns the data being saved or restored.
SDEN documentation: https://developer.arm.com/documentation/SDEN-3050973/latest/
Change-Id: I77aaf8ae0afff3adde9a85f4a1a13ac9d1daf0af Signed-off-by: Govindraj Raja <govindraj.raja@arm.com>
show more ...
|
| adea6e52 | 22-Jan-2025 |
Govindraj Raja <govindraj.raja@arm.com> |
fix(cpus): workaround for Neoverse-N2 erratum 3701773
Neoverse-N2 erratum 3701773 that applies to r0p0, r0p1, r0p2 and r0p3 is still Open.
The workaround is for EL3 software that performs context s
fix(cpus): workaround for Neoverse-N2 erratum 3701773
Neoverse-N2 erratum 3701773 that applies to r0p0, r0p1, r0p2 and r0p3 is still Open.
The workaround is for EL3 software that performs context save/restore on a change of Security state to use a value of SCR_EL3.NS when accessing ICH_VMCR_EL2 that reflects the Security state that owns the data being saved or restored.
SDEN documentation: https://developer.arm.com/documentation/SDEN-1982442/latest/
Change-Id: If95bd67363228c8083724b31f630636fb27f3b61 Signed-off-by: Govindraj Raja <govindraj.raja@arm.com>
show more ...
|
| 511148ef | 22-Jan-2025 |
Govindraj Raja <govindraj.raja@arm.com> |
fix(cpus): workaround for Cortex-X925 erratum 3701747
Cortex-X925 erratum 3701747 that applies to r0p0, r0p1 and is still Open.
The workaround is for EL3 software that performs context save/restore
fix(cpus): workaround for Cortex-X925 erratum 3701747
Cortex-X925 erratum 3701747 that applies to r0p0, r0p1 and is still Open.
The workaround is for EL3 software that performs context save/restore on a change of Security state to use a value of SCR_EL3.NS when accessing ICH_VMCR_EL2 that reflects the Security state that owns the data being saved or restored.
SDEN documentation: https://developer.arm.com/documentation/109180/latest/
Change-Id: I080296666f89276b3260686c2bdb8de63fc174c1 Signed-off-by: Govindraj Raja <govindraj.raja@arm.com>
show more ...
|
| 38401c53 | 22-Jan-2025 |
Govindraj Raja <govindraj.raja@arm.com> |
fix(cpus): workaround for Cortex-X4 erratum 3701758
Cortex-X4 erratum 3701758 that applies to r0p0, r0p1, r0p2 and r0p3 is still Open.
The workaround is for EL3 software that performs context save/
fix(cpus): workaround for Cortex-X4 erratum 3701758
Cortex-X4 erratum 3701758 that applies to r0p0, r0p1, r0p2 and r0p3 is still Open.
The workaround is for EL3 software that performs context save/restore on a change of Security state to use a value of SCR_EL3.NS when accessing ICH_VMCR_EL2 that reflects the Security state that owns the data being saved or restored.
SDEN documentation: https://developer.arm.com/documentation/109148/latest/
Change-Id: I4ee941d1e7653de7a12d69f538ca05f7f9f9961d Signed-off-by: Govindraj Raja <govindraj.raja@arm.com>
show more ...
|
| 77feb745 | 22-Jan-2025 |
Govindraj Raja <govindraj.raja@arm.com> |
fix(cpus): workaround for Cortex-X3 erratum 3701769
Cortex-X3 erratum 3701769 that applies to r0p0, r1p0, r1p1 and r1p2 is still Open.
The workaround is for EL3 software that performs context save/
fix(cpus): workaround for Cortex-X3 erratum 3701769
Cortex-X3 erratum 3701769 that applies to r0p0, r1p0, r1p1 and r1p2 is still Open.
The workaround is for EL3 software that performs context save/restore on a change of Security state to use a value of SCR_EL3.NS when accessing ICH_VMCR_EL2 that reflects the Security state that owns the data being saved or restored.
SDEN documentation: https://developer.arm.com/documentation/SDEN-2055130/latest/
Change-Id: Ifd722e1bb8616ada2ad158297a7ca80b19a3370b Signed-off-by: Govindraj Raja <govindraj.raja@arm.com>
show more ...
|
| ae6c7c97 | 22-Jan-2025 |
Govindraj Raja <govindraj.raja@arm.com> |
fix(cpus): workaround for Cortex-X2 erratum 3701772
Cortex-X2 erratum 3701772 that applies to r0p0, r1p0, r2p0, r2p1 is still Open.
The workaround is for EL3 software that performs context save/res
fix(cpus): workaround for Cortex-X2 erratum 3701772
Cortex-X2 erratum 3701772 that applies to r0p0, r1p0, r2p0, r2p1 is still Open.
The workaround is for EL3 software that performs context save/restore on a change of Security state to use a value of SCR_EL3.NS when accessing ICH_VMCR_EL2 that reflects the Security state that owns the data being saved or restored.
SDEN documentation: https://developer.arm.com/documentation/SDEN-1775100/latest/
Change-Id: I2ffc5e7d7467f1bcff8b895fea52a1daa7d14495 Signed-off-by: Govindraj Raja <govindraj.raja@arm.com>
show more ...
|
| d732300b | 21-Jan-2025 |
Govindraj Raja <govindraj.raja@arm.com> |
fix(cpus): workaround for Cortex-A725 erratum 3699564
Cortex-A725 erratum 3699564 that applies to r0p0, r0p1 and is fixed in r0p2.
The workaround is for EL3 software that performs context save/rest
fix(cpus): workaround for Cortex-A725 erratum 3699564
Cortex-A725 erratum 3699564 that applies to r0p0, r0p1 and is fixed in r0p2.
The workaround is for EL3 software that performs context save/restore on a change of Security state to use a value of SCR_EL3.NS when accessing ICH_VMCR_EL2 that reflects the Security state that owns the data being saved or restored.
SDEN documentation: https://developer.arm.com/documentation/SDEN-2832921/latest
Change-Id: Ifad1f6c3f5b74060273f897eb5e4b79dd9f088f7 Signed-off-by: Govindraj Raja <govindraj.raja@arm.com>
show more ...
|