| #
35b2bbf4 |
| 28-Jul-2025 |
Manish Pandey <manish.pandey2@arm.com> |
Merge changes from topic "bk/pabandon_cleanup" into integration
* changes: feat(cpus): add pabandon support to the Alto cpu feat(psci): optimise clock init on a pabandon feat(psci): check that
Merge changes from topic "bk/pabandon_cleanup" into integration
* changes: feat(cpus): add pabandon support to the Alto cpu feat(psci): optimise clock init on a pabandon feat(psci): check that CPUs handled a pabandon feat(psci): make pabandon support generic refactor(psci): unify coherency exit between AArch64 and AArch32 refactor(psci): absorb psci_power_down_wfi() into common code refactor(platforms): remove usage of psci_power_down_wfi fix(cm): disable SPE/TRBE correctly
show more ...
|
| #
aadb4b56 |
| 12-Mar-2025 |
Boyan Karatotev <boyan.karatotev@arm.com> |
refactor(psci): unify coherency exit between AArch64 and AArch32
The procedure is fairly simple: if we have hardware assisted coherency, call into the cpu driver and let it do its thing. If we don't
refactor(psci): unify coherency exit between AArch64 and AArch32
The procedure is fairly simple: if we have hardware assisted coherency, call into the cpu driver and let it do its thing. If we don't, then we must turn data caches off, handle the confusion that causes with the stack, and call into the cpu driver which will flush the caches that need flushing.
On AArch32 the above happens in common code. On AArch64, however, the turning off of the caches happens in the cpu driver. Since we're dealing with the stack, we must exercise control over it and implement this in assembly. But as the two implementations are nominally different (in the ordering of operations), the part that is in assembly is quite large as jumping back to C to handle the difference might involve the stack.
Presumably, the AArch difference was introduced in order to cater for a possible implementation where turning off the caches requires an IMP DEF sequence. Well, Arm no longer makes cores without hardware assisted coherency, so this eventually is not possible.
So take this part out of the cpu driver and put it into common code, just like in AArch32. With this, there is no longer a need call prepare_cpu_pwr_dwn() in a different order either - we can delay it a bit to happen after the stack management. So the two AArch-s flows become identical. We can convert prepare_cpu_pwr_dwn() to C and leave psci_do_pwrdown_cache_maintenance() only to exercise control over stack.
Change-Id: Ie4759ebe20bb74b60533c6a47dbc2b101875900f Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| #
a8a5d39d |
| 24-Feb-2025 |
Manish V Badarkhe <manish.badarkhe@arm.com> |
Merge changes from topic "bk/errata_speed" into integration
* changes: refactor(cpus): declare runtime errata correctly perf(cpus): make reset errata do fewer branches perf(cpus): inline the i
Merge changes from topic "bk/errata_speed" into integration
* changes: refactor(cpus): declare runtime errata correctly perf(cpus): make reset errata do fewer branches perf(cpus): inline the init_cpu_data_ptr function perf(cpus): inline the reset function perf(cpus): inline the cpu_get_rev_var call perf(cpus): inline cpu_rev_var checks refactor(cpus): register DSU errata with the errata framework's wrappers refactor(cpus): convert checker functions to standard helpers refactor(cpus): convert the Cortex-A65 to use the errata framework fix(cpus): declare reset errata correctly
show more ...
|
| #
89dba82d |
| 22-Jan-2025 |
Boyan Karatotev <boyan.karatotev@arm.com> |
perf(cpus): make reset errata do fewer branches
Errata application is painful for performance. For a start, it's done when the core has just come out of reset, which means branch predictors and cach
perf(cpus): make reset errata do fewer branches
Errata application is painful for performance. For a start, it's done when the core has just come out of reset, which means branch predictors and caches will be empty so a branch to a workaround function must be fetched from memory and that round trip is very slow. Then it also runs with the I-cache off, which means that the loop to iterate over the workarounds must also be fetched from memory on each iteration.
We can remove both branches. First, we can simply apply every erratum directly instead of defining a workaround function and jumping to it. Currently, no errata that need to be applied at both reset and runtime, with the same workaround function, exist. If the need arose in future, this should be achievable with a reset + runtime wrapper combo.
Then, we can construct a function that applies each erratum linearly instead of looping over the list. If this function is part of the reset function, then the only "far" branches at reset will be for the checker functions. Importantly, this mitigates the slowdown even when an erratum is disabled.
The result is ~50% speedup on N1SDP and ~20% on AArch64 Juno on wakeup from PSCI calls that end in powerdown. This is roughly back to the baseline of v2.9, before the errata framework regressed on performance (or a little better). It is important to note that there are other slowdowns since then that remain unknown.
Change-Id: Ie4d5288a331b11fd648e5c4a0b652b74160b07b9 Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| #
0d020822 |
| 19-Nov-2024 |
Boyan Karatotev <boyan.karatotev@arm.com> |
perf(cpus): inline the reset function
Similar to the cpu_rev_var and cpu_ger_rev_var functions, inline the call_reset_handler handler. This way we skip the costly branch at no extra cost as this is
perf(cpus): inline the reset function
Similar to the cpu_rev_var and cpu_ger_rev_var functions, inline the call_reset_handler handler. This way we skip the costly branch at no extra cost as this is the only place where this is called.
While we're at it, drop the options for CPU_NO_RESET_FUNC. The only cpus that need that are virtual cpus which can spare the tiny bit of performance lost. The rest are real cores which can save on the check for zero.
Now is a good time to put the assert for a missing cpu in the get_cpu_ops_ptr function so that it's a bit better encapsulated.
Change-Id: Ia7c3dcd13b75e5d7c8bafad4698994ea65f42406 Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
show more ...
|
| #
cc4f3838 |
| 27-Aug-2024 |
Manish V Badarkhe <manish.badarkhe@arm.com> |
Merge changes from topic "clean-up-errata-compatibility" into integration
* changes: refactor(cpus): remove cpu specific errata funcs refactor(cpus): directly invoke errata reporter
|
| #
3fb52e41 |
| 14-May-2024 |
Ryan Everett <ryan.everett@arm.com> |
refactor(cpus): remove cpu specific errata funcs
Errata printing is done directly via generic_errata_report. This commit removes the unused \_cpu\()_errata_report functions for all cores, and remove
refactor(cpus): remove cpu specific errata funcs
Errata printing is done directly via generic_errata_report. This commit removes the unused \_cpu\()_errata_report functions for all cores, and removes errata_func from cpu_ops.
Change-Id: I04fefbde5f0ff63b1f1cd17c864557a14070d68c Signed-off-by: Ryan Everett <ryan.everett@arm.com>
show more ...
|
| #
478fc4f2 |
| 28-Sep-2020 |
André Przywara <andre.przywara@arm.com> |
Merge "arm_fpga: Add support for unknown MPIDs" into integration
|
| #
1994e562 |
| 20-Aug-2020 |
Javier Almansa Sobrino <javier.almansasobrino@arm.com> |
arm_fpga: Add support for unknown MPIDs
This patch allows the system to fallback to a default CPU library in case the MPID does not match with any of the supported ones.
This feature can be enabled
arm_fpga: Add support for unknown MPIDs
This patch allows the system to fallback to a default CPU library in case the MPID does not match with any of the supported ones.
This feature can be enabled by setting SUPPORT_UNKNOWN_MPID build option to 1 (enabled by default only on arm_fpga platform).
This feature can be very dangerous on a production image and therefore it MUST be disabled for Release images.
Signed-off-by: Javier Almansa Sobrino <javier.almansasobrino@arm.com> Change-Id: I0df7ef2b012d7d60a4fd5de44dea1fbbb46881ba
show more ...
|