1NVIDIA Tegra 2============ 3 4- .. rubric:: T186 5 :name: t186 6 7The NVIDIA® Parker (T186) series system-on-chip (SoC) delivers a heterogeneous 8multi-processing (HMP) solution designed to optimize performance and 9efficiency. 10 11T186 has Dual NVIDIA Denver 2 ARM® CPU cores, plus Quad ARM Cortex®-A57 cores, 12in a coherent multiprocessor configuration. The Denver 2 and Cortex-A57 cores 13support ARMv8, executing both 64-bit Aarch64 code, and 32-bit Aarch32 code 14including legacy ARMv7 applications. The Denver 2 processors each have 128 KB 15Instruction and 64 KB Data Level 1 caches; and have a 2MB shared Level 2 16unified cache. The Cortex-A57 processors each have 48 KB Instruction and 32 KB 17Data Level 1 caches; and also have a 2 MB shared Level 2 unified cache. A 18high speed coherency fabric connects these two processor complexes and allows 19heterogeneous multi-processing with all six cores if required. 20 21- .. rubric:: T210 22 :name: t210 23 24T210 has Quad Arm® Cortex®-A57 cores in a switched configuration with a 25companion set of quad Arm Cortex-A53 cores. The Cortex-A57 and A53 cores 26support Armv8-A, executing both 64-bit Aarch64 code, and 32-bit Aarch32 code 27including legacy Armv7-A applications. The Cortex-A57 processors each have 2848 KB Instruction and 32 KB Data Level 1 caches; and have a 2 MB shared 29Level 2 unified cache. The Cortex-A53 processors each have 32 KB Instruction 30and 32 KB Data Level 1 caches; and have a 512 KB shared Level 2 unified cache. 31 32- .. rubric:: T132 33 :name: t132 34 35Denver is NVIDIA's own custom-designed, 64-bit, dual-core CPU which is 36fully Armv8-A architecture compatible. Each of the two Denver cores 37implements a 7-way superscalar microarchitecture (up to 7 concurrent 38micro-ops can be executed per clock), and includes a 128KB 4-way L1 39instruction cache, a 64KB 4-way L1 data cache, and a 2MB 16-way L2 40cache, which services both cores. 41 42Denver implements an innovative process called Dynamic Code Optimization, 43which optimizes frequently used software routines at runtime into dense, 44highly tuned microcode-equivalent routines. These are stored in a 45dedicated, 128MB main-memory-based optimization cache. After being read 46into the instruction cache, the optimized micro-ops are executed, 47re-fetched and executed from the instruction cache as long as needed and 48capacity allows. 49 50Effectively, this reduces the need to re-optimize the software routines. 51Instead of using hardware to extract the instruction-level parallelism 52(ILP) inherent in the code, Denver extracts the ILP once via software 53techniques, and then executes those routines repeatedly, thus amortizing 54the cost of ILP extraction over the many execution instances. 55 56Denver also features new low latency power-state transitions, in addition 57to extensive power-gating and dynamic voltage and clock scaling based on 58workloads. 59 60Directory structure 61------------------- 62 63- plat/nvidia/tegra/common - Common code for all Tegra SoCs 64- plat/nvidia/tegra/soc/txxx - Chip specific code 65 66Trusted OS dispatcher 67--------------------- 68 69Tegra supports multiple Trusted OS'. 70 71- Trusted Little Kernel (TLK): In order to include the 'tlkd' dispatcher in 72 the image, pass 'SPD=tlkd' on the command line while preparing a bl31 image. 73- Trusty: In order to include the 'trusty' dispatcher in the image, pass 74 'SPD=trusty' on the command line while preparing a bl31 image. 75 76This allows other Trusted OS vendors to use the upstream code and include 77their dispatchers in the image without changing any makefiles. 78 79These are the supported Trusted OS' by Tegra platforms. 80 81Tegra132: TLK 82Tegra210: TLK and Trusty 83Tegra186: Trusty 84 85Scatter files 86------------- 87 88Tegra platforms currently support scatter files and ld.S scripts. The scatter 89files help support ARMLINK linker to generate BL31 binaries. For now, there 90exists a common scatter file, plat/nvidia/tegra/scat/bl31.scat, for all Tegra 91SoCs. The `LINKER` build variable needs to point to the ARMLINK binary for 92the scatter file to be used. Tegra platforms have verified BL31 image generation 93with ARMCLANG (compilation) and ARMLINK (linking) for the Tegra186 platforms. 94 95Preparing the BL31 image to run on Tegra SoCs 96--------------------------------------------- 97 98.. code:: shell 99 100 CROSS_COMPILE=<path-to-aarch64-gcc>/bin/aarch64-none-elf- make PLAT=tegra \ 101 TARGET_SOC=<target-soc e.g. t186|t210|t132> SPD=<dispatcher e.g. trusty|tlkd> 102 bl31 103 104Platforms wanting to use different TZDRAM\_BASE, can add ``TZDRAM_BASE=<value>`` 105to the build command line. 106 107The Tegra platform code expects a pointer to the following platform specific 108structure via 'x1' register from the BL2 layer which is used by the 109bl31\_early\_platform\_setup() handler to extract the TZDRAM carveout base and 110size for loading the Trusted OS and the UART port ID to be used. The Tegra 111memory controller driver programs this base/size in order to restrict NS 112accesses. 113 114typedef struct plat\_params\_from\_bl2 { 115/\* TZ memory size */ 116uint64\_t tzdram\_size; 117/* TZ memory base */ 118uint64\_t tzdram\_base; 119/* UART port ID \*/ 120int uart\_id; 121/* L2 ECC parity protection disable flag \*/ 122int l2\_ecc\_parity\_prot\_dis; 123/* SHMEM base address for storing the boot logs \*/ 124uint64\_t boot\_profiler\_shmem\_base; 125} plat\_params\_from\_bl2\_t; 126 127Power Management 128---------------- 129 130The PSCI implementation expects each platform to expose the 'power state' 131parameter to be used during the 'SYSTEM SUSPEND' call. The state-id field 132is implementation defined on Tegra SoCs and is preferably defined by 133tegra\_def.h. 134 135Tegra configs 136------------- 137 138- 'tegra\_enable\_l2\_ecc\_parity\_prot': This flag enables the L2 ECC and Parity 139 Protection bit, for Arm Cortex-A57 CPUs, during CPU boot. This flag will 140 be enabled by Tegrs SoCs during 'Cluster power up' or 'System Suspend' exit. 141