1*4882a593Smuzhiyun================================================== 2*4882a593SmuzhiyunARM TCM (Tightly-Coupled Memory) handling in Linux 3*4882a593Smuzhiyun================================================== 4*4882a593Smuzhiyun 5*4882a593SmuzhiyunWritten by Linus Walleij <linus.walleij@stericsson.com> 6*4882a593Smuzhiyun 7*4882a593SmuzhiyunSome ARM SoCs have a so-called TCM (Tightly-Coupled Memory). 8*4882a593SmuzhiyunThis is usually just a few (4-64) KiB of RAM inside the ARM 9*4882a593Smuzhiyunprocessor. 10*4882a593Smuzhiyun 11*4882a593SmuzhiyunDue to being embedded inside the CPU, the TCM has a 12*4882a593SmuzhiyunHarvard-architecture, so there is an ITCM (instruction TCM) 13*4882a593Smuzhiyunand a DTCM (data TCM). The DTCM can not contain any 14*4882a593Smuzhiyuninstructions, but the ITCM can actually contain data. 15*4882a593SmuzhiyunThe size of DTCM or ITCM is minimum 4KiB so the typical 16*4882a593Smuzhiyunminimum configuration is 4KiB ITCM and 4KiB DTCM. 17*4882a593Smuzhiyun 18*4882a593SmuzhiyunARM CPUs have special registers to read out status, physical 19*4882a593Smuzhiyunlocation and size of TCM memories. arch/arm/include/asm/cputype.h 20*4882a593Smuzhiyundefines a CPUID_TCM register that you can read out from the 21*4882a593Smuzhiyunsystem control coprocessor. Documentation from ARM can be found 22*4882a593Smuzhiyunat http://infocenter.arm.com, search for "TCM Status Register" 23*4882a593Smuzhiyunto see documents for all CPUs. Reading this register you can 24*4882a593Smuzhiyundetermine if ITCM (bits 1-0) and/or DTCM (bit 17-16) is present 25*4882a593Smuzhiyunin the machine. 26*4882a593Smuzhiyun 27*4882a593SmuzhiyunThere is further a TCM region register (search for "TCM Region 28*4882a593SmuzhiyunRegisters" at the ARM site) that can report and modify the location 29*4882a593Smuzhiyunsize of TCM memories at runtime. This is used to read out and modify 30*4882a593SmuzhiyunTCM location and size. Notice that this is not a MMU table: you 31*4882a593Smuzhiyunactually move the physical location of the TCM around. At the 32*4882a593Smuzhiyunplace you put it, it will mask any underlying RAM from the 33*4882a593SmuzhiyunCPU so it is usually wise not to overlap any physical RAM with 34*4882a593Smuzhiyunthe TCM. 35*4882a593Smuzhiyun 36*4882a593SmuzhiyunThe TCM memory can then be remapped to another address again using 37*4882a593Smuzhiyunthe MMU, but notice that the TCM if often used in situations where 38*4882a593Smuzhiyunthe MMU is turned off. To avoid confusion the current Linux 39*4882a593Smuzhiyunimplementation will map the TCM 1 to 1 from physical to virtual 40*4882a593Smuzhiyunmemory in the location specified by the kernel. Currently Linux 41*4882a593Smuzhiyunwill map ITCM to 0xfffe0000 and on, and DTCM to 0xfffe8000 and 42*4882a593Smuzhiyunon, supporting a maximum of 32KiB of ITCM and 32KiB of DTCM. 43*4882a593Smuzhiyun 44*4882a593SmuzhiyunNewer versions of the region registers also support dividing these 45*4882a593SmuzhiyunTCMs in two separate banks, so for example an 8KiB ITCM is divided 46*4882a593Smuzhiyuninto two 4KiB banks with its own control registers. The idea is to 47*4882a593Smuzhiyunbe able to lock and hide one of the banks for use by the secure 48*4882a593Smuzhiyunworld (TrustZone). 49*4882a593Smuzhiyun 50*4882a593SmuzhiyunTCM is used for a few things: 51*4882a593Smuzhiyun 52*4882a593Smuzhiyun- FIQ and other interrupt handlers that need deterministic 53*4882a593Smuzhiyun timing and cannot wait for cache misses. 54*4882a593Smuzhiyun 55*4882a593Smuzhiyun- Idle loops where all external RAM is set to self-refresh 56*4882a593Smuzhiyun retention mode, so only on-chip RAM is accessible by 57*4882a593Smuzhiyun the CPU and then we hang inside ITCM waiting for an 58*4882a593Smuzhiyun interrupt. 59*4882a593Smuzhiyun 60*4882a593Smuzhiyun- Other operations which implies shutting off or reconfiguring 61*4882a593Smuzhiyun the external RAM controller. 62*4882a593Smuzhiyun 63*4882a593SmuzhiyunThere is an interface for using TCM on the ARM architecture 64*4882a593Smuzhiyunin <asm/tcm.h>. Using this interface it is possible to: 65*4882a593Smuzhiyun 66*4882a593Smuzhiyun- Define the physical address and size of ITCM and DTCM. 67*4882a593Smuzhiyun 68*4882a593Smuzhiyun- Tag functions to be compiled into ITCM. 69*4882a593Smuzhiyun 70*4882a593Smuzhiyun- Tag data and constants to be allocated to DTCM and ITCM. 71*4882a593Smuzhiyun 72*4882a593Smuzhiyun- Have the remaining TCM RAM added to a special 73*4882a593Smuzhiyun allocation pool with gen_pool_create() and gen_pool_add() 74*4882a593Smuzhiyun and provice tcm_alloc() and tcm_free() for this 75*4882a593Smuzhiyun memory. Such a heap is great for things like saving 76*4882a593Smuzhiyun device state when shutting off device power domains. 77*4882a593Smuzhiyun 78*4882a593SmuzhiyunA machine that has TCM memory shall select HAVE_TCM from 79*4882a593Smuzhiyunarch/arm/Kconfig for itself. Code that needs to use TCM shall 80*4882a593Smuzhiyun#include <asm/tcm.h> 81*4882a593Smuzhiyun 82*4882a593SmuzhiyunFunctions to go into itcm can be tagged like this: 83*4882a593Smuzhiyunint __tcmfunc foo(int bar); 84*4882a593Smuzhiyun 85*4882a593SmuzhiyunSince these are marked to become long_calls and you may want 86*4882a593Smuzhiyunto have functions called locally inside the TCM without 87*4882a593Smuzhiyunwasting space, there is also the __tcmlocalfunc prefix that 88*4882a593Smuzhiyunwill make the call relative. 89*4882a593Smuzhiyun 90*4882a593SmuzhiyunVariables to go into dtcm can be tagged like this:: 91*4882a593Smuzhiyun 92*4882a593Smuzhiyun int __tcmdata foo; 93*4882a593Smuzhiyun 94*4882a593SmuzhiyunConstants can be tagged like this:: 95*4882a593Smuzhiyun 96*4882a593Smuzhiyun int __tcmconst foo; 97*4882a593Smuzhiyun 98*4882a593SmuzhiyunTo put assembler into TCM just use:: 99*4882a593Smuzhiyun 100*4882a593Smuzhiyun .section ".tcm.text" or .section ".tcm.data" 101*4882a593Smuzhiyun 102*4882a593Smuzhiyunrespectively. 103*4882a593Smuzhiyun 104*4882a593SmuzhiyunExample code:: 105*4882a593Smuzhiyun 106*4882a593Smuzhiyun #include <asm/tcm.h> 107*4882a593Smuzhiyun 108*4882a593Smuzhiyun /* Uninitialized data */ 109*4882a593Smuzhiyun static u32 __tcmdata tcmvar; 110*4882a593Smuzhiyun /* Initialized data */ 111*4882a593Smuzhiyun static u32 __tcmdata tcmassigned = 0x2BADBABEU; 112*4882a593Smuzhiyun /* Constant */ 113*4882a593Smuzhiyun static const u32 __tcmconst tcmconst = 0xCAFEBABEU; 114*4882a593Smuzhiyun 115*4882a593Smuzhiyun static void __tcmlocalfunc tcm_to_tcm(void) 116*4882a593Smuzhiyun { 117*4882a593Smuzhiyun int i; 118*4882a593Smuzhiyun for (i = 0; i < 100; i++) 119*4882a593Smuzhiyun tcmvar ++; 120*4882a593Smuzhiyun } 121*4882a593Smuzhiyun 122*4882a593Smuzhiyun static void __tcmfunc hello_tcm(void) 123*4882a593Smuzhiyun { 124*4882a593Smuzhiyun /* Some abstract code that runs in ITCM */ 125*4882a593Smuzhiyun int i; 126*4882a593Smuzhiyun for (i = 0; i < 100; i++) { 127*4882a593Smuzhiyun tcmvar ++; 128*4882a593Smuzhiyun } 129*4882a593Smuzhiyun tcm_to_tcm(); 130*4882a593Smuzhiyun } 131*4882a593Smuzhiyun 132*4882a593Smuzhiyun static void __init test_tcm(void) 133*4882a593Smuzhiyun { 134*4882a593Smuzhiyun u32 *tcmem; 135*4882a593Smuzhiyun int i; 136*4882a593Smuzhiyun 137*4882a593Smuzhiyun hello_tcm(); 138*4882a593Smuzhiyun printk("Hello TCM executed from ITCM RAM\n"); 139*4882a593Smuzhiyun 140*4882a593Smuzhiyun printk("TCM variable from testrun: %u @ %p\n", tcmvar, &tcmvar); 141*4882a593Smuzhiyun tcmvar = 0xDEADBEEFU; 142*4882a593Smuzhiyun printk("TCM variable: 0x%x @ %p\n", tcmvar, &tcmvar); 143*4882a593Smuzhiyun 144*4882a593Smuzhiyun printk("TCM assigned variable: 0x%x @ %p\n", tcmassigned, &tcmassigned); 145*4882a593Smuzhiyun 146*4882a593Smuzhiyun printk("TCM constant: 0x%x @ %p\n", tcmconst, &tcmconst); 147*4882a593Smuzhiyun 148*4882a593Smuzhiyun /* Allocate some TCM memory from the pool */ 149*4882a593Smuzhiyun tcmem = tcm_alloc(20); 150*4882a593Smuzhiyun if (tcmem) { 151*4882a593Smuzhiyun printk("TCM Allocated 20 bytes of TCM @ %p\n", tcmem); 152*4882a593Smuzhiyun tcmem[0] = 0xDEADBEEFU; 153*4882a593Smuzhiyun tcmem[1] = 0x2BADBABEU; 154*4882a593Smuzhiyun tcmem[2] = 0xCAFEBABEU; 155*4882a593Smuzhiyun tcmem[3] = 0xDEADBEEFU; 156*4882a593Smuzhiyun tcmem[4] = 0x2BADBABEU; 157*4882a593Smuzhiyun for (i = 0; i < 5; i++) 158*4882a593Smuzhiyun printk("TCM tcmem[%d] = %08x\n", i, tcmem[i]); 159*4882a593Smuzhiyun tcm_free(tcmem, 20); 160*4882a593Smuzhiyun } 161*4882a593Smuzhiyun } 162