1*4882a593Smuzhiyun================================== 2*4882a593SmuzhiyunCache and TLB Flushing Under Linux 3*4882a593Smuzhiyun================================== 4*4882a593Smuzhiyun 5*4882a593Smuzhiyun:Author: David S. Miller <davem@redhat.com> 6*4882a593Smuzhiyun 7*4882a593SmuzhiyunThis document describes the cache/tlb flushing interfaces called 8*4882a593Smuzhiyunby the Linux VM subsystem. It enumerates over each interface, 9*4882a593Smuzhiyundescribes its intended purpose, and what side effect is expected 10*4882a593Smuzhiyunafter the interface is invoked. 11*4882a593Smuzhiyun 12*4882a593SmuzhiyunThe side effects described below are stated for a uniprocessor 13*4882a593Smuzhiyunimplementation, and what is to happen on that single processor. The 14*4882a593SmuzhiyunSMP cases are a simple extension, in that you just extend the 15*4882a593Smuzhiyundefinition such that the side effect for a particular interface occurs 16*4882a593Smuzhiyunon all processors in the system. Don't let this scare you into 17*4882a593Smuzhiyunthinking SMP cache/tlb flushing must be so inefficient, this is in 18*4882a593Smuzhiyunfact an area where many optimizations are possible. For example, 19*4882a593Smuzhiyunif it can be proven that a user address space has never executed 20*4882a593Smuzhiyunon a cpu (see mm_cpumask()), one need not perform a flush 21*4882a593Smuzhiyunfor this address space on that cpu. 22*4882a593Smuzhiyun 23*4882a593SmuzhiyunFirst, the TLB flushing interfaces, since they are the simplest. The 24*4882a593Smuzhiyun"TLB" is abstracted under Linux as something the cpu uses to cache 25*4882a593Smuzhiyunvirtual-->physical address translations obtained from the software 26*4882a593Smuzhiyunpage tables. Meaning that if the software page tables change, it is 27*4882a593Smuzhiyunpossible for stale translations to exist in this "TLB" cache. 28*4882a593SmuzhiyunTherefore when software page table changes occur, the kernel will 29*4882a593Smuzhiyuninvoke one of the following flush methods _after_ the page table 30*4882a593Smuzhiyunchanges occur: 31*4882a593Smuzhiyun 32*4882a593Smuzhiyun1) ``void flush_tlb_all(void)`` 33*4882a593Smuzhiyun 34*4882a593Smuzhiyun The most severe flush of all. After this interface runs, 35*4882a593Smuzhiyun any previous page table modification whatsoever will be 36*4882a593Smuzhiyun visible to the cpu. 37*4882a593Smuzhiyun 38*4882a593Smuzhiyun This is usually invoked when the kernel page tables are 39*4882a593Smuzhiyun changed, since such translations are "global" in nature. 40*4882a593Smuzhiyun 41*4882a593Smuzhiyun2) ``void flush_tlb_mm(struct mm_struct *mm)`` 42*4882a593Smuzhiyun 43*4882a593Smuzhiyun This interface flushes an entire user address space from 44*4882a593Smuzhiyun the TLB. After running, this interface must make sure that 45*4882a593Smuzhiyun any previous page table modifications for the address space 46*4882a593Smuzhiyun 'mm' will be visible to the cpu. That is, after running, 47*4882a593Smuzhiyun there will be no entries in the TLB for 'mm'. 48*4882a593Smuzhiyun 49*4882a593Smuzhiyun This interface is used to handle whole address space 50*4882a593Smuzhiyun page table operations such as what happens during 51*4882a593Smuzhiyun fork, and exec. 52*4882a593Smuzhiyun 53*4882a593Smuzhiyun3) ``void flush_tlb_range(struct vm_area_struct *vma, 54*4882a593Smuzhiyun unsigned long start, unsigned long end)`` 55*4882a593Smuzhiyun 56*4882a593Smuzhiyun Here we are flushing a specific range of (user) virtual 57*4882a593Smuzhiyun address translations from the TLB. After running, this 58*4882a593Smuzhiyun interface must make sure that any previous page table 59*4882a593Smuzhiyun modifications for the address space 'vma->vm_mm' in the range 60*4882a593Smuzhiyun 'start' to 'end-1' will be visible to the cpu. That is, after 61*4882a593Smuzhiyun running, there will be no entries in the TLB for 'mm' for 62*4882a593Smuzhiyun virtual addresses in the range 'start' to 'end-1'. 63*4882a593Smuzhiyun 64*4882a593Smuzhiyun The "vma" is the backing store being used for the region. 65*4882a593Smuzhiyun Primarily, this is used for munmap() type operations. 66*4882a593Smuzhiyun 67*4882a593Smuzhiyun The interface is provided in hopes that the port can find 68*4882a593Smuzhiyun a suitably efficient method for removing multiple page 69*4882a593Smuzhiyun sized translations from the TLB, instead of having the kernel 70*4882a593Smuzhiyun call flush_tlb_page (see below) for each entry which may be 71*4882a593Smuzhiyun modified. 72*4882a593Smuzhiyun 73*4882a593Smuzhiyun4) ``void flush_tlb_page(struct vm_area_struct *vma, unsigned long addr)`` 74*4882a593Smuzhiyun 75*4882a593Smuzhiyun This time we need to remove the PAGE_SIZE sized translation 76*4882a593Smuzhiyun from the TLB. The 'vma' is the backing structure used by 77*4882a593Smuzhiyun Linux to keep track of mmap'd regions for a process, the 78*4882a593Smuzhiyun address space is available via vma->vm_mm. Also, one may 79*4882a593Smuzhiyun test (vma->vm_flags & VM_EXEC) to see if this region is 80*4882a593Smuzhiyun executable (and thus could be in the 'instruction TLB' in 81*4882a593Smuzhiyun split-tlb type setups). 82*4882a593Smuzhiyun 83*4882a593Smuzhiyun After running, this interface must make sure that any previous 84*4882a593Smuzhiyun page table modification for address space 'vma->vm_mm' for 85*4882a593Smuzhiyun user virtual address 'addr' will be visible to the cpu. That 86*4882a593Smuzhiyun is, after running, there will be no entries in the TLB for 87*4882a593Smuzhiyun 'vma->vm_mm' for virtual address 'addr'. 88*4882a593Smuzhiyun 89*4882a593Smuzhiyun This is used primarily during fault processing. 90*4882a593Smuzhiyun 91*4882a593Smuzhiyun5) ``void update_mmu_cache(struct vm_area_struct *vma, 92*4882a593Smuzhiyun unsigned long address, pte_t *ptep)`` 93*4882a593Smuzhiyun 94*4882a593Smuzhiyun At the end of every page fault, this routine is invoked to 95*4882a593Smuzhiyun tell the architecture specific code that a translation 96*4882a593Smuzhiyun now exists at virtual address "address" for address space 97*4882a593Smuzhiyun "vma->vm_mm", in the software page tables. 98*4882a593Smuzhiyun 99*4882a593Smuzhiyun A port may use this information in any way it so chooses. 100*4882a593Smuzhiyun For example, it could use this event to pre-load TLB 101*4882a593Smuzhiyun translations for software managed TLB configurations. 102*4882a593Smuzhiyun The sparc64 port currently does this. 103*4882a593Smuzhiyun 104*4882a593SmuzhiyunNext, we have the cache flushing interfaces. In general, when Linux 105*4882a593Smuzhiyunis changing an existing virtual-->physical mapping to a new value, 106*4882a593Smuzhiyunthe sequence will be in one of the following forms:: 107*4882a593Smuzhiyun 108*4882a593Smuzhiyun 1) flush_cache_mm(mm); 109*4882a593Smuzhiyun change_all_page_tables_of(mm); 110*4882a593Smuzhiyun flush_tlb_mm(mm); 111*4882a593Smuzhiyun 112*4882a593Smuzhiyun 2) flush_cache_range(vma, start, end); 113*4882a593Smuzhiyun change_range_of_page_tables(mm, start, end); 114*4882a593Smuzhiyun flush_tlb_range(vma, start, end); 115*4882a593Smuzhiyun 116*4882a593Smuzhiyun 3) flush_cache_page(vma, addr, pfn); 117*4882a593Smuzhiyun set_pte(pte_pointer, new_pte_val); 118*4882a593Smuzhiyun flush_tlb_page(vma, addr); 119*4882a593Smuzhiyun 120*4882a593SmuzhiyunThe cache level flush will always be first, because this allows 121*4882a593Smuzhiyunus to properly handle systems whose caches are strict and require 122*4882a593Smuzhiyuna virtual-->physical translation to exist for a virtual address 123*4882a593Smuzhiyunwhen that virtual address is flushed from the cache. The HyperSparc 124*4882a593Smuzhiyuncpu is one such cpu with this attribute. 125*4882a593Smuzhiyun 126*4882a593SmuzhiyunThe cache flushing routines below need only deal with cache flushing 127*4882a593Smuzhiyunto the extent that it is necessary for a particular cpu. Mostly, 128*4882a593Smuzhiyunthese routines must be implemented for cpus which have virtually 129*4882a593Smuzhiyunindexed caches which must be flushed when virtual-->physical 130*4882a593Smuzhiyuntranslations are changed or removed. So, for example, the physically 131*4882a593Smuzhiyunindexed physically tagged caches of IA32 processors have no need to 132*4882a593Smuzhiyunimplement these interfaces since the caches are fully synchronized 133*4882a593Smuzhiyunand have no dependency on translation information. 134*4882a593Smuzhiyun 135*4882a593SmuzhiyunHere are the routines, one by one: 136*4882a593Smuzhiyun 137*4882a593Smuzhiyun1) ``void flush_cache_mm(struct mm_struct *mm)`` 138*4882a593Smuzhiyun 139*4882a593Smuzhiyun This interface flushes an entire user address space from 140*4882a593Smuzhiyun the caches. That is, after running, there will be no cache 141*4882a593Smuzhiyun lines associated with 'mm'. 142*4882a593Smuzhiyun 143*4882a593Smuzhiyun This interface is used to handle whole address space 144*4882a593Smuzhiyun page table operations such as what happens during exit and exec. 145*4882a593Smuzhiyun 146*4882a593Smuzhiyun2) ``void flush_cache_dup_mm(struct mm_struct *mm)`` 147*4882a593Smuzhiyun 148*4882a593Smuzhiyun This interface flushes an entire user address space from 149*4882a593Smuzhiyun the caches. That is, after running, there will be no cache 150*4882a593Smuzhiyun lines associated with 'mm'. 151*4882a593Smuzhiyun 152*4882a593Smuzhiyun This interface is used to handle whole address space 153*4882a593Smuzhiyun page table operations such as what happens during fork. 154*4882a593Smuzhiyun 155*4882a593Smuzhiyun This option is separate from flush_cache_mm to allow some 156*4882a593Smuzhiyun optimizations for VIPT caches. 157*4882a593Smuzhiyun 158*4882a593Smuzhiyun3) ``void flush_cache_range(struct vm_area_struct *vma, 159*4882a593Smuzhiyun unsigned long start, unsigned long end)`` 160*4882a593Smuzhiyun 161*4882a593Smuzhiyun Here we are flushing a specific range of (user) virtual 162*4882a593Smuzhiyun addresses from the cache. After running, there will be no 163*4882a593Smuzhiyun entries in the cache for 'vma->vm_mm' for virtual addresses in 164*4882a593Smuzhiyun the range 'start' to 'end-1'. 165*4882a593Smuzhiyun 166*4882a593Smuzhiyun The "vma" is the backing store being used for the region. 167*4882a593Smuzhiyun Primarily, this is used for munmap() type operations. 168*4882a593Smuzhiyun 169*4882a593Smuzhiyun The interface is provided in hopes that the port can find 170*4882a593Smuzhiyun a suitably efficient method for removing multiple page 171*4882a593Smuzhiyun sized regions from the cache, instead of having the kernel 172*4882a593Smuzhiyun call flush_cache_page (see below) for each entry which may be 173*4882a593Smuzhiyun modified. 174*4882a593Smuzhiyun 175*4882a593Smuzhiyun4) ``void flush_cache_page(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn)`` 176*4882a593Smuzhiyun 177*4882a593Smuzhiyun This time we need to remove a PAGE_SIZE sized range 178*4882a593Smuzhiyun from the cache. The 'vma' is the backing structure used by 179*4882a593Smuzhiyun Linux to keep track of mmap'd regions for a process, the 180*4882a593Smuzhiyun address space is available via vma->vm_mm. Also, one may 181*4882a593Smuzhiyun test (vma->vm_flags & VM_EXEC) to see if this region is 182*4882a593Smuzhiyun executable (and thus could be in the 'instruction cache' in 183*4882a593Smuzhiyun "Harvard" type cache layouts). 184*4882a593Smuzhiyun 185*4882a593Smuzhiyun The 'pfn' indicates the physical page frame (shift this value 186*4882a593Smuzhiyun left by PAGE_SHIFT to get the physical address) that 'addr' 187*4882a593Smuzhiyun translates to. It is this mapping which should be removed from 188*4882a593Smuzhiyun the cache. 189*4882a593Smuzhiyun 190*4882a593Smuzhiyun After running, there will be no entries in the cache for 191*4882a593Smuzhiyun 'vma->vm_mm' for virtual address 'addr' which translates 192*4882a593Smuzhiyun to 'pfn'. 193*4882a593Smuzhiyun 194*4882a593Smuzhiyun This is used primarily during fault processing. 195*4882a593Smuzhiyun 196*4882a593Smuzhiyun5) ``void flush_cache_kmaps(void)`` 197*4882a593Smuzhiyun 198*4882a593Smuzhiyun This routine need only be implemented if the platform utilizes 199*4882a593Smuzhiyun highmem. It will be called right before all of the kmaps 200*4882a593Smuzhiyun are invalidated. 201*4882a593Smuzhiyun 202*4882a593Smuzhiyun After running, there will be no entries in the cache for 203*4882a593Smuzhiyun the kernel virtual address range PKMAP_ADDR(0) to 204*4882a593Smuzhiyun PKMAP_ADDR(LAST_PKMAP). 205*4882a593Smuzhiyun 206*4882a593Smuzhiyun This routing should be implemented in asm/highmem.h 207*4882a593Smuzhiyun 208*4882a593Smuzhiyun6) ``void flush_cache_vmap(unsigned long start, unsigned long end)`` 209*4882a593Smuzhiyun ``void flush_cache_vunmap(unsigned long start, unsigned long end)`` 210*4882a593Smuzhiyun 211*4882a593Smuzhiyun Here in these two interfaces we are flushing a specific range 212*4882a593Smuzhiyun of (kernel) virtual addresses from the cache. After running, 213*4882a593Smuzhiyun there will be no entries in the cache for the kernel address 214*4882a593Smuzhiyun space for virtual addresses in the range 'start' to 'end-1'. 215*4882a593Smuzhiyun 216*4882a593Smuzhiyun The first of these two routines is invoked after map_kernel_range() 217*4882a593Smuzhiyun has installed the page table entries. The second is invoked 218*4882a593Smuzhiyun before unmap_kernel_range() deletes the page table entries. 219*4882a593Smuzhiyun 220*4882a593SmuzhiyunThere exists another whole class of cpu cache issues which currently 221*4882a593Smuzhiyunrequire a whole different set of interfaces to handle properly. 222*4882a593SmuzhiyunThe biggest problem is that of virtual aliasing in the data cache 223*4882a593Smuzhiyunof a processor. 224*4882a593Smuzhiyun 225*4882a593SmuzhiyunIs your port susceptible to virtual aliasing in its D-cache? 226*4882a593SmuzhiyunWell, if your D-cache is virtually indexed, is larger in size than 227*4882a593SmuzhiyunPAGE_SIZE, and does not prevent multiple cache lines for the same 228*4882a593Smuzhiyunphysical address from existing at once, you have this problem. 229*4882a593Smuzhiyun 230*4882a593SmuzhiyunIf your D-cache has this problem, first define asm/shmparam.h SHMLBA 231*4882a593Smuzhiyunproperly, it should essentially be the size of your virtually 232*4882a593Smuzhiyunaddressed D-cache (or if the size is variable, the largest possible 233*4882a593Smuzhiyunsize). This setting will force the SYSv IPC layer to only allow user 234*4882a593Smuzhiyunprocesses to mmap shared memory at address which are a multiple of 235*4882a593Smuzhiyunthis value. 236*4882a593Smuzhiyun 237*4882a593Smuzhiyun.. note:: 238*4882a593Smuzhiyun 239*4882a593Smuzhiyun This does not fix shared mmaps, check out the sparc64 port for 240*4882a593Smuzhiyun one way to solve this (in particular SPARC_FLAG_MMAPSHARED). 241*4882a593Smuzhiyun 242*4882a593SmuzhiyunNext, you have to solve the D-cache aliasing issue for all 243*4882a593Smuzhiyunother cases. Please keep in mind that fact that, for a given page 244*4882a593Smuzhiyunmapped into some user address space, there is always at least one more 245*4882a593Smuzhiyunmapping, that of the kernel in its linear mapping starting at 246*4882a593SmuzhiyunPAGE_OFFSET. So immediately, once the first user maps a given 247*4882a593Smuzhiyunphysical page into its address space, by implication the D-cache 248*4882a593Smuzhiyunaliasing problem has the potential to exist since the kernel already 249*4882a593Smuzhiyunmaps this page at its virtual address. 250*4882a593Smuzhiyun 251*4882a593Smuzhiyun ``void copy_user_page(void *to, void *from, unsigned long addr, struct page *page)`` 252*4882a593Smuzhiyun ``void clear_user_page(void *to, unsigned long addr, struct page *page)`` 253*4882a593Smuzhiyun 254*4882a593Smuzhiyun These two routines store data in user anonymous or COW 255*4882a593Smuzhiyun pages. It allows a port to efficiently avoid D-cache alias 256*4882a593Smuzhiyun issues between userspace and the kernel. 257*4882a593Smuzhiyun 258*4882a593Smuzhiyun For example, a port may temporarily map 'from' and 'to' to 259*4882a593Smuzhiyun kernel virtual addresses during the copy. The virtual address 260*4882a593Smuzhiyun for these two pages is chosen in such a way that the kernel 261*4882a593Smuzhiyun load/store instructions happen to virtual addresses which are 262*4882a593Smuzhiyun of the same "color" as the user mapping of the page. Sparc64 263*4882a593Smuzhiyun for example, uses this technique. 264*4882a593Smuzhiyun 265*4882a593Smuzhiyun The 'addr' parameter tells the virtual address where the 266*4882a593Smuzhiyun user will ultimately have this page mapped, and the 'page' 267*4882a593Smuzhiyun parameter gives a pointer to the struct page of the target. 268*4882a593Smuzhiyun 269*4882a593Smuzhiyun If D-cache aliasing is not an issue, these two routines may 270*4882a593Smuzhiyun simply call memcpy/memset directly and do nothing more. 271*4882a593Smuzhiyun 272*4882a593Smuzhiyun ``void flush_dcache_page(struct page *page)`` 273*4882a593Smuzhiyun 274*4882a593Smuzhiyun Any time the kernel writes to a page cache page, _OR_ 275*4882a593Smuzhiyun the kernel is about to read from a page cache page and 276*4882a593Smuzhiyun user space shared/writable mappings of this page potentially 277*4882a593Smuzhiyun exist, this routine is called. 278*4882a593Smuzhiyun 279*4882a593Smuzhiyun .. note:: 280*4882a593Smuzhiyun 281*4882a593Smuzhiyun This routine need only be called for page cache pages 282*4882a593Smuzhiyun which can potentially ever be mapped into the address 283*4882a593Smuzhiyun space of a user process. So for example, VFS layer code 284*4882a593Smuzhiyun handling vfs symlinks in the page cache need not call 285*4882a593Smuzhiyun this interface at all. 286*4882a593Smuzhiyun 287*4882a593Smuzhiyun The phrase "kernel writes to a page cache page" means, 288*4882a593Smuzhiyun specifically, that the kernel executes store instructions 289*4882a593Smuzhiyun that dirty data in that page at the page->virtual mapping 290*4882a593Smuzhiyun of that page. It is important to flush here to handle 291*4882a593Smuzhiyun D-cache aliasing, to make sure these kernel stores are 292*4882a593Smuzhiyun visible to user space mappings of that page. 293*4882a593Smuzhiyun 294*4882a593Smuzhiyun The corollary case is just as important, if there are users 295*4882a593Smuzhiyun which have shared+writable mappings of this file, we must make 296*4882a593Smuzhiyun sure that kernel reads of these pages will see the most recent 297*4882a593Smuzhiyun stores done by the user. 298*4882a593Smuzhiyun 299*4882a593Smuzhiyun If D-cache aliasing is not an issue, this routine may 300*4882a593Smuzhiyun simply be defined as a nop on that architecture. 301*4882a593Smuzhiyun 302*4882a593Smuzhiyun There is a bit set aside in page->flags (PG_arch_1) as 303*4882a593Smuzhiyun "architecture private". The kernel guarantees that, 304*4882a593Smuzhiyun for pagecache pages, it will clear this bit when such 305*4882a593Smuzhiyun a page first enters the pagecache. 306*4882a593Smuzhiyun 307*4882a593Smuzhiyun This allows these interfaces to be implemented much more 308*4882a593Smuzhiyun efficiently. It allows one to "defer" (perhaps indefinitely) 309*4882a593Smuzhiyun the actual flush if there are currently no user processes 310*4882a593Smuzhiyun mapping this page. See sparc64's flush_dcache_page and 311*4882a593Smuzhiyun update_mmu_cache implementations for an example of how to go 312*4882a593Smuzhiyun about doing this. 313*4882a593Smuzhiyun 314*4882a593Smuzhiyun The idea is, first at flush_dcache_page() time, if 315*4882a593Smuzhiyun page->mapping->i_mmap is an empty tree, just mark the architecture 316*4882a593Smuzhiyun private page flag bit. Later, in update_mmu_cache(), a check is 317*4882a593Smuzhiyun made of this flag bit, and if set the flush is done and the flag 318*4882a593Smuzhiyun bit is cleared. 319*4882a593Smuzhiyun 320*4882a593Smuzhiyun .. important:: 321*4882a593Smuzhiyun 322*4882a593Smuzhiyun It is often important, if you defer the flush, 323*4882a593Smuzhiyun that the actual flush occurs on the same CPU 324*4882a593Smuzhiyun as did the cpu stores into the page to make it 325*4882a593Smuzhiyun dirty. Again, see sparc64 for examples of how 326*4882a593Smuzhiyun to deal with this. 327*4882a593Smuzhiyun 328*4882a593Smuzhiyun ``void copy_to_user_page(struct vm_area_struct *vma, struct page *page, 329*4882a593Smuzhiyun unsigned long user_vaddr, void *dst, void *src, int len)`` 330*4882a593Smuzhiyun ``void copy_from_user_page(struct vm_area_struct *vma, struct page *page, 331*4882a593Smuzhiyun unsigned long user_vaddr, void *dst, void *src, int len)`` 332*4882a593Smuzhiyun 333*4882a593Smuzhiyun When the kernel needs to copy arbitrary data in and out 334*4882a593Smuzhiyun of arbitrary user pages (f.e. for ptrace()) it will use 335*4882a593Smuzhiyun these two routines. 336*4882a593Smuzhiyun 337*4882a593Smuzhiyun Any necessary cache flushing or other coherency operations 338*4882a593Smuzhiyun that need to occur should happen here. If the processor's 339*4882a593Smuzhiyun instruction cache does not snoop cpu stores, it is very 340*4882a593Smuzhiyun likely that you will need to flush the instruction cache 341*4882a593Smuzhiyun for copy_to_user_page(). 342*4882a593Smuzhiyun 343*4882a593Smuzhiyun ``void flush_anon_page(struct vm_area_struct *vma, struct page *page, 344*4882a593Smuzhiyun unsigned long vmaddr)`` 345*4882a593Smuzhiyun 346*4882a593Smuzhiyun When the kernel needs to access the contents of an anonymous 347*4882a593Smuzhiyun page, it calls this function (currently only 348*4882a593Smuzhiyun get_user_pages()). Note: flush_dcache_page() deliberately 349*4882a593Smuzhiyun doesn't work for an anonymous page. The default 350*4882a593Smuzhiyun implementation is a nop (and should remain so for all coherent 351*4882a593Smuzhiyun architectures). For incoherent architectures, it should flush 352*4882a593Smuzhiyun the cache of the page at vmaddr. 353*4882a593Smuzhiyun 354*4882a593Smuzhiyun ``void flush_kernel_dcache_page(struct page *page)`` 355*4882a593Smuzhiyun 356*4882a593Smuzhiyun When the kernel needs to modify a user page is has obtained 357*4882a593Smuzhiyun with kmap, it calls this function after all modifications are 358*4882a593Smuzhiyun complete (but before kunmapping it) to bring the underlying 359*4882a593Smuzhiyun page up to date. It is assumed here that the user has no 360*4882a593Smuzhiyun incoherent cached copies (i.e. the original page was obtained 361*4882a593Smuzhiyun from a mechanism like get_user_pages()). The default 362*4882a593Smuzhiyun implementation is a nop and should remain so on all coherent 363*4882a593Smuzhiyun architectures. On incoherent architectures, this should flush 364*4882a593Smuzhiyun the kernel cache for page (using page_address(page)). 365*4882a593Smuzhiyun 366*4882a593Smuzhiyun 367*4882a593Smuzhiyun ``void flush_icache_range(unsigned long start, unsigned long end)`` 368*4882a593Smuzhiyun 369*4882a593Smuzhiyun When the kernel stores into addresses that it will execute 370*4882a593Smuzhiyun out of (eg when loading modules), this function is called. 371*4882a593Smuzhiyun 372*4882a593Smuzhiyun If the icache does not snoop stores then this routine will need 373*4882a593Smuzhiyun to flush it. 374*4882a593Smuzhiyun 375*4882a593Smuzhiyun ``void flush_icache_page(struct vm_area_struct *vma, struct page *page)`` 376*4882a593Smuzhiyun 377*4882a593Smuzhiyun All the functionality of flush_icache_page can be implemented in 378*4882a593Smuzhiyun flush_dcache_page and update_mmu_cache. In the future, the hope 379*4882a593Smuzhiyun is to remove this interface completely. 380*4882a593Smuzhiyun 381*4882a593SmuzhiyunThe final category of APIs is for I/O to deliberately aliased address 382*4882a593Smuzhiyunranges inside the kernel. Such aliases are set up by use of the 383*4882a593Smuzhiyunvmap/vmalloc API. Since kernel I/O goes via physical pages, the I/O 384*4882a593Smuzhiyunsubsystem assumes that the user mapping and kernel offset mapping are 385*4882a593Smuzhiyunthe only aliases. This isn't true for vmap aliases, so anything in 386*4882a593Smuzhiyunthe kernel trying to do I/O to vmap areas must manually manage 387*4882a593Smuzhiyuncoherency. It must do this by flushing the vmap range before doing 388*4882a593SmuzhiyunI/O and invalidating it after the I/O returns. 389*4882a593Smuzhiyun 390*4882a593Smuzhiyun ``void flush_kernel_vmap_range(void *vaddr, int size)`` 391*4882a593Smuzhiyun 392*4882a593Smuzhiyun flushes the kernel cache for a given virtual address range in 393*4882a593Smuzhiyun the vmap area. This is to make sure that any data the kernel 394*4882a593Smuzhiyun modified in the vmap range is made visible to the physical 395*4882a593Smuzhiyun page. The design is to make this area safe to perform I/O on. 396*4882a593Smuzhiyun Note that this API does *not* also flush the offset map alias 397*4882a593Smuzhiyun of the area. 398*4882a593Smuzhiyun 399*4882a593Smuzhiyun ``void invalidate_kernel_vmap_range(void *vaddr, int size) invalidates`` 400*4882a593Smuzhiyun 401*4882a593Smuzhiyun the cache for a given virtual address range in the vmap area 402*4882a593Smuzhiyun which prevents the processor from making the cache stale by 403*4882a593Smuzhiyun speculatively reading data while the I/O was occurring to the 404*4882a593Smuzhiyun physical pages. This is only necessary for data reads into the 405*4882a593Smuzhiyun vmap area. 406