1*4882a593SmuzhiyunAssembler Annotations 2*4882a593Smuzhiyun===================== 3*4882a593Smuzhiyun 4*4882a593SmuzhiyunCopyright (c) 2017-2019 Jiri Slaby 5*4882a593Smuzhiyun 6*4882a593SmuzhiyunThis document describes the new macros for annotation of data and code in 7*4882a593Smuzhiyunassembly. In particular, it contains information about ``SYM_FUNC_START``, 8*4882a593Smuzhiyun``SYM_FUNC_END``, ``SYM_CODE_START``, and similar. 9*4882a593Smuzhiyun 10*4882a593SmuzhiyunRationale 11*4882a593Smuzhiyun--------- 12*4882a593SmuzhiyunSome code like entries, trampolines, or boot code needs to be written in 13*4882a593Smuzhiyunassembly. The same as in C, such code is grouped into functions and 14*4882a593Smuzhiyunaccompanied with data. Standard assemblers do not force users into precisely 15*4882a593Smuzhiyunmarking these pieces as code, data, or even specifying their length. 16*4882a593SmuzhiyunNevertheless, assemblers provide developers with such annotations to aid 17*4882a593Smuzhiyundebuggers throughout assembly. On top of that, developers also want to mark 18*4882a593Smuzhiyunsome functions as *global* in order to be visible outside of their translation 19*4882a593Smuzhiyununits. 20*4882a593Smuzhiyun 21*4882a593SmuzhiyunOver time, the Linux kernel has adopted macros from various projects (like 22*4882a593Smuzhiyun``binutils``) to facilitate such annotations. So for historic reasons, 23*4882a593Smuzhiyundevelopers have been using ``ENTRY``, ``END``, ``ENDPROC``, and other 24*4882a593Smuzhiyunannotations in assembly. Due to the lack of their documentation, the macros 25*4882a593Smuzhiyunare used in rather wrong contexts at some locations. Clearly, ``ENTRY`` was 26*4882a593Smuzhiyunintended to denote the beginning of global symbols (be it data or code). 27*4882a593Smuzhiyun``END`` used to mark the end of data or end of special functions with 28*4882a593Smuzhiyun*non-standard* calling convention. In contrast, ``ENDPROC`` should annotate 29*4882a593Smuzhiyunonly ends of *standard* functions. 30*4882a593Smuzhiyun 31*4882a593SmuzhiyunWhen these macros are used correctly, they help assemblers generate a nice 32*4882a593Smuzhiyunobject with both sizes and types set correctly. For example, the result of 33*4882a593Smuzhiyun``arch/x86/lib/putuser.S``:: 34*4882a593Smuzhiyun 35*4882a593Smuzhiyun Num: Value Size Type Bind Vis Ndx Name 36*4882a593Smuzhiyun 25: 0000000000000000 33 FUNC GLOBAL DEFAULT 1 __put_user_1 37*4882a593Smuzhiyun 29: 0000000000000030 37 FUNC GLOBAL DEFAULT 1 __put_user_2 38*4882a593Smuzhiyun 32: 0000000000000060 36 FUNC GLOBAL DEFAULT 1 __put_user_4 39*4882a593Smuzhiyun 35: 0000000000000090 37 FUNC GLOBAL DEFAULT 1 __put_user_8 40*4882a593Smuzhiyun 41*4882a593SmuzhiyunThis is not only important for debugging purposes. When there are properly 42*4882a593Smuzhiyunannotated objects like this, tools can be run on them to generate more useful 43*4882a593Smuzhiyuninformation. In particular, on properly annotated objects, ``objtool`` can be 44*4882a593Smuzhiyunrun to check and fix the object if needed. Currently, ``objtool`` can report 45*4882a593Smuzhiyunmissing frame pointer setup/destruction in functions. It can also 46*4882a593Smuzhiyunautomatically generate annotations for :doc:`ORC unwinder <x86/orc-unwinder>` 47*4882a593Smuzhiyunfor most code. Both of these are especially important to support reliable 48*4882a593Smuzhiyunstack traces which are in turn necessary for :doc:`Kernel live patching 49*4882a593Smuzhiyun<livepatch/livepatch>`. 50*4882a593Smuzhiyun 51*4882a593SmuzhiyunCaveat and Discussion 52*4882a593Smuzhiyun--------------------- 53*4882a593SmuzhiyunAs one might realize, there were only three macros previously. That is indeed 54*4882a593Smuzhiyuninsufficient to cover all the combinations of cases: 55*4882a593Smuzhiyun 56*4882a593Smuzhiyun* standard/non-standard function 57*4882a593Smuzhiyun* code/data 58*4882a593Smuzhiyun* global/local symbol 59*4882a593Smuzhiyun 60*4882a593SmuzhiyunThere was a discussion_ and instead of extending the current ``ENTRY/END*`` 61*4882a593Smuzhiyunmacros, it was decided that brand new macros should be introduced instead:: 62*4882a593Smuzhiyun 63*4882a593Smuzhiyun So how about using macro names that actually show the purpose, instead 64*4882a593Smuzhiyun of importing all the crappy, historic, essentially randomly chosen 65*4882a593Smuzhiyun debug symbol macro names from the binutils and older kernels? 66*4882a593Smuzhiyun 67*4882a593Smuzhiyun.. _discussion: https://lkml.kernel.org/r/20170217104757.28588-1-jslaby@suse.cz 68*4882a593Smuzhiyun 69*4882a593SmuzhiyunMacros Description 70*4882a593Smuzhiyun------------------ 71*4882a593Smuzhiyun 72*4882a593SmuzhiyunThe new macros are prefixed with the ``SYM_`` prefix and can be divided into 73*4882a593Smuzhiyunthree main groups: 74*4882a593Smuzhiyun 75*4882a593Smuzhiyun1. ``SYM_FUNC_*`` -- to annotate C-like functions. This means functions with 76*4882a593Smuzhiyun standard C calling conventions. For example, on x86, this means that the 77*4882a593Smuzhiyun stack contains a return address at the predefined place and a return from 78*4882a593Smuzhiyun the function can happen in a standard way. When frame pointers are enabled, 79*4882a593Smuzhiyun save/restore of frame pointer shall happen at the start/end of a function, 80*4882a593Smuzhiyun respectively, too. 81*4882a593Smuzhiyun 82*4882a593Smuzhiyun Checking tools like ``objtool`` should ensure such marked functions conform 83*4882a593Smuzhiyun to these rules. The tools can also easily annotate these functions with 84*4882a593Smuzhiyun debugging information (like *ORC data*) automatically. 85*4882a593Smuzhiyun 86*4882a593Smuzhiyun2. ``SYM_CODE_*`` -- special functions called with special stack. Be it 87*4882a593Smuzhiyun interrupt handlers with special stack content, trampolines, or startup 88*4882a593Smuzhiyun functions. 89*4882a593Smuzhiyun 90*4882a593Smuzhiyun Checking tools mostly ignore checking of these functions. But some debug 91*4882a593Smuzhiyun information still can be generated automatically. For correct debug data, 92*4882a593Smuzhiyun this code needs hints like ``UNWIND_HINT_REGS`` provided by developers. 93*4882a593Smuzhiyun 94*4882a593Smuzhiyun3. ``SYM_DATA*`` -- obviously data belonging to ``.data`` sections and not to 95*4882a593Smuzhiyun ``.text``. Data do not contain instructions, so they have to be treated 96*4882a593Smuzhiyun specially by the tools: they should not treat the bytes as instructions, 97*4882a593Smuzhiyun nor assign any debug information to them. 98*4882a593Smuzhiyun 99*4882a593SmuzhiyunInstruction Macros 100*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~~ 101*4882a593SmuzhiyunThis section covers ``SYM_FUNC_*`` and ``SYM_CODE_*`` enumerated above. 102*4882a593Smuzhiyun 103*4882a593Smuzhiyun``objtool`` requires that all code must be contained in an ELF symbol. Symbol 104*4882a593Smuzhiyunnames that have a ``.L`` prefix do not emit symbol table entries. ``.L`` 105*4882a593Smuzhiyunprefixed symbols can be used within a code region, but should be avoided for 106*4882a593Smuzhiyundenoting a range of code via ``SYM_*_START/END`` annotations. 107*4882a593Smuzhiyun 108*4882a593Smuzhiyun* ``SYM_FUNC_START`` and ``SYM_FUNC_START_LOCAL`` are supposed to be **the 109*4882a593Smuzhiyun most frequent markings**. They are used for functions with standard calling 110*4882a593Smuzhiyun conventions -- global and local. Like in C, they both align the functions to 111*4882a593Smuzhiyun architecture specific ``__ALIGN`` bytes. There are also ``_NOALIGN`` variants 112*4882a593Smuzhiyun for special cases where developers do not want this implicit alignment. 113*4882a593Smuzhiyun 114*4882a593Smuzhiyun ``SYM_FUNC_START_WEAK`` and ``SYM_FUNC_START_WEAK_NOALIGN`` markings are 115*4882a593Smuzhiyun also offered as an assembler counterpart to the *weak* attribute known from 116*4882a593Smuzhiyun C. 117*4882a593Smuzhiyun 118*4882a593Smuzhiyun All of these **shall** be coupled with ``SYM_FUNC_END``. First, it marks 119*4882a593Smuzhiyun the sequence of instructions as a function and computes its size to the 120*4882a593Smuzhiyun generated object file. Second, it also eases checking and processing such 121*4882a593Smuzhiyun object files as the tools can trivially find exact function boundaries. 122*4882a593Smuzhiyun 123*4882a593Smuzhiyun So in most cases, developers should write something like in the following 124*4882a593Smuzhiyun example, having some asm instructions in between the macros, of course:: 125*4882a593Smuzhiyun 126*4882a593Smuzhiyun SYM_FUNC_START(memset) 127*4882a593Smuzhiyun ... asm insns ... 128*4882a593Smuzhiyun SYM_FUNC_END(memset) 129*4882a593Smuzhiyun 130*4882a593Smuzhiyun In fact, this kind of annotation corresponds to the now deprecated ``ENTRY`` 131*4882a593Smuzhiyun and ``ENDPROC`` macros. 132*4882a593Smuzhiyun 133*4882a593Smuzhiyun* ``SYM_FUNC_START_ALIAS`` and ``SYM_FUNC_START_LOCAL_ALIAS`` serve for those 134*4882a593Smuzhiyun who decided to have two or more names for one function. The typical use is:: 135*4882a593Smuzhiyun 136*4882a593Smuzhiyun SYM_FUNC_START_ALIAS(__memset) 137*4882a593Smuzhiyun SYM_FUNC_START(memset) 138*4882a593Smuzhiyun ... asm insns ... 139*4882a593Smuzhiyun SYM_FUNC_END(memset) 140*4882a593Smuzhiyun SYM_FUNC_END_ALIAS(__memset) 141*4882a593Smuzhiyun 142*4882a593Smuzhiyun In this example, one can call ``__memset`` or ``memset`` with the same 143*4882a593Smuzhiyun result, except the debug information for the instructions is generated to 144*4882a593Smuzhiyun the object file only once -- for the non-``ALIAS`` case. 145*4882a593Smuzhiyun 146*4882a593Smuzhiyun* ``SYM_CODE_START`` and ``SYM_CODE_START_LOCAL`` should be used only in 147*4882a593Smuzhiyun special cases -- if you know what you are doing. This is used exclusively 148*4882a593Smuzhiyun for interrupt handlers and similar where the calling convention is not the C 149*4882a593Smuzhiyun one. ``_NOALIGN`` variants exist too. The use is the same as for the ``FUNC`` 150*4882a593Smuzhiyun category above:: 151*4882a593Smuzhiyun 152*4882a593Smuzhiyun SYM_CODE_START_LOCAL(bad_put_user) 153*4882a593Smuzhiyun ... asm insns ... 154*4882a593Smuzhiyun SYM_CODE_END(bad_put_user) 155*4882a593Smuzhiyun 156*4882a593Smuzhiyun Again, every ``SYM_CODE_START*`` **shall** be coupled by ``SYM_CODE_END``. 157*4882a593Smuzhiyun 158*4882a593Smuzhiyun To some extent, this category corresponds to deprecated ``ENTRY`` and 159*4882a593Smuzhiyun ``END``. Except ``END`` had several other meanings too. 160*4882a593Smuzhiyun 161*4882a593Smuzhiyun* ``SYM_INNER_LABEL*`` is used to denote a label inside some 162*4882a593Smuzhiyun ``SYM_{CODE,FUNC}_START`` and ``SYM_{CODE,FUNC}_END``. They are very similar 163*4882a593Smuzhiyun to C labels, except they can be made global. An example of use:: 164*4882a593Smuzhiyun 165*4882a593Smuzhiyun SYM_CODE_START(ftrace_caller) 166*4882a593Smuzhiyun /* save_mcount_regs fills in first two parameters */ 167*4882a593Smuzhiyun ... 168*4882a593Smuzhiyun 169*4882a593Smuzhiyun SYM_INNER_LABEL(ftrace_caller_op_ptr, SYM_L_GLOBAL) 170*4882a593Smuzhiyun /* Load the ftrace_ops into the 3rd parameter */ 171*4882a593Smuzhiyun ... 172*4882a593Smuzhiyun 173*4882a593Smuzhiyun SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL) 174*4882a593Smuzhiyun call ftrace_stub 175*4882a593Smuzhiyun ... 176*4882a593Smuzhiyun retq 177*4882a593Smuzhiyun SYM_CODE_END(ftrace_caller) 178*4882a593Smuzhiyun 179*4882a593SmuzhiyunData Macros 180*4882a593Smuzhiyun~~~~~~~~~~~ 181*4882a593SmuzhiyunSimilar to instructions, there is a couple of macros to describe data in the 182*4882a593Smuzhiyunassembly. 183*4882a593Smuzhiyun 184*4882a593Smuzhiyun* ``SYM_DATA_START`` and ``SYM_DATA_START_LOCAL`` mark the start of some data 185*4882a593Smuzhiyun and shall be used in conjunction with either ``SYM_DATA_END``, or 186*4882a593Smuzhiyun ``SYM_DATA_END_LABEL``. The latter adds also a label to the end, so that 187*4882a593Smuzhiyun people can use ``lstack`` and (local) ``lstack_end`` in the following 188*4882a593Smuzhiyun example:: 189*4882a593Smuzhiyun 190*4882a593Smuzhiyun SYM_DATA_START_LOCAL(lstack) 191*4882a593Smuzhiyun .skip 4096 192*4882a593Smuzhiyun SYM_DATA_END_LABEL(lstack, SYM_L_LOCAL, lstack_end) 193*4882a593Smuzhiyun 194*4882a593Smuzhiyun* ``SYM_DATA`` and ``SYM_DATA_LOCAL`` are variants for simple, mostly one-line 195*4882a593Smuzhiyun data:: 196*4882a593Smuzhiyun 197*4882a593Smuzhiyun SYM_DATA(HEAP, .long rm_heap) 198*4882a593Smuzhiyun SYM_DATA(heap_end, .long rm_stack) 199*4882a593Smuzhiyun 200*4882a593Smuzhiyun In the end, they expand to ``SYM_DATA_START`` with ``SYM_DATA_END`` 201*4882a593Smuzhiyun internally. 202*4882a593Smuzhiyun 203*4882a593SmuzhiyunSupport Macros 204*4882a593Smuzhiyun~~~~~~~~~~~~~~ 205*4882a593SmuzhiyunAll the above reduce themselves to some invocation of ``SYM_START``, 206*4882a593Smuzhiyun``SYM_END``, or ``SYM_ENTRY`` at last. Normally, developers should avoid using 207*4882a593Smuzhiyunthese. 208*4882a593Smuzhiyun 209*4882a593SmuzhiyunFurther, in the above examples, one could see ``SYM_L_LOCAL``. There are also 210*4882a593Smuzhiyun``SYM_L_GLOBAL`` and ``SYM_L_WEAK``. All are intended to denote linkage of a 211*4882a593Smuzhiyunsymbol marked by them. They are used either in ``_LABEL`` variants of the 212*4882a593Smuzhiyunearlier macros, or in ``SYM_START``. 213*4882a593Smuzhiyun 214*4882a593Smuzhiyun 215*4882a593SmuzhiyunOverriding Macros 216*4882a593Smuzhiyun~~~~~~~~~~~~~~~~~ 217*4882a593SmuzhiyunArchitecture can also override any of the macros in their own 218*4882a593Smuzhiyun``asm/linkage.h``, including macros specifying the type of a symbol 219*4882a593Smuzhiyun(``SYM_T_FUNC``, ``SYM_T_OBJECT``, and ``SYM_T_NONE``). As every macro 220*4882a593Smuzhiyundescribed in this file is surrounded by ``#ifdef`` + ``#endif``, it is enough 221*4882a593Smuzhiyunto define the macros differently in the aforementioned architecture-dependent 222*4882a593Smuzhiyunheader. 223