1*4882a593Smuzhiyun========= 2*4882a593SmuzhiyunLivepatch 3*4882a593Smuzhiyun========= 4*4882a593Smuzhiyun 5*4882a593SmuzhiyunThis document outlines basic information about kernel livepatching. 6*4882a593Smuzhiyun 7*4882a593Smuzhiyun.. Table of Contents: 8*4882a593Smuzhiyun 9*4882a593Smuzhiyun 1. Motivation 10*4882a593Smuzhiyun 2. Kprobes, Ftrace, Livepatching 11*4882a593Smuzhiyun 3. Consistency model 12*4882a593Smuzhiyun 4. Livepatch module 13*4882a593Smuzhiyun 4.1. New functions 14*4882a593Smuzhiyun 4.2. Metadata 15*4882a593Smuzhiyun 5. Livepatch life-cycle 16*4882a593Smuzhiyun 5.1. Loading 17*4882a593Smuzhiyun 5.2. Enabling 18*4882a593Smuzhiyun 5.3. Replacing 19*4882a593Smuzhiyun 5.4. Disabling 20*4882a593Smuzhiyun 5.5. Removing 21*4882a593Smuzhiyun 6. Sysfs 22*4882a593Smuzhiyun 7. Limitations 23*4882a593Smuzhiyun 24*4882a593Smuzhiyun 25*4882a593Smuzhiyun1. Motivation 26*4882a593Smuzhiyun============= 27*4882a593Smuzhiyun 28*4882a593SmuzhiyunThere are many situations where users are reluctant to reboot a system. It may 29*4882a593Smuzhiyunbe because their system is performing complex scientific computations or under 30*4882a593Smuzhiyunheavy load during peak usage. In addition to keeping systems up and running, 31*4882a593Smuzhiyunusers want to also have a stable and secure system. Livepatching gives users 32*4882a593Smuzhiyunboth by allowing for function calls to be redirected; thus, fixing critical 33*4882a593Smuzhiyunfunctions without a system reboot. 34*4882a593Smuzhiyun 35*4882a593Smuzhiyun 36*4882a593Smuzhiyun2. Kprobes, Ftrace, Livepatching 37*4882a593Smuzhiyun================================ 38*4882a593Smuzhiyun 39*4882a593SmuzhiyunThere are multiple mechanisms in the Linux kernel that are directly related 40*4882a593Smuzhiyunto redirection of code execution; namely: kernel probes, function tracing, 41*4882a593Smuzhiyunand livepatching: 42*4882a593Smuzhiyun 43*4882a593Smuzhiyun - The kernel probes are the most generic. The code can be redirected by 44*4882a593Smuzhiyun putting a breakpoint instruction instead of any instruction. 45*4882a593Smuzhiyun 46*4882a593Smuzhiyun - The function tracer calls the code from a predefined location that is 47*4882a593Smuzhiyun close to the function entry point. This location is generated by the 48*4882a593Smuzhiyun compiler using the '-pg' gcc option. 49*4882a593Smuzhiyun 50*4882a593Smuzhiyun - Livepatching typically needs to redirect the code at the very beginning 51*4882a593Smuzhiyun of the function entry before the function parameters or the stack 52*4882a593Smuzhiyun are in any way modified. 53*4882a593Smuzhiyun 54*4882a593SmuzhiyunAll three approaches need to modify the existing code at runtime. Therefore 55*4882a593Smuzhiyunthey need to be aware of each other and not step over each other's toes. 56*4882a593SmuzhiyunMost of these problems are solved by using the dynamic ftrace framework as 57*4882a593Smuzhiyuna base. A Kprobe is registered as a ftrace handler when the function entry 58*4882a593Smuzhiyunis probed, see CONFIG_KPROBES_ON_FTRACE. Also an alternative function from 59*4882a593Smuzhiyuna live patch is called with the help of a custom ftrace handler. But there are 60*4882a593Smuzhiyunsome limitations, see below. 61*4882a593Smuzhiyun 62*4882a593Smuzhiyun 63*4882a593Smuzhiyun3. Consistency model 64*4882a593Smuzhiyun==================== 65*4882a593Smuzhiyun 66*4882a593SmuzhiyunFunctions are there for a reason. They take some input parameters, get or 67*4882a593Smuzhiyunrelease locks, read, process, and even write some data in a defined way, 68*4882a593Smuzhiyunhave return values. In other words, each function has a defined semantic. 69*4882a593Smuzhiyun 70*4882a593SmuzhiyunMany fixes do not change the semantic of the modified functions. For 71*4882a593Smuzhiyunexample, they add a NULL pointer or a boundary check, fix a race by adding 72*4882a593Smuzhiyuna missing memory barrier, or add some locking around a critical section. 73*4882a593SmuzhiyunMost of these changes are self contained and the function presents itself 74*4882a593Smuzhiyunthe same way to the rest of the system. In this case, the functions might 75*4882a593Smuzhiyunbe updated independently one by one. 76*4882a593Smuzhiyun 77*4882a593SmuzhiyunBut there are more complex fixes. For example, a patch might change 78*4882a593Smuzhiyunordering of locking in multiple functions at the same time. Or a patch 79*4882a593Smuzhiyunmight exchange meaning of some temporary structures and update 80*4882a593Smuzhiyunall the relevant functions. In this case, the affected unit 81*4882a593Smuzhiyun(thread, whole kernel) need to start using all new versions of 82*4882a593Smuzhiyunthe functions at the same time. Also the switch must happen only 83*4882a593Smuzhiyunwhen it is safe to do so, e.g. when the affected locks are released 84*4882a593Smuzhiyunor no data are stored in the modified structures at the moment. 85*4882a593Smuzhiyun 86*4882a593SmuzhiyunThe theory about how to apply functions a safe way is rather complex. 87*4882a593SmuzhiyunThe aim is to define a so-called consistency model. It attempts to define 88*4882a593Smuzhiyunconditions when the new implementation could be used so that the system 89*4882a593Smuzhiyunstays consistent. 90*4882a593Smuzhiyun 91*4882a593SmuzhiyunLivepatch has a consistency model which is a hybrid of kGraft and 92*4882a593Smuzhiyunkpatch: it uses kGraft's per-task consistency and syscall barrier 93*4882a593Smuzhiyunswitching combined with kpatch's stack trace switching. There are also 94*4882a593Smuzhiyuna number of fallback options which make it quite flexible. 95*4882a593Smuzhiyun 96*4882a593SmuzhiyunPatches are applied on a per-task basis, when the task is deemed safe to 97*4882a593Smuzhiyunswitch over. When a patch is enabled, livepatch enters into a 98*4882a593Smuzhiyuntransition state where tasks are converging to the patched state. 99*4882a593SmuzhiyunUsually this transition state can complete in a few seconds. The same 100*4882a593Smuzhiyunsequence occurs when a patch is disabled, except the tasks converge from 101*4882a593Smuzhiyunthe patched state to the unpatched state. 102*4882a593Smuzhiyun 103*4882a593SmuzhiyunAn interrupt handler inherits the patched state of the task it 104*4882a593Smuzhiyuninterrupts. The same is true for forked tasks: the child inherits the 105*4882a593Smuzhiyunpatched state of the parent. 106*4882a593Smuzhiyun 107*4882a593SmuzhiyunLivepatch uses several complementary approaches to determine when it's 108*4882a593Smuzhiyunsafe to patch tasks: 109*4882a593Smuzhiyun 110*4882a593Smuzhiyun1. The first and most effective approach is stack checking of sleeping 111*4882a593Smuzhiyun tasks. If no affected functions are on the stack of a given task, 112*4882a593Smuzhiyun the task is patched. In most cases this will patch most or all of 113*4882a593Smuzhiyun the tasks on the first try. Otherwise it'll keep trying 114*4882a593Smuzhiyun periodically. This option is only available if the architecture has 115*4882a593Smuzhiyun reliable stacks (HAVE_RELIABLE_STACKTRACE). 116*4882a593Smuzhiyun 117*4882a593Smuzhiyun2. The second approach, if needed, is kernel exit switching. A 118*4882a593Smuzhiyun task is switched when it returns to user space from a system call, a 119*4882a593Smuzhiyun user space IRQ, or a signal. It's useful in the following cases: 120*4882a593Smuzhiyun 121*4882a593Smuzhiyun a) Patching I/O-bound user tasks which are sleeping on an affected 122*4882a593Smuzhiyun function. In this case you have to send SIGSTOP and SIGCONT to 123*4882a593Smuzhiyun force it to exit the kernel and be patched. 124*4882a593Smuzhiyun b) Patching CPU-bound user tasks. If the task is highly CPU-bound 125*4882a593Smuzhiyun then it will get patched the next time it gets interrupted by an 126*4882a593Smuzhiyun IRQ. 127*4882a593Smuzhiyun 128*4882a593Smuzhiyun3. For idle "swapper" tasks, since they don't ever exit the kernel, they 129*4882a593Smuzhiyun instead have a klp_update_patch_state() call in the idle loop which 130*4882a593Smuzhiyun allows them to be patched before the CPU enters the idle state. 131*4882a593Smuzhiyun 132*4882a593Smuzhiyun (Note there's not yet such an approach for kthreads.) 133*4882a593Smuzhiyun 134*4882a593SmuzhiyunArchitectures which don't have HAVE_RELIABLE_STACKTRACE solely rely on 135*4882a593Smuzhiyunthe second approach. It's highly likely that some tasks may still be 136*4882a593Smuzhiyunrunning with an old version of the function, until that function 137*4882a593Smuzhiyunreturns. In this case you would have to signal the tasks. This 138*4882a593Smuzhiyunespecially applies to kthreads. They may not be woken up and would need 139*4882a593Smuzhiyunto be forced. See below for more information. 140*4882a593Smuzhiyun 141*4882a593SmuzhiyunUnless we can come up with another way to patch kthreads, architectures 142*4882a593Smuzhiyunwithout HAVE_RELIABLE_STACKTRACE are not considered fully supported by 143*4882a593Smuzhiyunthe kernel livepatching. 144*4882a593Smuzhiyun 145*4882a593SmuzhiyunThe /sys/kernel/livepatch/<patch>/transition file shows whether a patch 146*4882a593Smuzhiyunis in transition. Only a single patch can be in transition at a given 147*4882a593Smuzhiyuntime. A patch can remain in transition indefinitely, if any of the tasks 148*4882a593Smuzhiyunare stuck in the initial patch state. 149*4882a593Smuzhiyun 150*4882a593SmuzhiyunA transition can be reversed and effectively canceled by writing the 151*4882a593Smuzhiyunopposite value to the /sys/kernel/livepatch/<patch>/enabled file while 152*4882a593Smuzhiyunthe transition is in progress. Then all the tasks will attempt to 153*4882a593Smuzhiyunconverge back to the original patch state. 154*4882a593Smuzhiyun 155*4882a593SmuzhiyunThere's also a /proc/<pid>/patch_state file which can be used to 156*4882a593Smuzhiyundetermine which tasks are blocking completion of a patching operation. 157*4882a593SmuzhiyunIf a patch is in transition, this file shows 0 to indicate the task is 158*4882a593Smuzhiyununpatched and 1 to indicate it's patched. Otherwise, if no patch is in 159*4882a593Smuzhiyuntransition, it shows -1. Any tasks which are blocking the transition 160*4882a593Smuzhiyuncan be signaled with SIGSTOP and SIGCONT to force them to change their 161*4882a593Smuzhiyunpatched state. This may be harmful to the system though. Sending a fake signal 162*4882a593Smuzhiyunto all remaining blocking tasks is a better alternative. No proper signal is 163*4882a593Smuzhiyunactually delivered (there is no data in signal pending structures). Tasks are 164*4882a593Smuzhiyuninterrupted or woken up, and forced to change their patched state. The fake 165*4882a593Smuzhiyunsignal is automatically sent every 15 seconds. 166*4882a593Smuzhiyun 167*4882a593SmuzhiyunAdministrator can also affect a transition through 168*4882a593Smuzhiyun/sys/kernel/livepatch/<patch>/force attribute. Writing 1 there clears 169*4882a593SmuzhiyunTIF_PATCH_PENDING flag of all tasks and thus forces the tasks to the patched 170*4882a593Smuzhiyunstate. Important note! The force attribute is intended for cases when the 171*4882a593Smuzhiyuntransition gets stuck for a long time because of a blocking task. Administrator 172*4882a593Smuzhiyunis expected to collect all necessary data (namely stack traces of such blocking 173*4882a593Smuzhiyuntasks) and request a clearance from a patch distributor to force the transition. 174*4882a593SmuzhiyunUnauthorized usage may cause harm to the system. It depends on the nature of the 175*4882a593Smuzhiyunpatch, which functions are (un)patched, and which functions the blocking tasks 176*4882a593Smuzhiyunare sleeping in (/proc/<pid>/stack may help here). Removal (rmmod) of patch 177*4882a593Smuzhiyunmodules is permanently disabled when the force feature is used. It cannot be 178*4882a593Smuzhiyunguaranteed there is no task sleeping in such module. It implies unbounded 179*4882a593Smuzhiyunreference count if a patch module is disabled and enabled in a loop. 180*4882a593Smuzhiyun 181*4882a593SmuzhiyunMoreover, the usage of force may also affect future applications of live 182*4882a593Smuzhiyunpatches and cause even more harm to the system. Administrator should first 183*4882a593Smuzhiyunconsider to simply cancel a transition (see above). If force is used, reboot 184*4882a593Smuzhiyunshould be planned and no more live patches applied. 185*4882a593Smuzhiyun 186*4882a593Smuzhiyun3.1 Adding consistency model support to new architectures 187*4882a593Smuzhiyun--------------------------------------------------------- 188*4882a593Smuzhiyun 189*4882a593SmuzhiyunFor adding consistency model support to new architectures, there are a 190*4882a593Smuzhiyunfew options: 191*4882a593Smuzhiyun 192*4882a593Smuzhiyun1) Add CONFIG_HAVE_RELIABLE_STACKTRACE. This means porting objtool, and 193*4882a593Smuzhiyun for non-DWARF unwinders, also making sure there's a way for the stack 194*4882a593Smuzhiyun tracing code to detect interrupts on the stack. 195*4882a593Smuzhiyun 196*4882a593Smuzhiyun2) Alternatively, ensure that every kthread has a call to 197*4882a593Smuzhiyun klp_update_patch_state() in a safe location. Kthreads are typically 198*4882a593Smuzhiyun in an infinite loop which does some action repeatedly. The safe 199*4882a593Smuzhiyun location to switch the kthread's patch state would be at a designated 200*4882a593Smuzhiyun point in the loop where there are no locks taken and all data 201*4882a593Smuzhiyun structures are in a well-defined state. 202*4882a593Smuzhiyun 203*4882a593Smuzhiyun The location is clear when using workqueues or the kthread worker 204*4882a593Smuzhiyun API. These kthreads process independent actions in a generic loop. 205*4882a593Smuzhiyun 206*4882a593Smuzhiyun It's much more complicated with kthreads which have a custom loop. 207*4882a593Smuzhiyun There the safe location must be carefully selected on a case-by-case 208*4882a593Smuzhiyun basis. 209*4882a593Smuzhiyun 210*4882a593Smuzhiyun In that case, arches without HAVE_RELIABLE_STACKTRACE would still be 211*4882a593Smuzhiyun able to use the non-stack-checking parts of the consistency model: 212*4882a593Smuzhiyun 213*4882a593Smuzhiyun a) patching user tasks when they cross the kernel/user space 214*4882a593Smuzhiyun boundary; and 215*4882a593Smuzhiyun 216*4882a593Smuzhiyun b) patching kthreads and idle tasks at their designated patch points. 217*4882a593Smuzhiyun 218*4882a593Smuzhiyun This option isn't as good as option 1 because it requires signaling 219*4882a593Smuzhiyun user tasks and waking kthreads to patch them. But it could still be 220*4882a593Smuzhiyun a good backup option for those architectures which don't have 221*4882a593Smuzhiyun reliable stack traces yet. 222*4882a593Smuzhiyun 223*4882a593Smuzhiyun 224*4882a593Smuzhiyun4. Livepatch module 225*4882a593Smuzhiyun=================== 226*4882a593Smuzhiyun 227*4882a593SmuzhiyunLivepatches are distributed using kernel modules, see 228*4882a593Smuzhiyunsamples/livepatch/livepatch-sample.c. 229*4882a593Smuzhiyun 230*4882a593SmuzhiyunThe module includes a new implementation of functions that we want 231*4882a593Smuzhiyunto replace. In addition, it defines some structures describing the 232*4882a593Smuzhiyunrelation between the original and the new implementation. Then there 233*4882a593Smuzhiyunis code that makes the kernel start using the new code when the livepatch 234*4882a593Smuzhiyunmodule is loaded. Also there is code that cleans up before the 235*4882a593Smuzhiyunlivepatch module is removed. All this is explained in more details in 236*4882a593Smuzhiyunthe next sections. 237*4882a593Smuzhiyun 238*4882a593Smuzhiyun 239*4882a593Smuzhiyun4.1. New functions 240*4882a593Smuzhiyun------------------ 241*4882a593Smuzhiyun 242*4882a593SmuzhiyunNew versions of functions are typically just copied from the original 243*4882a593Smuzhiyunsources. A good practice is to add a prefix to the names so that they 244*4882a593Smuzhiyuncan be distinguished from the original ones, e.g. in a backtrace. Also 245*4882a593Smuzhiyunthey can be declared as static because they are not called directly 246*4882a593Smuzhiyunand do not need the global visibility. 247*4882a593Smuzhiyun 248*4882a593SmuzhiyunThe patch contains only functions that are really modified. But they 249*4882a593Smuzhiyunmight want to access functions or data from the original source file 250*4882a593Smuzhiyunthat may only be locally accessible. This can be solved by a special 251*4882a593Smuzhiyunrelocation section in the generated livepatch module, see 252*4882a593SmuzhiyunDocumentation/livepatch/module-elf-format.rst for more details. 253*4882a593Smuzhiyun 254*4882a593Smuzhiyun 255*4882a593Smuzhiyun4.2. Metadata 256*4882a593Smuzhiyun------------- 257*4882a593Smuzhiyun 258*4882a593SmuzhiyunThe patch is described by several structures that split the information 259*4882a593Smuzhiyuninto three levels: 260*4882a593Smuzhiyun 261*4882a593Smuzhiyun - struct klp_func is defined for each patched function. It describes 262*4882a593Smuzhiyun the relation between the original and the new implementation of a 263*4882a593Smuzhiyun particular function. 264*4882a593Smuzhiyun 265*4882a593Smuzhiyun The structure includes the name, as a string, of the original function. 266*4882a593Smuzhiyun The function address is found via kallsyms at runtime. 267*4882a593Smuzhiyun 268*4882a593Smuzhiyun Then it includes the address of the new function. It is defined 269*4882a593Smuzhiyun directly by assigning the function pointer. Note that the new 270*4882a593Smuzhiyun function is typically defined in the same source file. 271*4882a593Smuzhiyun 272*4882a593Smuzhiyun As an optional parameter, the symbol position in the kallsyms database can 273*4882a593Smuzhiyun be used to disambiguate functions of the same name. This is not the 274*4882a593Smuzhiyun absolute position in the database, but rather the order it has been found 275*4882a593Smuzhiyun only for a particular object ( vmlinux or a kernel module ). Note that 276*4882a593Smuzhiyun kallsyms allows for searching symbols according to the object name. 277*4882a593Smuzhiyun 278*4882a593Smuzhiyun - struct klp_object defines an array of patched functions (struct 279*4882a593Smuzhiyun klp_func) in the same object. Where the object is either vmlinux 280*4882a593Smuzhiyun (NULL) or a module name. 281*4882a593Smuzhiyun 282*4882a593Smuzhiyun The structure helps to group and handle functions for each object 283*4882a593Smuzhiyun together. Note that patched modules might be loaded later than 284*4882a593Smuzhiyun the patch itself and the relevant functions might be patched 285*4882a593Smuzhiyun only when they are available. 286*4882a593Smuzhiyun 287*4882a593Smuzhiyun 288*4882a593Smuzhiyun - struct klp_patch defines an array of patched objects (struct 289*4882a593Smuzhiyun klp_object). 290*4882a593Smuzhiyun 291*4882a593Smuzhiyun This structure handles all patched functions consistently and eventually, 292*4882a593Smuzhiyun synchronously. The whole patch is applied only when all patched 293*4882a593Smuzhiyun symbols are found. The only exception are symbols from objects 294*4882a593Smuzhiyun (kernel modules) that have not been loaded yet. 295*4882a593Smuzhiyun 296*4882a593Smuzhiyun For more details on how the patch is applied on a per-task basis, 297*4882a593Smuzhiyun see the "Consistency model" section. 298*4882a593Smuzhiyun 299*4882a593Smuzhiyun 300*4882a593Smuzhiyun5. Livepatch life-cycle 301*4882a593Smuzhiyun======================= 302*4882a593Smuzhiyun 303*4882a593SmuzhiyunLivepatching can be described by five basic operations: 304*4882a593Smuzhiyunloading, enabling, replacing, disabling, removing. 305*4882a593Smuzhiyun 306*4882a593SmuzhiyunWhere the replacing and the disabling operations are mutually 307*4882a593Smuzhiyunexclusive. They have the same result for the given patch but 308*4882a593Smuzhiyunnot for the system. 309*4882a593Smuzhiyun 310*4882a593Smuzhiyun 311*4882a593Smuzhiyun5.1. Loading 312*4882a593Smuzhiyun------------ 313*4882a593Smuzhiyun 314*4882a593SmuzhiyunThe only reasonable way is to enable the patch when the livepatch kernel 315*4882a593Smuzhiyunmodule is being loaded. For this, klp_enable_patch() has to be called 316*4882a593Smuzhiyunin the module_init() callback. There are two main reasons: 317*4882a593Smuzhiyun 318*4882a593SmuzhiyunFirst, only the module has an easy access to the related struct klp_patch. 319*4882a593Smuzhiyun 320*4882a593SmuzhiyunSecond, the error code might be used to refuse loading the module when 321*4882a593Smuzhiyunthe patch cannot get enabled. 322*4882a593Smuzhiyun 323*4882a593Smuzhiyun 324*4882a593Smuzhiyun5.2. Enabling 325*4882a593Smuzhiyun------------- 326*4882a593Smuzhiyun 327*4882a593SmuzhiyunThe livepatch gets enabled by calling klp_enable_patch() from 328*4882a593Smuzhiyunthe module_init() callback. The system will start using the new 329*4882a593Smuzhiyunimplementation of the patched functions at this stage. 330*4882a593Smuzhiyun 331*4882a593SmuzhiyunFirst, the addresses of the patched functions are found according to their 332*4882a593Smuzhiyunnames. The special relocations, mentioned in the section "New functions", 333*4882a593Smuzhiyunare applied. The relevant entries are created under 334*4882a593Smuzhiyun/sys/kernel/livepatch/<name>. The patch is rejected when any above 335*4882a593Smuzhiyunoperation fails. 336*4882a593Smuzhiyun 337*4882a593SmuzhiyunSecond, livepatch enters into a transition state where tasks are converging 338*4882a593Smuzhiyunto the patched state. If an original function is patched for the first 339*4882a593Smuzhiyuntime, a function specific struct klp_ops is created and an universal 340*4882a593Smuzhiyunftrace handler is registered\ [#]_. This stage is indicated by a value of '1' 341*4882a593Smuzhiyunin /sys/kernel/livepatch/<name>/transition. For more information about 342*4882a593Smuzhiyunthis process, see the "Consistency model" section. 343*4882a593Smuzhiyun 344*4882a593SmuzhiyunFinally, once all tasks have been patched, the 'transition' value changes 345*4882a593Smuzhiyunto '0'. 346*4882a593Smuzhiyun 347*4882a593Smuzhiyun.. [#] 348*4882a593Smuzhiyun 349*4882a593Smuzhiyun Note that functions might be patched multiple times. The ftrace handler 350*4882a593Smuzhiyun is registered only once for a given function. Further patches just add 351*4882a593Smuzhiyun an entry to the list (see field `func_stack`) of the struct klp_ops. 352*4882a593Smuzhiyun The right implementation is selected by the ftrace handler, see 353*4882a593Smuzhiyun the "Consistency model" section. 354*4882a593Smuzhiyun 355*4882a593Smuzhiyun That said, it is highly recommended to use cumulative livepatches 356*4882a593Smuzhiyun because they help keeping the consistency of all changes. In this case, 357*4882a593Smuzhiyun functions might be patched two times only during the transition period. 358*4882a593Smuzhiyun 359*4882a593Smuzhiyun 360*4882a593Smuzhiyun5.3. Replacing 361*4882a593Smuzhiyun-------------- 362*4882a593Smuzhiyun 363*4882a593SmuzhiyunAll enabled patches might get replaced by a cumulative patch that 364*4882a593Smuzhiyunhas the .replace flag set. 365*4882a593Smuzhiyun 366*4882a593SmuzhiyunOnce the new patch is enabled and the 'transition' finishes then 367*4882a593Smuzhiyunall the functions (struct klp_func) associated with the replaced 368*4882a593Smuzhiyunpatches are removed from the corresponding struct klp_ops. Also 369*4882a593Smuzhiyunthe ftrace handler is unregistered and the struct klp_ops is 370*4882a593Smuzhiyunfreed when the related function is not modified by the new patch 371*4882a593Smuzhiyunand func_stack list becomes empty. 372*4882a593Smuzhiyun 373*4882a593SmuzhiyunSee Documentation/livepatch/cumulative-patches.rst for more details. 374*4882a593Smuzhiyun 375*4882a593Smuzhiyun 376*4882a593Smuzhiyun5.4. Disabling 377*4882a593Smuzhiyun-------------- 378*4882a593Smuzhiyun 379*4882a593SmuzhiyunEnabled patches might get disabled by writing '0' to 380*4882a593Smuzhiyun/sys/kernel/livepatch/<name>/enabled. 381*4882a593Smuzhiyun 382*4882a593SmuzhiyunFirst, livepatch enters into a transition state where tasks are converging 383*4882a593Smuzhiyunto the unpatched state. The system starts using either the code from 384*4882a593Smuzhiyunthe previously enabled patch or even the original one. This stage is 385*4882a593Smuzhiyunindicated by a value of '1' in /sys/kernel/livepatch/<name>/transition. 386*4882a593SmuzhiyunFor more information about this process, see the "Consistency model" 387*4882a593Smuzhiyunsection. 388*4882a593Smuzhiyun 389*4882a593SmuzhiyunSecond, once all tasks have been unpatched, the 'transition' value changes 390*4882a593Smuzhiyunto '0'. All the functions (struct klp_func) associated with the to-be-disabled 391*4882a593Smuzhiyunpatch are removed from the corresponding struct klp_ops. The ftrace handler 392*4882a593Smuzhiyunis unregistered and the struct klp_ops is freed when the func_stack list 393*4882a593Smuzhiyunbecomes empty. 394*4882a593Smuzhiyun 395*4882a593SmuzhiyunThird, the sysfs interface is destroyed. 396*4882a593Smuzhiyun 397*4882a593Smuzhiyun 398*4882a593Smuzhiyun5.5. Removing 399*4882a593Smuzhiyun------------- 400*4882a593Smuzhiyun 401*4882a593SmuzhiyunModule removal is only safe when there are no users of functions provided 402*4882a593Smuzhiyunby the module. This is the reason why the force feature permanently 403*4882a593Smuzhiyundisables the removal. Only when the system is successfully transitioned 404*4882a593Smuzhiyunto a new patch state (patched/unpatched) without being forced it is 405*4882a593Smuzhiyunguaranteed that no task sleeps or runs in the old code. 406*4882a593Smuzhiyun 407*4882a593Smuzhiyun 408*4882a593Smuzhiyun6. Sysfs 409*4882a593Smuzhiyun======== 410*4882a593Smuzhiyun 411*4882a593SmuzhiyunInformation about the registered patches can be found under 412*4882a593Smuzhiyun/sys/kernel/livepatch. The patches could be enabled and disabled 413*4882a593Smuzhiyunby writing there. 414*4882a593Smuzhiyun 415*4882a593Smuzhiyun/sys/kernel/livepatch/<patch>/force attributes allow administrator to affect a 416*4882a593Smuzhiyunpatching operation. 417*4882a593Smuzhiyun 418*4882a593SmuzhiyunSee Documentation/ABI/testing/sysfs-kernel-livepatch for more details. 419*4882a593Smuzhiyun 420*4882a593Smuzhiyun 421*4882a593Smuzhiyun7. Limitations 422*4882a593Smuzhiyun============== 423*4882a593Smuzhiyun 424*4882a593SmuzhiyunThe current Livepatch implementation has several limitations: 425*4882a593Smuzhiyun 426*4882a593Smuzhiyun - Only functions that can be traced could be patched. 427*4882a593Smuzhiyun 428*4882a593Smuzhiyun Livepatch is based on the dynamic ftrace. In particular, functions 429*4882a593Smuzhiyun implementing ftrace or the livepatch ftrace handler could not be 430*4882a593Smuzhiyun patched. Otherwise, the code would end up in an infinite loop. A 431*4882a593Smuzhiyun potential mistake is prevented by marking the problematic functions 432*4882a593Smuzhiyun by "notrace". 433*4882a593Smuzhiyun 434*4882a593Smuzhiyun 435*4882a593Smuzhiyun 436*4882a593Smuzhiyun - Livepatch works reliably only when the dynamic ftrace is located at 437*4882a593Smuzhiyun the very beginning of the function. 438*4882a593Smuzhiyun 439*4882a593Smuzhiyun The function need to be redirected before the stack or the function 440*4882a593Smuzhiyun parameters are modified in any way. For example, livepatch requires 441*4882a593Smuzhiyun using -fentry gcc compiler option on x86_64. 442*4882a593Smuzhiyun 443*4882a593Smuzhiyun One exception is the PPC port. It uses relative addressing and TOC. 444*4882a593Smuzhiyun Each function has to handle TOC and save LR before it could call 445*4882a593Smuzhiyun the ftrace handler. This operation has to be reverted on return. 446*4882a593Smuzhiyun Fortunately, the generic ftrace code has the same problem and all 447*4882a593Smuzhiyun this is handled on the ftrace level. 448*4882a593Smuzhiyun 449*4882a593Smuzhiyun 450*4882a593Smuzhiyun - Kretprobes using the ftrace framework conflict with the patched 451*4882a593Smuzhiyun functions. 452*4882a593Smuzhiyun 453*4882a593Smuzhiyun Both kretprobes and livepatches use a ftrace handler that modifies 454*4882a593Smuzhiyun the return address. The first user wins. Either the probe or the patch 455*4882a593Smuzhiyun is rejected when the handler is already in use by the other. 456*4882a593Smuzhiyun 457*4882a593Smuzhiyun 458*4882a593Smuzhiyun - Kprobes in the original function are ignored when the code is 459*4882a593Smuzhiyun redirected to the new implementation. 460*4882a593Smuzhiyun 461*4882a593Smuzhiyun There is a work in progress to add warnings about this situation. 462