1*4882a593Smuzhiyun.. _rcu_dereference_doc: 2*4882a593Smuzhiyun 3*4882a593SmuzhiyunPROPER CARE AND FEEDING OF RETURN VALUES FROM rcu_dereference() 4*4882a593Smuzhiyun=============================================================== 5*4882a593Smuzhiyun 6*4882a593SmuzhiyunMost of the time, you can use values from rcu_dereference() or one of 7*4882a593Smuzhiyunthe similar primitives without worries. Dereferencing (prefix "*"), 8*4882a593Smuzhiyunfield selection ("->"), assignment ("="), address-of ("&"), addition and 9*4882a593Smuzhiyunsubtraction of constants, and casts all work quite naturally and safely. 10*4882a593Smuzhiyun 11*4882a593SmuzhiyunIt is nevertheless possible to get into trouble with other operations. 12*4882a593SmuzhiyunFollow these rules to keep your RCU code working properly: 13*4882a593Smuzhiyun 14*4882a593Smuzhiyun- You must use one of the rcu_dereference() family of primitives 15*4882a593Smuzhiyun to load an RCU-protected pointer, otherwise CONFIG_PROVE_RCU 16*4882a593Smuzhiyun will complain. Worse yet, your code can see random memory-corruption 17*4882a593Smuzhiyun bugs due to games that compilers and DEC Alpha can play. 18*4882a593Smuzhiyun Without one of the rcu_dereference() primitives, compilers 19*4882a593Smuzhiyun can reload the value, and won't your code have fun with two 20*4882a593Smuzhiyun different values for a single pointer! Without rcu_dereference(), 21*4882a593Smuzhiyun DEC Alpha can load a pointer, dereference that pointer, and 22*4882a593Smuzhiyun return data preceding initialization that preceded the store of 23*4882a593Smuzhiyun the pointer. 24*4882a593Smuzhiyun 25*4882a593Smuzhiyun In addition, the volatile cast in rcu_dereference() prevents the 26*4882a593Smuzhiyun compiler from deducing the resulting pointer value. Please see 27*4882a593Smuzhiyun the section entitled "EXAMPLE WHERE THE COMPILER KNOWS TOO MUCH" 28*4882a593Smuzhiyun for an example where the compiler can in fact deduce the exact 29*4882a593Smuzhiyun value of the pointer, and thus cause misordering. 30*4882a593Smuzhiyun 31*4882a593Smuzhiyun- You are only permitted to use rcu_dereference on pointer values. 32*4882a593Smuzhiyun The compiler simply knows too much about integral values to 33*4882a593Smuzhiyun trust it to carry dependencies through integer operations. 34*4882a593Smuzhiyun There are a very few exceptions, namely that you can temporarily 35*4882a593Smuzhiyun cast the pointer to uintptr_t in order to: 36*4882a593Smuzhiyun 37*4882a593Smuzhiyun - Set bits and clear bits down in the must-be-zero low-order 38*4882a593Smuzhiyun bits of that pointer. This clearly means that the pointer 39*4882a593Smuzhiyun must have alignment constraints, for example, this does 40*4882a593Smuzhiyun -not- work in general for char* pointers. 41*4882a593Smuzhiyun 42*4882a593Smuzhiyun - XOR bits to translate pointers, as is done in some 43*4882a593Smuzhiyun classic buddy-allocator algorithms. 44*4882a593Smuzhiyun 45*4882a593Smuzhiyun It is important to cast the value back to pointer before 46*4882a593Smuzhiyun doing much of anything else with it. 47*4882a593Smuzhiyun 48*4882a593Smuzhiyun- Avoid cancellation when using the "+" and "-" infix arithmetic 49*4882a593Smuzhiyun operators. For example, for a given variable "x", avoid 50*4882a593Smuzhiyun "(x-(uintptr_t)x)" for char* pointers. The compiler is within its 51*4882a593Smuzhiyun rights to substitute zero for this sort of expression, so that 52*4882a593Smuzhiyun subsequent accesses no longer depend on the rcu_dereference(), 53*4882a593Smuzhiyun again possibly resulting in bugs due to misordering. 54*4882a593Smuzhiyun 55*4882a593Smuzhiyun Of course, if "p" is a pointer from rcu_dereference(), and "a" 56*4882a593Smuzhiyun and "b" are integers that happen to be equal, the expression 57*4882a593Smuzhiyun "p+a-b" is safe because its value still necessarily depends on 58*4882a593Smuzhiyun the rcu_dereference(), thus maintaining proper ordering. 59*4882a593Smuzhiyun 60*4882a593Smuzhiyun- If you are using RCU to protect JITed functions, so that the 61*4882a593Smuzhiyun "()" function-invocation operator is applied to a value obtained 62*4882a593Smuzhiyun (directly or indirectly) from rcu_dereference(), you may need to 63*4882a593Smuzhiyun interact directly with the hardware to flush instruction caches. 64*4882a593Smuzhiyun This issue arises on some systems when a newly JITed function is 65*4882a593Smuzhiyun using the same memory that was used by an earlier JITed function. 66*4882a593Smuzhiyun 67*4882a593Smuzhiyun- Do not use the results from relational operators ("==", "!=", 68*4882a593Smuzhiyun ">", ">=", "<", or "<=") when dereferencing. For example, 69*4882a593Smuzhiyun the following (quite strange) code is buggy:: 70*4882a593Smuzhiyun 71*4882a593Smuzhiyun int *p; 72*4882a593Smuzhiyun int *q; 73*4882a593Smuzhiyun 74*4882a593Smuzhiyun ... 75*4882a593Smuzhiyun 76*4882a593Smuzhiyun p = rcu_dereference(gp) 77*4882a593Smuzhiyun q = &global_q; 78*4882a593Smuzhiyun q += p > &oom_p; 79*4882a593Smuzhiyun r1 = *q; /* BUGGY!!! */ 80*4882a593Smuzhiyun 81*4882a593Smuzhiyun As before, the reason this is buggy is that relational operators 82*4882a593Smuzhiyun are often compiled using branches. And as before, although 83*4882a593Smuzhiyun weak-memory machines such as ARM or PowerPC do order stores 84*4882a593Smuzhiyun after such branches, but can speculate loads, which can again 85*4882a593Smuzhiyun result in misordering bugs. 86*4882a593Smuzhiyun 87*4882a593Smuzhiyun- Be very careful about comparing pointers obtained from 88*4882a593Smuzhiyun rcu_dereference() against non-NULL values. As Linus Torvalds 89*4882a593Smuzhiyun explained, if the two pointers are equal, the compiler could 90*4882a593Smuzhiyun substitute the pointer you are comparing against for the pointer 91*4882a593Smuzhiyun obtained from rcu_dereference(). For example:: 92*4882a593Smuzhiyun 93*4882a593Smuzhiyun p = rcu_dereference(gp); 94*4882a593Smuzhiyun if (p == &default_struct) 95*4882a593Smuzhiyun do_default(p->a); 96*4882a593Smuzhiyun 97*4882a593Smuzhiyun Because the compiler now knows that the value of "p" is exactly 98*4882a593Smuzhiyun the address of the variable "default_struct", it is free to 99*4882a593Smuzhiyun transform this code into the following:: 100*4882a593Smuzhiyun 101*4882a593Smuzhiyun p = rcu_dereference(gp); 102*4882a593Smuzhiyun if (p == &default_struct) 103*4882a593Smuzhiyun do_default(default_struct.a); 104*4882a593Smuzhiyun 105*4882a593Smuzhiyun On ARM and Power hardware, the load from "default_struct.a" 106*4882a593Smuzhiyun can now be speculated, such that it might happen before the 107*4882a593Smuzhiyun rcu_dereference(). This could result in bugs due to misordering. 108*4882a593Smuzhiyun 109*4882a593Smuzhiyun However, comparisons are OK in the following cases: 110*4882a593Smuzhiyun 111*4882a593Smuzhiyun - The comparison was against the NULL pointer. If the 112*4882a593Smuzhiyun compiler knows that the pointer is NULL, you had better 113*4882a593Smuzhiyun not be dereferencing it anyway. If the comparison is 114*4882a593Smuzhiyun non-equal, the compiler is none the wiser. Therefore, 115*4882a593Smuzhiyun it is safe to compare pointers from rcu_dereference() 116*4882a593Smuzhiyun against NULL pointers. 117*4882a593Smuzhiyun 118*4882a593Smuzhiyun - The pointer is never dereferenced after being compared. 119*4882a593Smuzhiyun Since there are no subsequent dereferences, the compiler 120*4882a593Smuzhiyun cannot use anything it learned from the comparison 121*4882a593Smuzhiyun to reorder the non-existent subsequent dereferences. 122*4882a593Smuzhiyun This sort of comparison occurs frequently when scanning 123*4882a593Smuzhiyun RCU-protected circular linked lists. 124*4882a593Smuzhiyun 125*4882a593Smuzhiyun Note that if checks for being within an RCU read-side 126*4882a593Smuzhiyun critical section are not required and the pointer is never 127*4882a593Smuzhiyun dereferenced, rcu_access_pointer() should be used in place 128*4882a593Smuzhiyun of rcu_dereference(). 129*4882a593Smuzhiyun 130*4882a593Smuzhiyun - The comparison is against a pointer that references memory 131*4882a593Smuzhiyun that was initialized "a long time ago." The reason 132*4882a593Smuzhiyun this is safe is that even if misordering occurs, the 133*4882a593Smuzhiyun misordering will not affect the accesses that follow 134*4882a593Smuzhiyun the comparison. So exactly how long ago is "a long 135*4882a593Smuzhiyun time ago"? Here are some possibilities: 136*4882a593Smuzhiyun 137*4882a593Smuzhiyun - Compile time. 138*4882a593Smuzhiyun 139*4882a593Smuzhiyun - Boot time. 140*4882a593Smuzhiyun 141*4882a593Smuzhiyun - Module-init time for module code. 142*4882a593Smuzhiyun 143*4882a593Smuzhiyun - Prior to kthread creation for kthread code. 144*4882a593Smuzhiyun 145*4882a593Smuzhiyun - During some prior acquisition of the lock that 146*4882a593Smuzhiyun we now hold. 147*4882a593Smuzhiyun 148*4882a593Smuzhiyun - Before mod_timer() time for a timer handler. 149*4882a593Smuzhiyun 150*4882a593Smuzhiyun There are many other possibilities involving the Linux 151*4882a593Smuzhiyun kernel's wide array of primitives that cause code to 152*4882a593Smuzhiyun be invoked at a later time. 153*4882a593Smuzhiyun 154*4882a593Smuzhiyun - The pointer being compared against also came from 155*4882a593Smuzhiyun rcu_dereference(). In this case, both pointers depend 156*4882a593Smuzhiyun on one rcu_dereference() or another, so you get proper 157*4882a593Smuzhiyun ordering either way. 158*4882a593Smuzhiyun 159*4882a593Smuzhiyun That said, this situation can make certain RCU usage 160*4882a593Smuzhiyun bugs more likely to happen. Which can be a good thing, 161*4882a593Smuzhiyun at least if they happen during testing. An example 162*4882a593Smuzhiyun of such an RCU usage bug is shown in the section titled 163*4882a593Smuzhiyun "EXAMPLE OF AMPLIFIED RCU-USAGE BUG". 164*4882a593Smuzhiyun 165*4882a593Smuzhiyun - All of the accesses following the comparison are stores, 166*4882a593Smuzhiyun so that a control dependency preserves the needed ordering. 167*4882a593Smuzhiyun That said, it is easy to get control dependencies wrong. 168*4882a593Smuzhiyun Please see the "CONTROL DEPENDENCIES" section of 169*4882a593Smuzhiyun Documentation/memory-barriers.txt for more details. 170*4882a593Smuzhiyun 171*4882a593Smuzhiyun - The pointers are not equal -and- the compiler does 172*4882a593Smuzhiyun not have enough information to deduce the value of the 173*4882a593Smuzhiyun pointer. Note that the volatile cast in rcu_dereference() 174*4882a593Smuzhiyun will normally prevent the compiler from knowing too much. 175*4882a593Smuzhiyun 176*4882a593Smuzhiyun However, please note that if the compiler knows that the 177*4882a593Smuzhiyun pointer takes on only one of two values, a not-equal 178*4882a593Smuzhiyun comparison will provide exactly the information that the 179*4882a593Smuzhiyun compiler needs to deduce the value of the pointer. 180*4882a593Smuzhiyun 181*4882a593Smuzhiyun- Disable any value-speculation optimizations that your compiler 182*4882a593Smuzhiyun might provide, especially if you are making use of feedback-based 183*4882a593Smuzhiyun optimizations that take data collected from prior runs. Such 184*4882a593Smuzhiyun value-speculation optimizations reorder operations by design. 185*4882a593Smuzhiyun 186*4882a593Smuzhiyun There is one exception to this rule: Value-speculation 187*4882a593Smuzhiyun optimizations that leverage the branch-prediction hardware are 188*4882a593Smuzhiyun safe on strongly ordered systems (such as x86), but not on weakly 189*4882a593Smuzhiyun ordered systems (such as ARM or Power). Choose your compiler 190*4882a593Smuzhiyun command-line options wisely! 191*4882a593Smuzhiyun 192*4882a593Smuzhiyun 193*4882a593SmuzhiyunEXAMPLE OF AMPLIFIED RCU-USAGE BUG 194*4882a593Smuzhiyun---------------------------------- 195*4882a593Smuzhiyun 196*4882a593SmuzhiyunBecause updaters can run concurrently with RCU readers, RCU readers can 197*4882a593Smuzhiyunsee stale and/or inconsistent values. If RCU readers need fresh or 198*4882a593Smuzhiyunconsistent values, which they sometimes do, they need to take proper 199*4882a593Smuzhiyunprecautions. To see this, consider the following code fragment:: 200*4882a593Smuzhiyun 201*4882a593Smuzhiyun struct foo { 202*4882a593Smuzhiyun int a; 203*4882a593Smuzhiyun int b; 204*4882a593Smuzhiyun int c; 205*4882a593Smuzhiyun }; 206*4882a593Smuzhiyun struct foo *gp1; 207*4882a593Smuzhiyun struct foo *gp2; 208*4882a593Smuzhiyun 209*4882a593Smuzhiyun void updater(void) 210*4882a593Smuzhiyun { 211*4882a593Smuzhiyun struct foo *p; 212*4882a593Smuzhiyun 213*4882a593Smuzhiyun p = kmalloc(...); 214*4882a593Smuzhiyun if (p == NULL) 215*4882a593Smuzhiyun deal_with_it(); 216*4882a593Smuzhiyun p->a = 42; /* Each field in its own cache line. */ 217*4882a593Smuzhiyun p->b = 43; 218*4882a593Smuzhiyun p->c = 44; 219*4882a593Smuzhiyun rcu_assign_pointer(gp1, p); 220*4882a593Smuzhiyun p->b = 143; 221*4882a593Smuzhiyun p->c = 144; 222*4882a593Smuzhiyun rcu_assign_pointer(gp2, p); 223*4882a593Smuzhiyun } 224*4882a593Smuzhiyun 225*4882a593Smuzhiyun void reader(void) 226*4882a593Smuzhiyun { 227*4882a593Smuzhiyun struct foo *p; 228*4882a593Smuzhiyun struct foo *q; 229*4882a593Smuzhiyun int r1, r2; 230*4882a593Smuzhiyun 231*4882a593Smuzhiyun p = rcu_dereference(gp2); 232*4882a593Smuzhiyun if (p == NULL) 233*4882a593Smuzhiyun return; 234*4882a593Smuzhiyun r1 = p->b; /* Guaranteed to get 143. */ 235*4882a593Smuzhiyun q = rcu_dereference(gp1); /* Guaranteed non-NULL. */ 236*4882a593Smuzhiyun if (p == q) { 237*4882a593Smuzhiyun /* The compiler decides that q->c is same as p->c. */ 238*4882a593Smuzhiyun r2 = p->c; /* Could get 44 on weakly order system. */ 239*4882a593Smuzhiyun } 240*4882a593Smuzhiyun do_something_with(r1, r2); 241*4882a593Smuzhiyun } 242*4882a593Smuzhiyun 243*4882a593SmuzhiyunYou might be surprised that the outcome (r1 == 143 && r2 == 44) is possible, 244*4882a593Smuzhiyunbut you should not be. After all, the updater might have been invoked 245*4882a593Smuzhiyuna second time between the time reader() loaded into "r1" and the time 246*4882a593Smuzhiyunthat it loaded into "r2". The fact that this same result can occur due 247*4882a593Smuzhiyunto some reordering from the compiler and CPUs is beside the point. 248*4882a593Smuzhiyun 249*4882a593SmuzhiyunBut suppose that the reader needs a consistent view? 250*4882a593Smuzhiyun 251*4882a593SmuzhiyunThen one approach is to use locking, for example, as follows:: 252*4882a593Smuzhiyun 253*4882a593Smuzhiyun struct foo { 254*4882a593Smuzhiyun int a; 255*4882a593Smuzhiyun int b; 256*4882a593Smuzhiyun int c; 257*4882a593Smuzhiyun spinlock_t lock; 258*4882a593Smuzhiyun }; 259*4882a593Smuzhiyun struct foo *gp1; 260*4882a593Smuzhiyun struct foo *gp2; 261*4882a593Smuzhiyun 262*4882a593Smuzhiyun void updater(void) 263*4882a593Smuzhiyun { 264*4882a593Smuzhiyun struct foo *p; 265*4882a593Smuzhiyun 266*4882a593Smuzhiyun p = kmalloc(...); 267*4882a593Smuzhiyun if (p == NULL) 268*4882a593Smuzhiyun deal_with_it(); 269*4882a593Smuzhiyun spin_lock(&p->lock); 270*4882a593Smuzhiyun p->a = 42; /* Each field in its own cache line. */ 271*4882a593Smuzhiyun p->b = 43; 272*4882a593Smuzhiyun p->c = 44; 273*4882a593Smuzhiyun spin_unlock(&p->lock); 274*4882a593Smuzhiyun rcu_assign_pointer(gp1, p); 275*4882a593Smuzhiyun spin_lock(&p->lock); 276*4882a593Smuzhiyun p->b = 143; 277*4882a593Smuzhiyun p->c = 144; 278*4882a593Smuzhiyun spin_unlock(&p->lock); 279*4882a593Smuzhiyun rcu_assign_pointer(gp2, p); 280*4882a593Smuzhiyun } 281*4882a593Smuzhiyun 282*4882a593Smuzhiyun void reader(void) 283*4882a593Smuzhiyun { 284*4882a593Smuzhiyun struct foo *p; 285*4882a593Smuzhiyun struct foo *q; 286*4882a593Smuzhiyun int r1, r2; 287*4882a593Smuzhiyun 288*4882a593Smuzhiyun p = rcu_dereference(gp2); 289*4882a593Smuzhiyun if (p == NULL) 290*4882a593Smuzhiyun return; 291*4882a593Smuzhiyun spin_lock(&p->lock); 292*4882a593Smuzhiyun r1 = p->b; /* Guaranteed to get 143. */ 293*4882a593Smuzhiyun q = rcu_dereference(gp1); /* Guaranteed non-NULL. */ 294*4882a593Smuzhiyun if (p == q) { 295*4882a593Smuzhiyun /* The compiler decides that q->c is same as p->c. */ 296*4882a593Smuzhiyun r2 = p->c; /* Locking guarantees r2 == 144. */ 297*4882a593Smuzhiyun } 298*4882a593Smuzhiyun spin_unlock(&p->lock); 299*4882a593Smuzhiyun do_something_with(r1, r2); 300*4882a593Smuzhiyun } 301*4882a593Smuzhiyun 302*4882a593SmuzhiyunAs always, use the right tool for the job! 303*4882a593Smuzhiyun 304*4882a593Smuzhiyun 305*4882a593SmuzhiyunEXAMPLE WHERE THE COMPILER KNOWS TOO MUCH 306*4882a593Smuzhiyun----------------------------------------- 307*4882a593Smuzhiyun 308*4882a593SmuzhiyunIf a pointer obtained from rcu_dereference() compares not-equal to some 309*4882a593Smuzhiyunother pointer, the compiler normally has no clue what the value of the 310*4882a593Smuzhiyunfirst pointer might be. This lack of knowledge prevents the compiler 311*4882a593Smuzhiyunfrom carrying out optimizations that otherwise might destroy the ordering 312*4882a593Smuzhiyunguarantees that RCU depends on. And the volatile cast in rcu_dereference() 313*4882a593Smuzhiyunshould prevent the compiler from guessing the value. 314*4882a593Smuzhiyun 315*4882a593SmuzhiyunBut without rcu_dereference(), the compiler knows more than you might 316*4882a593Smuzhiyunexpect. Consider the following code fragment:: 317*4882a593Smuzhiyun 318*4882a593Smuzhiyun struct foo { 319*4882a593Smuzhiyun int a; 320*4882a593Smuzhiyun int b; 321*4882a593Smuzhiyun }; 322*4882a593Smuzhiyun static struct foo variable1; 323*4882a593Smuzhiyun static struct foo variable2; 324*4882a593Smuzhiyun static struct foo *gp = &variable1; 325*4882a593Smuzhiyun 326*4882a593Smuzhiyun void updater(void) 327*4882a593Smuzhiyun { 328*4882a593Smuzhiyun initialize_foo(&variable2); 329*4882a593Smuzhiyun rcu_assign_pointer(gp, &variable2); 330*4882a593Smuzhiyun /* 331*4882a593Smuzhiyun * The above is the only store to gp in this translation unit, 332*4882a593Smuzhiyun * and the address of gp is not exported in any way. 333*4882a593Smuzhiyun */ 334*4882a593Smuzhiyun } 335*4882a593Smuzhiyun 336*4882a593Smuzhiyun int reader(void) 337*4882a593Smuzhiyun { 338*4882a593Smuzhiyun struct foo *p; 339*4882a593Smuzhiyun 340*4882a593Smuzhiyun p = gp; 341*4882a593Smuzhiyun barrier(); 342*4882a593Smuzhiyun if (p == &variable1) 343*4882a593Smuzhiyun return p->a; /* Must be variable1.a. */ 344*4882a593Smuzhiyun else 345*4882a593Smuzhiyun return p->b; /* Must be variable2.b. */ 346*4882a593Smuzhiyun } 347*4882a593Smuzhiyun 348*4882a593SmuzhiyunBecause the compiler can see all stores to "gp", it knows that the only 349*4882a593Smuzhiyunpossible values of "gp" are "variable1" on the one hand and "variable2" 350*4882a593Smuzhiyunon the other. The comparison in reader() therefore tells the compiler 351*4882a593Smuzhiyunthe exact value of "p" even in the not-equals case. This allows the 352*4882a593Smuzhiyuncompiler to make the return values independent of the load from "gp", 353*4882a593Smuzhiyunin turn destroying the ordering between this load and the loads of the 354*4882a593Smuzhiyunreturn values. This can result in "p->b" returning pre-initialization 355*4882a593Smuzhiyungarbage values. 356*4882a593Smuzhiyun 357*4882a593SmuzhiyunIn short, rcu_dereference() is -not- optional when you are going to 358*4882a593Smuzhiyundereference the resulting pointer. 359*4882a593Smuzhiyun 360*4882a593Smuzhiyun 361*4882a593SmuzhiyunWHICH MEMBER OF THE rcu_dereference() FAMILY SHOULD YOU USE? 362*4882a593Smuzhiyun------------------------------------------------------------ 363*4882a593Smuzhiyun 364*4882a593SmuzhiyunFirst, please avoid using rcu_dereference_raw() and also please avoid 365*4882a593Smuzhiyunusing rcu_dereference_check() and rcu_dereference_protected() with a 366*4882a593Smuzhiyunsecond argument with a constant value of 1 (or true, for that matter). 367*4882a593SmuzhiyunWith that caution out of the way, here is some guidance for which 368*4882a593Smuzhiyunmember of the rcu_dereference() to use in various situations: 369*4882a593Smuzhiyun 370*4882a593Smuzhiyun1. If the access needs to be within an RCU read-side critical 371*4882a593Smuzhiyun section, use rcu_dereference(). With the new consolidated 372*4882a593Smuzhiyun RCU flavors, an RCU read-side critical section is entered 373*4882a593Smuzhiyun using rcu_read_lock(), anything that disables bottom halves, 374*4882a593Smuzhiyun anything that disables interrupts, or anything that disables 375*4882a593Smuzhiyun preemption. 376*4882a593Smuzhiyun 377*4882a593Smuzhiyun2. If the access might be within an RCU read-side critical section 378*4882a593Smuzhiyun on the one hand, or protected by (say) my_lock on the other, 379*4882a593Smuzhiyun use rcu_dereference_check(), for example:: 380*4882a593Smuzhiyun 381*4882a593Smuzhiyun p1 = rcu_dereference_check(p->rcu_protected_pointer, 382*4882a593Smuzhiyun lockdep_is_held(&my_lock)); 383*4882a593Smuzhiyun 384*4882a593Smuzhiyun 385*4882a593Smuzhiyun3. If the access might be within an RCU read-side critical section 386*4882a593Smuzhiyun on the one hand, or protected by either my_lock or your_lock on 387*4882a593Smuzhiyun the other, again use rcu_dereference_check(), for example:: 388*4882a593Smuzhiyun 389*4882a593Smuzhiyun p1 = rcu_dereference_check(p->rcu_protected_pointer, 390*4882a593Smuzhiyun lockdep_is_held(&my_lock) || 391*4882a593Smuzhiyun lockdep_is_held(&your_lock)); 392*4882a593Smuzhiyun 393*4882a593Smuzhiyun4. If the access is on the update side, so that it is always protected 394*4882a593Smuzhiyun by my_lock, use rcu_dereference_protected():: 395*4882a593Smuzhiyun 396*4882a593Smuzhiyun p1 = rcu_dereference_protected(p->rcu_protected_pointer, 397*4882a593Smuzhiyun lockdep_is_held(&my_lock)); 398*4882a593Smuzhiyun 399*4882a593Smuzhiyun This can be extended to handle multiple locks as in #3 above, 400*4882a593Smuzhiyun and both can be extended to check other conditions as well. 401*4882a593Smuzhiyun 402*4882a593Smuzhiyun5. If the protection is supplied by the caller, and is thus unknown 403*4882a593Smuzhiyun to this code, that is the rare case when rcu_dereference_raw() 404*4882a593Smuzhiyun is appropriate. In addition, rcu_dereference_raw() might be 405*4882a593Smuzhiyun appropriate when the lockdep expression would be excessively 406*4882a593Smuzhiyun complex, except that a better approach in that case might be to 407*4882a593Smuzhiyun take a long hard look at your synchronization design. Still, 408*4882a593Smuzhiyun there are data-locking cases where any one of a very large number 409*4882a593Smuzhiyun of locks or reference counters suffices to protect the pointer, 410*4882a593Smuzhiyun so rcu_dereference_raw() does have its place. 411*4882a593Smuzhiyun 412*4882a593Smuzhiyun However, its place is probably quite a bit smaller than one 413*4882a593Smuzhiyun might expect given the number of uses in the current kernel. 414*4882a593Smuzhiyun Ditto for its synonym, rcu_dereference_check( ... , 1), and 415*4882a593Smuzhiyun its close relative, rcu_dereference_protected(... , 1). 416*4882a593Smuzhiyun 417*4882a593Smuzhiyun 418*4882a593SmuzhiyunSPARSE CHECKING OF RCU-PROTECTED POINTERS 419*4882a593Smuzhiyun----------------------------------------- 420*4882a593Smuzhiyun 421*4882a593SmuzhiyunThe sparse static-analysis tool checks for direct access to RCU-protected 422*4882a593Smuzhiyunpointers, which can result in "interesting" bugs due to compiler 423*4882a593Smuzhiyunoptimizations involving invented loads and perhaps also load tearing. 424*4882a593SmuzhiyunFor example, suppose someone mistakenly does something like this:: 425*4882a593Smuzhiyun 426*4882a593Smuzhiyun p = q->rcu_protected_pointer; 427*4882a593Smuzhiyun do_something_with(p->a); 428*4882a593Smuzhiyun do_something_else_with(p->b); 429*4882a593Smuzhiyun 430*4882a593SmuzhiyunIf register pressure is high, the compiler might optimize "p" out 431*4882a593Smuzhiyunof existence, transforming the code to something like this:: 432*4882a593Smuzhiyun 433*4882a593Smuzhiyun do_something_with(q->rcu_protected_pointer->a); 434*4882a593Smuzhiyun do_something_else_with(q->rcu_protected_pointer->b); 435*4882a593Smuzhiyun 436*4882a593SmuzhiyunThis could fatally disappoint your code if q->rcu_protected_pointer 437*4882a593Smuzhiyunchanged in the meantime. Nor is this a theoretical problem: Exactly 438*4882a593Smuzhiyunthis sort of bug cost Paul E. McKenney (and several of his innocent 439*4882a593Smuzhiyuncolleagues) a three-day weekend back in the early 1990s. 440*4882a593Smuzhiyun 441*4882a593SmuzhiyunLoad tearing could of course result in dereferencing a mashup of a pair 442*4882a593Smuzhiyunof pointers, which also might fatally disappoint your code. 443*4882a593Smuzhiyun 444*4882a593SmuzhiyunThese problems could have been avoided simply by making the code instead 445*4882a593Smuzhiyunread as follows:: 446*4882a593Smuzhiyun 447*4882a593Smuzhiyun p = rcu_dereference(q->rcu_protected_pointer); 448*4882a593Smuzhiyun do_something_with(p->a); 449*4882a593Smuzhiyun do_something_else_with(p->b); 450*4882a593Smuzhiyun 451*4882a593SmuzhiyunUnfortunately, these sorts of bugs can be extremely hard to spot during 452*4882a593Smuzhiyunreview. This is where the sparse tool comes into play, along with the 453*4882a593Smuzhiyun"__rcu" marker. If you mark a pointer declaration, whether in a structure 454*4882a593Smuzhiyunor as a formal parameter, with "__rcu", which tells sparse to complain if 455*4882a593Smuzhiyunthis pointer is accessed directly. It will also cause sparse to complain 456*4882a593Smuzhiyunif a pointer not marked with "__rcu" is accessed using rcu_dereference() 457*4882a593Smuzhiyunand friends. For example, ->rcu_protected_pointer might be declared as 458*4882a593Smuzhiyunfollows:: 459*4882a593Smuzhiyun 460*4882a593Smuzhiyun struct foo __rcu *rcu_protected_pointer; 461*4882a593Smuzhiyun 462*4882a593SmuzhiyunUse of "__rcu" is opt-in. If you choose not to use it, then you should 463*4882a593Smuzhiyunignore the sparse warnings. 464