xref: /OK3568_Linux_fs/kernel/Documentation/RCU/rcu_dereference.rst (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593Smuzhiyun.. _rcu_dereference_doc:
2*4882a593Smuzhiyun
3*4882a593SmuzhiyunPROPER CARE AND FEEDING OF RETURN VALUES FROM rcu_dereference()
4*4882a593Smuzhiyun===============================================================
5*4882a593Smuzhiyun
6*4882a593SmuzhiyunMost of the time, you can use values from rcu_dereference() or one of
7*4882a593Smuzhiyunthe similar primitives without worries.  Dereferencing (prefix "*"),
8*4882a593Smuzhiyunfield selection ("->"), assignment ("="), address-of ("&"), addition and
9*4882a593Smuzhiyunsubtraction of constants, and casts all work quite naturally and safely.
10*4882a593Smuzhiyun
11*4882a593SmuzhiyunIt is nevertheless possible to get into trouble with other operations.
12*4882a593SmuzhiyunFollow these rules to keep your RCU code working properly:
13*4882a593Smuzhiyun
14*4882a593Smuzhiyun-	You must use one of the rcu_dereference() family of primitives
15*4882a593Smuzhiyun	to load an RCU-protected pointer, otherwise CONFIG_PROVE_RCU
16*4882a593Smuzhiyun	will complain.  Worse yet, your code can see random memory-corruption
17*4882a593Smuzhiyun	bugs due to games that compilers and DEC Alpha can play.
18*4882a593Smuzhiyun	Without one of the rcu_dereference() primitives, compilers
19*4882a593Smuzhiyun	can reload the value, and won't your code have fun with two
20*4882a593Smuzhiyun	different values for a single pointer!  Without rcu_dereference(),
21*4882a593Smuzhiyun	DEC Alpha can load a pointer, dereference that pointer, and
22*4882a593Smuzhiyun	return data preceding initialization that preceded the store of
23*4882a593Smuzhiyun	the pointer.
24*4882a593Smuzhiyun
25*4882a593Smuzhiyun	In addition, the volatile cast in rcu_dereference() prevents the
26*4882a593Smuzhiyun	compiler from deducing the resulting pointer value.  Please see
27*4882a593Smuzhiyun	the section entitled "EXAMPLE WHERE THE COMPILER KNOWS TOO MUCH"
28*4882a593Smuzhiyun	for an example where the compiler can in fact deduce the exact
29*4882a593Smuzhiyun	value of the pointer, and thus cause misordering.
30*4882a593Smuzhiyun
31*4882a593Smuzhiyun-	You are only permitted to use rcu_dereference on pointer values.
32*4882a593Smuzhiyun	The compiler simply knows too much about integral values to
33*4882a593Smuzhiyun	trust it to carry dependencies through integer operations.
34*4882a593Smuzhiyun	There are a very few exceptions, namely that you can temporarily
35*4882a593Smuzhiyun	cast the pointer to uintptr_t in order to:
36*4882a593Smuzhiyun
37*4882a593Smuzhiyun	-	Set bits and clear bits down in the must-be-zero low-order
38*4882a593Smuzhiyun		bits of that pointer.  This clearly means that the pointer
39*4882a593Smuzhiyun		must have alignment constraints, for example, this does
40*4882a593Smuzhiyun		-not- work in general for char* pointers.
41*4882a593Smuzhiyun
42*4882a593Smuzhiyun	-	XOR bits to translate pointers, as is done in some
43*4882a593Smuzhiyun		classic buddy-allocator algorithms.
44*4882a593Smuzhiyun
45*4882a593Smuzhiyun	It is important to cast the value back to pointer before
46*4882a593Smuzhiyun	doing much of anything else with it.
47*4882a593Smuzhiyun
48*4882a593Smuzhiyun-	Avoid cancellation when using the "+" and "-" infix arithmetic
49*4882a593Smuzhiyun	operators.  For example, for a given variable "x", avoid
50*4882a593Smuzhiyun	"(x-(uintptr_t)x)" for char* pointers.	The compiler is within its
51*4882a593Smuzhiyun	rights to substitute zero for this sort of expression, so that
52*4882a593Smuzhiyun	subsequent accesses no longer depend on the rcu_dereference(),
53*4882a593Smuzhiyun	again possibly resulting in bugs due to misordering.
54*4882a593Smuzhiyun
55*4882a593Smuzhiyun	Of course, if "p" is a pointer from rcu_dereference(), and "a"
56*4882a593Smuzhiyun	and "b" are integers that happen to be equal, the expression
57*4882a593Smuzhiyun	"p+a-b" is safe because its value still necessarily depends on
58*4882a593Smuzhiyun	the rcu_dereference(), thus maintaining proper ordering.
59*4882a593Smuzhiyun
60*4882a593Smuzhiyun-	If you are using RCU to protect JITed functions, so that the
61*4882a593Smuzhiyun	"()" function-invocation operator is applied to a value obtained
62*4882a593Smuzhiyun	(directly or indirectly) from rcu_dereference(), you may need to
63*4882a593Smuzhiyun	interact directly with the hardware to flush instruction caches.
64*4882a593Smuzhiyun	This issue arises on some systems when a newly JITed function is
65*4882a593Smuzhiyun	using the same memory that was used by an earlier JITed function.
66*4882a593Smuzhiyun
67*4882a593Smuzhiyun-	Do not use the results from relational operators ("==", "!=",
68*4882a593Smuzhiyun	">", ">=", "<", or "<=") when dereferencing.  For example,
69*4882a593Smuzhiyun	the following (quite strange) code is buggy::
70*4882a593Smuzhiyun
71*4882a593Smuzhiyun		int *p;
72*4882a593Smuzhiyun		int *q;
73*4882a593Smuzhiyun
74*4882a593Smuzhiyun		...
75*4882a593Smuzhiyun
76*4882a593Smuzhiyun		p = rcu_dereference(gp)
77*4882a593Smuzhiyun		q = &global_q;
78*4882a593Smuzhiyun		q += p > &oom_p;
79*4882a593Smuzhiyun		r1 = *q;  /* BUGGY!!! */
80*4882a593Smuzhiyun
81*4882a593Smuzhiyun	As before, the reason this is buggy is that relational operators
82*4882a593Smuzhiyun	are often compiled using branches.  And as before, although
83*4882a593Smuzhiyun	weak-memory machines such as ARM or PowerPC do order stores
84*4882a593Smuzhiyun	after such branches, but can speculate loads, which can again
85*4882a593Smuzhiyun	result in misordering bugs.
86*4882a593Smuzhiyun
87*4882a593Smuzhiyun-	Be very careful about comparing pointers obtained from
88*4882a593Smuzhiyun	rcu_dereference() against non-NULL values.  As Linus Torvalds
89*4882a593Smuzhiyun	explained, if the two pointers are equal, the compiler could
90*4882a593Smuzhiyun	substitute the pointer you are comparing against for the pointer
91*4882a593Smuzhiyun	obtained from rcu_dereference().  For example::
92*4882a593Smuzhiyun
93*4882a593Smuzhiyun		p = rcu_dereference(gp);
94*4882a593Smuzhiyun		if (p == &default_struct)
95*4882a593Smuzhiyun			do_default(p->a);
96*4882a593Smuzhiyun
97*4882a593Smuzhiyun	Because the compiler now knows that the value of "p" is exactly
98*4882a593Smuzhiyun	the address of the variable "default_struct", it is free to
99*4882a593Smuzhiyun	transform this code into the following::
100*4882a593Smuzhiyun
101*4882a593Smuzhiyun		p = rcu_dereference(gp);
102*4882a593Smuzhiyun		if (p == &default_struct)
103*4882a593Smuzhiyun			do_default(default_struct.a);
104*4882a593Smuzhiyun
105*4882a593Smuzhiyun	On ARM and Power hardware, the load from "default_struct.a"
106*4882a593Smuzhiyun	can now be speculated, such that it might happen before the
107*4882a593Smuzhiyun	rcu_dereference().  This could result in bugs due to misordering.
108*4882a593Smuzhiyun
109*4882a593Smuzhiyun	However, comparisons are OK in the following cases:
110*4882a593Smuzhiyun
111*4882a593Smuzhiyun	-	The comparison was against the NULL pointer.  If the
112*4882a593Smuzhiyun		compiler knows that the pointer is NULL, you had better
113*4882a593Smuzhiyun		not be dereferencing it anyway.  If the comparison is
114*4882a593Smuzhiyun		non-equal, the compiler is none the wiser.  Therefore,
115*4882a593Smuzhiyun		it is safe to compare pointers from rcu_dereference()
116*4882a593Smuzhiyun		against NULL pointers.
117*4882a593Smuzhiyun
118*4882a593Smuzhiyun	-	The pointer is never dereferenced after being compared.
119*4882a593Smuzhiyun		Since there are no subsequent dereferences, the compiler
120*4882a593Smuzhiyun		cannot use anything it learned from the comparison
121*4882a593Smuzhiyun		to reorder the non-existent subsequent dereferences.
122*4882a593Smuzhiyun		This sort of comparison occurs frequently when scanning
123*4882a593Smuzhiyun		RCU-protected circular linked lists.
124*4882a593Smuzhiyun
125*4882a593Smuzhiyun		Note that if checks for being within an RCU read-side
126*4882a593Smuzhiyun		critical section are not required and the pointer is never
127*4882a593Smuzhiyun		dereferenced, rcu_access_pointer() should be used in place
128*4882a593Smuzhiyun		of rcu_dereference().
129*4882a593Smuzhiyun
130*4882a593Smuzhiyun	-	The comparison is against a pointer that references memory
131*4882a593Smuzhiyun		that was initialized "a long time ago."  The reason
132*4882a593Smuzhiyun		this is safe is that even if misordering occurs, the
133*4882a593Smuzhiyun		misordering will not affect the accesses that follow
134*4882a593Smuzhiyun		the comparison.  So exactly how long ago is "a long
135*4882a593Smuzhiyun		time ago"?  Here are some possibilities:
136*4882a593Smuzhiyun
137*4882a593Smuzhiyun		-	Compile time.
138*4882a593Smuzhiyun
139*4882a593Smuzhiyun		-	Boot time.
140*4882a593Smuzhiyun
141*4882a593Smuzhiyun		-	Module-init time for module code.
142*4882a593Smuzhiyun
143*4882a593Smuzhiyun		-	Prior to kthread creation for kthread code.
144*4882a593Smuzhiyun
145*4882a593Smuzhiyun		-	During some prior acquisition of the lock that
146*4882a593Smuzhiyun			we now hold.
147*4882a593Smuzhiyun
148*4882a593Smuzhiyun		-	Before mod_timer() time for a timer handler.
149*4882a593Smuzhiyun
150*4882a593Smuzhiyun		There are many other possibilities involving the Linux
151*4882a593Smuzhiyun		kernel's wide array of primitives that cause code to
152*4882a593Smuzhiyun		be invoked at a later time.
153*4882a593Smuzhiyun
154*4882a593Smuzhiyun	-	The pointer being compared against also came from
155*4882a593Smuzhiyun		rcu_dereference().  In this case, both pointers depend
156*4882a593Smuzhiyun		on one rcu_dereference() or another, so you get proper
157*4882a593Smuzhiyun		ordering either way.
158*4882a593Smuzhiyun
159*4882a593Smuzhiyun		That said, this situation can make certain RCU usage
160*4882a593Smuzhiyun		bugs more likely to happen.  Which can be a good thing,
161*4882a593Smuzhiyun		at least if they happen during testing.  An example
162*4882a593Smuzhiyun		of such an RCU usage bug is shown in the section titled
163*4882a593Smuzhiyun		"EXAMPLE OF AMPLIFIED RCU-USAGE BUG".
164*4882a593Smuzhiyun
165*4882a593Smuzhiyun	-	All of the accesses following the comparison are stores,
166*4882a593Smuzhiyun		so that a control dependency preserves the needed ordering.
167*4882a593Smuzhiyun		That said, it is easy to get control dependencies wrong.
168*4882a593Smuzhiyun		Please see the "CONTROL DEPENDENCIES" section of
169*4882a593Smuzhiyun		Documentation/memory-barriers.txt for more details.
170*4882a593Smuzhiyun
171*4882a593Smuzhiyun	-	The pointers are not equal -and- the compiler does
172*4882a593Smuzhiyun		not have enough information to deduce the value of the
173*4882a593Smuzhiyun		pointer.  Note that the volatile cast in rcu_dereference()
174*4882a593Smuzhiyun		will normally prevent the compiler from knowing too much.
175*4882a593Smuzhiyun
176*4882a593Smuzhiyun		However, please note that if the compiler knows that the
177*4882a593Smuzhiyun		pointer takes on only one of two values, a not-equal
178*4882a593Smuzhiyun		comparison will provide exactly the information that the
179*4882a593Smuzhiyun		compiler needs to deduce the value of the pointer.
180*4882a593Smuzhiyun
181*4882a593Smuzhiyun-	Disable any value-speculation optimizations that your compiler
182*4882a593Smuzhiyun	might provide, especially if you are making use of feedback-based
183*4882a593Smuzhiyun	optimizations that take data collected from prior runs.  Such
184*4882a593Smuzhiyun	value-speculation optimizations reorder operations by design.
185*4882a593Smuzhiyun
186*4882a593Smuzhiyun	There is one exception to this rule:  Value-speculation
187*4882a593Smuzhiyun	optimizations that leverage the branch-prediction hardware are
188*4882a593Smuzhiyun	safe on strongly ordered systems (such as x86), but not on weakly
189*4882a593Smuzhiyun	ordered systems (such as ARM or Power).  Choose your compiler
190*4882a593Smuzhiyun	command-line options wisely!
191*4882a593Smuzhiyun
192*4882a593Smuzhiyun
193*4882a593SmuzhiyunEXAMPLE OF AMPLIFIED RCU-USAGE BUG
194*4882a593Smuzhiyun----------------------------------
195*4882a593Smuzhiyun
196*4882a593SmuzhiyunBecause updaters can run concurrently with RCU readers, RCU readers can
197*4882a593Smuzhiyunsee stale and/or inconsistent values.  If RCU readers need fresh or
198*4882a593Smuzhiyunconsistent values, which they sometimes do, they need to take proper
199*4882a593Smuzhiyunprecautions.  To see this, consider the following code fragment::
200*4882a593Smuzhiyun
201*4882a593Smuzhiyun	struct foo {
202*4882a593Smuzhiyun		int a;
203*4882a593Smuzhiyun		int b;
204*4882a593Smuzhiyun		int c;
205*4882a593Smuzhiyun	};
206*4882a593Smuzhiyun	struct foo *gp1;
207*4882a593Smuzhiyun	struct foo *gp2;
208*4882a593Smuzhiyun
209*4882a593Smuzhiyun	void updater(void)
210*4882a593Smuzhiyun	{
211*4882a593Smuzhiyun		struct foo *p;
212*4882a593Smuzhiyun
213*4882a593Smuzhiyun		p = kmalloc(...);
214*4882a593Smuzhiyun		if (p == NULL)
215*4882a593Smuzhiyun			deal_with_it();
216*4882a593Smuzhiyun		p->a = 42;  /* Each field in its own cache line. */
217*4882a593Smuzhiyun		p->b = 43;
218*4882a593Smuzhiyun		p->c = 44;
219*4882a593Smuzhiyun		rcu_assign_pointer(gp1, p);
220*4882a593Smuzhiyun		p->b = 143;
221*4882a593Smuzhiyun		p->c = 144;
222*4882a593Smuzhiyun		rcu_assign_pointer(gp2, p);
223*4882a593Smuzhiyun	}
224*4882a593Smuzhiyun
225*4882a593Smuzhiyun	void reader(void)
226*4882a593Smuzhiyun	{
227*4882a593Smuzhiyun		struct foo *p;
228*4882a593Smuzhiyun		struct foo *q;
229*4882a593Smuzhiyun		int r1, r2;
230*4882a593Smuzhiyun
231*4882a593Smuzhiyun		p = rcu_dereference(gp2);
232*4882a593Smuzhiyun		if (p == NULL)
233*4882a593Smuzhiyun			return;
234*4882a593Smuzhiyun		r1 = p->b;  /* Guaranteed to get 143. */
235*4882a593Smuzhiyun		q = rcu_dereference(gp1);  /* Guaranteed non-NULL. */
236*4882a593Smuzhiyun		if (p == q) {
237*4882a593Smuzhiyun			/* The compiler decides that q->c is same as p->c. */
238*4882a593Smuzhiyun			r2 = p->c; /* Could get 44 on weakly order system. */
239*4882a593Smuzhiyun		}
240*4882a593Smuzhiyun		do_something_with(r1, r2);
241*4882a593Smuzhiyun	}
242*4882a593Smuzhiyun
243*4882a593SmuzhiyunYou might be surprised that the outcome (r1 == 143 && r2 == 44) is possible,
244*4882a593Smuzhiyunbut you should not be.  After all, the updater might have been invoked
245*4882a593Smuzhiyuna second time between the time reader() loaded into "r1" and the time
246*4882a593Smuzhiyunthat it loaded into "r2".  The fact that this same result can occur due
247*4882a593Smuzhiyunto some reordering from the compiler and CPUs is beside the point.
248*4882a593Smuzhiyun
249*4882a593SmuzhiyunBut suppose that the reader needs a consistent view?
250*4882a593Smuzhiyun
251*4882a593SmuzhiyunThen one approach is to use locking, for example, as follows::
252*4882a593Smuzhiyun
253*4882a593Smuzhiyun	struct foo {
254*4882a593Smuzhiyun		int a;
255*4882a593Smuzhiyun		int b;
256*4882a593Smuzhiyun		int c;
257*4882a593Smuzhiyun		spinlock_t lock;
258*4882a593Smuzhiyun	};
259*4882a593Smuzhiyun	struct foo *gp1;
260*4882a593Smuzhiyun	struct foo *gp2;
261*4882a593Smuzhiyun
262*4882a593Smuzhiyun	void updater(void)
263*4882a593Smuzhiyun	{
264*4882a593Smuzhiyun		struct foo *p;
265*4882a593Smuzhiyun
266*4882a593Smuzhiyun		p = kmalloc(...);
267*4882a593Smuzhiyun		if (p == NULL)
268*4882a593Smuzhiyun			deal_with_it();
269*4882a593Smuzhiyun		spin_lock(&p->lock);
270*4882a593Smuzhiyun		p->a = 42;  /* Each field in its own cache line. */
271*4882a593Smuzhiyun		p->b = 43;
272*4882a593Smuzhiyun		p->c = 44;
273*4882a593Smuzhiyun		spin_unlock(&p->lock);
274*4882a593Smuzhiyun		rcu_assign_pointer(gp1, p);
275*4882a593Smuzhiyun		spin_lock(&p->lock);
276*4882a593Smuzhiyun		p->b = 143;
277*4882a593Smuzhiyun		p->c = 144;
278*4882a593Smuzhiyun		spin_unlock(&p->lock);
279*4882a593Smuzhiyun		rcu_assign_pointer(gp2, p);
280*4882a593Smuzhiyun	}
281*4882a593Smuzhiyun
282*4882a593Smuzhiyun	void reader(void)
283*4882a593Smuzhiyun	{
284*4882a593Smuzhiyun		struct foo *p;
285*4882a593Smuzhiyun		struct foo *q;
286*4882a593Smuzhiyun		int r1, r2;
287*4882a593Smuzhiyun
288*4882a593Smuzhiyun		p = rcu_dereference(gp2);
289*4882a593Smuzhiyun		if (p == NULL)
290*4882a593Smuzhiyun			return;
291*4882a593Smuzhiyun		spin_lock(&p->lock);
292*4882a593Smuzhiyun		r1 = p->b;  /* Guaranteed to get 143. */
293*4882a593Smuzhiyun		q = rcu_dereference(gp1);  /* Guaranteed non-NULL. */
294*4882a593Smuzhiyun		if (p == q) {
295*4882a593Smuzhiyun			/* The compiler decides that q->c is same as p->c. */
296*4882a593Smuzhiyun			r2 = p->c; /* Locking guarantees r2 == 144. */
297*4882a593Smuzhiyun		}
298*4882a593Smuzhiyun		spin_unlock(&p->lock);
299*4882a593Smuzhiyun		do_something_with(r1, r2);
300*4882a593Smuzhiyun	}
301*4882a593Smuzhiyun
302*4882a593SmuzhiyunAs always, use the right tool for the job!
303*4882a593Smuzhiyun
304*4882a593Smuzhiyun
305*4882a593SmuzhiyunEXAMPLE WHERE THE COMPILER KNOWS TOO MUCH
306*4882a593Smuzhiyun-----------------------------------------
307*4882a593Smuzhiyun
308*4882a593SmuzhiyunIf a pointer obtained from rcu_dereference() compares not-equal to some
309*4882a593Smuzhiyunother pointer, the compiler normally has no clue what the value of the
310*4882a593Smuzhiyunfirst pointer might be.  This lack of knowledge prevents the compiler
311*4882a593Smuzhiyunfrom carrying out optimizations that otherwise might destroy the ordering
312*4882a593Smuzhiyunguarantees that RCU depends on.  And the volatile cast in rcu_dereference()
313*4882a593Smuzhiyunshould prevent the compiler from guessing the value.
314*4882a593Smuzhiyun
315*4882a593SmuzhiyunBut without rcu_dereference(), the compiler knows more than you might
316*4882a593Smuzhiyunexpect.  Consider the following code fragment::
317*4882a593Smuzhiyun
318*4882a593Smuzhiyun	struct foo {
319*4882a593Smuzhiyun		int a;
320*4882a593Smuzhiyun		int b;
321*4882a593Smuzhiyun	};
322*4882a593Smuzhiyun	static struct foo variable1;
323*4882a593Smuzhiyun	static struct foo variable2;
324*4882a593Smuzhiyun	static struct foo *gp = &variable1;
325*4882a593Smuzhiyun
326*4882a593Smuzhiyun	void updater(void)
327*4882a593Smuzhiyun	{
328*4882a593Smuzhiyun		initialize_foo(&variable2);
329*4882a593Smuzhiyun		rcu_assign_pointer(gp, &variable2);
330*4882a593Smuzhiyun		/*
331*4882a593Smuzhiyun		 * The above is the only store to gp in this translation unit,
332*4882a593Smuzhiyun		 * and the address of gp is not exported in any way.
333*4882a593Smuzhiyun		 */
334*4882a593Smuzhiyun	}
335*4882a593Smuzhiyun
336*4882a593Smuzhiyun	int reader(void)
337*4882a593Smuzhiyun	{
338*4882a593Smuzhiyun		struct foo *p;
339*4882a593Smuzhiyun
340*4882a593Smuzhiyun		p = gp;
341*4882a593Smuzhiyun		barrier();
342*4882a593Smuzhiyun		if (p == &variable1)
343*4882a593Smuzhiyun			return p->a; /* Must be variable1.a. */
344*4882a593Smuzhiyun		else
345*4882a593Smuzhiyun			return p->b; /* Must be variable2.b. */
346*4882a593Smuzhiyun	}
347*4882a593Smuzhiyun
348*4882a593SmuzhiyunBecause the compiler can see all stores to "gp", it knows that the only
349*4882a593Smuzhiyunpossible values of "gp" are "variable1" on the one hand and "variable2"
350*4882a593Smuzhiyunon the other.  The comparison in reader() therefore tells the compiler
351*4882a593Smuzhiyunthe exact value of "p" even in the not-equals case.  This allows the
352*4882a593Smuzhiyuncompiler to make the return values independent of the load from "gp",
353*4882a593Smuzhiyunin turn destroying the ordering between this load and the loads of the
354*4882a593Smuzhiyunreturn values.  This can result in "p->b" returning pre-initialization
355*4882a593Smuzhiyungarbage values.
356*4882a593Smuzhiyun
357*4882a593SmuzhiyunIn short, rcu_dereference() is -not- optional when you are going to
358*4882a593Smuzhiyundereference the resulting pointer.
359*4882a593Smuzhiyun
360*4882a593Smuzhiyun
361*4882a593SmuzhiyunWHICH MEMBER OF THE rcu_dereference() FAMILY SHOULD YOU USE?
362*4882a593Smuzhiyun------------------------------------------------------------
363*4882a593Smuzhiyun
364*4882a593SmuzhiyunFirst, please avoid using rcu_dereference_raw() and also please avoid
365*4882a593Smuzhiyunusing rcu_dereference_check() and rcu_dereference_protected() with a
366*4882a593Smuzhiyunsecond argument with a constant value of 1 (or true, for that matter).
367*4882a593SmuzhiyunWith that caution out of the way, here is some guidance for which
368*4882a593Smuzhiyunmember of the rcu_dereference() to use in various situations:
369*4882a593Smuzhiyun
370*4882a593Smuzhiyun1.	If the access needs to be within an RCU read-side critical
371*4882a593Smuzhiyun	section, use rcu_dereference().  With the new consolidated
372*4882a593Smuzhiyun	RCU flavors, an RCU read-side critical section is entered
373*4882a593Smuzhiyun	using rcu_read_lock(), anything that disables bottom halves,
374*4882a593Smuzhiyun	anything that disables interrupts, or anything that disables
375*4882a593Smuzhiyun	preemption.
376*4882a593Smuzhiyun
377*4882a593Smuzhiyun2.	If the access might be within an RCU read-side critical section
378*4882a593Smuzhiyun	on the one hand, or protected by (say) my_lock on the other,
379*4882a593Smuzhiyun	use rcu_dereference_check(), for example::
380*4882a593Smuzhiyun
381*4882a593Smuzhiyun		p1 = rcu_dereference_check(p->rcu_protected_pointer,
382*4882a593Smuzhiyun					   lockdep_is_held(&my_lock));
383*4882a593Smuzhiyun
384*4882a593Smuzhiyun
385*4882a593Smuzhiyun3.	If the access might be within an RCU read-side critical section
386*4882a593Smuzhiyun	on the one hand, or protected by either my_lock or your_lock on
387*4882a593Smuzhiyun	the other, again use rcu_dereference_check(), for example::
388*4882a593Smuzhiyun
389*4882a593Smuzhiyun		p1 = rcu_dereference_check(p->rcu_protected_pointer,
390*4882a593Smuzhiyun					   lockdep_is_held(&my_lock) ||
391*4882a593Smuzhiyun					   lockdep_is_held(&your_lock));
392*4882a593Smuzhiyun
393*4882a593Smuzhiyun4.	If the access is on the update side, so that it is always protected
394*4882a593Smuzhiyun	by my_lock, use rcu_dereference_protected()::
395*4882a593Smuzhiyun
396*4882a593Smuzhiyun		p1 = rcu_dereference_protected(p->rcu_protected_pointer,
397*4882a593Smuzhiyun					       lockdep_is_held(&my_lock));
398*4882a593Smuzhiyun
399*4882a593Smuzhiyun	This can be extended to handle multiple locks as in #3 above,
400*4882a593Smuzhiyun	and both can be extended to check other conditions as well.
401*4882a593Smuzhiyun
402*4882a593Smuzhiyun5.	If the protection is supplied by the caller, and is thus unknown
403*4882a593Smuzhiyun	to this code, that is the rare case when rcu_dereference_raw()
404*4882a593Smuzhiyun	is appropriate.  In addition, rcu_dereference_raw() might be
405*4882a593Smuzhiyun	appropriate when the lockdep expression would be excessively
406*4882a593Smuzhiyun	complex, except that a better approach in that case might be to
407*4882a593Smuzhiyun	take a long hard look at your synchronization design.  Still,
408*4882a593Smuzhiyun	there are data-locking cases where any one of a very large number
409*4882a593Smuzhiyun	of locks or reference counters suffices to protect the pointer,
410*4882a593Smuzhiyun	so rcu_dereference_raw() does have its place.
411*4882a593Smuzhiyun
412*4882a593Smuzhiyun	However, its place is probably quite a bit smaller than one
413*4882a593Smuzhiyun	might expect given the number of uses in the current kernel.
414*4882a593Smuzhiyun	Ditto for its synonym, rcu_dereference_check( ... , 1), and
415*4882a593Smuzhiyun	its close relative, rcu_dereference_protected(... , 1).
416*4882a593Smuzhiyun
417*4882a593Smuzhiyun
418*4882a593SmuzhiyunSPARSE CHECKING OF RCU-PROTECTED POINTERS
419*4882a593Smuzhiyun-----------------------------------------
420*4882a593Smuzhiyun
421*4882a593SmuzhiyunThe sparse static-analysis tool checks for direct access to RCU-protected
422*4882a593Smuzhiyunpointers, which can result in "interesting" bugs due to compiler
423*4882a593Smuzhiyunoptimizations involving invented loads and perhaps also load tearing.
424*4882a593SmuzhiyunFor example, suppose someone mistakenly does something like this::
425*4882a593Smuzhiyun
426*4882a593Smuzhiyun	p = q->rcu_protected_pointer;
427*4882a593Smuzhiyun	do_something_with(p->a);
428*4882a593Smuzhiyun	do_something_else_with(p->b);
429*4882a593Smuzhiyun
430*4882a593SmuzhiyunIf register pressure is high, the compiler might optimize "p" out
431*4882a593Smuzhiyunof existence, transforming the code to something like this::
432*4882a593Smuzhiyun
433*4882a593Smuzhiyun	do_something_with(q->rcu_protected_pointer->a);
434*4882a593Smuzhiyun	do_something_else_with(q->rcu_protected_pointer->b);
435*4882a593Smuzhiyun
436*4882a593SmuzhiyunThis could fatally disappoint your code if q->rcu_protected_pointer
437*4882a593Smuzhiyunchanged in the meantime.  Nor is this a theoretical problem:  Exactly
438*4882a593Smuzhiyunthis sort of bug cost Paul E. McKenney (and several of his innocent
439*4882a593Smuzhiyuncolleagues) a three-day weekend back in the early 1990s.
440*4882a593Smuzhiyun
441*4882a593SmuzhiyunLoad tearing could of course result in dereferencing a mashup of a pair
442*4882a593Smuzhiyunof pointers, which also might fatally disappoint your code.
443*4882a593Smuzhiyun
444*4882a593SmuzhiyunThese problems could have been avoided simply by making the code instead
445*4882a593Smuzhiyunread as follows::
446*4882a593Smuzhiyun
447*4882a593Smuzhiyun	p = rcu_dereference(q->rcu_protected_pointer);
448*4882a593Smuzhiyun	do_something_with(p->a);
449*4882a593Smuzhiyun	do_something_else_with(p->b);
450*4882a593Smuzhiyun
451*4882a593SmuzhiyunUnfortunately, these sorts of bugs can be extremely hard to spot during
452*4882a593Smuzhiyunreview.  This is where the sparse tool comes into play, along with the
453*4882a593Smuzhiyun"__rcu" marker.  If you mark a pointer declaration, whether in a structure
454*4882a593Smuzhiyunor as a formal parameter, with "__rcu", which tells sparse to complain if
455*4882a593Smuzhiyunthis pointer is accessed directly.  It will also cause sparse to complain
456*4882a593Smuzhiyunif a pointer not marked with "__rcu" is accessed using rcu_dereference()
457*4882a593Smuzhiyunand friends.  For example, ->rcu_protected_pointer might be declared as
458*4882a593Smuzhiyunfollows::
459*4882a593Smuzhiyun
460*4882a593Smuzhiyun	struct foo __rcu *rcu_protected_pointer;
461*4882a593Smuzhiyun
462*4882a593SmuzhiyunUse of "__rcu" is opt-in.  If you choose not to use it, then you should
463*4882a593Smuzhiyunignore the sparse warnings.
464