xref: /OK3568_Linux_fs/kernel/tools/memory-model/Documentation/litmus-tests.txt (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593SmuzhiyunLinux-Kernel Memory Model Litmus Tests
2*4882a593Smuzhiyun======================================
3*4882a593Smuzhiyun
4*4882a593SmuzhiyunThis file describes the LKMM litmus-test format by example, describes
5*4882a593Smuzhiyunsome tricks and traps, and finally outlines LKMM's limitations.  Earlier
6*4882a593Smuzhiyunversions of this material appeared in a number of LWN articles, including:
7*4882a593Smuzhiyun
8*4882a593Smuzhiyunhttps://lwn.net/Articles/720550/
9*4882a593Smuzhiyun	A formal kernel memory-ordering model (part 2)
10*4882a593Smuzhiyunhttps://lwn.net/Articles/608550/
11*4882a593Smuzhiyun	Axiomatic validation of memory barriers and atomic instructions
12*4882a593Smuzhiyunhttps://lwn.net/Articles/470681/
13*4882a593Smuzhiyun	Validating Memory Barriers and Atomic Instructions
14*4882a593Smuzhiyun
15*4882a593SmuzhiyunThis document presents information in decreasing order of applicability,
16*4882a593Smuzhiyunso that, where possible, the information that has proven more commonly
17*4882a593Smuzhiyunuseful is shown near the beginning.
18*4882a593Smuzhiyun
19*4882a593SmuzhiyunFor information on installing LKMM, including the underlying "herd7"
20*4882a593Smuzhiyuntool, please see tools/memory-model/README.
21*4882a593Smuzhiyun
22*4882a593Smuzhiyun
23*4882a593SmuzhiyunCopy-Pasta
24*4882a593Smuzhiyun==========
25*4882a593Smuzhiyun
26*4882a593SmuzhiyunAs with other software, it is often better (if less macho) to adapt an
27*4882a593Smuzhiyunexisting litmus test than it is to create one from scratch.  A number
28*4882a593Smuzhiyunof litmus tests may be found in the kernel source tree:
29*4882a593Smuzhiyun
30*4882a593Smuzhiyun	tools/memory-model/litmus-tests/
31*4882a593Smuzhiyun	Documentation/litmus-tests/
32*4882a593Smuzhiyun
33*4882a593SmuzhiyunSeveral thousand more example litmus tests are available on github
34*4882a593Smuzhiyunand kernel.org:
35*4882a593Smuzhiyun
36*4882a593Smuzhiyun	https://github.com/paulmckrcu/litmus
37*4882a593Smuzhiyun	https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/perfbook.git/tree/CodeSamples/formal/herd
38*4882a593Smuzhiyun	https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/perfbook.git/tree/CodeSamples/formal/litmus
39*4882a593Smuzhiyun
40*4882a593SmuzhiyunThe -l and -L arguments to "git grep" can be quite helpful in identifying
41*4882a593Smuzhiyunexisting litmus tests that are similar to the one you need.  But even if
42*4882a593Smuzhiyunyou start with an existing litmus test, it is still helpful to have a
43*4882a593Smuzhiyungood understanding of the litmus-test format.
44*4882a593Smuzhiyun
45*4882a593Smuzhiyun
46*4882a593SmuzhiyunExamples and Format
47*4882a593Smuzhiyun===================
48*4882a593Smuzhiyun
49*4882a593SmuzhiyunThis section describes the overall format of litmus tests, starting
50*4882a593Smuzhiyunwith a small example of the message-passing pattern and moving on to
51*4882a593Smuzhiyunmore complex examples that illustrate explicit initialization and LKMM's
52*4882a593Smuzhiyunminimalistic set of flow-control statements.
53*4882a593Smuzhiyun
54*4882a593Smuzhiyun
55*4882a593SmuzhiyunMessage-Passing Example
56*4882a593Smuzhiyun-----------------------
57*4882a593Smuzhiyun
58*4882a593SmuzhiyunThis section gives an overview of the format of a litmus test using an
59*4882a593Smuzhiyunexample based on the common message-passing use case.  This use case
60*4882a593Smuzhiyunappears often in the Linux kernel.  For example, a flag (modeled by "y"
61*4882a593Smuzhiyunbelow) indicates that a buffer (modeled by "x" below) is now completely
62*4882a593Smuzhiyunfilled in and ready for use.  It would be very bad if the consumer saw the
63*4882a593Smuzhiyunflag set, but, due to memory misordering, saw old values in the buffer.
64*4882a593Smuzhiyun
65*4882a593SmuzhiyunThis example asks whether smp_store_release() and smp_load_acquire()
66*4882a593Smuzhiyunsuffices to avoid this bad outcome:
67*4882a593Smuzhiyun
68*4882a593Smuzhiyun 1 C MP+pooncerelease+poacquireonce
69*4882a593Smuzhiyun 2
70*4882a593Smuzhiyun 3 {}
71*4882a593Smuzhiyun 4
72*4882a593Smuzhiyun 5 P0(int *x, int *y)
73*4882a593Smuzhiyun 6 {
74*4882a593Smuzhiyun 7   WRITE_ONCE(*x, 1);
75*4882a593Smuzhiyun 8   smp_store_release(y, 1);
76*4882a593Smuzhiyun 9 }
77*4882a593Smuzhiyun10
78*4882a593Smuzhiyun11 P1(int *x, int *y)
79*4882a593Smuzhiyun12 {
80*4882a593Smuzhiyun13   int r0;
81*4882a593Smuzhiyun14   int r1;
82*4882a593Smuzhiyun15
83*4882a593Smuzhiyun16   r0 = smp_load_acquire(y);
84*4882a593Smuzhiyun17   r1 = READ_ONCE(*x);
85*4882a593Smuzhiyun18 }
86*4882a593Smuzhiyun19
87*4882a593Smuzhiyun20 exists (1:r0=1 /\ 1:r1=0)
88*4882a593Smuzhiyun
89*4882a593SmuzhiyunLine 1 starts with "C", which identifies this file as being in the
90*4882a593SmuzhiyunLKMM C-language format (which, as we will see, is a small fragment
91*4882a593Smuzhiyunof the full C language).  The remainder of line 1 is the name of
92*4882a593Smuzhiyunthe test, which by convention is the filename with the ".litmus"
93*4882a593Smuzhiyunsuffix stripped.  In this case, the actual test may be found in
94*4882a593Smuzhiyuntools/memory-model/litmus-tests/MP+pooncerelease+poacquireonce.litmus
95*4882a593Smuzhiyunin the Linux-kernel source tree.
96*4882a593Smuzhiyun
97*4882a593SmuzhiyunMechanically generated litmus tests will often have an optional
98*4882a593Smuzhiyundouble-quoted comment string on the second line.  Such strings are ignored
99*4882a593Smuzhiyunwhen running the test.  Yes, you can add your own comments to litmus
100*4882a593Smuzhiyuntests, but this is a bit involved due to the use of multiple parsers.
101*4882a593SmuzhiyunFor now, you can use C-language comments in the C code, and these comments
102*4882a593Smuzhiyunmay be in either the "/* */" or the "//" style.  A later section will
103*4882a593Smuzhiyuncover the full litmus-test commenting story.
104*4882a593Smuzhiyun
105*4882a593SmuzhiyunLine 3 is the initialization section.  Because the default initialization
106*4882a593Smuzhiyunto zero suffices for this test, the "{}" syntax is used, which mean the
107*4882a593Smuzhiyuninitialization section is empty.  Litmus tests requiring non-default
108*4882a593Smuzhiyuninitialization must have non-empty initialization sections, as in the
109*4882a593Smuzhiyunexample that will be presented later in this document.
110*4882a593Smuzhiyun
111*4882a593SmuzhiyunLines 5-9 show the first process and lines 11-18 the second process.  Each
112*4882a593Smuzhiyunprocess corresponds to a Linux-kernel task (or kthread, workqueue, thread,
113*4882a593Smuzhiyunand so on; LKMM discussions often use these terms interchangeably).
114*4882a593SmuzhiyunThe name of the first process is "P0" and that of the second "P1".
115*4882a593SmuzhiyunYou can name your processes anything you like as long as the names consist
116*4882a593Smuzhiyunof a single "P" followed by a number, and as long as the numbers are
117*4882a593Smuzhiyunconsecutive starting with zero.  This can actually be quite helpful,
118*4882a593Smuzhiyunfor example, a .litmus file matching "^P1(" but not matching "^P2("
119*4882a593Smuzhiyunmust contain a two-process litmus test.
120*4882a593Smuzhiyun
121*4882a593SmuzhiyunThe argument list for each function are pointers to the global variables
122*4882a593Smuzhiyunused by that function.  Unlike normal C-language function parameters, the
123*4882a593Smuzhiyunnames are significant.  The fact that both P0() and P1() have a formal
124*4882a593Smuzhiyunparameter named "x" means that these two processes are working with the
125*4882a593Smuzhiyunsame global variable, also named "x".  So the "int *x, int *y" on P0()
126*4882a593Smuzhiyunand P1() mean that both processes are working with two shared global
127*4882a593Smuzhiyunvariables, "x" and "y".  Global variables are always passed to processes
128*4882a593Smuzhiyunby reference, hence "P0(int *x, int *y)", but *never* "P0(int x, int y)".
129*4882a593Smuzhiyun
130*4882a593SmuzhiyunP0() has no local variables, but P1() has two of them named "r0" and "r1".
131*4882a593SmuzhiyunThese names may be freely chosen, but for historical reasons stemming from
132*4882a593Smuzhiyunother litmus-test formats, it is conventional to use names consisting of
133*4882a593Smuzhiyun"r" followed by a number as shown here.  A common bug in litmus tests
134*4882a593Smuzhiyunis forgetting to add a global variable to a process's parameter list.
135*4882a593SmuzhiyunThis will sometimes result in an error message, but can also cause the
136*4882a593Smuzhiyunintended global to instead be silently treated as an undeclared local
137*4882a593Smuzhiyunvariable.
138*4882a593Smuzhiyun
139*4882a593SmuzhiyunEach process's code is similar to Linux-kernel C, as can be seen on lines
140*4882a593Smuzhiyun7-8 and 13-17.  This code may use many of the Linux kernel's atomic
141*4882a593Smuzhiyunoperations, some of its exclusive-lock functions, and some of its RCU
142*4882a593Smuzhiyunand SRCU functions.  An approximate list of the currently supported
143*4882a593Smuzhiyunfunctions may be found in the linux-kernel.def file.
144*4882a593Smuzhiyun
145*4882a593SmuzhiyunThe P0() process does "WRITE_ONCE(*x, 1)" on line 7.  Because "x" is a
146*4882a593Smuzhiyunpointer in P0()'s parameter list, this does an unordered store to global
147*4882a593Smuzhiyunvariable "x".  Line 8 does "smp_store_release(y, 1)", and because "y"
148*4882a593Smuzhiyunis also in P0()'s parameter list, this does a release store to global
149*4882a593Smuzhiyunvariable "y".
150*4882a593Smuzhiyun
151*4882a593SmuzhiyunThe P1() process declares two local variables on lines 13 and 14.
152*4882a593SmuzhiyunLine 16 does "r0 = smp_load_acquire(y)" which does an acquire load
153*4882a593Smuzhiyunfrom global variable "y" into local variable "r0".  Line 17 does a
154*4882a593Smuzhiyun"r1 = READ_ONCE(*x)", which does an unordered load from "*x" into local
155*4882a593Smuzhiyunvariable "r1".  Both "x" and "y" are in P1()'s parameter list, so both
156*4882a593Smuzhiyunreference the same global variables that are used by P0().
157*4882a593Smuzhiyun
158*4882a593SmuzhiyunLine 20 is the "exists" assertion expression to evaluate the final state.
159*4882a593SmuzhiyunThis final state is evaluated after the dust has settled: both processes
160*4882a593Smuzhiyunhave completed and all of their memory references and memory barriers
161*4882a593Smuzhiyunhave propagated to all parts of the system.  The references to the local
162*4882a593Smuzhiyunvariables "r0" and "r1" in line 24 must be prefixed with "1:" to specify
163*4882a593Smuzhiyunwhich process they are local to.
164*4882a593Smuzhiyun
165*4882a593SmuzhiyunNote that the assertion expression is written in the litmus-test
166*4882a593Smuzhiyunlanguage rather than in C.  For example, single "=" is an equality
167*4882a593Smuzhiyunoperator rather than an assignment.  The "/\" character combination means
168*4882a593Smuzhiyun"and".  Similarly, "\/" stands for "or".  Both of these are ASCII-art
169*4882a593Smuzhiyunrepresentations of the corresponding mathematical symbols.  Finally,
170*4882a593Smuzhiyun"~" stands for "logical not", which is "!" in C, and not to be confused
171*4882a593Smuzhiyunwith the C-language "~" operator which instead stands for "bitwise not".
172*4882a593SmuzhiyunParentheses may be used to override precedence.
173*4882a593Smuzhiyun
174*4882a593SmuzhiyunThe "exists" assertion on line 20 is satisfied if the consumer sees the
175*4882a593Smuzhiyunflag ("y") set but the buffer ("x") as not yet filled in, that is, if P1()
176*4882a593Smuzhiyunloaded a value from "x" that was equal to 1 but loaded a value from "y"
177*4882a593Smuzhiyunthat was still equal to zero.
178*4882a593Smuzhiyun
179*4882a593SmuzhiyunThis example can be checked by running the following command, which
180*4882a593Smuzhiyunabsolutely must be run from the tools/memory-model directory and from
181*4882a593Smuzhiyunthis directory only:
182*4882a593Smuzhiyun
183*4882a593Smuzhiyunherd7 -conf linux-kernel.cfg litmus-tests/MP+pooncerelease+poacquireonce.litmus
184*4882a593Smuzhiyun
185*4882a593SmuzhiyunThe output is the result of something similar to a full state-space
186*4882a593Smuzhiyunsearch, and is as follows:
187*4882a593Smuzhiyun
188*4882a593Smuzhiyun 1 Test MP+pooncerelease+poacquireonce Allowed
189*4882a593Smuzhiyun 2 States 3
190*4882a593Smuzhiyun 3 1:r0=0; 1:r1=0;
191*4882a593Smuzhiyun 4 1:r0=0; 1:r1=1;
192*4882a593Smuzhiyun 5 1:r0=1; 1:r1=1;
193*4882a593Smuzhiyun 6 No
194*4882a593Smuzhiyun 7 Witnesses
195*4882a593Smuzhiyun 8 Positive: 0 Negative: 3
196*4882a593Smuzhiyun 9 Condition exists (1:r0=1 /\ 1:r1=0)
197*4882a593Smuzhiyun10 Observation MP+pooncerelease+poacquireonce Never 0 3
198*4882a593Smuzhiyun11 Time MP+pooncerelease+poacquireonce 0.00
199*4882a593Smuzhiyun12 Hash=579aaa14d8c35a39429b02e698241d09
200*4882a593Smuzhiyun
201*4882a593SmuzhiyunThe most pertinent line is line 10, which contains "Never 0 3", which
202*4882a593Smuzhiyunindicates that the bad result flagged by the "exists" clause never
203*4882a593Smuzhiyunhappens.  This line might instead say "Sometimes" to indicate that the
204*4882a593Smuzhiyunbad result happened in some but not all executions, or it might say
205*4882a593Smuzhiyun"Always" to indicate that the bad result happened in all executions.
206*4882a593Smuzhiyun(The herd7 tool doesn't judge, so it is only an LKMM convention that the
207*4882a593Smuzhiyun"exists" clause indicates a bad result.  To see this, invert the "exists"
208*4882a593Smuzhiyunclause's condition and run the test.)  The numbers ("0 3") at the end
209*4882a593Smuzhiyunof this line indicate the number of end states satisfying the "exists"
210*4882a593Smuzhiyunclause (0) and the number not not satisfying that clause (3).
211*4882a593Smuzhiyun
212*4882a593SmuzhiyunAnother important part of this output is shown in lines 2-5, repeated here:
213*4882a593Smuzhiyun
214*4882a593Smuzhiyun 2 States 3
215*4882a593Smuzhiyun 3 1:r0=0; 1:r1=0;
216*4882a593Smuzhiyun 4 1:r0=0; 1:r1=1;
217*4882a593Smuzhiyun 5 1:r0=1; 1:r1=1;
218*4882a593Smuzhiyun
219*4882a593SmuzhiyunLine 2 gives the total number of end states, and each of lines 3-5 list
220*4882a593Smuzhiyunone of these states, with the first ("1:r0=0; 1:r1=0;") indicating that
221*4882a593Smuzhiyunboth of P1()'s loads returned the value "0".  As expected, given the
222*4882a593Smuzhiyun"Never" on line 10, the state flagged by the "exists" clause is not
223*4882a593Smuzhiyunlisted.  This full list of states can be helpful when debugging a new
224*4882a593Smuzhiyunlitmus test.
225*4882a593Smuzhiyun
226*4882a593SmuzhiyunThe rest of the output is not normally needed, either due to irrelevance
227*4882a593Smuzhiyunor due to being redundant with the lines discussed above.  However, the
228*4882a593Smuzhiyunfollowing paragraph lists them for the benefit of readers possessed of
229*4882a593Smuzhiyunan insatiable curiosity.  Other readers should feel free to skip ahead.
230*4882a593Smuzhiyun
231*4882a593SmuzhiyunLine 1 echos the test name, along with the "Test" and "Allowed".  Line 6's
232*4882a593Smuzhiyun"No" says that the "exists" clause was not satisfied by any execution,
233*4882a593Smuzhiyunand as such it has the same meaning as line 10's "Never".  Line 7 is a
234*4882a593Smuzhiyunlead-in to line 8's "Positive: 0 Negative: 3", which lists the number
235*4882a593Smuzhiyunof end states satisfying and not satisfying the "exists" clause, just
236*4882a593Smuzhiyunlike the two numbers at the end of line 10.  Line 9 repeats the "exists"
237*4882a593Smuzhiyunclause so that you don't have to look it up in the litmus-test file.
238*4882a593SmuzhiyunThe number at the end of line 11 (which begins with "Time") gives the
239*4882a593Smuzhiyuntime in seconds required to analyze the litmus test.  Small tests such
240*4882a593Smuzhiyunas this one complete in a few milliseconds, so "0.00" is quite common.
241*4882a593SmuzhiyunLine 12 gives a hash of the contents for the litmus-test file, and is used
242*4882a593Smuzhiyunby tooling that manages litmus tests and their output.  This tooling is
243*4882a593Smuzhiyunused by people modifying LKMM itself, and among other things lets such
244*4882a593Smuzhiyunpeople know which of the several thousand relevant litmus tests were
245*4882a593Smuzhiyunaffected by a given change to LKMM.
246*4882a593Smuzhiyun
247*4882a593Smuzhiyun
248*4882a593SmuzhiyunInitialization
249*4882a593Smuzhiyun--------------
250*4882a593Smuzhiyun
251*4882a593SmuzhiyunThe previous example relied on the default zero initialization for
252*4882a593Smuzhiyun"x" and "y", but a similar litmus test could instead initialize them
253*4882a593Smuzhiyunto some other value:
254*4882a593Smuzhiyun
255*4882a593Smuzhiyun 1 C MP+pooncerelease+poacquireonce
256*4882a593Smuzhiyun 2
257*4882a593Smuzhiyun 3 {
258*4882a593Smuzhiyun 4   x=42;
259*4882a593Smuzhiyun 5   y=42;
260*4882a593Smuzhiyun 6 }
261*4882a593Smuzhiyun 7
262*4882a593Smuzhiyun 8 P0(int *x, int *y)
263*4882a593Smuzhiyun 9 {
264*4882a593Smuzhiyun10   WRITE_ONCE(*x, 1);
265*4882a593Smuzhiyun11   smp_store_release(y, 1);
266*4882a593Smuzhiyun12 }
267*4882a593Smuzhiyun13
268*4882a593Smuzhiyun14 P1(int *x, int *y)
269*4882a593Smuzhiyun15 {
270*4882a593Smuzhiyun16   int r0;
271*4882a593Smuzhiyun17   int r1;
272*4882a593Smuzhiyun18
273*4882a593Smuzhiyun19   r0 = smp_load_acquire(y);
274*4882a593Smuzhiyun20   r1 = READ_ONCE(*x);
275*4882a593Smuzhiyun21 }
276*4882a593Smuzhiyun22
277*4882a593Smuzhiyun23 exists (1:r0=1 /\ 1:r1=42)
278*4882a593Smuzhiyun
279*4882a593SmuzhiyunLines 3-6 now initialize both "x" and "y" to the value 42.  This also
280*4882a593Smuzhiyunmeans that the "exists" clause on line 23 must change "1:r1=0" to
281*4882a593Smuzhiyun"1:r1=42".
282*4882a593Smuzhiyun
283*4882a593SmuzhiyunRunning the test gives the same overall result as before, but with the
284*4882a593Smuzhiyunvalue 42 appearing in place of the value zero:
285*4882a593Smuzhiyun
286*4882a593Smuzhiyun 1 Test MP+pooncerelease+poacquireonce Allowed
287*4882a593Smuzhiyun 2 States 3
288*4882a593Smuzhiyun 3 1:r0=1; 1:r1=1;
289*4882a593Smuzhiyun 4 1:r0=42; 1:r1=1;
290*4882a593Smuzhiyun 5 1:r0=42; 1:r1=42;
291*4882a593Smuzhiyun 6 No
292*4882a593Smuzhiyun 7 Witnesses
293*4882a593Smuzhiyun 8 Positive: 0 Negative: 3
294*4882a593Smuzhiyun 9 Condition exists (1:r0=1 /\ 1:r1=42)
295*4882a593Smuzhiyun10 Observation MP+pooncerelease+poacquireonce Never 0 3
296*4882a593Smuzhiyun11 Time MP+pooncerelease+poacquireonce 0.02
297*4882a593Smuzhiyun12 Hash=ab9a9b7940a75a792266be279a980156
298*4882a593Smuzhiyun
299*4882a593SmuzhiyunIt is tempting to avoid the open-coded repetitions of the value "42"
300*4882a593Smuzhiyunby defining another global variable "initval=42" and replacing all
301*4882a593Smuzhiyunoccurrences of "42" with "initval".  This will not, repeat *not*,
302*4882a593Smuzhiyuninitialize "x" and "y" to 42, but instead to the address of "initval"
303*4882a593Smuzhiyun(try it!).  See the section below on linked lists to learn more about
304*4882a593Smuzhiyunwhy this approach to initialization can be useful.
305*4882a593Smuzhiyun
306*4882a593Smuzhiyun
307*4882a593SmuzhiyunControl Structures
308*4882a593Smuzhiyun------------------
309*4882a593Smuzhiyun
310*4882a593SmuzhiyunLKMM supports the C-language "if" statement, which allows modeling of
311*4882a593Smuzhiyunconditional branches.  In LKMM, conditional branches can affect ordering,
312*4882a593Smuzhiyunbut only if you are *very* careful (compilers are surprisingly able
313*4882a593Smuzhiyunto optimize away conditional branches).  The following example shows
314*4882a593Smuzhiyunthe "load buffering" (LB) use case that is used in the Linux kernel to
315*4882a593Smuzhiyunsynchronize between ring-buffer producers and consumers.  In the example
316*4882a593Smuzhiyunbelow, P0() is one side checking to see if an operation may proceed and
317*4882a593SmuzhiyunP1() is the other side completing its update.
318*4882a593Smuzhiyun
319*4882a593Smuzhiyun 1 C LB+fencembonceonce+ctrlonceonce
320*4882a593Smuzhiyun 2
321*4882a593Smuzhiyun 3 {}
322*4882a593Smuzhiyun 4
323*4882a593Smuzhiyun 5 P0(int *x, int *y)
324*4882a593Smuzhiyun 6 {
325*4882a593Smuzhiyun 7   int r0;
326*4882a593Smuzhiyun 8
327*4882a593Smuzhiyun 9   r0 = READ_ONCE(*x);
328*4882a593Smuzhiyun10   if (r0)
329*4882a593Smuzhiyun11     WRITE_ONCE(*y, 1);
330*4882a593Smuzhiyun12 }
331*4882a593Smuzhiyun13
332*4882a593Smuzhiyun14 P1(int *x, int *y)
333*4882a593Smuzhiyun15 {
334*4882a593Smuzhiyun16   int r0;
335*4882a593Smuzhiyun17
336*4882a593Smuzhiyun18   r0 = READ_ONCE(*y);
337*4882a593Smuzhiyun19   smp_mb();
338*4882a593Smuzhiyun20   WRITE_ONCE(*x, 1);
339*4882a593Smuzhiyun21 }
340*4882a593Smuzhiyun22
341*4882a593Smuzhiyun23 exists (0:r0=1 /\ 1:r0=1)
342*4882a593Smuzhiyun
343*4882a593SmuzhiyunP1()'s "if" statement on line 10 works as expected, so that line 11 is
344*4882a593Smuzhiyunexecuted only if line 9 loads a non-zero value from "x".  Because P1()'s
345*4882a593Smuzhiyunwrite of "1" to "x" happens only after P1()'s read from "y", one would
346*4882a593Smuzhiyunhope that the "exists" clause cannot be satisfied.  LKMM agrees:
347*4882a593Smuzhiyun
348*4882a593Smuzhiyun 1 Test LB+fencembonceonce+ctrlonceonce Allowed
349*4882a593Smuzhiyun 2 States 2
350*4882a593Smuzhiyun 3 0:r0=0; 1:r0=0;
351*4882a593Smuzhiyun 4 0:r0=1; 1:r0=0;
352*4882a593Smuzhiyun 5 No
353*4882a593Smuzhiyun 6 Witnesses
354*4882a593Smuzhiyun 7 Positive: 0 Negative: 2
355*4882a593Smuzhiyun 8 Condition exists (0:r0=1 /\ 1:r0=1)
356*4882a593Smuzhiyun 9 Observation LB+fencembonceonce+ctrlonceonce Never 0 2
357*4882a593Smuzhiyun10 Time LB+fencembonceonce+ctrlonceonce 0.00
358*4882a593Smuzhiyun11 Hash=e5260556f6de495fd39b556d1b831c3b
359*4882a593Smuzhiyun
360*4882a593SmuzhiyunHowever, there is no "while" statement due to the fact that full
361*4882a593Smuzhiyunstate-space search has some difficulty with iteration.  However, there
362*4882a593Smuzhiyunare tricks that may be used to handle some special cases, which are
363*4882a593Smuzhiyundiscussed below.  In addition, loop-unrolling tricks may be applied,
364*4882a593Smuzhiyunalbeit sparingly.
365*4882a593Smuzhiyun
366*4882a593Smuzhiyun
367*4882a593SmuzhiyunTricks and Traps
368*4882a593Smuzhiyun================
369*4882a593Smuzhiyun
370*4882a593SmuzhiyunThis section covers extracting debug output from herd7, emulating
371*4882a593Smuzhiyunspin loops, handling trivial linked lists, adding comments to litmus tests,
372*4882a593Smuzhiyunemulating call_rcu(), and finally tricks to improve herd7 performance
373*4882a593Smuzhiyunin order to better handle large litmus tests.
374*4882a593Smuzhiyun
375*4882a593Smuzhiyun
376*4882a593SmuzhiyunDebug Output
377*4882a593Smuzhiyun------------
378*4882a593Smuzhiyun
379*4882a593SmuzhiyunBy default, the herd7 state output includes all variables mentioned
380*4882a593Smuzhiyunin the "exists" clause.  But sometimes debugging efforts are greatly
381*4882a593Smuzhiyunaided by the values of other variables.  Consider this litmus test
382*4882a593Smuzhiyun(tools/memory-order/litmus-tests/SB+rfionceonce-poonceonces.litmus but
383*4882a593Smuzhiyunslightly modified), which probes an obscure corner of hardware memory
384*4882a593Smuzhiyunordering:
385*4882a593Smuzhiyun
386*4882a593Smuzhiyun 1 C SB+rfionceonce-poonceonces
387*4882a593Smuzhiyun 2
388*4882a593Smuzhiyun 3 {}
389*4882a593Smuzhiyun 4
390*4882a593Smuzhiyun 5 P0(int *x, int *y)
391*4882a593Smuzhiyun 6 {
392*4882a593Smuzhiyun 7   int r1;
393*4882a593Smuzhiyun 8   int r2;
394*4882a593Smuzhiyun 9
395*4882a593Smuzhiyun10   WRITE_ONCE(*x, 1);
396*4882a593Smuzhiyun11   r1 = READ_ONCE(*x);
397*4882a593Smuzhiyun12   r2 = READ_ONCE(*y);
398*4882a593Smuzhiyun13 }
399*4882a593Smuzhiyun14
400*4882a593Smuzhiyun15 P1(int *x, int *y)
401*4882a593Smuzhiyun16 {
402*4882a593Smuzhiyun17   int r3;
403*4882a593Smuzhiyun18   int r4;
404*4882a593Smuzhiyun19
405*4882a593Smuzhiyun20   WRITE_ONCE(*y, 1);
406*4882a593Smuzhiyun21   r3 = READ_ONCE(*y);
407*4882a593Smuzhiyun22   r4 = READ_ONCE(*x);
408*4882a593Smuzhiyun23 }
409*4882a593Smuzhiyun24
410*4882a593Smuzhiyun25 exists (0:r2=0 /\ 1:r4=0)
411*4882a593Smuzhiyun
412*4882a593SmuzhiyunThe herd7 output is as follows:
413*4882a593Smuzhiyun
414*4882a593Smuzhiyun 1 Test SB+rfionceonce-poonceonces Allowed
415*4882a593Smuzhiyun 2 States 4
416*4882a593Smuzhiyun 3 0:r2=0; 1:r4=0;
417*4882a593Smuzhiyun 4 0:r2=0; 1:r4=1;
418*4882a593Smuzhiyun 5 0:r2=1; 1:r4=0;
419*4882a593Smuzhiyun 6 0:r2=1; 1:r4=1;
420*4882a593Smuzhiyun 7 Ok
421*4882a593Smuzhiyun 8 Witnesses
422*4882a593Smuzhiyun 9 Positive: 1 Negative: 3
423*4882a593Smuzhiyun10 Condition exists (0:r2=0 /\ 1:r4=0)
424*4882a593Smuzhiyun11 Observation SB+rfionceonce-poonceonces Sometimes 1 3
425*4882a593Smuzhiyun12 Time SB+rfionceonce-poonceonces 0.01
426*4882a593Smuzhiyun13 Hash=c7f30fe0faebb7d565405d55b7318ada
427*4882a593Smuzhiyun
428*4882a593Smuzhiyun(This output indicates that CPUs are permitted to "snoop their own
429*4882a593Smuzhiyunstore buffers", which all of Linux's CPU families other than s390 will
430*4882a593Smuzhiyunhappily do.  Such snooping results in disagreement among CPUs on the
431*4882a593Smuzhiyunorder of stores from different CPUs, which is rarely an issue.)
432*4882a593Smuzhiyun
433*4882a593SmuzhiyunBut the herd7 output shows only the two variables mentioned in the
434*4882a593Smuzhiyun"exists" clause.  Someone modifying this test might wish to know the
435*4882a593Smuzhiyunvalues of "x", "y", "0:r1", and "0:r3" as well.  The "locations"
436*4882a593Smuzhiyunstatement on line 25 shows how to cause herd7 to display additional
437*4882a593Smuzhiyunvariables:
438*4882a593Smuzhiyun
439*4882a593Smuzhiyun 1 C SB+rfionceonce-poonceonces
440*4882a593Smuzhiyun 2
441*4882a593Smuzhiyun 3 {}
442*4882a593Smuzhiyun 4
443*4882a593Smuzhiyun 5 P0(int *x, int *y)
444*4882a593Smuzhiyun 6 {
445*4882a593Smuzhiyun 7   int r1;
446*4882a593Smuzhiyun 8   int r2;
447*4882a593Smuzhiyun 9
448*4882a593Smuzhiyun10   WRITE_ONCE(*x, 1);
449*4882a593Smuzhiyun11   r1 = READ_ONCE(*x);
450*4882a593Smuzhiyun12   r2 = READ_ONCE(*y);
451*4882a593Smuzhiyun13 }
452*4882a593Smuzhiyun14
453*4882a593Smuzhiyun15 P1(int *x, int *y)
454*4882a593Smuzhiyun16 {
455*4882a593Smuzhiyun17   int r3;
456*4882a593Smuzhiyun18   int r4;
457*4882a593Smuzhiyun19
458*4882a593Smuzhiyun20   WRITE_ONCE(*y, 1);
459*4882a593Smuzhiyun21   r3 = READ_ONCE(*y);
460*4882a593Smuzhiyun22   r4 = READ_ONCE(*x);
461*4882a593Smuzhiyun23 }
462*4882a593Smuzhiyun24
463*4882a593Smuzhiyun25 locations [0:r1; 1:r3; x; y]
464*4882a593Smuzhiyun26 exists (0:r2=0 /\ 1:r4=0)
465*4882a593Smuzhiyun
466*4882a593SmuzhiyunThe herd7 output then displays the values of all the variables:
467*4882a593Smuzhiyun
468*4882a593Smuzhiyun 1 Test SB+rfionceonce-poonceonces Allowed
469*4882a593Smuzhiyun 2 States 4
470*4882a593Smuzhiyun 3 0:r1=1; 0:r2=0; 1:r3=1; 1:r4=0; x=1; y=1;
471*4882a593Smuzhiyun 4 0:r1=1; 0:r2=0; 1:r3=1; 1:r4=1; x=1; y=1;
472*4882a593Smuzhiyun 5 0:r1=1; 0:r2=1; 1:r3=1; 1:r4=0; x=1; y=1;
473*4882a593Smuzhiyun 6 0:r1=1; 0:r2=1; 1:r3=1; 1:r4=1; x=1; y=1;
474*4882a593Smuzhiyun 7 Ok
475*4882a593Smuzhiyun 8 Witnesses
476*4882a593Smuzhiyun 9 Positive: 1 Negative: 3
477*4882a593Smuzhiyun10 Condition exists (0:r2=0 /\ 1:r4=0)
478*4882a593Smuzhiyun11 Observation SB+rfionceonce-poonceonces Sometimes 1 3
479*4882a593Smuzhiyun12 Time SB+rfionceonce-poonceonces 0.01
480*4882a593Smuzhiyun13 Hash=40de8418c4b395388f6501cafd1ed38d
481*4882a593Smuzhiyun
482*4882a593SmuzhiyunWhat if you would like to know the value of a particular global variable
483*4882a593Smuzhiyunat some particular point in a given process's execution?  One approach
484*4882a593Smuzhiyunis to use a READ_ONCE() to load that global variable into a new local
485*4882a593Smuzhiyunvariable, then add that local variable to the "locations" clause.
486*4882a593SmuzhiyunBut be careful:  In some litmus tests, adding a READ_ONCE() will change
487*4882a593Smuzhiyunthe outcome!  For one example, please see the C-READ_ONCE.litmus and
488*4882a593SmuzhiyunC-READ_ONCE-omitted.litmus tests located here:
489*4882a593Smuzhiyun
490*4882a593Smuzhiyun	https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/
491*4882a593Smuzhiyun
492*4882a593Smuzhiyun
493*4882a593SmuzhiyunSpin Loops
494*4882a593Smuzhiyun----------
495*4882a593Smuzhiyun
496*4882a593SmuzhiyunThe analysis carried out by herd7 explores full state space, which is
497*4882a593Smuzhiyunat best of exponential time complexity.  Adding processes and increasing
498*4882a593Smuzhiyunthe amount of code in a give process can greatly increase execution time.
499*4882a593SmuzhiyunPotentially infinite loops, such as those used to wait for locks to
500*4882a593Smuzhiyunbecome available, are clearly problematic.
501*4882a593Smuzhiyun
502*4882a593SmuzhiyunFortunately, it is possible to avoid state-space explosion by specially
503*4882a593Smuzhiyunmodeling such loops.  For example, the following litmus tests emulates
504*4882a593Smuzhiyunlocking using xchg_acquire(), but instead of enclosing xchg_acquire()
505*4882a593Smuzhiyunin a spin loop, it instead excludes executions that fail to acquire the
506*4882a593Smuzhiyunlock using a herd7 "filter" clause.  Note that for exclusive locking, you
507*4882a593Smuzhiyunare better off using the spin_lock() and spin_unlock() that LKMM directly
508*4882a593Smuzhiyunmodels, if for no other reason that these are much faster.  However, the
509*4882a593Smuzhiyuntechniques illustrated in this section can be used for other purposes,
510*4882a593Smuzhiyunsuch as emulating reader-writer locking, which LKMM does not yet model.
511*4882a593Smuzhiyun
512*4882a593Smuzhiyun 1 C C-SB+l-o-o-u+l-o-o-u-X
513*4882a593Smuzhiyun 2
514*4882a593Smuzhiyun 3 {
515*4882a593Smuzhiyun 4 }
516*4882a593Smuzhiyun 5
517*4882a593Smuzhiyun 6 P0(int *sl, int *x0, int *x1)
518*4882a593Smuzhiyun 7 {
519*4882a593Smuzhiyun 8   int r2;
520*4882a593Smuzhiyun 9   int r1;
521*4882a593Smuzhiyun10
522*4882a593Smuzhiyun11   r2 = xchg_acquire(sl, 1);
523*4882a593Smuzhiyun12   WRITE_ONCE(*x0, 1);
524*4882a593Smuzhiyun13   r1 = READ_ONCE(*x1);
525*4882a593Smuzhiyun14   smp_store_release(sl, 0);
526*4882a593Smuzhiyun15 }
527*4882a593Smuzhiyun16
528*4882a593Smuzhiyun17 P1(int *sl, int *x0, int *x1)
529*4882a593Smuzhiyun18 {
530*4882a593Smuzhiyun19   int r2;
531*4882a593Smuzhiyun20   int r1;
532*4882a593Smuzhiyun21
533*4882a593Smuzhiyun22   r2 = xchg_acquire(sl, 1);
534*4882a593Smuzhiyun23   WRITE_ONCE(*x1, 1);
535*4882a593Smuzhiyun24   r1 = READ_ONCE(*x0);
536*4882a593Smuzhiyun25   smp_store_release(sl, 0);
537*4882a593Smuzhiyun26 }
538*4882a593Smuzhiyun27
539*4882a593Smuzhiyun28 filter (0:r2=0 /\ 1:r2=0)
540*4882a593Smuzhiyun29 exists (0:r1=0 /\ 1:r1=0)
541*4882a593Smuzhiyun
542*4882a593SmuzhiyunThis litmus test may be found here:
543*4882a593Smuzhiyun
544*4882a593Smuzhiyunhttps://git.kernel.org/pub/scm/linux/kernel/git/paulmck/perfbook.git/tree/CodeSamples/formal/herd/C-SB+l-o-o-u+l-o-o-u-X.litmus
545*4882a593Smuzhiyun
546*4882a593SmuzhiyunThis test uses two global variables, "x1" and "x2", and also emulates a
547*4882a593Smuzhiyunsingle global spinlock named "sl".  This spinlock is held by whichever
548*4882a593Smuzhiyunprocess changes the value of "sl" from "0" to "1", and is released when
549*4882a593Smuzhiyunthat process sets "sl" back to "0".  P0()'s lock acquisition is emulated
550*4882a593Smuzhiyunon line 11 using xchg_acquire(), which unconditionally stores the value
551*4882a593Smuzhiyun"1" to "sl" and stores either "0" or "1" to "r2", depending on whether
552*4882a593Smuzhiyunthe lock acquisition was successful or unsuccessful (due to "sl" already
553*4882a593Smuzhiyunhaving the value "1"), respectively.  P1() operates in a similar manner.
554*4882a593Smuzhiyun
555*4882a593SmuzhiyunRather unconventionally, execution appears to proceed to the critical
556*4882a593Smuzhiyunsection on lines 12 and 13 in either case.  Line 14 then uses an
557*4882a593Smuzhiyunsmp_store_release() to store zero to "sl", thus emulating lock release.
558*4882a593Smuzhiyun
559*4882a593SmuzhiyunThe case where xchg_acquire() fails to acquire the lock is handled by
560*4882a593Smuzhiyunthe "filter" clause on line 28, which tells herd7 to keep only those
561*4882a593Smuzhiyunexecutions in which both "0:r2" and "1:r2" are zero, that is to pay
562*4882a593Smuzhiyunattention only to those executions in which both locks are actually
563*4882a593Smuzhiyunacquired.  Thus, the bogus executions that would execute the critical
564*4882a593Smuzhiyunsections are discarded and any effects that they might have had are
565*4882a593Smuzhiyunignored.  Note well that the "filter" clause keeps those executions
566*4882a593Smuzhiyunfor which its expression is satisfied, that is, for which the expression
567*4882a593Smuzhiyunevaluates to true.  In other words, the "filter" clause says what to
568*4882a593Smuzhiyunkeep, not what to discard.
569*4882a593Smuzhiyun
570*4882a593SmuzhiyunThe result of running this test is as follows:
571*4882a593Smuzhiyun
572*4882a593Smuzhiyun 1 Test C-SB+l-o-o-u+l-o-o-u-X Allowed
573*4882a593Smuzhiyun 2 States 2
574*4882a593Smuzhiyun 3 0:r1=0; 1:r1=1;
575*4882a593Smuzhiyun 4 0:r1=1; 1:r1=0;
576*4882a593Smuzhiyun 5 No
577*4882a593Smuzhiyun 6 Witnesses
578*4882a593Smuzhiyun 7 Positive: 0 Negative: 2
579*4882a593Smuzhiyun 8 Condition exists (0:r1=0 /\ 1:r1=0)
580*4882a593Smuzhiyun 9 Observation C-SB+l-o-o-u+l-o-o-u-X Never 0 2
581*4882a593Smuzhiyun10 Time C-SB+l-o-o-u+l-o-o-u-X 0.03
582*4882a593Smuzhiyun
583*4882a593SmuzhiyunThe "Never" on line 9 indicates that this use of xchg_acquire() and
584*4882a593Smuzhiyunsmp_store_release() really does correctly emulate locking.
585*4882a593Smuzhiyun
586*4882a593SmuzhiyunWhy doesn't the litmus test take the simpler approach of using a spin loop
587*4882a593Smuzhiyunto handle failed spinlock acquisitions, like the kernel does?  The key
588*4882a593Smuzhiyuninsight behind this litmus test is that spin loops have no effect on the
589*4882a593Smuzhiyunpossible "exists"-clause outcomes of program execution in the absence
590*4882a593Smuzhiyunof deadlock.  In other words, given a high-quality lock-acquisition
591*4882a593Smuzhiyunprimitive in a deadlock-free program running on high-quality hardware,
592*4882a593Smuzhiyuneach lock acquisition will eventually succeed.  Because herd7 already
593*4882a593Smuzhiyunexplores the full state space, the length of time required to actually
594*4882a593Smuzhiyunacquire the lock does not matter.  After all, herd7 already models all
595*4882a593Smuzhiyunpossible durations of the xchg_acquire() statements.
596*4882a593Smuzhiyun
597*4882a593SmuzhiyunWhy not just add the "filter" clause to the "exists" clause, thus
598*4882a593Smuzhiyunavoiding the "filter" clause entirely?  This does work, but is slower.
599*4882a593SmuzhiyunThe reason that the "filter" clause is faster is that (in the common case)
600*4882a593Smuzhiyunherd7 knows to abandon an execution as soon as the "filter" expression
601*4882a593Smuzhiyunfails to be satisfied.  In contrast, the "exists" clause is evaluated
602*4882a593Smuzhiyunonly at the end of time, thus requiring herd7 to waste time on bogus
603*4882a593Smuzhiyunexecutions in which both critical sections proceed concurrently.  In
604*4882a593Smuzhiyunaddition, some LKMM users like the separation of concerns provided by
605*4882a593Smuzhiyunusing the both the "filter" and "exists" clauses.
606*4882a593Smuzhiyun
607*4882a593SmuzhiyunReaders lacking a pathological interest in odd corner cases should feel
608*4882a593Smuzhiyunfree to skip the remainder of this section.
609*4882a593Smuzhiyun
610*4882a593SmuzhiyunBut what if the litmus test were to temporarily set "0:r2" to a non-zero
611*4882a593Smuzhiyunvalue?  Wouldn't that cause herd7 to abandon the execution prematurely
612*4882a593Smuzhiyundue to an early mismatch of the "filter" clause?
613*4882a593Smuzhiyun
614*4882a593SmuzhiyunWhy not just try it?  Line 4 of the following modified litmus test
615*4882a593Smuzhiyunintroduces a new global variable "x2" that is initialized to "1".  Line 23
616*4882a593Smuzhiyunof P1() reads that variable into "1:r2" to force an early mismatch with
617*4882a593Smuzhiyunthe "filter" clause.  Line 24 does a known-true "if" condition to avoid
618*4882a593Smuzhiyunand static analysis that herd7 might do.  Finally the "exists" clause
619*4882a593Smuzhiyunon line 32 is updated to a condition that is alway satisfied at the end
620*4882a593Smuzhiyunof the test.
621*4882a593Smuzhiyun
622*4882a593Smuzhiyun 1 C C-SB+l-o-o-u+l-o-o-u-X
623*4882a593Smuzhiyun 2
624*4882a593Smuzhiyun 3 {
625*4882a593Smuzhiyun 4   x2=1;
626*4882a593Smuzhiyun 5 }
627*4882a593Smuzhiyun 6
628*4882a593Smuzhiyun 7 P0(int *sl, int *x0, int *x1)
629*4882a593Smuzhiyun 8 {
630*4882a593Smuzhiyun 9   int r2;
631*4882a593Smuzhiyun10   int r1;
632*4882a593Smuzhiyun11
633*4882a593Smuzhiyun12   r2 = xchg_acquire(sl, 1);
634*4882a593Smuzhiyun13   WRITE_ONCE(*x0, 1);
635*4882a593Smuzhiyun14   r1 = READ_ONCE(*x1);
636*4882a593Smuzhiyun15   smp_store_release(sl, 0);
637*4882a593Smuzhiyun16 }
638*4882a593Smuzhiyun17
639*4882a593Smuzhiyun18 P1(int *sl, int *x0, int *x1, int *x2)
640*4882a593Smuzhiyun19 {
641*4882a593Smuzhiyun20   int r2;
642*4882a593Smuzhiyun21   int r1;
643*4882a593Smuzhiyun22
644*4882a593Smuzhiyun23   r2 = READ_ONCE(*x2);
645*4882a593Smuzhiyun24   if (r2)
646*4882a593Smuzhiyun25     r2 = xchg_acquire(sl, 1);
647*4882a593Smuzhiyun26   WRITE_ONCE(*x1, 1);
648*4882a593Smuzhiyun27   r1 = READ_ONCE(*x0);
649*4882a593Smuzhiyun28   smp_store_release(sl, 0);
650*4882a593Smuzhiyun29 }
651*4882a593Smuzhiyun30
652*4882a593Smuzhiyun31 filter (0:r2=0 /\ 1:r2=0)
653*4882a593Smuzhiyun32 exists (x1=1)
654*4882a593Smuzhiyun
655*4882a593SmuzhiyunIf the "filter" clause were to check each variable at each point in the
656*4882a593Smuzhiyunexecution, running this litmus test would display no executions because
657*4882a593Smuzhiyunall executions would be filtered out at line 23.  However, the output
658*4882a593Smuzhiyunis instead as follows:
659*4882a593Smuzhiyun
660*4882a593Smuzhiyun 1 Test C-SB+l-o-o-u+l-o-o-u-X Allowed
661*4882a593Smuzhiyun 2 States 1
662*4882a593Smuzhiyun 3 x1=1;
663*4882a593Smuzhiyun 4 Ok
664*4882a593Smuzhiyun 5 Witnesses
665*4882a593Smuzhiyun 6 Positive: 2 Negative: 0
666*4882a593Smuzhiyun 7 Condition exists (x1=1)
667*4882a593Smuzhiyun 8 Observation C-SB+l-o-o-u+l-o-o-u-X Always 2 0
668*4882a593Smuzhiyun 9 Time C-SB+l-o-o-u+l-o-o-u-X 0.04
669*4882a593Smuzhiyun10 Hash=080bc508da7f291e122c6de76c0088e3
670*4882a593Smuzhiyun
671*4882a593SmuzhiyunLine 3 shows that there is one execution that did not get filtered out,
672*4882a593Smuzhiyunso the "filter" clause is evaluated only on the last assignment to
673*4882a593Smuzhiyunthe variables that it checks.  In this case, the "filter" clause is a
674*4882a593Smuzhiyundisjunction, so it might be evaluated twice, once at the final (and only)
675*4882a593Smuzhiyunassignment to "0:r2" and once at the final assignment to "1:r2".
676*4882a593Smuzhiyun
677*4882a593Smuzhiyun
678*4882a593SmuzhiyunLinked Lists
679*4882a593Smuzhiyun------------
680*4882a593Smuzhiyun
681*4882a593SmuzhiyunLKMM can handle linked lists, but only linked lists in which each node
682*4882a593Smuzhiyuncontains nothing except a pointer to the next node in the list.  This is
683*4882a593Smuzhiyunof course quite restrictive, but there is nevertheless quite a bit that
684*4882a593Smuzhiyuncan be done within these confines, as can be seen in the litmus test
685*4882a593Smuzhiyunat tools/memory-model/litmus-tests/MP+onceassign+derefonce.litmus:
686*4882a593Smuzhiyun
687*4882a593Smuzhiyun 1 C MP+onceassign+derefonce
688*4882a593Smuzhiyun 2
689*4882a593Smuzhiyun 3 {
690*4882a593Smuzhiyun 4 y=z;
691*4882a593Smuzhiyun 5 z=0;
692*4882a593Smuzhiyun 6 }
693*4882a593Smuzhiyun 7
694*4882a593Smuzhiyun 8 P0(int *x, int **y)
695*4882a593Smuzhiyun 9 {
696*4882a593Smuzhiyun10   WRITE_ONCE(*x, 1);
697*4882a593Smuzhiyun11   rcu_assign_pointer(*y, x);
698*4882a593Smuzhiyun12 }
699*4882a593Smuzhiyun13
700*4882a593Smuzhiyun14 P1(int *x, int **y)
701*4882a593Smuzhiyun15 {
702*4882a593Smuzhiyun16   int *r0;
703*4882a593Smuzhiyun17   int r1;
704*4882a593Smuzhiyun18
705*4882a593Smuzhiyun19   rcu_read_lock();
706*4882a593Smuzhiyun20   r0 = rcu_dereference(*y);
707*4882a593Smuzhiyun21   r1 = READ_ONCE(*r0);
708*4882a593Smuzhiyun22   rcu_read_unlock();
709*4882a593Smuzhiyun23 }
710*4882a593Smuzhiyun24
711*4882a593Smuzhiyun25 exists (1:r0=x /\ 1:r1=0)
712*4882a593Smuzhiyun
713*4882a593SmuzhiyunLine 4's "y=z" may seem odd, given that "z" has not yet been initialized.
714*4882a593SmuzhiyunBut "y=z" does not set the value of "y" to that of "z", but instead
715*4882a593Smuzhiyunsets the value of "y" to the *address* of "z".  Lines 4 and 5 therefore
716*4882a593Smuzhiyuncreate a simple linked list, with "y" pointing to "z" and "z" having a
717*4882a593SmuzhiyunNULL pointer.  A much longer linked list could be created if desired,
718*4882a593Smuzhiyunand circular singly linked lists can also be created and manipulated.
719*4882a593Smuzhiyun
720*4882a593SmuzhiyunThe "exists" clause works the same way, with the "1:r0=x" comparing P1()'s
721*4882a593Smuzhiyun"r0" not to the value of "x", but again to its address.  This term of the
722*4882a593Smuzhiyun"exists" clause therefore tests whether line 20's load from "y" saw the
723*4882a593Smuzhiyunvalue stored by line 11, which is in fact what is required in this case.
724*4882a593Smuzhiyun
725*4882a593SmuzhiyunP0()'s line 10 initializes "x" to the value 1 then line 11 links to "x"
726*4882a593Smuzhiyunfrom "y", replacing "z".
727*4882a593Smuzhiyun
728*4882a593SmuzhiyunP1()'s line 20 loads a pointer from "y", and line 21 dereferences that
729*4882a593Smuzhiyunpointer.  The RCU read-side critical section spanning lines 19-22 is just
730*4882a593Smuzhiyunfor show in this example.  Note that the address used for line 21's load
731*4882a593Smuzhiyundepends on (in this case, "is exactly the same as") the value loaded by
732*4882a593Smuzhiyunline 20.  This is an example of what is called an "address dependency".
733*4882a593SmuzhiyunThis particular address dependency extends from the load on line 20 to the
734*4882a593Smuzhiyunload on line 21.  Address dependencies provide a weak form of ordering.
735*4882a593Smuzhiyun
736*4882a593SmuzhiyunRunning this test results in the following:
737*4882a593Smuzhiyun
738*4882a593Smuzhiyun 1 Test MP+onceassign+derefonce Allowed
739*4882a593Smuzhiyun 2 States 2
740*4882a593Smuzhiyun 3 1:r0=x; 1:r1=1;
741*4882a593Smuzhiyun 4 1:r0=z; 1:r1=0;
742*4882a593Smuzhiyun 5 No
743*4882a593Smuzhiyun 6 Witnesses
744*4882a593Smuzhiyun 7 Positive: 0 Negative: 2
745*4882a593Smuzhiyun 8 Condition exists (1:r0=x /\ 1:r1=0)
746*4882a593Smuzhiyun 9 Observation MP+onceassign+derefonce Never 0 2
747*4882a593Smuzhiyun10 Time MP+onceassign+derefonce 0.00
748*4882a593Smuzhiyun11 Hash=49ef7a741563570102448a256a0c8568
749*4882a593Smuzhiyun
750*4882a593SmuzhiyunThe only possible outcomes feature P1() loading a pointer to "z"
751*4882a593Smuzhiyun(which contains zero) on the one hand and P1() loading a pointer to "x"
752*4882a593Smuzhiyun(which contains the value one) on the other.  This should be reassuring
753*4882a593Smuzhiyunbecause it says that RCU readers cannot see the old preinitialization
754*4882a593Smuzhiyunvalues when accessing a newly inserted list node.  This undesirable
755*4882a593Smuzhiyunscenario is flagged by the "exists" clause, and would occur if P1()
756*4882a593Smuzhiyunloaded a pointer to "x", but obtained the pre-initialization value of
757*4882a593Smuzhiyunzero after dereferencing that pointer.
758*4882a593Smuzhiyun
759*4882a593Smuzhiyun
760*4882a593SmuzhiyunComments
761*4882a593Smuzhiyun--------
762*4882a593Smuzhiyun
763*4882a593SmuzhiyunDifferent portions of a litmus test are processed by different parsers,
764*4882a593Smuzhiyunwhich has the charming effect of requiring different comment syntax in
765*4882a593Smuzhiyundifferent portions of the litmus test.  The C-syntax portions use
766*4882a593SmuzhiyunC-language comments (either "/* */" or "//"), while the other portions
767*4882a593Smuzhiyunuse Ocaml comments "(* *)".
768*4882a593Smuzhiyun
769*4882a593SmuzhiyunThe following litmus test illustrates the comment style corresponding
770*4882a593Smuzhiyunto each syntactic unit of the test:
771*4882a593Smuzhiyun
772*4882a593Smuzhiyun 1 C MP+onceassign+derefonce (* A *)
773*4882a593Smuzhiyun 2
774*4882a593Smuzhiyun 3 (* B *)
775*4882a593Smuzhiyun 4
776*4882a593Smuzhiyun 5 {
777*4882a593Smuzhiyun 6 y=z; (* C *)
778*4882a593Smuzhiyun 7 z=0;
779*4882a593Smuzhiyun 8 } // D
780*4882a593Smuzhiyun 9
781*4882a593Smuzhiyun10 // E
782*4882a593Smuzhiyun11
783*4882a593Smuzhiyun12 P0(int *x, int **y) // F
784*4882a593Smuzhiyun13 {
785*4882a593Smuzhiyun14   WRITE_ONCE(*x, 1);  // G
786*4882a593Smuzhiyun15   rcu_assign_pointer(*y, x);
787*4882a593Smuzhiyun16 }
788*4882a593Smuzhiyun17
789*4882a593Smuzhiyun18 // H
790*4882a593Smuzhiyun19
791*4882a593Smuzhiyun20 P1(int *x, int **y)
792*4882a593Smuzhiyun21 {
793*4882a593Smuzhiyun22   int *r0;
794*4882a593Smuzhiyun23   int r1;
795*4882a593Smuzhiyun24
796*4882a593Smuzhiyun25   rcu_read_lock();
797*4882a593Smuzhiyun26   r0 = rcu_dereference(*y);
798*4882a593Smuzhiyun27   r1 = READ_ONCE(*r0);
799*4882a593Smuzhiyun28   rcu_read_unlock();
800*4882a593Smuzhiyun29 }
801*4882a593Smuzhiyun30
802*4882a593Smuzhiyun31 // I
803*4882a593Smuzhiyun32
804*4882a593Smuzhiyun33 exists (* J *) (1:r0=x /\ (* K *) 1:r1=0) (* L *)
805*4882a593Smuzhiyun
806*4882a593SmuzhiyunIn short, use C-language comments in the C code and Ocaml comments in
807*4882a593Smuzhiyunthe rest of the litmus test.
808*4882a593Smuzhiyun
809*4882a593SmuzhiyunOn the other hand, if you prefer C-style comments everywhere, the
810*4882a593SmuzhiyunC preprocessor is your friend.
811*4882a593Smuzhiyun
812*4882a593Smuzhiyun
813*4882a593SmuzhiyunAsynchronous RCU Grace Periods
814*4882a593Smuzhiyun------------------------------
815*4882a593Smuzhiyun
816*4882a593SmuzhiyunThe following litmus test is derived from the example show in
817*4882a593SmuzhiyunDocumentation/litmus-tests/rcu/RCU+sync+free.litmus, but converted to
818*4882a593Smuzhiyunemulate call_rcu():
819*4882a593Smuzhiyun
820*4882a593Smuzhiyun 1 C RCU+sync+free
821*4882a593Smuzhiyun 2
822*4882a593Smuzhiyun 3 {
823*4882a593Smuzhiyun 4 int x = 1;
824*4882a593Smuzhiyun 5 int *y = &x;
825*4882a593Smuzhiyun 6 int z = 1;
826*4882a593Smuzhiyun 7 }
827*4882a593Smuzhiyun 8
828*4882a593Smuzhiyun 9 P0(int *x, int *z, int **y)
829*4882a593Smuzhiyun10 {
830*4882a593Smuzhiyun11   int *r0;
831*4882a593Smuzhiyun12   int r1;
832*4882a593Smuzhiyun13
833*4882a593Smuzhiyun14   rcu_read_lock();
834*4882a593Smuzhiyun15   r0 = rcu_dereference(*y);
835*4882a593Smuzhiyun16   r1 = READ_ONCE(*r0);
836*4882a593Smuzhiyun17   rcu_read_unlock();
837*4882a593Smuzhiyun18 }
838*4882a593Smuzhiyun19
839*4882a593Smuzhiyun20 P1(int *z, int **y, int *c)
840*4882a593Smuzhiyun21 {
841*4882a593Smuzhiyun22   rcu_assign_pointer(*y, z);
842*4882a593Smuzhiyun23   smp_store_release(*c, 1); // Emulate call_rcu().
843*4882a593Smuzhiyun24 }
844*4882a593Smuzhiyun25
845*4882a593Smuzhiyun26 P2(int *x, int *z, int **y, int *c)
846*4882a593Smuzhiyun27 {
847*4882a593Smuzhiyun28   int r0;
848*4882a593Smuzhiyun29
849*4882a593Smuzhiyun30   r0 = smp_load_acquire(*c); // Note call_rcu() request.
850*4882a593Smuzhiyun31   synchronize_rcu(); // Wait one grace period.
851*4882a593Smuzhiyun32   WRITE_ONCE(*x, 0); // Emulate the RCU callback.
852*4882a593Smuzhiyun33 }
853*4882a593Smuzhiyun34
854*4882a593Smuzhiyun35 filter (2:r0=1) (* Reject too-early starts. *)
855*4882a593Smuzhiyun36 exists (0:r0=x /\ 0:r1=0)
856*4882a593Smuzhiyun
857*4882a593SmuzhiyunLines 4-6 initialize a linked list headed by "y" that initially contains
858*4882a593Smuzhiyun"x".  In addition, "z" is pre-initialized to prepare for P1(), which
859*4882a593Smuzhiyunwill replace "x" with "z" in this list.
860*4882a593Smuzhiyun
861*4882a593SmuzhiyunP0() on lines 9-18 enters an RCU read-side critical section, loads the
862*4882a593Smuzhiyunlist header "y" and dereferences it, leaving the node in "0:r0" and
863*4882a593Smuzhiyunthe node's value in "0:r1".
864*4882a593Smuzhiyun
865*4882a593SmuzhiyunP1() on lines 20-24 updates the list header to instead reference "z",
866*4882a593Smuzhiyunthen emulates call_rcu() by doing a release store into "c".
867*4882a593Smuzhiyun
868*4882a593SmuzhiyunP2() on lines 27-33 emulates the behind-the-scenes effect of doing a
869*4882a593Smuzhiyuncall_rcu().  Line 30 first does an acquire load from "c", then line 31
870*4882a593Smuzhiyunwaits for an RCU grace period to elapse, and finally line 32 emulates
871*4882a593Smuzhiyunthe RCU callback, which in turn emulates a call to kfree().
872*4882a593Smuzhiyun
873*4882a593SmuzhiyunOf course, it is possible for P2() to start too soon, so that the
874*4882a593Smuzhiyunvalue of "2:r0" is zero rather than the required value of "1".
875*4882a593SmuzhiyunThe "filter" clause on line 35 handles this possibility, rejecting
876*4882a593Smuzhiyunall executions in which "2:r0" is not equal to the value "1".
877*4882a593Smuzhiyun
878*4882a593Smuzhiyun
879*4882a593SmuzhiyunPerformance
880*4882a593Smuzhiyun-----------
881*4882a593Smuzhiyun
882*4882a593SmuzhiyunLKMM's exploration of the full state-space can be extremely helpful,
883*4882a593Smuzhiyunbut it does not come for free.  The price is exponential computational
884*4882a593Smuzhiyuncomplexity in terms of the number of processes, the average number
885*4882a593Smuzhiyunof statements in each process, and the total number of stores in the
886*4882a593Smuzhiyunlitmus test.
887*4882a593Smuzhiyun
888*4882a593SmuzhiyunSo it is best to start small and then work up.  Where possible, break
889*4882a593Smuzhiyunyour code down into small pieces each representing a core concurrency
890*4882a593Smuzhiyunrequirement.
891*4882a593Smuzhiyun
892*4882a593SmuzhiyunThat said, herd7 is quite fast.  On an unprepossessing x86 laptop, it
893*4882a593Smuzhiyunwas able to analyze the following 10-process RCU litmus test in about
894*4882a593Smuzhiyunsix seconds.
895*4882a593Smuzhiyun
896*4882a593Smuzhiyunhttps://github.com/paulmckrcu/litmus/blob/master/auto/C-RW-R+RW-R+RW-G+RW-G+RW-G+RW-G+RW-R+RW-R+RW-R+RW-R.litmus
897*4882a593Smuzhiyun
898*4882a593SmuzhiyunOne way to make herd7 run faster is to use the "-speedcheck true" option.
899*4882a593SmuzhiyunThis option prevents herd7 from generating all possible end states,
900*4882a593Smuzhiyuninstead causing it to focus solely on whether or not the "exists"
901*4882a593Smuzhiyunclause can be satisfied.  With this option, herd7 evaluates the above
902*4882a593Smuzhiyunlitmus test in about 300 milliseconds, for more than an order of magnitude
903*4882a593Smuzhiyunimprovement in performance.
904*4882a593Smuzhiyun
905*4882a593SmuzhiyunLarger 16-process litmus tests that would normally consume 15 minutes
906*4882a593Smuzhiyunof time complete in about 40 seconds with this option.  To be fair,
907*4882a593Smuzhiyunyou do get an extra 65,535 states when you leave off the "-speedcheck
908*4882a593Smuzhiyuntrue" option.
909*4882a593Smuzhiyun
910*4882a593Smuzhiyunhttps://github.com/paulmckrcu/litmus/blob/master/auto/C-RW-R+RW-R+RW-G+RW-G+RW-G+RW-G+RW-R+RW-R+RW-R+RW-R+RW-G+RW-G+RW-G+RW-G+RW-R+RW-R.litmus
911*4882a593Smuzhiyun
912*4882a593SmuzhiyunNevertheless, litmus-test analysis really is of exponential complexity,
913*4882a593Smuzhiyunwhether with or without "-speedcheck true".  Increasing by just three
914*4882a593Smuzhiyunprocesses to a 19-process litmus test requires 2 hours and 40 minutes
915*4882a593Smuzhiyunwithout, and about 8 minutes with "-speedcheck true".  Each of these
916*4882a593Smuzhiyunresults represent roughly an order of magnitude slowdown compared to the
917*4882a593Smuzhiyun16-process litmus test.  Again, to be fair, the multi-hour run explores
918*4882a593Smuzhiyunno fewer than 524,287 additional states compared to the shorter one.
919*4882a593Smuzhiyun
920*4882a593Smuzhiyunhttps://github.com/paulmckrcu/litmus/blob/master/auto/C-RW-R+RW-R+RW-G+RW-G+RW-G+RW-G+RW-R+RW-R+RW-R+RW-R+RW-R+RW-R+RW-G+RW-G+RW-G+RW-G+RW-R+RW-R+RW-R.litmus
921*4882a593Smuzhiyun
922*4882a593SmuzhiyunIf you don't like command-line arguments, you can obtain a similar speedup
923*4882a593Smuzhiyunby adding a "filter" clause with exactly the same expression as your
924*4882a593Smuzhiyun"exists" clause.
925*4882a593Smuzhiyun
926*4882a593SmuzhiyunHowever, please note that seeing the full set of states can be extremely
927*4882a593Smuzhiyunhelpful when developing and debugging litmus tests.
928*4882a593Smuzhiyun
929*4882a593Smuzhiyun
930*4882a593SmuzhiyunLIMITATIONS
931*4882a593Smuzhiyun===========
932*4882a593Smuzhiyun
933*4882a593SmuzhiyunLimitations of the Linux-kernel memory model (LKMM) include:
934*4882a593Smuzhiyun
935*4882a593Smuzhiyun1.	Compiler optimizations are not accurately modeled.  Of course,
936*4882a593Smuzhiyun	the use of READ_ONCE() and WRITE_ONCE() limits the compiler's
937*4882a593Smuzhiyun	ability to optimize, but under some circumstances it is possible
938*4882a593Smuzhiyun	for the compiler to undermine the memory model.  For more
939*4882a593Smuzhiyun	information, see Documentation/explanation.txt (in particular,
940*4882a593Smuzhiyun	the "THE PROGRAM ORDER RELATION: po AND po-loc" and "A WARNING"
941*4882a593Smuzhiyun	sections).
942*4882a593Smuzhiyun
943*4882a593Smuzhiyun	Note that this limitation in turn limits LKMM's ability to
944*4882a593Smuzhiyun	accurately model address, control, and data dependencies.
945*4882a593Smuzhiyun	For example, if the compiler can deduce the value of some variable
946*4882a593Smuzhiyun	carrying a dependency, then the compiler can break that dependency
947*4882a593Smuzhiyun	by substituting a constant of that value.
948*4882a593Smuzhiyun
949*4882a593Smuzhiyun2.	Multiple access sizes for a single variable are not supported,
950*4882a593Smuzhiyun	and neither are misaligned or partially overlapping accesses.
951*4882a593Smuzhiyun
952*4882a593Smuzhiyun3.	Exceptions and interrupts are not modeled.  In some cases,
953*4882a593Smuzhiyun	this limitation can be overcome by modeling the interrupt or
954*4882a593Smuzhiyun	exception with an additional process.
955*4882a593Smuzhiyun
956*4882a593Smuzhiyun4.	I/O such as MMIO or DMA is not supported.
957*4882a593Smuzhiyun
958*4882a593Smuzhiyun5.	Self-modifying code (such as that found in the kernel's
959*4882a593Smuzhiyun	alternatives mechanism, function tracer, Berkeley Packet Filter
960*4882a593Smuzhiyun	JIT compiler, and module loader) is not supported.
961*4882a593Smuzhiyun
962*4882a593Smuzhiyun6.	Complete modeling of all variants of atomic read-modify-write
963*4882a593Smuzhiyun	operations, locking primitives, and RCU is not provided.
964*4882a593Smuzhiyun	For example, call_rcu() and rcu_barrier() are not supported.
965*4882a593Smuzhiyun	However, a substantial amount of support is provided for these
966*4882a593Smuzhiyun	operations, as shown in the linux-kernel.def file.
967*4882a593Smuzhiyun
968*4882a593Smuzhiyun	Here are specific limitations:
969*4882a593Smuzhiyun
970*4882a593Smuzhiyun	a.	When rcu_assign_pointer() is passed NULL, the Linux
971*4882a593Smuzhiyun		kernel provides no ordering, but LKMM models this
972*4882a593Smuzhiyun		case as a store release.
973*4882a593Smuzhiyun
974*4882a593Smuzhiyun	b.	The "unless" RMW operations are not currently modeled:
975*4882a593Smuzhiyun		atomic_long_add_unless(), atomic_inc_unless_negative(),
976*4882a593Smuzhiyun		and atomic_dec_unless_positive().  These can be emulated
977*4882a593Smuzhiyun		in litmus tests, for example, by using atomic_cmpxchg().
978*4882a593Smuzhiyun
979*4882a593Smuzhiyun		One exception of this limitation is atomic_add_unless(),
980*4882a593Smuzhiyun		which is provided directly by herd7 (so no corresponding
981*4882a593Smuzhiyun		definition in linux-kernel.def).  atomic_add_unless() is
982*4882a593Smuzhiyun		modeled by herd7 therefore it can be used in litmus tests.
983*4882a593Smuzhiyun
984*4882a593Smuzhiyun	c.	The call_rcu() function is not modeled.  As was shown above,
985*4882a593Smuzhiyun		it can be emulated in litmus tests by adding another
986*4882a593Smuzhiyun		process that invokes synchronize_rcu() and the body of the
987*4882a593Smuzhiyun		callback function, with (for example) a release-acquire
988*4882a593Smuzhiyun		from the site of the emulated call_rcu() to the beginning
989*4882a593Smuzhiyun		of the additional process.
990*4882a593Smuzhiyun
991*4882a593Smuzhiyun	d.	The rcu_barrier() function is not modeled.  It can be
992*4882a593Smuzhiyun		emulated in litmus tests emulating call_rcu() via
993*4882a593Smuzhiyun		(for example) a release-acquire from the end of each
994*4882a593Smuzhiyun		additional call_rcu() process to the site of the
995*4882a593Smuzhiyun		emulated rcu-barrier().
996*4882a593Smuzhiyun
997*4882a593Smuzhiyun	e.	Although sleepable RCU (SRCU) is now modeled, there
998*4882a593Smuzhiyun		are some subtle differences between its semantics and
999*4882a593Smuzhiyun		those in the Linux kernel.  For example, the kernel
1000*4882a593Smuzhiyun		might interpret the following sequence as two partially
1001*4882a593Smuzhiyun		overlapping SRCU read-side critical sections:
1002*4882a593Smuzhiyun
1003*4882a593Smuzhiyun			 1  r1 = srcu_read_lock(&my_srcu);
1004*4882a593Smuzhiyun			 2  do_something_1();
1005*4882a593Smuzhiyun			 3  r2 = srcu_read_lock(&my_srcu);
1006*4882a593Smuzhiyun			 4  do_something_2();
1007*4882a593Smuzhiyun			 5  srcu_read_unlock(&my_srcu, r1);
1008*4882a593Smuzhiyun			 6  do_something_3();
1009*4882a593Smuzhiyun			 7  srcu_read_unlock(&my_srcu, r2);
1010*4882a593Smuzhiyun
1011*4882a593Smuzhiyun		In contrast, LKMM will interpret this as a nested pair of
1012*4882a593Smuzhiyun		SRCU read-side critical sections, with the outer critical
1013*4882a593Smuzhiyun		section spanning lines 1-7 and the inner critical section
1014*4882a593Smuzhiyun		spanning lines 3-5.
1015*4882a593Smuzhiyun
1016*4882a593Smuzhiyun		This difference would be more of a concern had anyone
1017*4882a593Smuzhiyun		identified a reasonable use case for partially overlapping
1018*4882a593Smuzhiyun		SRCU read-side critical sections.  For more information
1019*4882a593Smuzhiyun		on the trickiness of such overlapping, please see:
1020*4882a593Smuzhiyun		https://paulmck.livejournal.com/40593.html
1021*4882a593Smuzhiyun
1022*4882a593Smuzhiyun	f.	Reader-writer locking is not modeled.  It can be
1023*4882a593Smuzhiyun		emulated in litmus tests using atomic read-modify-write
1024*4882a593Smuzhiyun		operations.
1025*4882a593Smuzhiyun
1026*4882a593SmuzhiyunThe fragment of the C language supported by these litmus tests is quite
1027*4882a593Smuzhiyunlimited and in some ways non-standard:
1028*4882a593Smuzhiyun
1029*4882a593Smuzhiyun1.	There is no automatic C-preprocessor pass.  You can of course
1030*4882a593Smuzhiyun	run it manually, if you choose.
1031*4882a593Smuzhiyun
1032*4882a593Smuzhiyun2.	There is no way to create functions other than the Pn() functions
1033*4882a593Smuzhiyun	that model the concurrent processes.
1034*4882a593Smuzhiyun
1035*4882a593Smuzhiyun3.	The Pn() functions' formal parameters must be pointers to the
1036*4882a593Smuzhiyun	global shared variables.  Nothing can be passed by value into
1037*4882a593Smuzhiyun	these functions.
1038*4882a593Smuzhiyun
1039*4882a593Smuzhiyun4.	The only functions that can be invoked are those built directly
1040*4882a593Smuzhiyun	into herd7 or that are defined in the linux-kernel.def file.
1041*4882a593Smuzhiyun
1042*4882a593Smuzhiyun5.	The "switch", "do", "for", "while", and "goto" C statements are
1043*4882a593Smuzhiyun	not supported.	The "switch" statement can be emulated by the
1044*4882a593Smuzhiyun	"if" statement.  The "do", "for", and "while" statements can
1045*4882a593Smuzhiyun	often be emulated by manually unrolling the loop, or perhaps by
1046*4882a593Smuzhiyun	enlisting the aid of the C preprocessor to minimize the resulting
1047*4882a593Smuzhiyun	code duplication.  Some uses of "goto" can be emulated by "if",
1048*4882a593Smuzhiyun	and some others by unrolling.
1049*4882a593Smuzhiyun
1050*4882a593Smuzhiyun6.	Although you can use a wide variety of types in litmus-test
1051*4882a593Smuzhiyun	variable declarations, and especially in global-variable
1052*4882a593Smuzhiyun	declarations, the "herd7" tool understands only int and
1053*4882a593Smuzhiyun	pointer types.	There is no support for floating-point types,
1054*4882a593Smuzhiyun	enumerations, characters, strings, arrays, or structures.
1055*4882a593Smuzhiyun
1056*4882a593Smuzhiyun7.	Parsing of variable declarations is very loose, with almost no
1057*4882a593Smuzhiyun	type checking.
1058*4882a593Smuzhiyun
1059*4882a593Smuzhiyun8.	Initializers differ from their C-language counterparts.
1060*4882a593Smuzhiyun	For example, when an initializer contains the name of a shared
1061*4882a593Smuzhiyun	variable, that name denotes a pointer to that variable, not
1062*4882a593Smuzhiyun	the current value of that variable.  For example, "int x = y"
1063*4882a593Smuzhiyun	is interpreted the way "int x = &y" would be in C.
1064*4882a593Smuzhiyun
1065*4882a593Smuzhiyun9.	Dynamic memory allocation is not supported, although this can
1066*4882a593Smuzhiyun	be worked around in some cases by supplying multiple statically
1067*4882a593Smuzhiyun	allocated variables.
1068*4882a593Smuzhiyun
1069*4882a593SmuzhiyunSome of these limitations may be overcome in the future, but others are
1070*4882a593Smuzhiyunmore likely to be addressed by incorporating the Linux-kernel memory model
1071*4882a593Smuzhiyuninto other tools.
1072*4882a593Smuzhiyun
1073*4882a593SmuzhiyunFinally, please note that LKMM is subject to change as hardware, use cases,
1074*4882a593Smuzhiyunand compilers evolve.
1075