1*4882a593SmuzhiyunLinux-Kernel Memory Model Litmus Tests 2*4882a593Smuzhiyun====================================== 3*4882a593Smuzhiyun 4*4882a593SmuzhiyunThis file describes the LKMM litmus-test format by example, describes 5*4882a593Smuzhiyunsome tricks and traps, and finally outlines LKMM's limitations. Earlier 6*4882a593Smuzhiyunversions of this material appeared in a number of LWN articles, including: 7*4882a593Smuzhiyun 8*4882a593Smuzhiyunhttps://lwn.net/Articles/720550/ 9*4882a593Smuzhiyun A formal kernel memory-ordering model (part 2) 10*4882a593Smuzhiyunhttps://lwn.net/Articles/608550/ 11*4882a593Smuzhiyun Axiomatic validation of memory barriers and atomic instructions 12*4882a593Smuzhiyunhttps://lwn.net/Articles/470681/ 13*4882a593Smuzhiyun Validating Memory Barriers and Atomic Instructions 14*4882a593Smuzhiyun 15*4882a593SmuzhiyunThis document presents information in decreasing order of applicability, 16*4882a593Smuzhiyunso that, where possible, the information that has proven more commonly 17*4882a593Smuzhiyunuseful is shown near the beginning. 18*4882a593Smuzhiyun 19*4882a593SmuzhiyunFor information on installing LKMM, including the underlying "herd7" 20*4882a593Smuzhiyuntool, please see tools/memory-model/README. 21*4882a593Smuzhiyun 22*4882a593Smuzhiyun 23*4882a593SmuzhiyunCopy-Pasta 24*4882a593Smuzhiyun========== 25*4882a593Smuzhiyun 26*4882a593SmuzhiyunAs with other software, it is often better (if less macho) to adapt an 27*4882a593Smuzhiyunexisting litmus test than it is to create one from scratch. A number 28*4882a593Smuzhiyunof litmus tests may be found in the kernel source tree: 29*4882a593Smuzhiyun 30*4882a593Smuzhiyun tools/memory-model/litmus-tests/ 31*4882a593Smuzhiyun Documentation/litmus-tests/ 32*4882a593Smuzhiyun 33*4882a593SmuzhiyunSeveral thousand more example litmus tests are available on github 34*4882a593Smuzhiyunand kernel.org: 35*4882a593Smuzhiyun 36*4882a593Smuzhiyun https://github.com/paulmckrcu/litmus 37*4882a593Smuzhiyun https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/perfbook.git/tree/CodeSamples/formal/herd 38*4882a593Smuzhiyun https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/perfbook.git/tree/CodeSamples/formal/litmus 39*4882a593Smuzhiyun 40*4882a593SmuzhiyunThe -l and -L arguments to "git grep" can be quite helpful in identifying 41*4882a593Smuzhiyunexisting litmus tests that are similar to the one you need. But even if 42*4882a593Smuzhiyunyou start with an existing litmus test, it is still helpful to have a 43*4882a593Smuzhiyungood understanding of the litmus-test format. 44*4882a593Smuzhiyun 45*4882a593Smuzhiyun 46*4882a593SmuzhiyunExamples and Format 47*4882a593Smuzhiyun=================== 48*4882a593Smuzhiyun 49*4882a593SmuzhiyunThis section describes the overall format of litmus tests, starting 50*4882a593Smuzhiyunwith a small example of the message-passing pattern and moving on to 51*4882a593Smuzhiyunmore complex examples that illustrate explicit initialization and LKMM's 52*4882a593Smuzhiyunminimalistic set of flow-control statements. 53*4882a593Smuzhiyun 54*4882a593Smuzhiyun 55*4882a593SmuzhiyunMessage-Passing Example 56*4882a593Smuzhiyun----------------------- 57*4882a593Smuzhiyun 58*4882a593SmuzhiyunThis section gives an overview of the format of a litmus test using an 59*4882a593Smuzhiyunexample based on the common message-passing use case. This use case 60*4882a593Smuzhiyunappears often in the Linux kernel. For example, a flag (modeled by "y" 61*4882a593Smuzhiyunbelow) indicates that a buffer (modeled by "x" below) is now completely 62*4882a593Smuzhiyunfilled in and ready for use. It would be very bad if the consumer saw the 63*4882a593Smuzhiyunflag set, but, due to memory misordering, saw old values in the buffer. 64*4882a593Smuzhiyun 65*4882a593SmuzhiyunThis example asks whether smp_store_release() and smp_load_acquire() 66*4882a593Smuzhiyunsuffices to avoid this bad outcome: 67*4882a593Smuzhiyun 68*4882a593Smuzhiyun 1 C MP+pooncerelease+poacquireonce 69*4882a593Smuzhiyun 2 70*4882a593Smuzhiyun 3 {} 71*4882a593Smuzhiyun 4 72*4882a593Smuzhiyun 5 P0(int *x, int *y) 73*4882a593Smuzhiyun 6 { 74*4882a593Smuzhiyun 7 WRITE_ONCE(*x, 1); 75*4882a593Smuzhiyun 8 smp_store_release(y, 1); 76*4882a593Smuzhiyun 9 } 77*4882a593Smuzhiyun10 78*4882a593Smuzhiyun11 P1(int *x, int *y) 79*4882a593Smuzhiyun12 { 80*4882a593Smuzhiyun13 int r0; 81*4882a593Smuzhiyun14 int r1; 82*4882a593Smuzhiyun15 83*4882a593Smuzhiyun16 r0 = smp_load_acquire(y); 84*4882a593Smuzhiyun17 r1 = READ_ONCE(*x); 85*4882a593Smuzhiyun18 } 86*4882a593Smuzhiyun19 87*4882a593Smuzhiyun20 exists (1:r0=1 /\ 1:r1=0) 88*4882a593Smuzhiyun 89*4882a593SmuzhiyunLine 1 starts with "C", which identifies this file as being in the 90*4882a593SmuzhiyunLKMM C-language format (which, as we will see, is a small fragment 91*4882a593Smuzhiyunof the full C language). The remainder of line 1 is the name of 92*4882a593Smuzhiyunthe test, which by convention is the filename with the ".litmus" 93*4882a593Smuzhiyunsuffix stripped. In this case, the actual test may be found in 94*4882a593Smuzhiyuntools/memory-model/litmus-tests/MP+pooncerelease+poacquireonce.litmus 95*4882a593Smuzhiyunin the Linux-kernel source tree. 96*4882a593Smuzhiyun 97*4882a593SmuzhiyunMechanically generated litmus tests will often have an optional 98*4882a593Smuzhiyundouble-quoted comment string on the second line. Such strings are ignored 99*4882a593Smuzhiyunwhen running the test. Yes, you can add your own comments to litmus 100*4882a593Smuzhiyuntests, but this is a bit involved due to the use of multiple parsers. 101*4882a593SmuzhiyunFor now, you can use C-language comments in the C code, and these comments 102*4882a593Smuzhiyunmay be in either the "/* */" or the "//" style. A later section will 103*4882a593Smuzhiyuncover the full litmus-test commenting story. 104*4882a593Smuzhiyun 105*4882a593SmuzhiyunLine 3 is the initialization section. Because the default initialization 106*4882a593Smuzhiyunto zero suffices for this test, the "{}" syntax is used, which mean the 107*4882a593Smuzhiyuninitialization section is empty. Litmus tests requiring non-default 108*4882a593Smuzhiyuninitialization must have non-empty initialization sections, as in the 109*4882a593Smuzhiyunexample that will be presented later in this document. 110*4882a593Smuzhiyun 111*4882a593SmuzhiyunLines 5-9 show the first process and lines 11-18 the second process. Each 112*4882a593Smuzhiyunprocess corresponds to a Linux-kernel task (or kthread, workqueue, thread, 113*4882a593Smuzhiyunand so on; LKMM discussions often use these terms interchangeably). 114*4882a593SmuzhiyunThe name of the first process is "P0" and that of the second "P1". 115*4882a593SmuzhiyunYou can name your processes anything you like as long as the names consist 116*4882a593Smuzhiyunof a single "P" followed by a number, and as long as the numbers are 117*4882a593Smuzhiyunconsecutive starting with zero. This can actually be quite helpful, 118*4882a593Smuzhiyunfor example, a .litmus file matching "^P1(" but not matching "^P2(" 119*4882a593Smuzhiyunmust contain a two-process litmus test. 120*4882a593Smuzhiyun 121*4882a593SmuzhiyunThe argument list for each function are pointers to the global variables 122*4882a593Smuzhiyunused by that function. Unlike normal C-language function parameters, the 123*4882a593Smuzhiyunnames are significant. The fact that both P0() and P1() have a formal 124*4882a593Smuzhiyunparameter named "x" means that these two processes are working with the 125*4882a593Smuzhiyunsame global variable, also named "x". So the "int *x, int *y" on P0() 126*4882a593Smuzhiyunand P1() mean that both processes are working with two shared global 127*4882a593Smuzhiyunvariables, "x" and "y". Global variables are always passed to processes 128*4882a593Smuzhiyunby reference, hence "P0(int *x, int *y)", but *never* "P0(int x, int y)". 129*4882a593Smuzhiyun 130*4882a593SmuzhiyunP0() has no local variables, but P1() has two of them named "r0" and "r1". 131*4882a593SmuzhiyunThese names may be freely chosen, but for historical reasons stemming from 132*4882a593Smuzhiyunother litmus-test formats, it is conventional to use names consisting of 133*4882a593Smuzhiyun"r" followed by a number as shown here. A common bug in litmus tests 134*4882a593Smuzhiyunis forgetting to add a global variable to a process's parameter list. 135*4882a593SmuzhiyunThis will sometimes result in an error message, but can also cause the 136*4882a593Smuzhiyunintended global to instead be silently treated as an undeclared local 137*4882a593Smuzhiyunvariable. 138*4882a593Smuzhiyun 139*4882a593SmuzhiyunEach process's code is similar to Linux-kernel C, as can be seen on lines 140*4882a593Smuzhiyun7-8 and 13-17. This code may use many of the Linux kernel's atomic 141*4882a593Smuzhiyunoperations, some of its exclusive-lock functions, and some of its RCU 142*4882a593Smuzhiyunand SRCU functions. An approximate list of the currently supported 143*4882a593Smuzhiyunfunctions may be found in the linux-kernel.def file. 144*4882a593Smuzhiyun 145*4882a593SmuzhiyunThe P0() process does "WRITE_ONCE(*x, 1)" on line 7. Because "x" is a 146*4882a593Smuzhiyunpointer in P0()'s parameter list, this does an unordered store to global 147*4882a593Smuzhiyunvariable "x". Line 8 does "smp_store_release(y, 1)", and because "y" 148*4882a593Smuzhiyunis also in P0()'s parameter list, this does a release store to global 149*4882a593Smuzhiyunvariable "y". 150*4882a593Smuzhiyun 151*4882a593SmuzhiyunThe P1() process declares two local variables on lines 13 and 14. 152*4882a593SmuzhiyunLine 16 does "r0 = smp_load_acquire(y)" which does an acquire load 153*4882a593Smuzhiyunfrom global variable "y" into local variable "r0". Line 17 does a 154*4882a593Smuzhiyun"r1 = READ_ONCE(*x)", which does an unordered load from "*x" into local 155*4882a593Smuzhiyunvariable "r1". Both "x" and "y" are in P1()'s parameter list, so both 156*4882a593Smuzhiyunreference the same global variables that are used by P0(). 157*4882a593Smuzhiyun 158*4882a593SmuzhiyunLine 20 is the "exists" assertion expression to evaluate the final state. 159*4882a593SmuzhiyunThis final state is evaluated after the dust has settled: both processes 160*4882a593Smuzhiyunhave completed and all of their memory references and memory barriers 161*4882a593Smuzhiyunhave propagated to all parts of the system. The references to the local 162*4882a593Smuzhiyunvariables "r0" and "r1" in line 24 must be prefixed with "1:" to specify 163*4882a593Smuzhiyunwhich process they are local to. 164*4882a593Smuzhiyun 165*4882a593SmuzhiyunNote that the assertion expression is written in the litmus-test 166*4882a593Smuzhiyunlanguage rather than in C. For example, single "=" is an equality 167*4882a593Smuzhiyunoperator rather than an assignment. The "/\" character combination means 168*4882a593Smuzhiyun"and". Similarly, "\/" stands for "or". Both of these are ASCII-art 169*4882a593Smuzhiyunrepresentations of the corresponding mathematical symbols. Finally, 170*4882a593Smuzhiyun"~" stands for "logical not", which is "!" in C, and not to be confused 171*4882a593Smuzhiyunwith the C-language "~" operator which instead stands for "bitwise not". 172*4882a593SmuzhiyunParentheses may be used to override precedence. 173*4882a593Smuzhiyun 174*4882a593SmuzhiyunThe "exists" assertion on line 20 is satisfied if the consumer sees the 175*4882a593Smuzhiyunflag ("y") set but the buffer ("x") as not yet filled in, that is, if P1() 176*4882a593Smuzhiyunloaded a value from "x" that was equal to 1 but loaded a value from "y" 177*4882a593Smuzhiyunthat was still equal to zero. 178*4882a593Smuzhiyun 179*4882a593SmuzhiyunThis example can be checked by running the following command, which 180*4882a593Smuzhiyunabsolutely must be run from the tools/memory-model directory and from 181*4882a593Smuzhiyunthis directory only: 182*4882a593Smuzhiyun 183*4882a593Smuzhiyunherd7 -conf linux-kernel.cfg litmus-tests/MP+pooncerelease+poacquireonce.litmus 184*4882a593Smuzhiyun 185*4882a593SmuzhiyunThe output is the result of something similar to a full state-space 186*4882a593Smuzhiyunsearch, and is as follows: 187*4882a593Smuzhiyun 188*4882a593Smuzhiyun 1 Test MP+pooncerelease+poacquireonce Allowed 189*4882a593Smuzhiyun 2 States 3 190*4882a593Smuzhiyun 3 1:r0=0; 1:r1=0; 191*4882a593Smuzhiyun 4 1:r0=0; 1:r1=1; 192*4882a593Smuzhiyun 5 1:r0=1; 1:r1=1; 193*4882a593Smuzhiyun 6 No 194*4882a593Smuzhiyun 7 Witnesses 195*4882a593Smuzhiyun 8 Positive: 0 Negative: 3 196*4882a593Smuzhiyun 9 Condition exists (1:r0=1 /\ 1:r1=0) 197*4882a593Smuzhiyun10 Observation MP+pooncerelease+poacquireonce Never 0 3 198*4882a593Smuzhiyun11 Time MP+pooncerelease+poacquireonce 0.00 199*4882a593Smuzhiyun12 Hash=579aaa14d8c35a39429b02e698241d09 200*4882a593Smuzhiyun 201*4882a593SmuzhiyunThe most pertinent line is line 10, which contains "Never 0 3", which 202*4882a593Smuzhiyunindicates that the bad result flagged by the "exists" clause never 203*4882a593Smuzhiyunhappens. This line might instead say "Sometimes" to indicate that the 204*4882a593Smuzhiyunbad result happened in some but not all executions, or it might say 205*4882a593Smuzhiyun"Always" to indicate that the bad result happened in all executions. 206*4882a593Smuzhiyun(The herd7 tool doesn't judge, so it is only an LKMM convention that the 207*4882a593Smuzhiyun"exists" clause indicates a bad result. To see this, invert the "exists" 208*4882a593Smuzhiyunclause's condition and run the test.) The numbers ("0 3") at the end 209*4882a593Smuzhiyunof this line indicate the number of end states satisfying the "exists" 210*4882a593Smuzhiyunclause (0) and the number not not satisfying that clause (3). 211*4882a593Smuzhiyun 212*4882a593SmuzhiyunAnother important part of this output is shown in lines 2-5, repeated here: 213*4882a593Smuzhiyun 214*4882a593Smuzhiyun 2 States 3 215*4882a593Smuzhiyun 3 1:r0=0; 1:r1=0; 216*4882a593Smuzhiyun 4 1:r0=0; 1:r1=1; 217*4882a593Smuzhiyun 5 1:r0=1; 1:r1=1; 218*4882a593Smuzhiyun 219*4882a593SmuzhiyunLine 2 gives the total number of end states, and each of lines 3-5 list 220*4882a593Smuzhiyunone of these states, with the first ("1:r0=0; 1:r1=0;") indicating that 221*4882a593Smuzhiyunboth of P1()'s loads returned the value "0". As expected, given the 222*4882a593Smuzhiyun"Never" on line 10, the state flagged by the "exists" clause is not 223*4882a593Smuzhiyunlisted. This full list of states can be helpful when debugging a new 224*4882a593Smuzhiyunlitmus test. 225*4882a593Smuzhiyun 226*4882a593SmuzhiyunThe rest of the output is not normally needed, either due to irrelevance 227*4882a593Smuzhiyunor due to being redundant with the lines discussed above. However, the 228*4882a593Smuzhiyunfollowing paragraph lists them for the benefit of readers possessed of 229*4882a593Smuzhiyunan insatiable curiosity. Other readers should feel free to skip ahead. 230*4882a593Smuzhiyun 231*4882a593SmuzhiyunLine 1 echos the test name, along with the "Test" and "Allowed". Line 6's 232*4882a593Smuzhiyun"No" says that the "exists" clause was not satisfied by any execution, 233*4882a593Smuzhiyunand as such it has the same meaning as line 10's "Never". Line 7 is a 234*4882a593Smuzhiyunlead-in to line 8's "Positive: 0 Negative: 3", which lists the number 235*4882a593Smuzhiyunof end states satisfying and not satisfying the "exists" clause, just 236*4882a593Smuzhiyunlike the two numbers at the end of line 10. Line 9 repeats the "exists" 237*4882a593Smuzhiyunclause so that you don't have to look it up in the litmus-test file. 238*4882a593SmuzhiyunThe number at the end of line 11 (which begins with "Time") gives the 239*4882a593Smuzhiyuntime in seconds required to analyze the litmus test. Small tests such 240*4882a593Smuzhiyunas this one complete in a few milliseconds, so "0.00" is quite common. 241*4882a593SmuzhiyunLine 12 gives a hash of the contents for the litmus-test file, and is used 242*4882a593Smuzhiyunby tooling that manages litmus tests and their output. This tooling is 243*4882a593Smuzhiyunused by people modifying LKMM itself, and among other things lets such 244*4882a593Smuzhiyunpeople know which of the several thousand relevant litmus tests were 245*4882a593Smuzhiyunaffected by a given change to LKMM. 246*4882a593Smuzhiyun 247*4882a593Smuzhiyun 248*4882a593SmuzhiyunInitialization 249*4882a593Smuzhiyun-------------- 250*4882a593Smuzhiyun 251*4882a593SmuzhiyunThe previous example relied on the default zero initialization for 252*4882a593Smuzhiyun"x" and "y", but a similar litmus test could instead initialize them 253*4882a593Smuzhiyunto some other value: 254*4882a593Smuzhiyun 255*4882a593Smuzhiyun 1 C MP+pooncerelease+poacquireonce 256*4882a593Smuzhiyun 2 257*4882a593Smuzhiyun 3 { 258*4882a593Smuzhiyun 4 x=42; 259*4882a593Smuzhiyun 5 y=42; 260*4882a593Smuzhiyun 6 } 261*4882a593Smuzhiyun 7 262*4882a593Smuzhiyun 8 P0(int *x, int *y) 263*4882a593Smuzhiyun 9 { 264*4882a593Smuzhiyun10 WRITE_ONCE(*x, 1); 265*4882a593Smuzhiyun11 smp_store_release(y, 1); 266*4882a593Smuzhiyun12 } 267*4882a593Smuzhiyun13 268*4882a593Smuzhiyun14 P1(int *x, int *y) 269*4882a593Smuzhiyun15 { 270*4882a593Smuzhiyun16 int r0; 271*4882a593Smuzhiyun17 int r1; 272*4882a593Smuzhiyun18 273*4882a593Smuzhiyun19 r0 = smp_load_acquire(y); 274*4882a593Smuzhiyun20 r1 = READ_ONCE(*x); 275*4882a593Smuzhiyun21 } 276*4882a593Smuzhiyun22 277*4882a593Smuzhiyun23 exists (1:r0=1 /\ 1:r1=42) 278*4882a593Smuzhiyun 279*4882a593SmuzhiyunLines 3-6 now initialize both "x" and "y" to the value 42. This also 280*4882a593Smuzhiyunmeans that the "exists" clause on line 23 must change "1:r1=0" to 281*4882a593Smuzhiyun"1:r1=42". 282*4882a593Smuzhiyun 283*4882a593SmuzhiyunRunning the test gives the same overall result as before, but with the 284*4882a593Smuzhiyunvalue 42 appearing in place of the value zero: 285*4882a593Smuzhiyun 286*4882a593Smuzhiyun 1 Test MP+pooncerelease+poacquireonce Allowed 287*4882a593Smuzhiyun 2 States 3 288*4882a593Smuzhiyun 3 1:r0=1; 1:r1=1; 289*4882a593Smuzhiyun 4 1:r0=42; 1:r1=1; 290*4882a593Smuzhiyun 5 1:r0=42; 1:r1=42; 291*4882a593Smuzhiyun 6 No 292*4882a593Smuzhiyun 7 Witnesses 293*4882a593Smuzhiyun 8 Positive: 0 Negative: 3 294*4882a593Smuzhiyun 9 Condition exists (1:r0=1 /\ 1:r1=42) 295*4882a593Smuzhiyun10 Observation MP+pooncerelease+poacquireonce Never 0 3 296*4882a593Smuzhiyun11 Time MP+pooncerelease+poacquireonce 0.02 297*4882a593Smuzhiyun12 Hash=ab9a9b7940a75a792266be279a980156 298*4882a593Smuzhiyun 299*4882a593SmuzhiyunIt is tempting to avoid the open-coded repetitions of the value "42" 300*4882a593Smuzhiyunby defining another global variable "initval=42" and replacing all 301*4882a593Smuzhiyunoccurrences of "42" with "initval". This will not, repeat *not*, 302*4882a593Smuzhiyuninitialize "x" and "y" to 42, but instead to the address of "initval" 303*4882a593Smuzhiyun(try it!). See the section below on linked lists to learn more about 304*4882a593Smuzhiyunwhy this approach to initialization can be useful. 305*4882a593Smuzhiyun 306*4882a593Smuzhiyun 307*4882a593SmuzhiyunControl Structures 308*4882a593Smuzhiyun------------------ 309*4882a593Smuzhiyun 310*4882a593SmuzhiyunLKMM supports the C-language "if" statement, which allows modeling of 311*4882a593Smuzhiyunconditional branches. In LKMM, conditional branches can affect ordering, 312*4882a593Smuzhiyunbut only if you are *very* careful (compilers are surprisingly able 313*4882a593Smuzhiyunto optimize away conditional branches). The following example shows 314*4882a593Smuzhiyunthe "load buffering" (LB) use case that is used in the Linux kernel to 315*4882a593Smuzhiyunsynchronize between ring-buffer producers and consumers. In the example 316*4882a593Smuzhiyunbelow, P0() is one side checking to see if an operation may proceed and 317*4882a593SmuzhiyunP1() is the other side completing its update. 318*4882a593Smuzhiyun 319*4882a593Smuzhiyun 1 C LB+fencembonceonce+ctrlonceonce 320*4882a593Smuzhiyun 2 321*4882a593Smuzhiyun 3 {} 322*4882a593Smuzhiyun 4 323*4882a593Smuzhiyun 5 P0(int *x, int *y) 324*4882a593Smuzhiyun 6 { 325*4882a593Smuzhiyun 7 int r0; 326*4882a593Smuzhiyun 8 327*4882a593Smuzhiyun 9 r0 = READ_ONCE(*x); 328*4882a593Smuzhiyun10 if (r0) 329*4882a593Smuzhiyun11 WRITE_ONCE(*y, 1); 330*4882a593Smuzhiyun12 } 331*4882a593Smuzhiyun13 332*4882a593Smuzhiyun14 P1(int *x, int *y) 333*4882a593Smuzhiyun15 { 334*4882a593Smuzhiyun16 int r0; 335*4882a593Smuzhiyun17 336*4882a593Smuzhiyun18 r0 = READ_ONCE(*y); 337*4882a593Smuzhiyun19 smp_mb(); 338*4882a593Smuzhiyun20 WRITE_ONCE(*x, 1); 339*4882a593Smuzhiyun21 } 340*4882a593Smuzhiyun22 341*4882a593Smuzhiyun23 exists (0:r0=1 /\ 1:r0=1) 342*4882a593Smuzhiyun 343*4882a593SmuzhiyunP1()'s "if" statement on line 10 works as expected, so that line 11 is 344*4882a593Smuzhiyunexecuted only if line 9 loads a non-zero value from "x". Because P1()'s 345*4882a593Smuzhiyunwrite of "1" to "x" happens only after P1()'s read from "y", one would 346*4882a593Smuzhiyunhope that the "exists" clause cannot be satisfied. LKMM agrees: 347*4882a593Smuzhiyun 348*4882a593Smuzhiyun 1 Test LB+fencembonceonce+ctrlonceonce Allowed 349*4882a593Smuzhiyun 2 States 2 350*4882a593Smuzhiyun 3 0:r0=0; 1:r0=0; 351*4882a593Smuzhiyun 4 0:r0=1; 1:r0=0; 352*4882a593Smuzhiyun 5 No 353*4882a593Smuzhiyun 6 Witnesses 354*4882a593Smuzhiyun 7 Positive: 0 Negative: 2 355*4882a593Smuzhiyun 8 Condition exists (0:r0=1 /\ 1:r0=1) 356*4882a593Smuzhiyun 9 Observation LB+fencembonceonce+ctrlonceonce Never 0 2 357*4882a593Smuzhiyun10 Time LB+fencembonceonce+ctrlonceonce 0.00 358*4882a593Smuzhiyun11 Hash=e5260556f6de495fd39b556d1b831c3b 359*4882a593Smuzhiyun 360*4882a593SmuzhiyunHowever, there is no "while" statement due to the fact that full 361*4882a593Smuzhiyunstate-space search has some difficulty with iteration. However, there 362*4882a593Smuzhiyunare tricks that may be used to handle some special cases, which are 363*4882a593Smuzhiyundiscussed below. In addition, loop-unrolling tricks may be applied, 364*4882a593Smuzhiyunalbeit sparingly. 365*4882a593Smuzhiyun 366*4882a593Smuzhiyun 367*4882a593SmuzhiyunTricks and Traps 368*4882a593Smuzhiyun================ 369*4882a593Smuzhiyun 370*4882a593SmuzhiyunThis section covers extracting debug output from herd7, emulating 371*4882a593Smuzhiyunspin loops, handling trivial linked lists, adding comments to litmus tests, 372*4882a593Smuzhiyunemulating call_rcu(), and finally tricks to improve herd7 performance 373*4882a593Smuzhiyunin order to better handle large litmus tests. 374*4882a593Smuzhiyun 375*4882a593Smuzhiyun 376*4882a593SmuzhiyunDebug Output 377*4882a593Smuzhiyun------------ 378*4882a593Smuzhiyun 379*4882a593SmuzhiyunBy default, the herd7 state output includes all variables mentioned 380*4882a593Smuzhiyunin the "exists" clause. But sometimes debugging efforts are greatly 381*4882a593Smuzhiyunaided by the values of other variables. Consider this litmus test 382*4882a593Smuzhiyun(tools/memory-order/litmus-tests/SB+rfionceonce-poonceonces.litmus but 383*4882a593Smuzhiyunslightly modified), which probes an obscure corner of hardware memory 384*4882a593Smuzhiyunordering: 385*4882a593Smuzhiyun 386*4882a593Smuzhiyun 1 C SB+rfionceonce-poonceonces 387*4882a593Smuzhiyun 2 388*4882a593Smuzhiyun 3 {} 389*4882a593Smuzhiyun 4 390*4882a593Smuzhiyun 5 P0(int *x, int *y) 391*4882a593Smuzhiyun 6 { 392*4882a593Smuzhiyun 7 int r1; 393*4882a593Smuzhiyun 8 int r2; 394*4882a593Smuzhiyun 9 395*4882a593Smuzhiyun10 WRITE_ONCE(*x, 1); 396*4882a593Smuzhiyun11 r1 = READ_ONCE(*x); 397*4882a593Smuzhiyun12 r2 = READ_ONCE(*y); 398*4882a593Smuzhiyun13 } 399*4882a593Smuzhiyun14 400*4882a593Smuzhiyun15 P1(int *x, int *y) 401*4882a593Smuzhiyun16 { 402*4882a593Smuzhiyun17 int r3; 403*4882a593Smuzhiyun18 int r4; 404*4882a593Smuzhiyun19 405*4882a593Smuzhiyun20 WRITE_ONCE(*y, 1); 406*4882a593Smuzhiyun21 r3 = READ_ONCE(*y); 407*4882a593Smuzhiyun22 r4 = READ_ONCE(*x); 408*4882a593Smuzhiyun23 } 409*4882a593Smuzhiyun24 410*4882a593Smuzhiyun25 exists (0:r2=0 /\ 1:r4=0) 411*4882a593Smuzhiyun 412*4882a593SmuzhiyunThe herd7 output is as follows: 413*4882a593Smuzhiyun 414*4882a593Smuzhiyun 1 Test SB+rfionceonce-poonceonces Allowed 415*4882a593Smuzhiyun 2 States 4 416*4882a593Smuzhiyun 3 0:r2=0; 1:r4=0; 417*4882a593Smuzhiyun 4 0:r2=0; 1:r4=1; 418*4882a593Smuzhiyun 5 0:r2=1; 1:r4=0; 419*4882a593Smuzhiyun 6 0:r2=1; 1:r4=1; 420*4882a593Smuzhiyun 7 Ok 421*4882a593Smuzhiyun 8 Witnesses 422*4882a593Smuzhiyun 9 Positive: 1 Negative: 3 423*4882a593Smuzhiyun10 Condition exists (0:r2=0 /\ 1:r4=0) 424*4882a593Smuzhiyun11 Observation SB+rfionceonce-poonceonces Sometimes 1 3 425*4882a593Smuzhiyun12 Time SB+rfionceonce-poonceonces 0.01 426*4882a593Smuzhiyun13 Hash=c7f30fe0faebb7d565405d55b7318ada 427*4882a593Smuzhiyun 428*4882a593Smuzhiyun(This output indicates that CPUs are permitted to "snoop their own 429*4882a593Smuzhiyunstore buffers", which all of Linux's CPU families other than s390 will 430*4882a593Smuzhiyunhappily do. Such snooping results in disagreement among CPUs on the 431*4882a593Smuzhiyunorder of stores from different CPUs, which is rarely an issue.) 432*4882a593Smuzhiyun 433*4882a593SmuzhiyunBut the herd7 output shows only the two variables mentioned in the 434*4882a593Smuzhiyun"exists" clause. Someone modifying this test might wish to know the 435*4882a593Smuzhiyunvalues of "x", "y", "0:r1", and "0:r3" as well. The "locations" 436*4882a593Smuzhiyunstatement on line 25 shows how to cause herd7 to display additional 437*4882a593Smuzhiyunvariables: 438*4882a593Smuzhiyun 439*4882a593Smuzhiyun 1 C SB+rfionceonce-poonceonces 440*4882a593Smuzhiyun 2 441*4882a593Smuzhiyun 3 {} 442*4882a593Smuzhiyun 4 443*4882a593Smuzhiyun 5 P0(int *x, int *y) 444*4882a593Smuzhiyun 6 { 445*4882a593Smuzhiyun 7 int r1; 446*4882a593Smuzhiyun 8 int r2; 447*4882a593Smuzhiyun 9 448*4882a593Smuzhiyun10 WRITE_ONCE(*x, 1); 449*4882a593Smuzhiyun11 r1 = READ_ONCE(*x); 450*4882a593Smuzhiyun12 r2 = READ_ONCE(*y); 451*4882a593Smuzhiyun13 } 452*4882a593Smuzhiyun14 453*4882a593Smuzhiyun15 P1(int *x, int *y) 454*4882a593Smuzhiyun16 { 455*4882a593Smuzhiyun17 int r3; 456*4882a593Smuzhiyun18 int r4; 457*4882a593Smuzhiyun19 458*4882a593Smuzhiyun20 WRITE_ONCE(*y, 1); 459*4882a593Smuzhiyun21 r3 = READ_ONCE(*y); 460*4882a593Smuzhiyun22 r4 = READ_ONCE(*x); 461*4882a593Smuzhiyun23 } 462*4882a593Smuzhiyun24 463*4882a593Smuzhiyun25 locations [0:r1; 1:r3; x; y] 464*4882a593Smuzhiyun26 exists (0:r2=0 /\ 1:r4=0) 465*4882a593Smuzhiyun 466*4882a593SmuzhiyunThe herd7 output then displays the values of all the variables: 467*4882a593Smuzhiyun 468*4882a593Smuzhiyun 1 Test SB+rfionceonce-poonceonces Allowed 469*4882a593Smuzhiyun 2 States 4 470*4882a593Smuzhiyun 3 0:r1=1; 0:r2=0; 1:r3=1; 1:r4=0; x=1; y=1; 471*4882a593Smuzhiyun 4 0:r1=1; 0:r2=0; 1:r3=1; 1:r4=1; x=1; y=1; 472*4882a593Smuzhiyun 5 0:r1=1; 0:r2=1; 1:r3=1; 1:r4=0; x=1; y=1; 473*4882a593Smuzhiyun 6 0:r1=1; 0:r2=1; 1:r3=1; 1:r4=1; x=1; y=1; 474*4882a593Smuzhiyun 7 Ok 475*4882a593Smuzhiyun 8 Witnesses 476*4882a593Smuzhiyun 9 Positive: 1 Negative: 3 477*4882a593Smuzhiyun10 Condition exists (0:r2=0 /\ 1:r4=0) 478*4882a593Smuzhiyun11 Observation SB+rfionceonce-poonceonces Sometimes 1 3 479*4882a593Smuzhiyun12 Time SB+rfionceonce-poonceonces 0.01 480*4882a593Smuzhiyun13 Hash=40de8418c4b395388f6501cafd1ed38d 481*4882a593Smuzhiyun 482*4882a593SmuzhiyunWhat if you would like to know the value of a particular global variable 483*4882a593Smuzhiyunat some particular point in a given process's execution? One approach 484*4882a593Smuzhiyunis to use a READ_ONCE() to load that global variable into a new local 485*4882a593Smuzhiyunvariable, then add that local variable to the "locations" clause. 486*4882a593SmuzhiyunBut be careful: In some litmus tests, adding a READ_ONCE() will change 487*4882a593Smuzhiyunthe outcome! For one example, please see the C-READ_ONCE.litmus and 488*4882a593SmuzhiyunC-READ_ONCE-omitted.litmus tests located here: 489*4882a593Smuzhiyun 490*4882a593Smuzhiyun https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/ 491*4882a593Smuzhiyun 492*4882a593Smuzhiyun 493*4882a593SmuzhiyunSpin Loops 494*4882a593Smuzhiyun---------- 495*4882a593Smuzhiyun 496*4882a593SmuzhiyunThe analysis carried out by herd7 explores full state space, which is 497*4882a593Smuzhiyunat best of exponential time complexity. Adding processes and increasing 498*4882a593Smuzhiyunthe amount of code in a give process can greatly increase execution time. 499*4882a593SmuzhiyunPotentially infinite loops, such as those used to wait for locks to 500*4882a593Smuzhiyunbecome available, are clearly problematic. 501*4882a593Smuzhiyun 502*4882a593SmuzhiyunFortunately, it is possible to avoid state-space explosion by specially 503*4882a593Smuzhiyunmodeling such loops. For example, the following litmus tests emulates 504*4882a593Smuzhiyunlocking using xchg_acquire(), but instead of enclosing xchg_acquire() 505*4882a593Smuzhiyunin a spin loop, it instead excludes executions that fail to acquire the 506*4882a593Smuzhiyunlock using a herd7 "filter" clause. Note that for exclusive locking, you 507*4882a593Smuzhiyunare better off using the spin_lock() and spin_unlock() that LKMM directly 508*4882a593Smuzhiyunmodels, if for no other reason that these are much faster. However, the 509*4882a593Smuzhiyuntechniques illustrated in this section can be used for other purposes, 510*4882a593Smuzhiyunsuch as emulating reader-writer locking, which LKMM does not yet model. 511*4882a593Smuzhiyun 512*4882a593Smuzhiyun 1 C C-SB+l-o-o-u+l-o-o-u-X 513*4882a593Smuzhiyun 2 514*4882a593Smuzhiyun 3 { 515*4882a593Smuzhiyun 4 } 516*4882a593Smuzhiyun 5 517*4882a593Smuzhiyun 6 P0(int *sl, int *x0, int *x1) 518*4882a593Smuzhiyun 7 { 519*4882a593Smuzhiyun 8 int r2; 520*4882a593Smuzhiyun 9 int r1; 521*4882a593Smuzhiyun10 522*4882a593Smuzhiyun11 r2 = xchg_acquire(sl, 1); 523*4882a593Smuzhiyun12 WRITE_ONCE(*x0, 1); 524*4882a593Smuzhiyun13 r1 = READ_ONCE(*x1); 525*4882a593Smuzhiyun14 smp_store_release(sl, 0); 526*4882a593Smuzhiyun15 } 527*4882a593Smuzhiyun16 528*4882a593Smuzhiyun17 P1(int *sl, int *x0, int *x1) 529*4882a593Smuzhiyun18 { 530*4882a593Smuzhiyun19 int r2; 531*4882a593Smuzhiyun20 int r1; 532*4882a593Smuzhiyun21 533*4882a593Smuzhiyun22 r2 = xchg_acquire(sl, 1); 534*4882a593Smuzhiyun23 WRITE_ONCE(*x1, 1); 535*4882a593Smuzhiyun24 r1 = READ_ONCE(*x0); 536*4882a593Smuzhiyun25 smp_store_release(sl, 0); 537*4882a593Smuzhiyun26 } 538*4882a593Smuzhiyun27 539*4882a593Smuzhiyun28 filter (0:r2=0 /\ 1:r2=0) 540*4882a593Smuzhiyun29 exists (0:r1=0 /\ 1:r1=0) 541*4882a593Smuzhiyun 542*4882a593SmuzhiyunThis litmus test may be found here: 543*4882a593Smuzhiyun 544*4882a593Smuzhiyunhttps://git.kernel.org/pub/scm/linux/kernel/git/paulmck/perfbook.git/tree/CodeSamples/formal/herd/C-SB+l-o-o-u+l-o-o-u-X.litmus 545*4882a593Smuzhiyun 546*4882a593SmuzhiyunThis test uses two global variables, "x1" and "x2", and also emulates a 547*4882a593Smuzhiyunsingle global spinlock named "sl". This spinlock is held by whichever 548*4882a593Smuzhiyunprocess changes the value of "sl" from "0" to "1", and is released when 549*4882a593Smuzhiyunthat process sets "sl" back to "0". P0()'s lock acquisition is emulated 550*4882a593Smuzhiyunon line 11 using xchg_acquire(), which unconditionally stores the value 551*4882a593Smuzhiyun"1" to "sl" and stores either "0" or "1" to "r2", depending on whether 552*4882a593Smuzhiyunthe lock acquisition was successful or unsuccessful (due to "sl" already 553*4882a593Smuzhiyunhaving the value "1"), respectively. P1() operates in a similar manner. 554*4882a593Smuzhiyun 555*4882a593SmuzhiyunRather unconventionally, execution appears to proceed to the critical 556*4882a593Smuzhiyunsection on lines 12 and 13 in either case. Line 14 then uses an 557*4882a593Smuzhiyunsmp_store_release() to store zero to "sl", thus emulating lock release. 558*4882a593Smuzhiyun 559*4882a593SmuzhiyunThe case where xchg_acquire() fails to acquire the lock is handled by 560*4882a593Smuzhiyunthe "filter" clause on line 28, which tells herd7 to keep only those 561*4882a593Smuzhiyunexecutions in which both "0:r2" and "1:r2" are zero, that is to pay 562*4882a593Smuzhiyunattention only to those executions in which both locks are actually 563*4882a593Smuzhiyunacquired. Thus, the bogus executions that would execute the critical 564*4882a593Smuzhiyunsections are discarded and any effects that they might have had are 565*4882a593Smuzhiyunignored. Note well that the "filter" clause keeps those executions 566*4882a593Smuzhiyunfor which its expression is satisfied, that is, for which the expression 567*4882a593Smuzhiyunevaluates to true. In other words, the "filter" clause says what to 568*4882a593Smuzhiyunkeep, not what to discard. 569*4882a593Smuzhiyun 570*4882a593SmuzhiyunThe result of running this test is as follows: 571*4882a593Smuzhiyun 572*4882a593Smuzhiyun 1 Test C-SB+l-o-o-u+l-o-o-u-X Allowed 573*4882a593Smuzhiyun 2 States 2 574*4882a593Smuzhiyun 3 0:r1=0; 1:r1=1; 575*4882a593Smuzhiyun 4 0:r1=1; 1:r1=0; 576*4882a593Smuzhiyun 5 No 577*4882a593Smuzhiyun 6 Witnesses 578*4882a593Smuzhiyun 7 Positive: 0 Negative: 2 579*4882a593Smuzhiyun 8 Condition exists (0:r1=0 /\ 1:r1=0) 580*4882a593Smuzhiyun 9 Observation C-SB+l-o-o-u+l-o-o-u-X Never 0 2 581*4882a593Smuzhiyun10 Time C-SB+l-o-o-u+l-o-o-u-X 0.03 582*4882a593Smuzhiyun 583*4882a593SmuzhiyunThe "Never" on line 9 indicates that this use of xchg_acquire() and 584*4882a593Smuzhiyunsmp_store_release() really does correctly emulate locking. 585*4882a593Smuzhiyun 586*4882a593SmuzhiyunWhy doesn't the litmus test take the simpler approach of using a spin loop 587*4882a593Smuzhiyunto handle failed spinlock acquisitions, like the kernel does? The key 588*4882a593Smuzhiyuninsight behind this litmus test is that spin loops have no effect on the 589*4882a593Smuzhiyunpossible "exists"-clause outcomes of program execution in the absence 590*4882a593Smuzhiyunof deadlock. In other words, given a high-quality lock-acquisition 591*4882a593Smuzhiyunprimitive in a deadlock-free program running on high-quality hardware, 592*4882a593Smuzhiyuneach lock acquisition will eventually succeed. Because herd7 already 593*4882a593Smuzhiyunexplores the full state space, the length of time required to actually 594*4882a593Smuzhiyunacquire the lock does not matter. After all, herd7 already models all 595*4882a593Smuzhiyunpossible durations of the xchg_acquire() statements. 596*4882a593Smuzhiyun 597*4882a593SmuzhiyunWhy not just add the "filter" clause to the "exists" clause, thus 598*4882a593Smuzhiyunavoiding the "filter" clause entirely? This does work, but is slower. 599*4882a593SmuzhiyunThe reason that the "filter" clause is faster is that (in the common case) 600*4882a593Smuzhiyunherd7 knows to abandon an execution as soon as the "filter" expression 601*4882a593Smuzhiyunfails to be satisfied. In contrast, the "exists" clause is evaluated 602*4882a593Smuzhiyunonly at the end of time, thus requiring herd7 to waste time on bogus 603*4882a593Smuzhiyunexecutions in which both critical sections proceed concurrently. In 604*4882a593Smuzhiyunaddition, some LKMM users like the separation of concerns provided by 605*4882a593Smuzhiyunusing the both the "filter" and "exists" clauses. 606*4882a593Smuzhiyun 607*4882a593SmuzhiyunReaders lacking a pathological interest in odd corner cases should feel 608*4882a593Smuzhiyunfree to skip the remainder of this section. 609*4882a593Smuzhiyun 610*4882a593SmuzhiyunBut what if the litmus test were to temporarily set "0:r2" to a non-zero 611*4882a593Smuzhiyunvalue? Wouldn't that cause herd7 to abandon the execution prematurely 612*4882a593Smuzhiyundue to an early mismatch of the "filter" clause? 613*4882a593Smuzhiyun 614*4882a593SmuzhiyunWhy not just try it? Line 4 of the following modified litmus test 615*4882a593Smuzhiyunintroduces a new global variable "x2" that is initialized to "1". Line 23 616*4882a593Smuzhiyunof P1() reads that variable into "1:r2" to force an early mismatch with 617*4882a593Smuzhiyunthe "filter" clause. Line 24 does a known-true "if" condition to avoid 618*4882a593Smuzhiyunand static analysis that herd7 might do. Finally the "exists" clause 619*4882a593Smuzhiyunon line 32 is updated to a condition that is alway satisfied at the end 620*4882a593Smuzhiyunof the test. 621*4882a593Smuzhiyun 622*4882a593Smuzhiyun 1 C C-SB+l-o-o-u+l-o-o-u-X 623*4882a593Smuzhiyun 2 624*4882a593Smuzhiyun 3 { 625*4882a593Smuzhiyun 4 x2=1; 626*4882a593Smuzhiyun 5 } 627*4882a593Smuzhiyun 6 628*4882a593Smuzhiyun 7 P0(int *sl, int *x0, int *x1) 629*4882a593Smuzhiyun 8 { 630*4882a593Smuzhiyun 9 int r2; 631*4882a593Smuzhiyun10 int r1; 632*4882a593Smuzhiyun11 633*4882a593Smuzhiyun12 r2 = xchg_acquire(sl, 1); 634*4882a593Smuzhiyun13 WRITE_ONCE(*x0, 1); 635*4882a593Smuzhiyun14 r1 = READ_ONCE(*x1); 636*4882a593Smuzhiyun15 smp_store_release(sl, 0); 637*4882a593Smuzhiyun16 } 638*4882a593Smuzhiyun17 639*4882a593Smuzhiyun18 P1(int *sl, int *x0, int *x1, int *x2) 640*4882a593Smuzhiyun19 { 641*4882a593Smuzhiyun20 int r2; 642*4882a593Smuzhiyun21 int r1; 643*4882a593Smuzhiyun22 644*4882a593Smuzhiyun23 r2 = READ_ONCE(*x2); 645*4882a593Smuzhiyun24 if (r2) 646*4882a593Smuzhiyun25 r2 = xchg_acquire(sl, 1); 647*4882a593Smuzhiyun26 WRITE_ONCE(*x1, 1); 648*4882a593Smuzhiyun27 r1 = READ_ONCE(*x0); 649*4882a593Smuzhiyun28 smp_store_release(sl, 0); 650*4882a593Smuzhiyun29 } 651*4882a593Smuzhiyun30 652*4882a593Smuzhiyun31 filter (0:r2=0 /\ 1:r2=0) 653*4882a593Smuzhiyun32 exists (x1=1) 654*4882a593Smuzhiyun 655*4882a593SmuzhiyunIf the "filter" clause were to check each variable at each point in the 656*4882a593Smuzhiyunexecution, running this litmus test would display no executions because 657*4882a593Smuzhiyunall executions would be filtered out at line 23. However, the output 658*4882a593Smuzhiyunis instead as follows: 659*4882a593Smuzhiyun 660*4882a593Smuzhiyun 1 Test C-SB+l-o-o-u+l-o-o-u-X Allowed 661*4882a593Smuzhiyun 2 States 1 662*4882a593Smuzhiyun 3 x1=1; 663*4882a593Smuzhiyun 4 Ok 664*4882a593Smuzhiyun 5 Witnesses 665*4882a593Smuzhiyun 6 Positive: 2 Negative: 0 666*4882a593Smuzhiyun 7 Condition exists (x1=1) 667*4882a593Smuzhiyun 8 Observation C-SB+l-o-o-u+l-o-o-u-X Always 2 0 668*4882a593Smuzhiyun 9 Time C-SB+l-o-o-u+l-o-o-u-X 0.04 669*4882a593Smuzhiyun10 Hash=080bc508da7f291e122c6de76c0088e3 670*4882a593Smuzhiyun 671*4882a593SmuzhiyunLine 3 shows that there is one execution that did not get filtered out, 672*4882a593Smuzhiyunso the "filter" clause is evaluated only on the last assignment to 673*4882a593Smuzhiyunthe variables that it checks. In this case, the "filter" clause is a 674*4882a593Smuzhiyundisjunction, so it might be evaluated twice, once at the final (and only) 675*4882a593Smuzhiyunassignment to "0:r2" and once at the final assignment to "1:r2". 676*4882a593Smuzhiyun 677*4882a593Smuzhiyun 678*4882a593SmuzhiyunLinked Lists 679*4882a593Smuzhiyun------------ 680*4882a593Smuzhiyun 681*4882a593SmuzhiyunLKMM can handle linked lists, but only linked lists in which each node 682*4882a593Smuzhiyuncontains nothing except a pointer to the next node in the list. This is 683*4882a593Smuzhiyunof course quite restrictive, but there is nevertheless quite a bit that 684*4882a593Smuzhiyuncan be done within these confines, as can be seen in the litmus test 685*4882a593Smuzhiyunat tools/memory-model/litmus-tests/MP+onceassign+derefonce.litmus: 686*4882a593Smuzhiyun 687*4882a593Smuzhiyun 1 C MP+onceassign+derefonce 688*4882a593Smuzhiyun 2 689*4882a593Smuzhiyun 3 { 690*4882a593Smuzhiyun 4 y=z; 691*4882a593Smuzhiyun 5 z=0; 692*4882a593Smuzhiyun 6 } 693*4882a593Smuzhiyun 7 694*4882a593Smuzhiyun 8 P0(int *x, int **y) 695*4882a593Smuzhiyun 9 { 696*4882a593Smuzhiyun10 WRITE_ONCE(*x, 1); 697*4882a593Smuzhiyun11 rcu_assign_pointer(*y, x); 698*4882a593Smuzhiyun12 } 699*4882a593Smuzhiyun13 700*4882a593Smuzhiyun14 P1(int *x, int **y) 701*4882a593Smuzhiyun15 { 702*4882a593Smuzhiyun16 int *r0; 703*4882a593Smuzhiyun17 int r1; 704*4882a593Smuzhiyun18 705*4882a593Smuzhiyun19 rcu_read_lock(); 706*4882a593Smuzhiyun20 r0 = rcu_dereference(*y); 707*4882a593Smuzhiyun21 r1 = READ_ONCE(*r0); 708*4882a593Smuzhiyun22 rcu_read_unlock(); 709*4882a593Smuzhiyun23 } 710*4882a593Smuzhiyun24 711*4882a593Smuzhiyun25 exists (1:r0=x /\ 1:r1=0) 712*4882a593Smuzhiyun 713*4882a593SmuzhiyunLine 4's "y=z" may seem odd, given that "z" has not yet been initialized. 714*4882a593SmuzhiyunBut "y=z" does not set the value of "y" to that of "z", but instead 715*4882a593Smuzhiyunsets the value of "y" to the *address* of "z". Lines 4 and 5 therefore 716*4882a593Smuzhiyuncreate a simple linked list, with "y" pointing to "z" and "z" having a 717*4882a593SmuzhiyunNULL pointer. A much longer linked list could be created if desired, 718*4882a593Smuzhiyunand circular singly linked lists can also be created and manipulated. 719*4882a593Smuzhiyun 720*4882a593SmuzhiyunThe "exists" clause works the same way, with the "1:r0=x" comparing P1()'s 721*4882a593Smuzhiyun"r0" not to the value of "x", but again to its address. This term of the 722*4882a593Smuzhiyun"exists" clause therefore tests whether line 20's load from "y" saw the 723*4882a593Smuzhiyunvalue stored by line 11, which is in fact what is required in this case. 724*4882a593Smuzhiyun 725*4882a593SmuzhiyunP0()'s line 10 initializes "x" to the value 1 then line 11 links to "x" 726*4882a593Smuzhiyunfrom "y", replacing "z". 727*4882a593Smuzhiyun 728*4882a593SmuzhiyunP1()'s line 20 loads a pointer from "y", and line 21 dereferences that 729*4882a593Smuzhiyunpointer. The RCU read-side critical section spanning lines 19-22 is just 730*4882a593Smuzhiyunfor show in this example. Note that the address used for line 21's load 731*4882a593Smuzhiyundepends on (in this case, "is exactly the same as") the value loaded by 732*4882a593Smuzhiyunline 20. This is an example of what is called an "address dependency". 733*4882a593SmuzhiyunThis particular address dependency extends from the load on line 20 to the 734*4882a593Smuzhiyunload on line 21. Address dependencies provide a weak form of ordering. 735*4882a593Smuzhiyun 736*4882a593SmuzhiyunRunning this test results in the following: 737*4882a593Smuzhiyun 738*4882a593Smuzhiyun 1 Test MP+onceassign+derefonce Allowed 739*4882a593Smuzhiyun 2 States 2 740*4882a593Smuzhiyun 3 1:r0=x; 1:r1=1; 741*4882a593Smuzhiyun 4 1:r0=z; 1:r1=0; 742*4882a593Smuzhiyun 5 No 743*4882a593Smuzhiyun 6 Witnesses 744*4882a593Smuzhiyun 7 Positive: 0 Negative: 2 745*4882a593Smuzhiyun 8 Condition exists (1:r0=x /\ 1:r1=0) 746*4882a593Smuzhiyun 9 Observation MP+onceassign+derefonce Never 0 2 747*4882a593Smuzhiyun10 Time MP+onceassign+derefonce 0.00 748*4882a593Smuzhiyun11 Hash=49ef7a741563570102448a256a0c8568 749*4882a593Smuzhiyun 750*4882a593SmuzhiyunThe only possible outcomes feature P1() loading a pointer to "z" 751*4882a593Smuzhiyun(which contains zero) on the one hand and P1() loading a pointer to "x" 752*4882a593Smuzhiyun(which contains the value one) on the other. This should be reassuring 753*4882a593Smuzhiyunbecause it says that RCU readers cannot see the old preinitialization 754*4882a593Smuzhiyunvalues when accessing a newly inserted list node. This undesirable 755*4882a593Smuzhiyunscenario is flagged by the "exists" clause, and would occur if P1() 756*4882a593Smuzhiyunloaded a pointer to "x", but obtained the pre-initialization value of 757*4882a593Smuzhiyunzero after dereferencing that pointer. 758*4882a593Smuzhiyun 759*4882a593Smuzhiyun 760*4882a593SmuzhiyunComments 761*4882a593Smuzhiyun-------- 762*4882a593Smuzhiyun 763*4882a593SmuzhiyunDifferent portions of a litmus test are processed by different parsers, 764*4882a593Smuzhiyunwhich has the charming effect of requiring different comment syntax in 765*4882a593Smuzhiyundifferent portions of the litmus test. The C-syntax portions use 766*4882a593SmuzhiyunC-language comments (either "/* */" or "//"), while the other portions 767*4882a593Smuzhiyunuse Ocaml comments "(* *)". 768*4882a593Smuzhiyun 769*4882a593SmuzhiyunThe following litmus test illustrates the comment style corresponding 770*4882a593Smuzhiyunto each syntactic unit of the test: 771*4882a593Smuzhiyun 772*4882a593Smuzhiyun 1 C MP+onceassign+derefonce (* A *) 773*4882a593Smuzhiyun 2 774*4882a593Smuzhiyun 3 (* B *) 775*4882a593Smuzhiyun 4 776*4882a593Smuzhiyun 5 { 777*4882a593Smuzhiyun 6 y=z; (* C *) 778*4882a593Smuzhiyun 7 z=0; 779*4882a593Smuzhiyun 8 } // D 780*4882a593Smuzhiyun 9 781*4882a593Smuzhiyun10 // E 782*4882a593Smuzhiyun11 783*4882a593Smuzhiyun12 P0(int *x, int **y) // F 784*4882a593Smuzhiyun13 { 785*4882a593Smuzhiyun14 WRITE_ONCE(*x, 1); // G 786*4882a593Smuzhiyun15 rcu_assign_pointer(*y, x); 787*4882a593Smuzhiyun16 } 788*4882a593Smuzhiyun17 789*4882a593Smuzhiyun18 // H 790*4882a593Smuzhiyun19 791*4882a593Smuzhiyun20 P1(int *x, int **y) 792*4882a593Smuzhiyun21 { 793*4882a593Smuzhiyun22 int *r0; 794*4882a593Smuzhiyun23 int r1; 795*4882a593Smuzhiyun24 796*4882a593Smuzhiyun25 rcu_read_lock(); 797*4882a593Smuzhiyun26 r0 = rcu_dereference(*y); 798*4882a593Smuzhiyun27 r1 = READ_ONCE(*r0); 799*4882a593Smuzhiyun28 rcu_read_unlock(); 800*4882a593Smuzhiyun29 } 801*4882a593Smuzhiyun30 802*4882a593Smuzhiyun31 // I 803*4882a593Smuzhiyun32 804*4882a593Smuzhiyun33 exists (* J *) (1:r0=x /\ (* K *) 1:r1=0) (* L *) 805*4882a593Smuzhiyun 806*4882a593SmuzhiyunIn short, use C-language comments in the C code and Ocaml comments in 807*4882a593Smuzhiyunthe rest of the litmus test. 808*4882a593Smuzhiyun 809*4882a593SmuzhiyunOn the other hand, if you prefer C-style comments everywhere, the 810*4882a593SmuzhiyunC preprocessor is your friend. 811*4882a593Smuzhiyun 812*4882a593Smuzhiyun 813*4882a593SmuzhiyunAsynchronous RCU Grace Periods 814*4882a593Smuzhiyun------------------------------ 815*4882a593Smuzhiyun 816*4882a593SmuzhiyunThe following litmus test is derived from the example show in 817*4882a593SmuzhiyunDocumentation/litmus-tests/rcu/RCU+sync+free.litmus, but converted to 818*4882a593Smuzhiyunemulate call_rcu(): 819*4882a593Smuzhiyun 820*4882a593Smuzhiyun 1 C RCU+sync+free 821*4882a593Smuzhiyun 2 822*4882a593Smuzhiyun 3 { 823*4882a593Smuzhiyun 4 int x = 1; 824*4882a593Smuzhiyun 5 int *y = &x; 825*4882a593Smuzhiyun 6 int z = 1; 826*4882a593Smuzhiyun 7 } 827*4882a593Smuzhiyun 8 828*4882a593Smuzhiyun 9 P0(int *x, int *z, int **y) 829*4882a593Smuzhiyun10 { 830*4882a593Smuzhiyun11 int *r0; 831*4882a593Smuzhiyun12 int r1; 832*4882a593Smuzhiyun13 833*4882a593Smuzhiyun14 rcu_read_lock(); 834*4882a593Smuzhiyun15 r0 = rcu_dereference(*y); 835*4882a593Smuzhiyun16 r1 = READ_ONCE(*r0); 836*4882a593Smuzhiyun17 rcu_read_unlock(); 837*4882a593Smuzhiyun18 } 838*4882a593Smuzhiyun19 839*4882a593Smuzhiyun20 P1(int *z, int **y, int *c) 840*4882a593Smuzhiyun21 { 841*4882a593Smuzhiyun22 rcu_assign_pointer(*y, z); 842*4882a593Smuzhiyun23 smp_store_release(*c, 1); // Emulate call_rcu(). 843*4882a593Smuzhiyun24 } 844*4882a593Smuzhiyun25 845*4882a593Smuzhiyun26 P2(int *x, int *z, int **y, int *c) 846*4882a593Smuzhiyun27 { 847*4882a593Smuzhiyun28 int r0; 848*4882a593Smuzhiyun29 849*4882a593Smuzhiyun30 r0 = smp_load_acquire(*c); // Note call_rcu() request. 850*4882a593Smuzhiyun31 synchronize_rcu(); // Wait one grace period. 851*4882a593Smuzhiyun32 WRITE_ONCE(*x, 0); // Emulate the RCU callback. 852*4882a593Smuzhiyun33 } 853*4882a593Smuzhiyun34 854*4882a593Smuzhiyun35 filter (2:r0=1) (* Reject too-early starts. *) 855*4882a593Smuzhiyun36 exists (0:r0=x /\ 0:r1=0) 856*4882a593Smuzhiyun 857*4882a593SmuzhiyunLines 4-6 initialize a linked list headed by "y" that initially contains 858*4882a593Smuzhiyun"x". In addition, "z" is pre-initialized to prepare for P1(), which 859*4882a593Smuzhiyunwill replace "x" with "z" in this list. 860*4882a593Smuzhiyun 861*4882a593SmuzhiyunP0() on lines 9-18 enters an RCU read-side critical section, loads the 862*4882a593Smuzhiyunlist header "y" and dereferences it, leaving the node in "0:r0" and 863*4882a593Smuzhiyunthe node's value in "0:r1". 864*4882a593Smuzhiyun 865*4882a593SmuzhiyunP1() on lines 20-24 updates the list header to instead reference "z", 866*4882a593Smuzhiyunthen emulates call_rcu() by doing a release store into "c". 867*4882a593Smuzhiyun 868*4882a593SmuzhiyunP2() on lines 27-33 emulates the behind-the-scenes effect of doing a 869*4882a593Smuzhiyuncall_rcu(). Line 30 first does an acquire load from "c", then line 31 870*4882a593Smuzhiyunwaits for an RCU grace period to elapse, and finally line 32 emulates 871*4882a593Smuzhiyunthe RCU callback, which in turn emulates a call to kfree(). 872*4882a593Smuzhiyun 873*4882a593SmuzhiyunOf course, it is possible for P2() to start too soon, so that the 874*4882a593Smuzhiyunvalue of "2:r0" is zero rather than the required value of "1". 875*4882a593SmuzhiyunThe "filter" clause on line 35 handles this possibility, rejecting 876*4882a593Smuzhiyunall executions in which "2:r0" is not equal to the value "1". 877*4882a593Smuzhiyun 878*4882a593Smuzhiyun 879*4882a593SmuzhiyunPerformance 880*4882a593Smuzhiyun----------- 881*4882a593Smuzhiyun 882*4882a593SmuzhiyunLKMM's exploration of the full state-space can be extremely helpful, 883*4882a593Smuzhiyunbut it does not come for free. The price is exponential computational 884*4882a593Smuzhiyuncomplexity in terms of the number of processes, the average number 885*4882a593Smuzhiyunof statements in each process, and the total number of stores in the 886*4882a593Smuzhiyunlitmus test. 887*4882a593Smuzhiyun 888*4882a593SmuzhiyunSo it is best to start small and then work up. Where possible, break 889*4882a593Smuzhiyunyour code down into small pieces each representing a core concurrency 890*4882a593Smuzhiyunrequirement. 891*4882a593Smuzhiyun 892*4882a593SmuzhiyunThat said, herd7 is quite fast. On an unprepossessing x86 laptop, it 893*4882a593Smuzhiyunwas able to analyze the following 10-process RCU litmus test in about 894*4882a593Smuzhiyunsix seconds. 895*4882a593Smuzhiyun 896*4882a593Smuzhiyunhttps://github.com/paulmckrcu/litmus/blob/master/auto/C-RW-R+RW-R+RW-G+RW-G+RW-G+RW-G+RW-R+RW-R+RW-R+RW-R.litmus 897*4882a593Smuzhiyun 898*4882a593SmuzhiyunOne way to make herd7 run faster is to use the "-speedcheck true" option. 899*4882a593SmuzhiyunThis option prevents herd7 from generating all possible end states, 900*4882a593Smuzhiyuninstead causing it to focus solely on whether or not the "exists" 901*4882a593Smuzhiyunclause can be satisfied. With this option, herd7 evaluates the above 902*4882a593Smuzhiyunlitmus test in about 300 milliseconds, for more than an order of magnitude 903*4882a593Smuzhiyunimprovement in performance. 904*4882a593Smuzhiyun 905*4882a593SmuzhiyunLarger 16-process litmus tests that would normally consume 15 minutes 906*4882a593Smuzhiyunof time complete in about 40 seconds with this option. To be fair, 907*4882a593Smuzhiyunyou do get an extra 65,535 states when you leave off the "-speedcheck 908*4882a593Smuzhiyuntrue" option. 909*4882a593Smuzhiyun 910*4882a593Smuzhiyunhttps://github.com/paulmckrcu/litmus/blob/master/auto/C-RW-R+RW-R+RW-G+RW-G+RW-G+RW-G+RW-R+RW-R+RW-R+RW-R+RW-G+RW-G+RW-G+RW-G+RW-R+RW-R.litmus 911*4882a593Smuzhiyun 912*4882a593SmuzhiyunNevertheless, litmus-test analysis really is of exponential complexity, 913*4882a593Smuzhiyunwhether with or without "-speedcheck true". Increasing by just three 914*4882a593Smuzhiyunprocesses to a 19-process litmus test requires 2 hours and 40 minutes 915*4882a593Smuzhiyunwithout, and about 8 minutes with "-speedcheck true". Each of these 916*4882a593Smuzhiyunresults represent roughly an order of magnitude slowdown compared to the 917*4882a593Smuzhiyun16-process litmus test. Again, to be fair, the multi-hour run explores 918*4882a593Smuzhiyunno fewer than 524,287 additional states compared to the shorter one. 919*4882a593Smuzhiyun 920*4882a593Smuzhiyunhttps://github.com/paulmckrcu/litmus/blob/master/auto/C-RW-R+RW-R+RW-G+RW-G+RW-G+RW-G+RW-R+RW-R+RW-R+RW-R+RW-R+RW-R+RW-G+RW-G+RW-G+RW-G+RW-R+RW-R+RW-R.litmus 921*4882a593Smuzhiyun 922*4882a593SmuzhiyunIf you don't like command-line arguments, you can obtain a similar speedup 923*4882a593Smuzhiyunby adding a "filter" clause with exactly the same expression as your 924*4882a593Smuzhiyun"exists" clause. 925*4882a593Smuzhiyun 926*4882a593SmuzhiyunHowever, please note that seeing the full set of states can be extremely 927*4882a593Smuzhiyunhelpful when developing and debugging litmus tests. 928*4882a593Smuzhiyun 929*4882a593Smuzhiyun 930*4882a593SmuzhiyunLIMITATIONS 931*4882a593Smuzhiyun=========== 932*4882a593Smuzhiyun 933*4882a593SmuzhiyunLimitations of the Linux-kernel memory model (LKMM) include: 934*4882a593Smuzhiyun 935*4882a593Smuzhiyun1. Compiler optimizations are not accurately modeled. Of course, 936*4882a593Smuzhiyun the use of READ_ONCE() and WRITE_ONCE() limits the compiler's 937*4882a593Smuzhiyun ability to optimize, but under some circumstances it is possible 938*4882a593Smuzhiyun for the compiler to undermine the memory model. For more 939*4882a593Smuzhiyun information, see Documentation/explanation.txt (in particular, 940*4882a593Smuzhiyun the "THE PROGRAM ORDER RELATION: po AND po-loc" and "A WARNING" 941*4882a593Smuzhiyun sections). 942*4882a593Smuzhiyun 943*4882a593Smuzhiyun Note that this limitation in turn limits LKMM's ability to 944*4882a593Smuzhiyun accurately model address, control, and data dependencies. 945*4882a593Smuzhiyun For example, if the compiler can deduce the value of some variable 946*4882a593Smuzhiyun carrying a dependency, then the compiler can break that dependency 947*4882a593Smuzhiyun by substituting a constant of that value. 948*4882a593Smuzhiyun 949*4882a593Smuzhiyun2. Multiple access sizes for a single variable are not supported, 950*4882a593Smuzhiyun and neither are misaligned or partially overlapping accesses. 951*4882a593Smuzhiyun 952*4882a593Smuzhiyun3. Exceptions and interrupts are not modeled. In some cases, 953*4882a593Smuzhiyun this limitation can be overcome by modeling the interrupt or 954*4882a593Smuzhiyun exception with an additional process. 955*4882a593Smuzhiyun 956*4882a593Smuzhiyun4. I/O such as MMIO or DMA is not supported. 957*4882a593Smuzhiyun 958*4882a593Smuzhiyun5. Self-modifying code (such as that found in the kernel's 959*4882a593Smuzhiyun alternatives mechanism, function tracer, Berkeley Packet Filter 960*4882a593Smuzhiyun JIT compiler, and module loader) is not supported. 961*4882a593Smuzhiyun 962*4882a593Smuzhiyun6. Complete modeling of all variants of atomic read-modify-write 963*4882a593Smuzhiyun operations, locking primitives, and RCU is not provided. 964*4882a593Smuzhiyun For example, call_rcu() and rcu_barrier() are not supported. 965*4882a593Smuzhiyun However, a substantial amount of support is provided for these 966*4882a593Smuzhiyun operations, as shown in the linux-kernel.def file. 967*4882a593Smuzhiyun 968*4882a593Smuzhiyun Here are specific limitations: 969*4882a593Smuzhiyun 970*4882a593Smuzhiyun a. When rcu_assign_pointer() is passed NULL, the Linux 971*4882a593Smuzhiyun kernel provides no ordering, but LKMM models this 972*4882a593Smuzhiyun case as a store release. 973*4882a593Smuzhiyun 974*4882a593Smuzhiyun b. The "unless" RMW operations are not currently modeled: 975*4882a593Smuzhiyun atomic_long_add_unless(), atomic_inc_unless_negative(), 976*4882a593Smuzhiyun and atomic_dec_unless_positive(). These can be emulated 977*4882a593Smuzhiyun in litmus tests, for example, by using atomic_cmpxchg(). 978*4882a593Smuzhiyun 979*4882a593Smuzhiyun One exception of this limitation is atomic_add_unless(), 980*4882a593Smuzhiyun which is provided directly by herd7 (so no corresponding 981*4882a593Smuzhiyun definition in linux-kernel.def). atomic_add_unless() is 982*4882a593Smuzhiyun modeled by herd7 therefore it can be used in litmus tests. 983*4882a593Smuzhiyun 984*4882a593Smuzhiyun c. The call_rcu() function is not modeled. As was shown above, 985*4882a593Smuzhiyun it can be emulated in litmus tests by adding another 986*4882a593Smuzhiyun process that invokes synchronize_rcu() and the body of the 987*4882a593Smuzhiyun callback function, with (for example) a release-acquire 988*4882a593Smuzhiyun from the site of the emulated call_rcu() to the beginning 989*4882a593Smuzhiyun of the additional process. 990*4882a593Smuzhiyun 991*4882a593Smuzhiyun d. The rcu_barrier() function is not modeled. It can be 992*4882a593Smuzhiyun emulated in litmus tests emulating call_rcu() via 993*4882a593Smuzhiyun (for example) a release-acquire from the end of each 994*4882a593Smuzhiyun additional call_rcu() process to the site of the 995*4882a593Smuzhiyun emulated rcu-barrier(). 996*4882a593Smuzhiyun 997*4882a593Smuzhiyun e. Although sleepable RCU (SRCU) is now modeled, there 998*4882a593Smuzhiyun are some subtle differences between its semantics and 999*4882a593Smuzhiyun those in the Linux kernel. For example, the kernel 1000*4882a593Smuzhiyun might interpret the following sequence as two partially 1001*4882a593Smuzhiyun overlapping SRCU read-side critical sections: 1002*4882a593Smuzhiyun 1003*4882a593Smuzhiyun 1 r1 = srcu_read_lock(&my_srcu); 1004*4882a593Smuzhiyun 2 do_something_1(); 1005*4882a593Smuzhiyun 3 r2 = srcu_read_lock(&my_srcu); 1006*4882a593Smuzhiyun 4 do_something_2(); 1007*4882a593Smuzhiyun 5 srcu_read_unlock(&my_srcu, r1); 1008*4882a593Smuzhiyun 6 do_something_3(); 1009*4882a593Smuzhiyun 7 srcu_read_unlock(&my_srcu, r2); 1010*4882a593Smuzhiyun 1011*4882a593Smuzhiyun In contrast, LKMM will interpret this as a nested pair of 1012*4882a593Smuzhiyun SRCU read-side critical sections, with the outer critical 1013*4882a593Smuzhiyun section spanning lines 1-7 and the inner critical section 1014*4882a593Smuzhiyun spanning lines 3-5. 1015*4882a593Smuzhiyun 1016*4882a593Smuzhiyun This difference would be more of a concern had anyone 1017*4882a593Smuzhiyun identified a reasonable use case for partially overlapping 1018*4882a593Smuzhiyun SRCU read-side critical sections. For more information 1019*4882a593Smuzhiyun on the trickiness of such overlapping, please see: 1020*4882a593Smuzhiyun https://paulmck.livejournal.com/40593.html 1021*4882a593Smuzhiyun 1022*4882a593Smuzhiyun f. Reader-writer locking is not modeled. It can be 1023*4882a593Smuzhiyun emulated in litmus tests using atomic read-modify-write 1024*4882a593Smuzhiyun operations. 1025*4882a593Smuzhiyun 1026*4882a593SmuzhiyunThe fragment of the C language supported by these litmus tests is quite 1027*4882a593Smuzhiyunlimited and in some ways non-standard: 1028*4882a593Smuzhiyun 1029*4882a593Smuzhiyun1. There is no automatic C-preprocessor pass. You can of course 1030*4882a593Smuzhiyun run it manually, if you choose. 1031*4882a593Smuzhiyun 1032*4882a593Smuzhiyun2. There is no way to create functions other than the Pn() functions 1033*4882a593Smuzhiyun that model the concurrent processes. 1034*4882a593Smuzhiyun 1035*4882a593Smuzhiyun3. The Pn() functions' formal parameters must be pointers to the 1036*4882a593Smuzhiyun global shared variables. Nothing can be passed by value into 1037*4882a593Smuzhiyun these functions. 1038*4882a593Smuzhiyun 1039*4882a593Smuzhiyun4. The only functions that can be invoked are those built directly 1040*4882a593Smuzhiyun into herd7 or that are defined in the linux-kernel.def file. 1041*4882a593Smuzhiyun 1042*4882a593Smuzhiyun5. The "switch", "do", "for", "while", and "goto" C statements are 1043*4882a593Smuzhiyun not supported. The "switch" statement can be emulated by the 1044*4882a593Smuzhiyun "if" statement. The "do", "for", and "while" statements can 1045*4882a593Smuzhiyun often be emulated by manually unrolling the loop, or perhaps by 1046*4882a593Smuzhiyun enlisting the aid of the C preprocessor to minimize the resulting 1047*4882a593Smuzhiyun code duplication. Some uses of "goto" can be emulated by "if", 1048*4882a593Smuzhiyun and some others by unrolling. 1049*4882a593Smuzhiyun 1050*4882a593Smuzhiyun6. Although you can use a wide variety of types in litmus-test 1051*4882a593Smuzhiyun variable declarations, and especially in global-variable 1052*4882a593Smuzhiyun declarations, the "herd7" tool understands only int and 1053*4882a593Smuzhiyun pointer types. There is no support for floating-point types, 1054*4882a593Smuzhiyun enumerations, characters, strings, arrays, or structures. 1055*4882a593Smuzhiyun 1056*4882a593Smuzhiyun7. Parsing of variable declarations is very loose, with almost no 1057*4882a593Smuzhiyun type checking. 1058*4882a593Smuzhiyun 1059*4882a593Smuzhiyun8. Initializers differ from their C-language counterparts. 1060*4882a593Smuzhiyun For example, when an initializer contains the name of a shared 1061*4882a593Smuzhiyun variable, that name denotes a pointer to that variable, not 1062*4882a593Smuzhiyun the current value of that variable. For example, "int x = y" 1063*4882a593Smuzhiyun is interpreted the way "int x = &y" would be in C. 1064*4882a593Smuzhiyun 1065*4882a593Smuzhiyun9. Dynamic memory allocation is not supported, although this can 1066*4882a593Smuzhiyun be worked around in some cases by supplying multiple statically 1067*4882a593Smuzhiyun allocated variables. 1068*4882a593Smuzhiyun 1069*4882a593SmuzhiyunSome of these limitations may be overcome in the future, but others are 1070*4882a593Smuzhiyunmore likely to be addressed by incorporating the Linux-kernel memory model 1071*4882a593Smuzhiyuninto other tools. 1072*4882a593Smuzhiyun 1073*4882a593SmuzhiyunFinally, please note that LKMM is subject to change as hardware, use cases, 1074*4882a593Smuzhiyunand compilers evolve. 1075