1*4882a593Smuzhiyun====================================== 2*4882a593SmuzhiyunSequence counters and sequential locks 3*4882a593Smuzhiyun====================================== 4*4882a593Smuzhiyun 5*4882a593SmuzhiyunIntroduction 6*4882a593Smuzhiyun============ 7*4882a593Smuzhiyun 8*4882a593SmuzhiyunSequence counters are a reader-writer consistency mechanism with 9*4882a593Smuzhiyunlockless readers (read-only retry loops), and no writer starvation. They 10*4882a593Smuzhiyunare used for data that's rarely written to (e.g. system time), where the 11*4882a593Smuzhiyunreader wants a consistent set of information and is willing to retry if 12*4882a593Smuzhiyunthat information changes. 13*4882a593Smuzhiyun 14*4882a593SmuzhiyunA data set is consistent when the sequence count at the beginning of the 15*4882a593Smuzhiyunread side critical section is even and the same sequence count value is 16*4882a593Smuzhiyunread again at the end of the critical section. The data in the set must 17*4882a593Smuzhiyunbe copied out inside the read side critical section. If the sequence 18*4882a593Smuzhiyuncount has changed between the start and the end of the critical section, 19*4882a593Smuzhiyunthe reader must retry. 20*4882a593Smuzhiyun 21*4882a593SmuzhiyunWriters increment the sequence count at the start and the end of their 22*4882a593Smuzhiyuncritical section. After starting the critical section the sequence count 23*4882a593Smuzhiyunis odd and indicates to the readers that an update is in progress. At 24*4882a593Smuzhiyunthe end of the write side critical section the sequence count becomes 25*4882a593Smuzhiyuneven again which lets readers make progress. 26*4882a593Smuzhiyun 27*4882a593SmuzhiyunA sequence counter write side critical section must never be preempted 28*4882a593Smuzhiyunor interrupted by read side sections. Otherwise the reader will spin for 29*4882a593Smuzhiyunthe entire scheduler tick due to the odd sequence count value and the 30*4882a593Smuzhiyuninterrupted writer. If that reader belongs to a real-time scheduling 31*4882a593Smuzhiyunclass, it can spin forever and the kernel will livelock. 32*4882a593Smuzhiyun 33*4882a593SmuzhiyunThis mechanism cannot be used if the protected data contains pointers, 34*4882a593Smuzhiyunas the writer can invalidate a pointer that the reader is following. 35*4882a593Smuzhiyun 36*4882a593Smuzhiyun 37*4882a593Smuzhiyun.. _seqcount_t: 38*4882a593Smuzhiyun 39*4882a593SmuzhiyunSequence counters (``seqcount_t``) 40*4882a593Smuzhiyun================================== 41*4882a593Smuzhiyun 42*4882a593SmuzhiyunThis is the the raw counting mechanism, which does not protect against 43*4882a593Smuzhiyunmultiple writers. Write side critical sections must thus be serialized 44*4882a593Smuzhiyunby an external lock. 45*4882a593Smuzhiyun 46*4882a593SmuzhiyunIf the write serialization primitive is not implicitly disabling 47*4882a593Smuzhiyunpreemption, preemption must be explicitly disabled before entering the 48*4882a593Smuzhiyunwrite side section. If the read section can be invoked from hardirq or 49*4882a593Smuzhiyunsoftirq contexts, interrupts or bottom halves must also be respectively 50*4882a593Smuzhiyundisabled before entering the write section. 51*4882a593Smuzhiyun 52*4882a593SmuzhiyunIf it's desired to automatically handle the sequence counter 53*4882a593Smuzhiyunrequirements of writer serialization and non-preemptibility, use 54*4882a593Smuzhiyun:ref:`seqlock_t` instead. 55*4882a593Smuzhiyun 56*4882a593SmuzhiyunInitialization:: 57*4882a593Smuzhiyun 58*4882a593Smuzhiyun /* dynamic */ 59*4882a593Smuzhiyun seqcount_t foo_seqcount; 60*4882a593Smuzhiyun seqcount_init(&foo_seqcount); 61*4882a593Smuzhiyun 62*4882a593Smuzhiyun /* static */ 63*4882a593Smuzhiyun static seqcount_t foo_seqcount = SEQCNT_ZERO(foo_seqcount); 64*4882a593Smuzhiyun 65*4882a593Smuzhiyun /* C99 struct init */ 66*4882a593Smuzhiyun struct { 67*4882a593Smuzhiyun .seq = SEQCNT_ZERO(foo.seq), 68*4882a593Smuzhiyun } foo; 69*4882a593Smuzhiyun 70*4882a593SmuzhiyunWrite path:: 71*4882a593Smuzhiyun 72*4882a593Smuzhiyun /* Serialized context with disabled preemption */ 73*4882a593Smuzhiyun 74*4882a593Smuzhiyun write_seqcount_begin(&foo_seqcount); 75*4882a593Smuzhiyun 76*4882a593Smuzhiyun /* ... [[write-side critical section]] ... */ 77*4882a593Smuzhiyun 78*4882a593Smuzhiyun write_seqcount_end(&foo_seqcount); 79*4882a593Smuzhiyun 80*4882a593SmuzhiyunRead path:: 81*4882a593Smuzhiyun 82*4882a593Smuzhiyun do { 83*4882a593Smuzhiyun seq = read_seqcount_begin(&foo_seqcount); 84*4882a593Smuzhiyun 85*4882a593Smuzhiyun /* ... [[read-side critical section]] ... */ 86*4882a593Smuzhiyun 87*4882a593Smuzhiyun } while (read_seqcount_retry(&foo_seqcount, seq)); 88*4882a593Smuzhiyun 89*4882a593Smuzhiyun 90*4882a593Smuzhiyun.. _seqcount_locktype_t: 91*4882a593Smuzhiyun 92*4882a593SmuzhiyunSequence counters with associated locks (``seqcount_LOCKNAME_t``) 93*4882a593Smuzhiyun----------------------------------------------------------------- 94*4882a593Smuzhiyun 95*4882a593SmuzhiyunAs discussed at :ref:`seqcount_t`, sequence count write side critical 96*4882a593Smuzhiyunsections must be serialized and non-preemptible. This variant of 97*4882a593Smuzhiyunsequence counters associate the lock used for writer serialization at 98*4882a593Smuzhiyuninitialization time, which enables lockdep to validate that the write 99*4882a593Smuzhiyunside critical sections are properly serialized. 100*4882a593Smuzhiyun 101*4882a593SmuzhiyunThis lock association is a NOOP if lockdep is disabled and has neither 102*4882a593Smuzhiyunstorage nor runtime overhead. If lockdep is enabled, the lock pointer is 103*4882a593Smuzhiyunstored in struct seqcount and lockdep's "lock is held" assertions are 104*4882a593Smuzhiyuninjected at the beginning of the write side critical section to validate 105*4882a593Smuzhiyunthat it is properly protected. 106*4882a593Smuzhiyun 107*4882a593SmuzhiyunFor lock types which do not implicitly disable preemption, preemption 108*4882a593Smuzhiyunprotection is enforced in the write side function. 109*4882a593Smuzhiyun 110*4882a593SmuzhiyunThe following sequence counters with associated locks are defined: 111*4882a593Smuzhiyun 112*4882a593Smuzhiyun - ``seqcount_spinlock_t`` 113*4882a593Smuzhiyun - ``seqcount_raw_spinlock_t`` 114*4882a593Smuzhiyun - ``seqcount_rwlock_t`` 115*4882a593Smuzhiyun - ``seqcount_mutex_t`` 116*4882a593Smuzhiyun - ``seqcount_ww_mutex_t`` 117*4882a593Smuzhiyun 118*4882a593SmuzhiyunThe sequence counter read and write APIs can take either a plain 119*4882a593Smuzhiyunseqcount_t or any of the seqcount_LOCKNAME_t variants above. 120*4882a593Smuzhiyun 121*4882a593SmuzhiyunInitialization (replace "LOCKNAME" with one of the supported locks):: 122*4882a593Smuzhiyun 123*4882a593Smuzhiyun /* dynamic */ 124*4882a593Smuzhiyun seqcount_LOCKNAME_t foo_seqcount; 125*4882a593Smuzhiyun seqcount_LOCKNAME_init(&foo_seqcount, &lock); 126*4882a593Smuzhiyun 127*4882a593Smuzhiyun /* static */ 128*4882a593Smuzhiyun static seqcount_LOCKNAME_t foo_seqcount = 129*4882a593Smuzhiyun SEQCNT_LOCKNAME_ZERO(foo_seqcount, &lock); 130*4882a593Smuzhiyun 131*4882a593Smuzhiyun /* C99 struct init */ 132*4882a593Smuzhiyun struct { 133*4882a593Smuzhiyun .seq = SEQCNT_LOCKNAME_ZERO(foo.seq, &lock), 134*4882a593Smuzhiyun } foo; 135*4882a593Smuzhiyun 136*4882a593SmuzhiyunWrite path: same as in :ref:`seqcount_t`, while running from a context 137*4882a593Smuzhiyunwith the associated write serialization lock acquired. 138*4882a593Smuzhiyun 139*4882a593SmuzhiyunRead path: same as in :ref:`seqcount_t`. 140*4882a593Smuzhiyun 141*4882a593Smuzhiyun 142*4882a593Smuzhiyun.. _seqcount_latch_t: 143*4882a593Smuzhiyun 144*4882a593SmuzhiyunLatch sequence counters (``seqcount_latch_t``) 145*4882a593Smuzhiyun---------------------------------------------- 146*4882a593Smuzhiyun 147*4882a593SmuzhiyunLatch sequence counters are a multiversion concurrency control mechanism 148*4882a593Smuzhiyunwhere the embedded seqcount_t counter even/odd value is used to switch 149*4882a593Smuzhiyunbetween two copies of protected data. This allows the sequence counter 150*4882a593Smuzhiyunread path to safely interrupt its own write side critical section. 151*4882a593Smuzhiyun 152*4882a593SmuzhiyunUse seqcount_latch_t when the write side sections cannot be protected 153*4882a593Smuzhiyunfrom interruption by readers. This is typically the case when the read 154*4882a593Smuzhiyunside can be invoked from NMI handlers. 155*4882a593Smuzhiyun 156*4882a593SmuzhiyunCheck `raw_write_seqcount_latch()` for more information. 157*4882a593Smuzhiyun 158*4882a593Smuzhiyun 159*4882a593Smuzhiyun.. _seqlock_t: 160*4882a593Smuzhiyun 161*4882a593SmuzhiyunSequential locks (``seqlock_t``) 162*4882a593Smuzhiyun================================ 163*4882a593Smuzhiyun 164*4882a593SmuzhiyunThis contains the :ref:`seqcount_t` mechanism earlier discussed, plus an 165*4882a593Smuzhiyunembedded spinlock for writer serialization and non-preemptibility. 166*4882a593Smuzhiyun 167*4882a593SmuzhiyunIf the read side section can be invoked from hardirq or softirq context, 168*4882a593Smuzhiyunuse the write side function variants which disable interrupts or bottom 169*4882a593Smuzhiyunhalves respectively. 170*4882a593Smuzhiyun 171*4882a593SmuzhiyunInitialization:: 172*4882a593Smuzhiyun 173*4882a593Smuzhiyun /* dynamic */ 174*4882a593Smuzhiyun seqlock_t foo_seqlock; 175*4882a593Smuzhiyun seqlock_init(&foo_seqlock); 176*4882a593Smuzhiyun 177*4882a593Smuzhiyun /* static */ 178*4882a593Smuzhiyun static DEFINE_SEQLOCK(foo_seqlock); 179*4882a593Smuzhiyun 180*4882a593Smuzhiyun /* C99 struct init */ 181*4882a593Smuzhiyun struct { 182*4882a593Smuzhiyun .seql = __SEQLOCK_UNLOCKED(foo.seql) 183*4882a593Smuzhiyun } foo; 184*4882a593Smuzhiyun 185*4882a593SmuzhiyunWrite path:: 186*4882a593Smuzhiyun 187*4882a593Smuzhiyun write_seqlock(&foo_seqlock); 188*4882a593Smuzhiyun 189*4882a593Smuzhiyun /* ... [[write-side critical section]] ... */ 190*4882a593Smuzhiyun 191*4882a593Smuzhiyun write_sequnlock(&foo_seqlock); 192*4882a593Smuzhiyun 193*4882a593SmuzhiyunRead path, three categories: 194*4882a593Smuzhiyun 195*4882a593Smuzhiyun1. Normal Sequence readers which never block a writer but they must 196*4882a593Smuzhiyun retry if a writer is in progress by detecting change in the sequence 197*4882a593Smuzhiyun number. Writers do not wait for a sequence reader:: 198*4882a593Smuzhiyun 199*4882a593Smuzhiyun do { 200*4882a593Smuzhiyun seq = read_seqbegin(&foo_seqlock); 201*4882a593Smuzhiyun 202*4882a593Smuzhiyun /* ... [[read-side critical section]] ... */ 203*4882a593Smuzhiyun 204*4882a593Smuzhiyun } while (read_seqretry(&foo_seqlock, seq)); 205*4882a593Smuzhiyun 206*4882a593Smuzhiyun2. Locking readers which will wait if a writer or another locking reader 207*4882a593Smuzhiyun is in progress. A locking reader in progress will also block a writer 208*4882a593Smuzhiyun from entering its critical section. This read lock is 209*4882a593Smuzhiyun exclusive. Unlike rwlock_t, only one locking reader can acquire it:: 210*4882a593Smuzhiyun 211*4882a593Smuzhiyun read_seqlock_excl(&foo_seqlock); 212*4882a593Smuzhiyun 213*4882a593Smuzhiyun /* ... [[read-side critical section]] ... */ 214*4882a593Smuzhiyun 215*4882a593Smuzhiyun read_sequnlock_excl(&foo_seqlock); 216*4882a593Smuzhiyun 217*4882a593Smuzhiyun3. Conditional lockless reader (as in 1), or locking reader (as in 2), 218*4882a593Smuzhiyun according to a passed marker. This is used to avoid lockless readers 219*4882a593Smuzhiyun starvation (too much retry loops) in case of a sharp spike in write 220*4882a593Smuzhiyun activity. First, a lockless read is tried (even marker passed). If 221*4882a593Smuzhiyun that trial fails (odd sequence counter is returned, which is used as 222*4882a593Smuzhiyun the next iteration marker), the lockless read is transformed to a 223*4882a593Smuzhiyun full locking read and no retry loop is necessary:: 224*4882a593Smuzhiyun 225*4882a593Smuzhiyun /* marker; even initialization */ 226*4882a593Smuzhiyun int seq = 0; 227*4882a593Smuzhiyun do { 228*4882a593Smuzhiyun read_seqbegin_or_lock(&foo_seqlock, &seq); 229*4882a593Smuzhiyun 230*4882a593Smuzhiyun /* ... [[read-side critical section]] ... */ 231*4882a593Smuzhiyun 232*4882a593Smuzhiyun } while (need_seqretry(&foo_seqlock, seq)); 233*4882a593Smuzhiyun done_seqretry(&foo_seqlock, seq); 234*4882a593Smuzhiyun 235*4882a593Smuzhiyun 236*4882a593SmuzhiyunAPI documentation 237*4882a593Smuzhiyun================= 238*4882a593Smuzhiyun 239*4882a593Smuzhiyun.. kernel-doc:: include/linux/seqlock.h 240