xref: /OK3568_Linux_fs/kernel/Documentation/locking/seqlock.rst (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593Smuzhiyun======================================
2*4882a593SmuzhiyunSequence counters and sequential locks
3*4882a593Smuzhiyun======================================
4*4882a593Smuzhiyun
5*4882a593SmuzhiyunIntroduction
6*4882a593Smuzhiyun============
7*4882a593Smuzhiyun
8*4882a593SmuzhiyunSequence counters are a reader-writer consistency mechanism with
9*4882a593Smuzhiyunlockless readers (read-only retry loops), and no writer starvation. They
10*4882a593Smuzhiyunare used for data that's rarely written to (e.g. system time), where the
11*4882a593Smuzhiyunreader wants a consistent set of information and is willing to retry if
12*4882a593Smuzhiyunthat information changes.
13*4882a593Smuzhiyun
14*4882a593SmuzhiyunA data set is consistent when the sequence count at the beginning of the
15*4882a593Smuzhiyunread side critical section is even and the same sequence count value is
16*4882a593Smuzhiyunread again at the end of the critical section. The data in the set must
17*4882a593Smuzhiyunbe copied out inside the read side critical section. If the sequence
18*4882a593Smuzhiyuncount has changed between the start and the end of the critical section,
19*4882a593Smuzhiyunthe reader must retry.
20*4882a593Smuzhiyun
21*4882a593SmuzhiyunWriters increment the sequence count at the start and the end of their
22*4882a593Smuzhiyuncritical section. After starting the critical section the sequence count
23*4882a593Smuzhiyunis odd and indicates to the readers that an update is in progress. At
24*4882a593Smuzhiyunthe end of the write side critical section the sequence count becomes
25*4882a593Smuzhiyuneven again which lets readers make progress.
26*4882a593Smuzhiyun
27*4882a593SmuzhiyunA sequence counter write side critical section must never be preempted
28*4882a593Smuzhiyunor interrupted by read side sections. Otherwise the reader will spin for
29*4882a593Smuzhiyunthe entire scheduler tick due to the odd sequence count value and the
30*4882a593Smuzhiyuninterrupted writer. If that reader belongs to a real-time scheduling
31*4882a593Smuzhiyunclass, it can spin forever and the kernel will livelock.
32*4882a593Smuzhiyun
33*4882a593SmuzhiyunThis mechanism cannot be used if the protected data contains pointers,
34*4882a593Smuzhiyunas the writer can invalidate a pointer that the reader is following.
35*4882a593Smuzhiyun
36*4882a593Smuzhiyun
37*4882a593Smuzhiyun.. _seqcount_t:
38*4882a593Smuzhiyun
39*4882a593SmuzhiyunSequence counters (``seqcount_t``)
40*4882a593Smuzhiyun==================================
41*4882a593Smuzhiyun
42*4882a593SmuzhiyunThis is the the raw counting mechanism, which does not protect against
43*4882a593Smuzhiyunmultiple writers.  Write side critical sections must thus be serialized
44*4882a593Smuzhiyunby an external lock.
45*4882a593Smuzhiyun
46*4882a593SmuzhiyunIf the write serialization primitive is not implicitly disabling
47*4882a593Smuzhiyunpreemption, preemption must be explicitly disabled before entering the
48*4882a593Smuzhiyunwrite side section. If the read section can be invoked from hardirq or
49*4882a593Smuzhiyunsoftirq contexts, interrupts or bottom halves must also be respectively
50*4882a593Smuzhiyundisabled before entering the write section.
51*4882a593Smuzhiyun
52*4882a593SmuzhiyunIf it's desired to automatically handle the sequence counter
53*4882a593Smuzhiyunrequirements of writer serialization and non-preemptibility, use
54*4882a593Smuzhiyun:ref:`seqlock_t` instead.
55*4882a593Smuzhiyun
56*4882a593SmuzhiyunInitialization::
57*4882a593Smuzhiyun
58*4882a593Smuzhiyun	/* dynamic */
59*4882a593Smuzhiyun	seqcount_t foo_seqcount;
60*4882a593Smuzhiyun	seqcount_init(&foo_seqcount);
61*4882a593Smuzhiyun
62*4882a593Smuzhiyun	/* static */
63*4882a593Smuzhiyun	static seqcount_t foo_seqcount = SEQCNT_ZERO(foo_seqcount);
64*4882a593Smuzhiyun
65*4882a593Smuzhiyun	/* C99 struct init */
66*4882a593Smuzhiyun	struct {
67*4882a593Smuzhiyun		.seq   = SEQCNT_ZERO(foo.seq),
68*4882a593Smuzhiyun	} foo;
69*4882a593Smuzhiyun
70*4882a593SmuzhiyunWrite path::
71*4882a593Smuzhiyun
72*4882a593Smuzhiyun	/* Serialized context with disabled preemption */
73*4882a593Smuzhiyun
74*4882a593Smuzhiyun	write_seqcount_begin(&foo_seqcount);
75*4882a593Smuzhiyun
76*4882a593Smuzhiyun	/* ... [[write-side critical section]] ... */
77*4882a593Smuzhiyun
78*4882a593Smuzhiyun	write_seqcount_end(&foo_seqcount);
79*4882a593Smuzhiyun
80*4882a593SmuzhiyunRead path::
81*4882a593Smuzhiyun
82*4882a593Smuzhiyun	do {
83*4882a593Smuzhiyun		seq = read_seqcount_begin(&foo_seqcount);
84*4882a593Smuzhiyun
85*4882a593Smuzhiyun		/* ... [[read-side critical section]] ... */
86*4882a593Smuzhiyun
87*4882a593Smuzhiyun	} while (read_seqcount_retry(&foo_seqcount, seq));
88*4882a593Smuzhiyun
89*4882a593Smuzhiyun
90*4882a593Smuzhiyun.. _seqcount_locktype_t:
91*4882a593Smuzhiyun
92*4882a593SmuzhiyunSequence counters with associated locks (``seqcount_LOCKNAME_t``)
93*4882a593Smuzhiyun-----------------------------------------------------------------
94*4882a593Smuzhiyun
95*4882a593SmuzhiyunAs discussed at :ref:`seqcount_t`, sequence count write side critical
96*4882a593Smuzhiyunsections must be serialized and non-preemptible. This variant of
97*4882a593Smuzhiyunsequence counters associate the lock used for writer serialization at
98*4882a593Smuzhiyuninitialization time, which enables lockdep to validate that the write
99*4882a593Smuzhiyunside critical sections are properly serialized.
100*4882a593Smuzhiyun
101*4882a593SmuzhiyunThis lock association is a NOOP if lockdep is disabled and has neither
102*4882a593Smuzhiyunstorage nor runtime overhead. If lockdep is enabled, the lock pointer is
103*4882a593Smuzhiyunstored in struct seqcount and lockdep's "lock is held" assertions are
104*4882a593Smuzhiyuninjected at the beginning of the write side critical section to validate
105*4882a593Smuzhiyunthat it is properly protected.
106*4882a593Smuzhiyun
107*4882a593SmuzhiyunFor lock types which do not implicitly disable preemption, preemption
108*4882a593Smuzhiyunprotection is enforced in the write side function.
109*4882a593Smuzhiyun
110*4882a593SmuzhiyunThe following sequence counters with associated locks are defined:
111*4882a593Smuzhiyun
112*4882a593Smuzhiyun  - ``seqcount_spinlock_t``
113*4882a593Smuzhiyun  - ``seqcount_raw_spinlock_t``
114*4882a593Smuzhiyun  - ``seqcount_rwlock_t``
115*4882a593Smuzhiyun  - ``seqcount_mutex_t``
116*4882a593Smuzhiyun  - ``seqcount_ww_mutex_t``
117*4882a593Smuzhiyun
118*4882a593SmuzhiyunThe sequence counter read and write APIs can take either a plain
119*4882a593Smuzhiyunseqcount_t or any of the seqcount_LOCKNAME_t variants above.
120*4882a593Smuzhiyun
121*4882a593SmuzhiyunInitialization (replace "LOCKNAME" with one of the supported locks)::
122*4882a593Smuzhiyun
123*4882a593Smuzhiyun	/* dynamic */
124*4882a593Smuzhiyun	seqcount_LOCKNAME_t foo_seqcount;
125*4882a593Smuzhiyun	seqcount_LOCKNAME_init(&foo_seqcount, &lock);
126*4882a593Smuzhiyun
127*4882a593Smuzhiyun	/* static */
128*4882a593Smuzhiyun	static seqcount_LOCKNAME_t foo_seqcount =
129*4882a593Smuzhiyun		SEQCNT_LOCKNAME_ZERO(foo_seqcount, &lock);
130*4882a593Smuzhiyun
131*4882a593Smuzhiyun	/* C99 struct init */
132*4882a593Smuzhiyun	struct {
133*4882a593Smuzhiyun		.seq   = SEQCNT_LOCKNAME_ZERO(foo.seq, &lock),
134*4882a593Smuzhiyun	} foo;
135*4882a593Smuzhiyun
136*4882a593SmuzhiyunWrite path: same as in :ref:`seqcount_t`, while running from a context
137*4882a593Smuzhiyunwith the associated write serialization lock acquired.
138*4882a593Smuzhiyun
139*4882a593SmuzhiyunRead path: same as in :ref:`seqcount_t`.
140*4882a593Smuzhiyun
141*4882a593Smuzhiyun
142*4882a593Smuzhiyun.. _seqcount_latch_t:
143*4882a593Smuzhiyun
144*4882a593SmuzhiyunLatch sequence counters (``seqcount_latch_t``)
145*4882a593Smuzhiyun----------------------------------------------
146*4882a593Smuzhiyun
147*4882a593SmuzhiyunLatch sequence counters are a multiversion concurrency control mechanism
148*4882a593Smuzhiyunwhere the embedded seqcount_t counter even/odd value is used to switch
149*4882a593Smuzhiyunbetween two copies of protected data. This allows the sequence counter
150*4882a593Smuzhiyunread path to safely interrupt its own write side critical section.
151*4882a593Smuzhiyun
152*4882a593SmuzhiyunUse seqcount_latch_t when the write side sections cannot be protected
153*4882a593Smuzhiyunfrom interruption by readers. This is typically the case when the read
154*4882a593Smuzhiyunside can be invoked from NMI handlers.
155*4882a593Smuzhiyun
156*4882a593SmuzhiyunCheck `raw_write_seqcount_latch()` for more information.
157*4882a593Smuzhiyun
158*4882a593Smuzhiyun
159*4882a593Smuzhiyun.. _seqlock_t:
160*4882a593Smuzhiyun
161*4882a593SmuzhiyunSequential locks (``seqlock_t``)
162*4882a593Smuzhiyun================================
163*4882a593Smuzhiyun
164*4882a593SmuzhiyunThis contains the :ref:`seqcount_t` mechanism earlier discussed, plus an
165*4882a593Smuzhiyunembedded spinlock for writer serialization and non-preemptibility.
166*4882a593Smuzhiyun
167*4882a593SmuzhiyunIf the read side section can be invoked from hardirq or softirq context,
168*4882a593Smuzhiyunuse the write side function variants which disable interrupts or bottom
169*4882a593Smuzhiyunhalves respectively.
170*4882a593Smuzhiyun
171*4882a593SmuzhiyunInitialization::
172*4882a593Smuzhiyun
173*4882a593Smuzhiyun	/* dynamic */
174*4882a593Smuzhiyun	seqlock_t foo_seqlock;
175*4882a593Smuzhiyun	seqlock_init(&foo_seqlock);
176*4882a593Smuzhiyun
177*4882a593Smuzhiyun	/* static */
178*4882a593Smuzhiyun	static DEFINE_SEQLOCK(foo_seqlock);
179*4882a593Smuzhiyun
180*4882a593Smuzhiyun	/* C99 struct init */
181*4882a593Smuzhiyun	struct {
182*4882a593Smuzhiyun		.seql   = __SEQLOCK_UNLOCKED(foo.seql)
183*4882a593Smuzhiyun	} foo;
184*4882a593Smuzhiyun
185*4882a593SmuzhiyunWrite path::
186*4882a593Smuzhiyun
187*4882a593Smuzhiyun	write_seqlock(&foo_seqlock);
188*4882a593Smuzhiyun
189*4882a593Smuzhiyun	/* ... [[write-side critical section]] ... */
190*4882a593Smuzhiyun
191*4882a593Smuzhiyun	write_sequnlock(&foo_seqlock);
192*4882a593Smuzhiyun
193*4882a593SmuzhiyunRead path, three categories:
194*4882a593Smuzhiyun
195*4882a593Smuzhiyun1. Normal Sequence readers which never block a writer but they must
196*4882a593Smuzhiyun   retry if a writer is in progress by detecting change in the sequence
197*4882a593Smuzhiyun   number.  Writers do not wait for a sequence reader::
198*4882a593Smuzhiyun
199*4882a593Smuzhiyun	do {
200*4882a593Smuzhiyun		seq = read_seqbegin(&foo_seqlock);
201*4882a593Smuzhiyun
202*4882a593Smuzhiyun		/* ... [[read-side critical section]] ... */
203*4882a593Smuzhiyun
204*4882a593Smuzhiyun	} while (read_seqretry(&foo_seqlock, seq));
205*4882a593Smuzhiyun
206*4882a593Smuzhiyun2. Locking readers which will wait if a writer or another locking reader
207*4882a593Smuzhiyun   is in progress. A locking reader in progress will also block a writer
208*4882a593Smuzhiyun   from entering its critical section. This read lock is
209*4882a593Smuzhiyun   exclusive. Unlike rwlock_t, only one locking reader can acquire it::
210*4882a593Smuzhiyun
211*4882a593Smuzhiyun	read_seqlock_excl(&foo_seqlock);
212*4882a593Smuzhiyun
213*4882a593Smuzhiyun	/* ... [[read-side critical section]] ... */
214*4882a593Smuzhiyun
215*4882a593Smuzhiyun	read_sequnlock_excl(&foo_seqlock);
216*4882a593Smuzhiyun
217*4882a593Smuzhiyun3. Conditional lockless reader (as in 1), or locking reader (as in 2),
218*4882a593Smuzhiyun   according to a passed marker. This is used to avoid lockless readers
219*4882a593Smuzhiyun   starvation (too much retry loops) in case of a sharp spike in write
220*4882a593Smuzhiyun   activity. First, a lockless read is tried (even marker passed). If
221*4882a593Smuzhiyun   that trial fails (odd sequence counter is returned, which is used as
222*4882a593Smuzhiyun   the next iteration marker), the lockless read is transformed to a
223*4882a593Smuzhiyun   full locking read and no retry loop is necessary::
224*4882a593Smuzhiyun
225*4882a593Smuzhiyun	/* marker; even initialization */
226*4882a593Smuzhiyun	int seq = 0;
227*4882a593Smuzhiyun	do {
228*4882a593Smuzhiyun		read_seqbegin_or_lock(&foo_seqlock, &seq);
229*4882a593Smuzhiyun
230*4882a593Smuzhiyun		/* ... [[read-side critical section]] ... */
231*4882a593Smuzhiyun
232*4882a593Smuzhiyun	} while (need_seqretry(&foo_seqlock, seq));
233*4882a593Smuzhiyun	done_seqretry(&foo_seqlock, seq);
234*4882a593Smuzhiyun
235*4882a593Smuzhiyun
236*4882a593SmuzhiyunAPI documentation
237*4882a593Smuzhiyun=================
238*4882a593Smuzhiyun
239*4882a593Smuzhiyun.. kernel-doc:: include/linux/seqlock.h
240