xref: /OK3568_Linux_fs/kernel/Documentation/locking/preempt-locking.rst (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593Smuzhiyun===========================================================================
2*4882a593SmuzhiyunProper Locking Under a Preemptible Kernel: Keeping Kernel Code Preempt-Safe
3*4882a593Smuzhiyun===========================================================================
4*4882a593Smuzhiyun
5*4882a593Smuzhiyun:Author: Robert Love <rml@tech9.net>
6*4882a593Smuzhiyun
7*4882a593Smuzhiyun
8*4882a593SmuzhiyunIntroduction
9*4882a593Smuzhiyun============
10*4882a593Smuzhiyun
11*4882a593Smuzhiyun
12*4882a593SmuzhiyunA preemptible kernel creates new locking issues.  The issues are the same as
13*4882a593Smuzhiyunthose under SMP: concurrency and reentrancy.  Thankfully, the Linux preemptible
14*4882a593Smuzhiyunkernel model leverages existing SMP locking mechanisms.  Thus, the kernel
15*4882a593Smuzhiyunrequires explicit additional locking for very few additional situations.
16*4882a593Smuzhiyun
17*4882a593SmuzhiyunThis document is for all kernel hackers.  Developing code in the kernel
18*4882a593Smuzhiyunrequires protecting these situations.
19*4882a593Smuzhiyun
20*4882a593Smuzhiyun
21*4882a593SmuzhiyunRULE #1: Per-CPU data structures need explicit protection
22*4882a593Smuzhiyun^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23*4882a593Smuzhiyun
24*4882a593Smuzhiyun
25*4882a593SmuzhiyunTwo similar problems arise. An example code snippet::
26*4882a593Smuzhiyun
27*4882a593Smuzhiyun	struct this_needs_locking tux[NR_CPUS];
28*4882a593Smuzhiyun	tux[smp_processor_id()] = some_value;
29*4882a593Smuzhiyun	/* task is preempted here... */
30*4882a593Smuzhiyun	something = tux[smp_processor_id()];
31*4882a593Smuzhiyun
32*4882a593SmuzhiyunFirst, since the data is per-CPU, it may not have explicit SMP locking, but
33*4882a593Smuzhiyunrequire it otherwise.  Second, when a preempted task is finally rescheduled,
34*4882a593Smuzhiyunthe previous value of smp_processor_id may not equal the current.  You must
35*4882a593Smuzhiyunprotect these situations by disabling preemption around them.
36*4882a593Smuzhiyun
37*4882a593SmuzhiyunYou can also use put_cpu() and get_cpu(), which will disable preemption.
38*4882a593Smuzhiyun
39*4882a593Smuzhiyun
40*4882a593SmuzhiyunRULE #2: CPU state must be protected.
41*4882a593Smuzhiyun^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
42*4882a593Smuzhiyun
43*4882a593Smuzhiyun
44*4882a593SmuzhiyunUnder preemption, the state of the CPU must be protected.  This is arch-
45*4882a593Smuzhiyundependent, but includes CPU structures and state not preserved over a context
46*4882a593Smuzhiyunswitch.  For example, on x86, entering and exiting FPU mode is now a critical
47*4882a593Smuzhiyunsection that must occur while preemption is disabled.  Think what would happen
48*4882a593Smuzhiyunif the kernel is executing a floating-point instruction and is then preempted.
49*4882a593SmuzhiyunRemember, the kernel does not save FPU state except for user tasks.  Therefore,
50*4882a593Smuzhiyunupon preemption, the FPU registers will be sold to the lowest bidder.  Thus,
51*4882a593Smuzhiyunpreemption must be disabled around such regions.
52*4882a593Smuzhiyun
53*4882a593SmuzhiyunNote, some FPU functions are already explicitly preempt safe.  For example,
54*4882a593Smuzhiyunkernel_fpu_begin and kernel_fpu_end will disable and enable preemption.
55*4882a593Smuzhiyun
56*4882a593Smuzhiyun
57*4882a593SmuzhiyunRULE #3: Lock acquire and release must be performed by same task
58*4882a593Smuzhiyun^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
59*4882a593Smuzhiyun
60*4882a593Smuzhiyun
61*4882a593SmuzhiyunA lock acquired in one task must be released by the same task.  This
62*4882a593Smuzhiyunmeans you can't do oddball things like acquire a lock and go off to
63*4882a593Smuzhiyunplay while another task releases it.  If you want to do something
64*4882a593Smuzhiyunlike this, acquire and release the task in the same code path and
65*4882a593Smuzhiyunhave the caller wait on an event by the other task.
66*4882a593Smuzhiyun
67*4882a593Smuzhiyun
68*4882a593SmuzhiyunSolution
69*4882a593Smuzhiyun========
70*4882a593Smuzhiyun
71*4882a593Smuzhiyun
72*4882a593SmuzhiyunData protection under preemption is achieved by disabling preemption for the
73*4882a593Smuzhiyunduration of the critical region.
74*4882a593Smuzhiyun
75*4882a593Smuzhiyun::
76*4882a593Smuzhiyun
77*4882a593Smuzhiyun  preempt_enable()		decrement the preempt counter
78*4882a593Smuzhiyun  preempt_disable()		increment the preempt counter
79*4882a593Smuzhiyun  preempt_enable_no_resched()	decrement, but do not immediately preempt
80*4882a593Smuzhiyun  preempt_check_resched()	if needed, reschedule
81*4882a593Smuzhiyun  preempt_count()		return the preempt counter
82*4882a593Smuzhiyun
83*4882a593SmuzhiyunThe functions are nestable.  In other words, you can call preempt_disable
84*4882a593Smuzhiyunn-times in a code path, and preemption will not be reenabled until the n-th
85*4882a593Smuzhiyuncall to preempt_enable.  The preempt statements define to nothing if
86*4882a593Smuzhiyunpreemption is not enabled.
87*4882a593Smuzhiyun
88*4882a593SmuzhiyunNote that you do not need to explicitly prevent preemption if you are holding
89*4882a593Smuzhiyunany locks or interrupts are disabled, since preemption is implicitly disabled
90*4882a593Smuzhiyunin those cases.
91*4882a593Smuzhiyun
92*4882a593SmuzhiyunBut keep in mind that 'irqs disabled' is a fundamentally unsafe way of
93*4882a593Smuzhiyundisabling preemption - any cond_resched() or cond_resched_lock() might trigger
94*4882a593Smuzhiyuna reschedule if the preempt count is 0. A simple printk() might trigger a
95*4882a593Smuzhiyunreschedule. So use this implicit preemption-disabling property only if you
96*4882a593Smuzhiyunknow that the affected codepath does not do any of this. Best policy is to use
97*4882a593Smuzhiyunthis only for small, atomic code that you wrote and which calls no complex
98*4882a593Smuzhiyunfunctions.
99*4882a593Smuzhiyun
100*4882a593SmuzhiyunExample::
101*4882a593Smuzhiyun
102*4882a593Smuzhiyun	cpucache_t *cc; /* this is per-CPU */
103*4882a593Smuzhiyun	preempt_disable();
104*4882a593Smuzhiyun	cc = cc_data(searchp);
105*4882a593Smuzhiyun	if (cc && cc->avail) {
106*4882a593Smuzhiyun		__free_block(searchp, cc_entry(cc), cc->avail);
107*4882a593Smuzhiyun		cc->avail = 0;
108*4882a593Smuzhiyun	}
109*4882a593Smuzhiyun	preempt_enable();
110*4882a593Smuzhiyun	return 0;
111*4882a593Smuzhiyun
112*4882a593SmuzhiyunNotice how the preemption statements must encompass every reference of the
113*4882a593Smuzhiyuncritical variables.  Another example::
114*4882a593Smuzhiyun
115*4882a593Smuzhiyun	int buf[NR_CPUS];
116*4882a593Smuzhiyun	set_cpu_val(buf);
117*4882a593Smuzhiyun	if (buf[smp_processor_id()] == -1) printf(KERN_INFO "wee!\n");
118*4882a593Smuzhiyun	spin_lock(&buf_lock);
119*4882a593Smuzhiyun	/* ... */
120*4882a593Smuzhiyun
121*4882a593SmuzhiyunThis code is not preempt-safe, but see how easily we can fix it by simply
122*4882a593Smuzhiyunmoving the spin_lock up two lines.
123*4882a593Smuzhiyun
124*4882a593Smuzhiyun
125*4882a593SmuzhiyunPreventing preemption using interrupt disabling
126*4882a593Smuzhiyun===============================================
127*4882a593Smuzhiyun
128*4882a593Smuzhiyun
129*4882a593SmuzhiyunIt is possible to prevent a preemption event using local_irq_disable and
130*4882a593Smuzhiyunlocal_irq_save.  Note, when doing so, you must be very careful to not cause
131*4882a593Smuzhiyunan event that would set need_resched and result in a preemption check.  When
132*4882a593Smuzhiyunin doubt, rely on locking or explicit preemption disabling.
133*4882a593Smuzhiyun
134*4882a593SmuzhiyunNote in 2.5 interrupt disabling is now only per-CPU (e.g. local).
135*4882a593Smuzhiyun
136*4882a593SmuzhiyunAn additional concern is proper usage of local_irq_disable and local_irq_save.
137*4882a593SmuzhiyunThese may be used to protect from preemption, however, on exit, if preemption
138*4882a593Smuzhiyunmay be enabled, a test to see if preemption is required should be done.  If
139*4882a593Smuzhiyunthese are called from the spin_lock and read/write lock macros, the right thing
140*4882a593Smuzhiyunis done.  They may also be called within a spin-lock protected region, however,
141*4882a593Smuzhiyunif they are ever called outside of this context, a test for preemption should
142*4882a593Smuzhiyunbe made. Do note that calls from interrupt context or bottom half/ tasklets
143*4882a593Smuzhiyunare also protected by preemption locks and so may use the versions which do
144*4882a593Smuzhiyunnot check preemption.
145