1*4882a593Smuzhiyun.. _up_doc: 2*4882a593Smuzhiyun 3*4882a593SmuzhiyunRCU on Uniprocessor Systems 4*4882a593Smuzhiyun=========================== 5*4882a593Smuzhiyun 6*4882a593SmuzhiyunA common misconception is that, on UP systems, the call_rcu() primitive 7*4882a593Smuzhiyunmay immediately invoke its function. The basis of this misconception 8*4882a593Smuzhiyunis that since there is only one CPU, it should not be necessary to 9*4882a593Smuzhiyunwait for anything else to get done, since there are no other CPUs for 10*4882a593Smuzhiyunanything else to be happening on. Although this approach will *sort of* 11*4882a593Smuzhiyunwork a surprising amount of the time, it is a very bad idea in general. 12*4882a593SmuzhiyunThis document presents three examples that demonstrate exactly how bad 13*4882a593Smuzhiyunan idea this is. 14*4882a593Smuzhiyun 15*4882a593SmuzhiyunExample 1: softirq Suicide 16*4882a593Smuzhiyun-------------------------- 17*4882a593Smuzhiyun 18*4882a593SmuzhiyunSuppose that an RCU-based algorithm scans a linked list containing 19*4882a593Smuzhiyunelements A, B, and C in process context, and can delete elements from 20*4882a593Smuzhiyunthis same list in softirq context. Suppose that the process-context scan 21*4882a593Smuzhiyunis referencing element B when it is interrupted by softirq processing, 22*4882a593Smuzhiyunwhich deletes element B, and then invokes call_rcu() to free element B 23*4882a593Smuzhiyunafter a grace period. 24*4882a593Smuzhiyun 25*4882a593SmuzhiyunNow, if call_rcu() were to directly invoke its arguments, then upon return 26*4882a593Smuzhiyunfrom softirq, the list scan would find itself referencing a newly freed 27*4882a593Smuzhiyunelement B. This situation can greatly decrease the life expectancy of 28*4882a593Smuzhiyunyour kernel. 29*4882a593Smuzhiyun 30*4882a593SmuzhiyunThis same problem can occur if call_rcu() is invoked from a hardware 31*4882a593Smuzhiyuninterrupt handler. 32*4882a593Smuzhiyun 33*4882a593SmuzhiyunExample 2: Function-Call Fatality 34*4882a593Smuzhiyun--------------------------------- 35*4882a593Smuzhiyun 36*4882a593SmuzhiyunOf course, one could avert the suicide described in the preceding example 37*4882a593Smuzhiyunby having call_rcu() directly invoke its arguments only if it was called 38*4882a593Smuzhiyunfrom process context. However, this can fail in a similar manner. 39*4882a593Smuzhiyun 40*4882a593SmuzhiyunSuppose that an RCU-based algorithm again scans a linked list containing 41*4882a593Smuzhiyunelements A, B, and C in process contexts, but that it invokes a function 42*4882a593Smuzhiyunon each element as it is scanned. Suppose further that this function 43*4882a593Smuzhiyundeletes element B from the list, then passes it to call_rcu() for deferred 44*4882a593Smuzhiyunfreeing. This may be a bit unconventional, but it is perfectly legal 45*4882a593SmuzhiyunRCU usage, since call_rcu() must wait for a grace period to elapse. 46*4882a593SmuzhiyunTherefore, in this case, allowing call_rcu() to immediately invoke 47*4882a593Smuzhiyunits arguments would cause it to fail to make the fundamental guarantee 48*4882a593Smuzhiyununderlying RCU, namely that call_rcu() defers invoking its arguments until 49*4882a593Smuzhiyunall RCU read-side critical sections currently executing have completed. 50*4882a593Smuzhiyun 51*4882a593SmuzhiyunQuick Quiz #1: 52*4882a593Smuzhiyun Why is it *not* legal to invoke synchronize_rcu() in this case? 53*4882a593Smuzhiyun 54*4882a593Smuzhiyun:ref:`Answers to Quick Quiz <answer_quick_quiz_up>` 55*4882a593Smuzhiyun 56*4882a593SmuzhiyunExample 3: Death by Deadlock 57*4882a593Smuzhiyun---------------------------- 58*4882a593Smuzhiyun 59*4882a593SmuzhiyunSuppose that call_rcu() is invoked while holding a lock, and that the 60*4882a593Smuzhiyuncallback function must acquire this same lock. In this case, if 61*4882a593Smuzhiyuncall_rcu() were to directly invoke the callback, the result would 62*4882a593Smuzhiyunbe self-deadlock. 63*4882a593Smuzhiyun 64*4882a593SmuzhiyunIn some cases, it would possible to restructure to code so that 65*4882a593Smuzhiyunthe call_rcu() is delayed until after the lock is released. However, 66*4882a593Smuzhiyunthere are cases where this can be quite ugly: 67*4882a593Smuzhiyun 68*4882a593Smuzhiyun1. If a number of items need to be passed to call_rcu() within 69*4882a593Smuzhiyun the same critical section, then the code would need to create 70*4882a593Smuzhiyun a list of them, then traverse the list once the lock was 71*4882a593Smuzhiyun released. 72*4882a593Smuzhiyun 73*4882a593Smuzhiyun2. In some cases, the lock will be held across some kernel API, 74*4882a593Smuzhiyun so that delaying the call_rcu() until the lock is released 75*4882a593Smuzhiyun requires that the data item be passed up via a common API. 76*4882a593Smuzhiyun It is far better to guarantee that callbacks are invoked 77*4882a593Smuzhiyun with no locks held than to have to modify such APIs to allow 78*4882a593Smuzhiyun arbitrary data items to be passed back up through them. 79*4882a593Smuzhiyun 80*4882a593SmuzhiyunIf call_rcu() directly invokes the callback, painful locking restrictions 81*4882a593Smuzhiyunor API changes would be required. 82*4882a593Smuzhiyun 83*4882a593SmuzhiyunQuick Quiz #2: 84*4882a593Smuzhiyun What locking restriction must RCU callbacks respect? 85*4882a593Smuzhiyun 86*4882a593Smuzhiyun:ref:`Answers to Quick Quiz <answer_quick_quiz_up>` 87*4882a593Smuzhiyun 88*4882a593SmuzhiyunSummary 89*4882a593Smuzhiyun------- 90*4882a593Smuzhiyun 91*4882a593SmuzhiyunPermitting call_rcu() to immediately invoke its arguments breaks RCU, 92*4882a593Smuzhiyuneven on a UP system. So do not do it! Even on a UP system, the RCU 93*4882a593Smuzhiyuninfrastructure *must* respect grace periods, and *must* invoke callbacks 94*4882a593Smuzhiyunfrom a known environment in which no locks are held. 95*4882a593Smuzhiyun 96*4882a593SmuzhiyunNote that it *is* safe for synchronize_rcu() to return immediately on 97*4882a593SmuzhiyunUP systems, including PREEMPT SMP builds running on UP systems. 98*4882a593Smuzhiyun 99*4882a593SmuzhiyunQuick Quiz #3: 100*4882a593Smuzhiyun Why can't synchronize_rcu() return immediately on UP systems running 101*4882a593Smuzhiyun preemptable RCU? 102*4882a593Smuzhiyun 103*4882a593Smuzhiyun.. _answer_quick_quiz_up: 104*4882a593Smuzhiyun 105*4882a593SmuzhiyunAnswer to Quick Quiz #1: 106*4882a593Smuzhiyun Why is it *not* legal to invoke synchronize_rcu() in this case? 107*4882a593Smuzhiyun 108*4882a593Smuzhiyun Because the calling function is scanning an RCU-protected linked 109*4882a593Smuzhiyun list, and is therefore within an RCU read-side critical section. 110*4882a593Smuzhiyun Therefore, the called function has been invoked within an RCU 111*4882a593Smuzhiyun read-side critical section, and is not permitted to block. 112*4882a593Smuzhiyun 113*4882a593SmuzhiyunAnswer to Quick Quiz #2: 114*4882a593Smuzhiyun What locking restriction must RCU callbacks respect? 115*4882a593Smuzhiyun 116*4882a593Smuzhiyun Any lock that is acquired within an RCU callback must be acquired 117*4882a593Smuzhiyun elsewhere using an _bh variant of the spinlock primitive. 118*4882a593Smuzhiyun For example, if "mylock" is acquired by an RCU callback, then 119*4882a593Smuzhiyun a process-context acquisition of this lock must use something 120*4882a593Smuzhiyun like spin_lock_bh() to acquire the lock. Please note that 121*4882a593Smuzhiyun it is also OK to use _irq variants of spinlocks, for example, 122*4882a593Smuzhiyun spin_lock_irqsave(). 123*4882a593Smuzhiyun 124*4882a593Smuzhiyun If the process-context code were to simply use spin_lock(), 125*4882a593Smuzhiyun then, since RCU callbacks can be invoked from softirq context, 126*4882a593Smuzhiyun the callback might be called from a softirq that interrupted 127*4882a593Smuzhiyun the process-context critical section. This would result in 128*4882a593Smuzhiyun self-deadlock. 129*4882a593Smuzhiyun 130*4882a593Smuzhiyun This restriction might seem gratuitous, since very few RCU 131*4882a593Smuzhiyun callbacks acquire locks directly. However, a great many RCU 132*4882a593Smuzhiyun callbacks do acquire locks *indirectly*, for example, via 133*4882a593Smuzhiyun the kfree() primitive. 134*4882a593Smuzhiyun 135*4882a593SmuzhiyunAnswer to Quick Quiz #3: 136*4882a593Smuzhiyun Why can't synchronize_rcu() return immediately on UP systems 137*4882a593Smuzhiyun running preemptable RCU? 138*4882a593Smuzhiyun 139*4882a593Smuzhiyun Because some other task might have been preempted in the middle 140*4882a593Smuzhiyun of an RCU read-side critical section. If synchronize_rcu() 141*4882a593Smuzhiyun simply immediately returned, it would prematurely signal the 142*4882a593Smuzhiyun end of the grace period, which would come as a nasty shock to 143*4882a593Smuzhiyun that other thread when it started running again. 144