1*4882a593Smuzhiyun 2*4882a593Smuzhiyun.. _volatile_considered_harmful: 3*4882a593Smuzhiyun 4*4882a593SmuzhiyunWhy the "volatile" type class should not be used 5*4882a593Smuzhiyun------------------------------------------------ 6*4882a593Smuzhiyun 7*4882a593SmuzhiyunC programmers have often taken volatile to mean that the variable could be 8*4882a593Smuzhiyunchanged outside of the current thread of execution; as a result, they are 9*4882a593Smuzhiyunsometimes tempted to use it in kernel code when shared data structures are 10*4882a593Smuzhiyunbeing used. In other words, they have been known to treat volatile types 11*4882a593Smuzhiyunas a sort of easy atomic variable, which they are not. The use of volatile in 12*4882a593Smuzhiyunkernel code is almost never correct; this document describes why. 13*4882a593Smuzhiyun 14*4882a593SmuzhiyunThe key point to understand with regard to volatile is that its purpose is 15*4882a593Smuzhiyunto suppress optimization, which is almost never what one really wants to 16*4882a593Smuzhiyundo. In the kernel, one must protect shared data structures against 17*4882a593Smuzhiyununwanted concurrent access, which is very much a different task. The 18*4882a593Smuzhiyunprocess of protecting against unwanted concurrency will also avoid almost 19*4882a593Smuzhiyunall optimization-related problems in a more efficient way. 20*4882a593Smuzhiyun 21*4882a593SmuzhiyunLike volatile, the kernel primitives which make concurrent access to data 22*4882a593Smuzhiyunsafe (spinlocks, mutexes, memory barriers, etc.) are designed to prevent 23*4882a593Smuzhiyununwanted optimization. If they are being used properly, there will be no 24*4882a593Smuzhiyunneed to use volatile as well. If volatile is still necessary, there is 25*4882a593Smuzhiyunalmost certainly a bug in the code somewhere. In properly-written kernel 26*4882a593Smuzhiyuncode, volatile can only serve to slow things down. 27*4882a593Smuzhiyun 28*4882a593SmuzhiyunConsider a typical block of kernel code:: 29*4882a593Smuzhiyun 30*4882a593Smuzhiyun spin_lock(&the_lock); 31*4882a593Smuzhiyun do_something_on(&shared_data); 32*4882a593Smuzhiyun do_something_else_with(&shared_data); 33*4882a593Smuzhiyun spin_unlock(&the_lock); 34*4882a593Smuzhiyun 35*4882a593SmuzhiyunIf all the code follows the locking rules, the value of shared_data cannot 36*4882a593Smuzhiyunchange unexpectedly while the_lock is held. Any other code which might 37*4882a593Smuzhiyunwant to play with that data will be waiting on the lock. The spinlock 38*4882a593Smuzhiyunprimitives act as memory barriers - they are explicitly written to do so - 39*4882a593Smuzhiyunmeaning that data accesses will not be optimized across them. So the 40*4882a593Smuzhiyuncompiler might think it knows what will be in shared_data, but the 41*4882a593Smuzhiyunspin_lock() call, since it acts as a memory barrier, will force it to 42*4882a593Smuzhiyunforget anything it knows. There will be no optimization problems with 43*4882a593Smuzhiyunaccesses to that data. 44*4882a593Smuzhiyun 45*4882a593SmuzhiyunIf shared_data were declared volatile, the locking would still be 46*4882a593Smuzhiyunnecessary. But the compiler would also be prevented from optimizing access 47*4882a593Smuzhiyunto shared_data _within_ the critical section, when we know that nobody else 48*4882a593Smuzhiyuncan be working with it. While the lock is held, shared_data is not 49*4882a593Smuzhiyunvolatile. When dealing with shared data, proper locking makes volatile 50*4882a593Smuzhiyununnecessary - and potentially harmful. 51*4882a593Smuzhiyun 52*4882a593SmuzhiyunThe volatile storage class was originally meant for memory-mapped I/O 53*4882a593Smuzhiyunregisters. Within the kernel, register accesses, too, should be protected 54*4882a593Smuzhiyunby locks, but one also does not want the compiler "optimizing" register 55*4882a593Smuzhiyunaccesses within a critical section. But, within the kernel, I/O memory 56*4882a593Smuzhiyunaccesses are always done through accessor functions; accessing I/O memory 57*4882a593Smuzhiyundirectly through pointers is frowned upon and does not work on all 58*4882a593Smuzhiyunarchitectures. Those accessors are written to prevent unwanted 59*4882a593Smuzhiyunoptimization, so, once again, volatile is unnecessary. 60*4882a593Smuzhiyun 61*4882a593SmuzhiyunAnother situation where one might be tempted to use volatile is 62*4882a593Smuzhiyunwhen the processor is busy-waiting on the value of a variable. The right 63*4882a593Smuzhiyunway to perform a busy wait is:: 64*4882a593Smuzhiyun 65*4882a593Smuzhiyun while (my_variable != what_i_want) 66*4882a593Smuzhiyun cpu_relax(); 67*4882a593Smuzhiyun 68*4882a593SmuzhiyunThe cpu_relax() call can lower CPU power consumption or yield to a 69*4882a593Smuzhiyunhyperthreaded twin processor; it also happens to serve as a compiler 70*4882a593Smuzhiyunbarrier, so, once again, volatile is unnecessary. Of course, busy- 71*4882a593Smuzhiyunwaiting is generally an anti-social act to begin with. 72*4882a593Smuzhiyun 73*4882a593SmuzhiyunThere are still a few rare situations where volatile makes sense in the 74*4882a593Smuzhiyunkernel: 75*4882a593Smuzhiyun 76*4882a593Smuzhiyun - The above-mentioned accessor functions might use volatile on 77*4882a593Smuzhiyun architectures where direct I/O memory access does work. Essentially, 78*4882a593Smuzhiyun each accessor call becomes a little critical section on its own and 79*4882a593Smuzhiyun ensures that the access happens as expected by the programmer. 80*4882a593Smuzhiyun 81*4882a593Smuzhiyun - Inline assembly code which changes memory, but which has no other 82*4882a593Smuzhiyun visible side effects, risks being deleted by GCC. Adding the volatile 83*4882a593Smuzhiyun keyword to asm statements will prevent this removal. 84*4882a593Smuzhiyun 85*4882a593Smuzhiyun - The jiffies variable is special in that it can have a different value 86*4882a593Smuzhiyun every time it is referenced, but it can be read without any special 87*4882a593Smuzhiyun locking. So jiffies can be volatile, but the addition of other 88*4882a593Smuzhiyun variables of this type is strongly frowned upon. Jiffies is considered 89*4882a593Smuzhiyun to be a "stupid legacy" issue (Linus's words) in this regard; fixing it 90*4882a593Smuzhiyun would be more trouble than it is worth. 91*4882a593Smuzhiyun 92*4882a593Smuzhiyun - Pointers to data structures in coherent memory which might be modified 93*4882a593Smuzhiyun by I/O devices can, sometimes, legitimately be volatile. A ring buffer 94*4882a593Smuzhiyun used by a network adapter, where that adapter changes pointers to 95*4882a593Smuzhiyun indicate which descriptors have been processed, is an example of this 96*4882a593Smuzhiyun type of situation. 97*4882a593Smuzhiyun 98*4882a593SmuzhiyunFor most code, none of the above justifications for volatile apply. As a 99*4882a593Smuzhiyunresult, the use of volatile is likely to be seen as a bug and will bring 100*4882a593Smuzhiyunadditional scrutiny to the code. Developers who are tempted to use 101*4882a593Smuzhiyunvolatile should take a step back and think about what they are truly trying 102*4882a593Smuzhiyunto accomplish. 103*4882a593Smuzhiyun 104*4882a593SmuzhiyunPatches to remove volatile variables are generally welcome - as long as 105*4882a593Smuzhiyunthey come with a justification which shows that the concurrency issues have 106*4882a593Smuzhiyunbeen properly thought through. 107*4882a593Smuzhiyun 108*4882a593Smuzhiyun 109*4882a593SmuzhiyunReferences 110*4882a593Smuzhiyun========== 111*4882a593Smuzhiyun 112*4882a593Smuzhiyun[1] https://lwn.net/Articles/233481/ 113*4882a593Smuzhiyun 114*4882a593Smuzhiyun[2] https://lwn.net/Articles/233482/ 115*4882a593Smuzhiyun 116*4882a593SmuzhiyunCredits 117*4882a593Smuzhiyun======= 118*4882a593Smuzhiyun 119*4882a593SmuzhiyunOriginal impetus and research by Randy Dunlap 120*4882a593Smuzhiyun 121*4882a593SmuzhiyunWritten by Jonathan Corbet 122*4882a593Smuzhiyun 123*4882a593SmuzhiyunImprovements via comments from Satyam Sharma, Johannes Stezenbach, Jesper 124*4882a593SmuzhiyunJuhl, Heikki Orsila, H. Peter Anvin, Philipp Hahn, and Stefan 125*4882a593SmuzhiyunRichter. 126