1*4882a593Smuzhiyun============================== 2*4882a593SmuzhiyunDeadline IO scheduler tunables 3*4882a593Smuzhiyun============================== 4*4882a593Smuzhiyun 5*4882a593SmuzhiyunThis little file attempts to document how the deadline io scheduler works. 6*4882a593SmuzhiyunIn particular, it will clarify the meaning of the exposed tunables that may be 7*4882a593Smuzhiyunof interest to power users. 8*4882a593Smuzhiyun 9*4882a593SmuzhiyunSelecting IO schedulers 10*4882a593Smuzhiyun----------------------- 11*4882a593SmuzhiyunRefer to Documentation/block/switching-sched.rst for information on 12*4882a593Smuzhiyunselecting an io scheduler on a per-device basis. 13*4882a593Smuzhiyun 14*4882a593Smuzhiyun------------------------------------------------------------------------------ 15*4882a593Smuzhiyun 16*4882a593Smuzhiyunread_expire (in ms) 17*4882a593Smuzhiyun----------------------- 18*4882a593Smuzhiyun 19*4882a593SmuzhiyunThe goal of the deadline io scheduler is to attempt to guarantee a start 20*4882a593Smuzhiyunservice time for a request. As we focus mainly on read latencies, this is 21*4882a593Smuzhiyuntunable. When a read request first enters the io scheduler, it is assigned 22*4882a593Smuzhiyuna deadline that is the current time + the read_expire value in units of 23*4882a593Smuzhiyunmilliseconds. 24*4882a593Smuzhiyun 25*4882a593Smuzhiyun 26*4882a593Smuzhiyunwrite_expire (in ms) 27*4882a593Smuzhiyun----------------------- 28*4882a593Smuzhiyun 29*4882a593SmuzhiyunSimilar to read_expire mentioned above, but for writes. 30*4882a593Smuzhiyun 31*4882a593Smuzhiyun 32*4882a593Smuzhiyunfifo_batch (number of requests) 33*4882a593Smuzhiyun------------------------------------ 34*4882a593Smuzhiyun 35*4882a593SmuzhiyunRequests are grouped into ``batches`` of a particular data direction (read or 36*4882a593Smuzhiyunwrite) which are serviced in increasing sector order. To limit extra seeking, 37*4882a593Smuzhiyundeadline expiries are only checked between batches. fifo_batch controls the 38*4882a593Smuzhiyunmaximum number of requests per batch. 39*4882a593Smuzhiyun 40*4882a593SmuzhiyunThis parameter tunes the balance between per-request latency and aggregate 41*4882a593Smuzhiyunthroughput. When low latency is the primary concern, smaller is better (where 42*4882a593Smuzhiyuna value of 1 yields first-come first-served behaviour). Increasing fifo_batch 43*4882a593Smuzhiyungenerally improves throughput, at the cost of latency variation. 44*4882a593Smuzhiyun 45*4882a593Smuzhiyun 46*4882a593Smuzhiyunwrites_starved (number of dispatches) 47*4882a593Smuzhiyun-------------------------------------- 48*4882a593Smuzhiyun 49*4882a593SmuzhiyunWhen we have to move requests from the io scheduler queue to the block 50*4882a593Smuzhiyundevice dispatch queue, we always give a preference to reads. However, we 51*4882a593Smuzhiyundon't want to starve writes indefinitely either. So writes_starved controls 52*4882a593Smuzhiyunhow many times we give preference to reads over writes. When that has been 53*4882a593Smuzhiyundone writes_starved number of times, we dispatch some writes based on the 54*4882a593Smuzhiyunsame criteria as reads. 55*4882a593Smuzhiyun 56*4882a593Smuzhiyun 57*4882a593Smuzhiyunfront_merges (bool) 58*4882a593Smuzhiyun---------------------- 59*4882a593Smuzhiyun 60*4882a593SmuzhiyunSometimes it happens that a request enters the io scheduler that is contiguous 61*4882a593Smuzhiyunwith a request that is already on the queue. Either it fits in the back of that 62*4882a593Smuzhiyunrequest, or it fits at the front. That is called either a back merge candidate 63*4882a593Smuzhiyunor a front merge candidate. Due to the way files are typically laid out, 64*4882a593Smuzhiyunback merges are much more common than front merges. For some work loads, you 65*4882a593Smuzhiyunmay even know that it is a waste of time to spend any time attempting to 66*4882a593Smuzhiyunfront merge requests. Setting front_merges to 0 disables this functionality. 67*4882a593SmuzhiyunFront merges may still occur due to the cached last_merge hint, but since 68*4882a593Smuzhiyunthat comes at basically 0 cost we leave that on. We simply disable the 69*4882a593Smuzhiyunrbtree front sector lookup when the io scheduler merge function is called. 70*4882a593Smuzhiyun 71*4882a593Smuzhiyun 72*4882a593SmuzhiyunNov 11 2002, Jens Axboe <jens.axboe@oracle.com> 73