xref: /OK3568_Linux_fs/kernel/Documentation/block/writeback_cache_control.rst (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593Smuzhiyun==========================================
2*4882a593SmuzhiyunExplicit volatile write back cache control
3*4882a593Smuzhiyun==========================================
4*4882a593Smuzhiyun
5*4882a593SmuzhiyunIntroduction
6*4882a593Smuzhiyun------------
7*4882a593Smuzhiyun
8*4882a593SmuzhiyunMany storage devices, especially in the consumer market, come with volatile
9*4882a593Smuzhiyunwrite back caches.  That means the devices signal I/O completion to the
10*4882a593Smuzhiyunoperating system before data actually has hit the non-volatile storage.  This
11*4882a593Smuzhiyunbehavior obviously speeds up various workloads, but it means the operating
12*4882a593Smuzhiyunsystem needs to force data out to the non-volatile storage when it performs
13*4882a593Smuzhiyuna data integrity operation like fsync, sync or an unmount.
14*4882a593Smuzhiyun
15*4882a593SmuzhiyunThe Linux block layer provides two simple mechanisms that let filesystems
16*4882a593Smuzhiyuncontrol the caching behavior of the storage device.  These mechanisms are
17*4882a593Smuzhiyuna forced cache flush, and the Force Unit Access (FUA) flag for requests.
18*4882a593Smuzhiyun
19*4882a593Smuzhiyun
20*4882a593SmuzhiyunExplicit cache flushes
21*4882a593Smuzhiyun----------------------
22*4882a593Smuzhiyun
23*4882a593SmuzhiyunThe REQ_PREFLUSH flag can be OR ed into the r/w flags of a bio submitted from
24*4882a593Smuzhiyunthe filesystem and will make sure the volatile cache of the storage device
25*4882a593Smuzhiyunhas been flushed before the actual I/O operation is started.  This explicitly
26*4882a593Smuzhiyunguarantees that previously completed write requests are on non-volatile
27*4882a593Smuzhiyunstorage before the flagged bio starts. In addition the REQ_PREFLUSH flag can be
28*4882a593Smuzhiyunset on an otherwise empty bio structure, which causes only an explicit cache
29*4882a593Smuzhiyunflush without any dependent I/O.  It is recommend to use
30*4882a593Smuzhiyunthe blkdev_issue_flush() helper for a pure cache flush.
31*4882a593Smuzhiyun
32*4882a593Smuzhiyun
33*4882a593SmuzhiyunForced Unit Access
34*4882a593Smuzhiyun------------------
35*4882a593Smuzhiyun
36*4882a593SmuzhiyunThe REQ_FUA flag can be OR ed into the r/w flags of a bio submitted from the
37*4882a593Smuzhiyunfilesystem and will make sure that I/O completion for this request is only
38*4882a593Smuzhiyunsignaled after the data has been committed to non-volatile storage.
39*4882a593Smuzhiyun
40*4882a593Smuzhiyun
41*4882a593SmuzhiyunImplementation details for filesystems
42*4882a593Smuzhiyun--------------------------------------
43*4882a593Smuzhiyun
44*4882a593SmuzhiyunFilesystems can simply set the REQ_PREFLUSH and REQ_FUA bits and do not have to
45*4882a593Smuzhiyunworry if the underlying devices need any explicit cache flushing and how
46*4882a593Smuzhiyunthe Forced Unit Access is implemented.  The REQ_PREFLUSH and REQ_FUA flags
47*4882a593Smuzhiyunmay both be set on a single bio.
48*4882a593Smuzhiyun
49*4882a593Smuzhiyun
50*4882a593SmuzhiyunImplementation details for bio based block drivers
51*4882a593Smuzhiyun--------------------------------------------------------------
52*4882a593Smuzhiyun
53*4882a593SmuzhiyunThese drivers will always see the REQ_PREFLUSH and REQ_FUA bits as they sit
54*4882a593Smuzhiyundirectly below the submit_bio interface.  For remapping drivers the REQ_FUA
55*4882a593Smuzhiyunbits need to be propagated to underlying devices, and a global flush needs
56*4882a593Smuzhiyunto be implemented for bios with the REQ_PREFLUSH bit set.  For real device
57*4882a593Smuzhiyundrivers that do not have a volatile cache the REQ_PREFLUSH and REQ_FUA bits
58*4882a593Smuzhiyunon non-empty bios can simply be ignored, and REQ_PREFLUSH requests without
59*4882a593Smuzhiyundata can be completed successfully without doing any work.  Drivers for
60*4882a593Smuzhiyundevices with volatile caches need to implement the support for these
61*4882a593Smuzhiyunflags themselves without any help from the block layer.
62*4882a593Smuzhiyun
63*4882a593Smuzhiyun
64*4882a593SmuzhiyunImplementation details for request_fn based block drivers
65*4882a593Smuzhiyun---------------------------------------------------------
66*4882a593Smuzhiyun
67*4882a593SmuzhiyunFor devices that do not support volatile write caches there is no driver
68*4882a593Smuzhiyunsupport required, the block layer completes empty REQ_PREFLUSH requests before
69*4882a593Smuzhiyunentering the driver and strips off the REQ_PREFLUSH and REQ_FUA bits from
70*4882a593Smuzhiyunrequests that have a payload.  For devices with volatile write caches the
71*4882a593Smuzhiyundriver needs to tell the block layer that it supports flushing caches by
72*4882a593Smuzhiyundoing::
73*4882a593Smuzhiyun
74*4882a593Smuzhiyun	blk_queue_write_cache(sdkp->disk->queue, true, false);
75*4882a593Smuzhiyun
76*4882a593Smuzhiyunand handle empty REQ_OP_FLUSH requests in its prep_fn/request_fn.  Note that
77*4882a593SmuzhiyunREQ_PREFLUSH requests with a payload are automatically turned into a sequence
78*4882a593Smuzhiyunof an empty REQ_OP_FLUSH request followed by the actual write by the block
79*4882a593Smuzhiyunlayer.  For devices that also support the FUA bit the block layer needs
80*4882a593Smuzhiyunto be told to pass through the REQ_FUA bit using::
81*4882a593Smuzhiyun
82*4882a593Smuzhiyun	blk_queue_write_cache(sdkp->disk->queue, true, true);
83*4882a593Smuzhiyun
84*4882a593Smuzhiyunand the driver must handle write requests that have the REQ_FUA bit set
85*4882a593Smuzhiyunin prep_fn/request_fn.  If the FUA bit is not natively supported the block
86*4882a593Smuzhiyunlayer turns it into an empty REQ_OP_FLUSH request after the actual write.
87