1*4882a593Smuzhiyun======= 2*4882a593Smuzhiyundm-raid 3*4882a593Smuzhiyun======= 4*4882a593Smuzhiyun 5*4882a593SmuzhiyunThe device-mapper RAID (dm-raid) target provides a bridge from DM to MD. 6*4882a593SmuzhiyunIt allows the MD RAID drivers to be accessed using a device-mapper 7*4882a593Smuzhiyuninterface. 8*4882a593Smuzhiyun 9*4882a593Smuzhiyun 10*4882a593SmuzhiyunMapping Table Interface 11*4882a593Smuzhiyun----------------------- 12*4882a593SmuzhiyunThe target is named "raid" and it accepts the following parameters:: 13*4882a593Smuzhiyun 14*4882a593Smuzhiyun <raid_type> <#raid_params> <raid_params> \ 15*4882a593Smuzhiyun <#raid_devs> <metadata_dev0> <dev0> [.. <metadata_devN> <devN>] 16*4882a593Smuzhiyun 17*4882a593Smuzhiyun<raid_type>: 18*4882a593Smuzhiyun 19*4882a593Smuzhiyun ============= =============================================================== 20*4882a593Smuzhiyun raid0 RAID0 striping (no resilience) 21*4882a593Smuzhiyun raid1 RAID1 mirroring 22*4882a593Smuzhiyun raid4 RAID4 with dedicated last parity disk 23*4882a593Smuzhiyun raid5_n RAID5 with dedicated last parity disk supporting takeover 24*4882a593Smuzhiyun Same as raid4 25*4882a593Smuzhiyun 26*4882a593Smuzhiyun - Transitory layout 27*4882a593Smuzhiyun raid5_la RAID5 left asymmetric 28*4882a593Smuzhiyun 29*4882a593Smuzhiyun - rotating parity 0 with data continuation 30*4882a593Smuzhiyun raid5_ra RAID5 right asymmetric 31*4882a593Smuzhiyun 32*4882a593Smuzhiyun - rotating parity N with data continuation 33*4882a593Smuzhiyun raid5_ls RAID5 left symmetric 34*4882a593Smuzhiyun 35*4882a593Smuzhiyun - rotating parity 0 with data restart 36*4882a593Smuzhiyun raid5_rs RAID5 right symmetric 37*4882a593Smuzhiyun 38*4882a593Smuzhiyun - rotating parity N with data restart 39*4882a593Smuzhiyun raid6_zr RAID6 zero restart 40*4882a593Smuzhiyun 41*4882a593Smuzhiyun - rotating parity zero (left-to-right) with data restart 42*4882a593Smuzhiyun raid6_nr RAID6 N restart 43*4882a593Smuzhiyun 44*4882a593Smuzhiyun - rotating parity N (right-to-left) with data restart 45*4882a593Smuzhiyun raid6_nc RAID6 N continue 46*4882a593Smuzhiyun 47*4882a593Smuzhiyun - rotating parity N (right-to-left) with data continuation 48*4882a593Smuzhiyun raid6_n_6 RAID6 with dedicate parity disks 49*4882a593Smuzhiyun 50*4882a593Smuzhiyun - parity and Q-syndrome on the last 2 disks; 51*4882a593Smuzhiyun layout for takeover from/to raid4/raid5_n 52*4882a593Smuzhiyun raid6_la_6 Same as "raid_la" plus dedicated last Q-syndrome disk 53*4882a593Smuzhiyun 54*4882a593Smuzhiyun - layout for takeover from raid5_la from/to raid6 55*4882a593Smuzhiyun raid6_ra_6 Same as "raid5_ra" dedicated last Q-syndrome disk 56*4882a593Smuzhiyun 57*4882a593Smuzhiyun - layout for takeover from raid5_ra from/to raid6 58*4882a593Smuzhiyun raid6_ls_6 Same as "raid5_ls" dedicated last Q-syndrome disk 59*4882a593Smuzhiyun 60*4882a593Smuzhiyun - layout for takeover from raid5_ls from/to raid6 61*4882a593Smuzhiyun raid6_rs_6 Same as "raid5_rs" dedicated last Q-syndrome disk 62*4882a593Smuzhiyun 63*4882a593Smuzhiyun - layout for takeover from raid5_rs from/to raid6 64*4882a593Smuzhiyun raid10 Various RAID10 inspired algorithms chosen by additional params 65*4882a593Smuzhiyun (see raid10_format and raid10_copies below) 66*4882a593Smuzhiyun 67*4882a593Smuzhiyun - RAID10: Striped Mirrors (aka 'Striping on top of mirrors') 68*4882a593Smuzhiyun - RAID1E: Integrated Adjacent Stripe Mirroring 69*4882a593Smuzhiyun - RAID1E: Integrated Offset Stripe Mirroring 70*4882a593Smuzhiyun - and other similar RAID10 variants 71*4882a593Smuzhiyun ============= =============================================================== 72*4882a593Smuzhiyun 73*4882a593Smuzhiyun Reference: Chapter 4 of 74*4882a593Smuzhiyun https://www.snia.org/sites/default/files/SNIA_DDF_Technical_Position_v2.0.pdf 75*4882a593Smuzhiyun 76*4882a593Smuzhiyun<#raid_params>: The number of parameters that follow. 77*4882a593Smuzhiyun 78*4882a593Smuzhiyun<raid_params> consists of 79*4882a593Smuzhiyun 80*4882a593Smuzhiyun Mandatory parameters: 81*4882a593Smuzhiyun <chunk_size>: 82*4882a593Smuzhiyun Chunk size in sectors. This parameter is often known as 83*4882a593Smuzhiyun "stripe size". It is the only mandatory parameter and 84*4882a593Smuzhiyun is placed first. 85*4882a593Smuzhiyun 86*4882a593Smuzhiyun followed by optional parameters (in any order): 87*4882a593Smuzhiyun [sync|nosync] 88*4882a593Smuzhiyun Force or prevent RAID initialization. 89*4882a593Smuzhiyun 90*4882a593Smuzhiyun [rebuild <idx>] 91*4882a593Smuzhiyun Rebuild drive number 'idx' (first drive is 0). 92*4882a593Smuzhiyun 93*4882a593Smuzhiyun [daemon_sleep <ms>] 94*4882a593Smuzhiyun Interval between runs of the bitmap daemon that 95*4882a593Smuzhiyun clear bits. A longer interval means less bitmap I/O but 96*4882a593Smuzhiyun resyncing after a failure is likely to take longer. 97*4882a593Smuzhiyun 98*4882a593Smuzhiyun [min_recovery_rate <kB/sec/disk>] 99*4882a593Smuzhiyun Throttle RAID initialization 100*4882a593Smuzhiyun [max_recovery_rate <kB/sec/disk>] 101*4882a593Smuzhiyun Throttle RAID initialization 102*4882a593Smuzhiyun [write_mostly <idx>] 103*4882a593Smuzhiyun Mark drive index 'idx' write-mostly. 104*4882a593Smuzhiyun [max_write_behind <sectors>] 105*4882a593Smuzhiyun See '--write-behind=' (man mdadm) 106*4882a593Smuzhiyun [stripe_cache <sectors>] 107*4882a593Smuzhiyun Stripe cache size (RAID 4/5/6 only) 108*4882a593Smuzhiyun [region_size <sectors>] 109*4882a593Smuzhiyun The region_size multiplied by the number of regions is the 110*4882a593Smuzhiyun logical size of the array. The bitmap records the device 111*4882a593Smuzhiyun synchronisation state for each region. 112*4882a593Smuzhiyun 113*4882a593Smuzhiyun [raid10_copies <# copies>], [raid10_format <near|far|offset>] 114*4882a593Smuzhiyun These two options are used to alter the default layout of 115*4882a593Smuzhiyun a RAID10 configuration. The number of copies is can be 116*4882a593Smuzhiyun specified, but the default is 2. There are also three 117*4882a593Smuzhiyun variations to how the copies are laid down - the default 118*4882a593Smuzhiyun is "near". Near copies are what most people think of with 119*4882a593Smuzhiyun respect to mirroring. If these options are left unspecified, 120*4882a593Smuzhiyun or 'raid10_copies 2' and/or 'raid10_format near' are given, 121*4882a593Smuzhiyun then the layouts for 2, 3 and 4 devices are: 122*4882a593Smuzhiyun 123*4882a593Smuzhiyun ======== ========== ============== 124*4882a593Smuzhiyun 2 drives 3 drives 4 drives 125*4882a593Smuzhiyun ======== ========== ============== 126*4882a593Smuzhiyun A1 A1 A1 A1 A2 A1 A1 A2 A2 127*4882a593Smuzhiyun A2 A2 A2 A3 A3 A3 A3 A4 A4 128*4882a593Smuzhiyun A3 A3 A4 A4 A5 A5 A5 A6 A6 129*4882a593Smuzhiyun A4 A4 A5 A6 A6 A7 A7 A8 A8 130*4882a593Smuzhiyun .. .. .. .. .. .. .. .. .. 131*4882a593Smuzhiyun ======== ========== ============== 132*4882a593Smuzhiyun 133*4882a593Smuzhiyun The 2-device layout is equivalent 2-way RAID1. The 4-device 134*4882a593Smuzhiyun layout is what a traditional RAID10 would look like. The 135*4882a593Smuzhiyun 3-device layout is what might be called a 'RAID1E - Integrated 136*4882a593Smuzhiyun Adjacent Stripe Mirroring'. 137*4882a593Smuzhiyun 138*4882a593Smuzhiyun If 'raid10_copies 2' and 'raid10_format far', then the layouts 139*4882a593Smuzhiyun for 2, 3 and 4 devices are: 140*4882a593Smuzhiyun 141*4882a593Smuzhiyun ======== ============ =================== 142*4882a593Smuzhiyun 2 drives 3 drives 4 drives 143*4882a593Smuzhiyun ======== ============ =================== 144*4882a593Smuzhiyun A1 A2 A1 A2 A3 A1 A2 A3 A4 145*4882a593Smuzhiyun A3 A4 A4 A5 A6 A5 A6 A7 A8 146*4882a593Smuzhiyun A5 A6 A7 A8 A9 A9 A10 A11 A12 147*4882a593Smuzhiyun .. .. .. .. .. .. .. .. .. 148*4882a593Smuzhiyun A2 A1 A3 A1 A2 A2 A1 A4 A3 149*4882a593Smuzhiyun A4 A3 A6 A4 A5 A6 A5 A8 A7 150*4882a593Smuzhiyun A6 A5 A9 A7 A8 A10 A9 A12 A11 151*4882a593Smuzhiyun .. .. .. .. .. .. .. .. .. 152*4882a593Smuzhiyun ======== ============ =================== 153*4882a593Smuzhiyun 154*4882a593Smuzhiyun If 'raid10_copies 2' and 'raid10_format offset', then the 155*4882a593Smuzhiyun layouts for 2, 3 and 4 devices are: 156*4882a593Smuzhiyun 157*4882a593Smuzhiyun ======== ========== ================ 158*4882a593Smuzhiyun 2 drives 3 drives 4 drives 159*4882a593Smuzhiyun ======== ========== ================ 160*4882a593Smuzhiyun A1 A2 A1 A2 A3 A1 A2 A3 A4 161*4882a593Smuzhiyun A2 A1 A3 A1 A2 A2 A1 A4 A3 162*4882a593Smuzhiyun A3 A4 A4 A5 A6 A5 A6 A7 A8 163*4882a593Smuzhiyun A4 A3 A6 A4 A5 A6 A5 A8 A7 164*4882a593Smuzhiyun A5 A6 A7 A8 A9 A9 A10 A11 A12 165*4882a593Smuzhiyun A6 A5 A9 A7 A8 A10 A9 A12 A11 166*4882a593Smuzhiyun .. .. .. .. .. .. .. .. .. 167*4882a593Smuzhiyun ======== ========== ================ 168*4882a593Smuzhiyun 169*4882a593Smuzhiyun Here we see layouts closely akin to 'RAID1E - Integrated 170*4882a593Smuzhiyun Offset Stripe Mirroring'. 171*4882a593Smuzhiyun 172*4882a593Smuzhiyun [delta_disks <N>] 173*4882a593Smuzhiyun The delta_disks option value (-251 < N < +251) triggers 174*4882a593Smuzhiyun device removal (negative value) or device addition (positive 175*4882a593Smuzhiyun value) to any reshape supporting raid levels 4/5/6 and 10. 176*4882a593Smuzhiyun RAID levels 4/5/6 allow for addition of devices (metadata 177*4882a593Smuzhiyun and data device tuple), raid10_near and raid10_offset only 178*4882a593Smuzhiyun allow for device addition. raid10_far does not support any 179*4882a593Smuzhiyun reshaping at all. 180*4882a593Smuzhiyun A minimum of devices have to be kept to enforce resilience, 181*4882a593Smuzhiyun which is 3 devices for raid4/5 and 4 devices for raid6. 182*4882a593Smuzhiyun 183*4882a593Smuzhiyun [data_offset <sectors>] 184*4882a593Smuzhiyun This option value defines the offset into each data device 185*4882a593Smuzhiyun where the data starts. This is used to provide out-of-place 186*4882a593Smuzhiyun reshaping space to avoid writing over data while 187*4882a593Smuzhiyun changing the layout of stripes, hence an interruption/crash 188*4882a593Smuzhiyun may happen at any time without the risk of losing data. 189*4882a593Smuzhiyun E.g. when adding devices to an existing raid set during 190*4882a593Smuzhiyun forward reshaping, the out-of-place space will be allocated 191*4882a593Smuzhiyun at the beginning of each raid device. The kernel raid4/5/6/10 192*4882a593Smuzhiyun MD personalities supporting such device addition will read the data from 193*4882a593Smuzhiyun the existing first stripes (those with smaller number of stripes) 194*4882a593Smuzhiyun starting at data_offset to fill up a new stripe with the larger 195*4882a593Smuzhiyun number of stripes, calculate the redundancy blocks (CRC/Q-syndrome) 196*4882a593Smuzhiyun and write that new stripe to offset 0. Same will be applied to all 197*4882a593Smuzhiyun N-1 other new stripes. This out-of-place scheme is used to change 198*4882a593Smuzhiyun the RAID type (i.e. the allocation algorithm) as well, e.g. 199*4882a593Smuzhiyun changing from raid5_ls to raid5_n. 200*4882a593Smuzhiyun 201*4882a593Smuzhiyun [journal_dev <dev>] 202*4882a593Smuzhiyun This option adds a journal device to raid4/5/6 raid sets and 203*4882a593Smuzhiyun uses it to close the 'write hole' caused by the non-atomic updates 204*4882a593Smuzhiyun to the component devices which can cause data loss during recovery. 205*4882a593Smuzhiyun The journal device is used as writethrough thus causing writes to 206*4882a593Smuzhiyun be throttled versus non-journaled raid4/5/6 sets. 207*4882a593Smuzhiyun Takeover/reshape is not possible with a raid4/5/6 journal device; 208*4882a593Smuzhiyun it has to be deconfigured before requesting these. 209*4882a593Smuzhiyun 210*4882a593Smuzhiyun [journal_mode <mode>] 211*4882a593Smuzhiyun This option sets the caching mode on journaled raid4/5/6 raid sets 212*4882a593Smuzhiyun (see 'journal_dev <dev>' above) to 'writethrough' or 'writeback'. 213*4882a593Smuzhiyun If 'writeback' is selected the journal device has to be resilient 214*4882a593Smuzhiyun and must not suffer from the 'write hole' problem itself (e.g. use 215*4882a593Smuzhiyun raid1 or raid10) to avoid a single point of failure. 216*4882a593Smuzhiyun 217*4882a593Smuzhiyun<#raid_devs>: The number of devices composing the array. 218*4882a593Smuzhiyun Each device consists of two entries. The first is the device 219*4882a593Smuzhiyun containing the metadata (if any); the second is the one containing the 220*4882a593Smuzhiyun data. A Maximum of 64 metadata/data device entries are supported 221*4882a593Smuzhiyun up to target version 1.8.0. 222*4882a593Smuzhiyun 1.9.0 supports up to 253 which is enforced by the used MD kernel runtime. 223*4882a593Smuzhiyun 224*4882a593Smuzhiyun If a drive has failed or is missing at creation time, a '-' can be 225*4882a593Smuzhiyun given for both the metadata and data drives for a given position. 226*4882a593Smuzhiyun 227*4882a593Smuzhiyun 228*4882a593SmuzhiyunExample Tables 229*4882a593Smuzhiyun-------------- 230*4882a593Smuzhiyun 231*4882a593Smuzhiyun:: 232*4882a593Smuzhiyun 233*4882a593Smuzhiyun # RAID4 - 4 data drives, 1 parity (no metadata devices) 234*4882a593Smuzhiyun # No metadata devices specified to hold superblock/bitmap info 235*4882a593Smuzhiyun # Chunk size of 1MiB 236*4882a593Smuzhiyun # (Lines separated for easy reading) 237*4882a593Smuzhiyun 238*4882a593Smuzhiyun 0 1960893648 raid \ 239*4882a593Smuzhiyun raid4 1 2048 \ 240*4882a593Smuzhiyun 5 - 8:17 - 8:33 - 8:49 - 8:65 - 8:81 241*4882a593Smuzhiyun 242*4882a593Smuzhiyun # RAID4 - 4 data drives, 1 parity (with metadata devices) 243*4882a593Smuzhiyun # Chunk size of 1MiB, force RAID initialization, 244*4882a593Smuzhiyun # min recovery rate at 20 kiB/sec/disk 245*4882a593Smuzhiyun 246*4882a593Smuzhiyun 0 1960893648 raid \ 247*4882a593Smuzhiyun raid4 4 2048 sync min_recovery_rate 20 \ 248*4882a593Smuzhiyun 5 8:17 8:18 8:33 8:34 8:49 8:50 8:65 8:66 8:81 8:82 249*4882a593Smuzhiyun 250*4882a593Smuzhiyun 251*4882a593SmuzhiyunStatus Output 252*4882a593Smuzhiyun------------- 253*4882a593Smuzhiyun'dmsetup table' displays the table used to construct the mapping. 254*4882a593SmuzhiyunThe optional parameters are always printed in the order listed 255*4882a593Smuzhiyunabove with "sync" or "nosync" always output ahead of the other 256*4882a593Smuzhiyunarguments, regardless of the order used when originally loading the table. 257*4882a593SmuzhiyunArguments that can be repeated are ordered by value. 258*4882a593Smuzhiyun 259*4882a593Smuzhiyun 260*4882a593Smuzhiyun'dmsetup status' yields information on the state and health of the array. 261*4882a593SmuzhiyunThe output is as follows (normally a single line, but expanded here for 262*4882a593Smuzhiyunclarity):: 263*4882a593Smuzhiyun 264*4882a593Smuzhiyun 1: <s> <l> raid \ 265*4882a593Smuzhiyun 2: <raid_type> <#devices> <health_chars> \ 266*4882a593Smuzhiyun 3: <sync_ratio> <sync_action> <mismatch_cnt> 267*4882a593Smuzhiyun 268*4882a593SmuzhiyunLine 1 is the standard output produced by device-mapper. 269*4882a593Smuzhiyun 270*4882a593SmuzhiyunLine 2 & 3 are produced by the raid target and are best explained by example:: 271*4882a593Smuzhiyun 272*4882a593Smuzhiyun 0 1960893648 raid raid4 5 AAAAA 2/490221568 init 0 273*4882a593Smuzhiyun 274*4882a593SmuzhiyunHere we can see the RAID type is raid4, there are 5 devices - all of 275*4882a593Smuzhiyunwhich are 'A'live, and the array is 2/490221568 complete with its initial 276*4882a593Smuzhiyunrecovery. Here is a fuller description of the individual fields: 277*4882a593Smuzhiyun 278*4882a593Smuzhiyun =============== ========================================================= 279*4882a593Smuzhiyun <raid_type> Same as the <raid_type> used to create the array. 280*4882a593Smuzhiyun <health_chars> One char for each device, indicating: 281*4882a593Smuzhiyun 282*4882a593Smuzhiyun - 'A' = alive and in-sync 283*4882a593Smuzhiyun - 'a' = alive but not in-sync 284*4882a593Smuzhiyun - 'D' = dead/failed. 285*4882a593Smuzhiyun <sync_ratio> The ratio indicating how much of the array has undergone 286*4882a593Smuzhiyun the process described by 'sync_action'. If the 287*4882a593Smuzhiyun 'sync_action' is "check" or "repair", then the process 288*4882a593Smuzhiyun of "resync" or "recover" can be considered complete. 289*4882a593Smuzhiyun <sync_action> One of the following possible states: 290*4882a593Smuzhiyun 291*4882a593Smuzhiyun idle 292*4882a593Smuzhiyun - No synchronization action is being performed. 293*4882a593Smuzhiyun frozen 294*4882a593Smuzhiyun - The current action has been halted. 295*4882a593Smuzhiyun resync 296*4882a593Smuzhiyun - Array is undergoing its initial synchronization 297*4882a593Smuzhiyun or is resynchronizing after an unclean shutdown 298*4882a593Smuzhiyun (possibly aided by a bitmap). 299*4882a593Smuzhiyun recover 300*4882a593Smuzhiyun - A device in the array is being rebuilt or 301*4882a593Smuzhiyun replaced. 302*4882a593Smuzhiyun check 303*4882a593Smuzhiyun - A user-initiated full check of the array is 304*4882a593Smuzhiyun being performed. All blocks are read and 305*4882a593Smuzhiyun checked for consistency. The number of 306*4882a593Smuzhiyun discrepancies found are recorded in 307*4882a593Smuzhiyun <mismatch_cnt>. No changes are made to the 308*4882a593Smuzhiyun array by this action. 309*4882a593Smuzhiyun repair 310*4882a593Smuzhiyun - The same as "check", but discrepancies are 311*4882a593Smuzhiyun corrected. 312*4882a593Smuzhiyun reshape 313*4882a593Smuzhiyun - The array is undergoing a reshape. 314*4882a593Smuzhiyun <mismatch_cnt> The number of discrepancies found between mirror copies 315*4882a593Smuzhiyun in RAID1/10 or wrong parity values found in RAID4/5/6. 316*4882a593Smuzhiyun This value is valid only after a "check" of the array 317*4882a593Smuzhiyun is performed. A healthy array has a 'mismatch_cnt' of 0. 318*4882a593Smuzhiyun <data_offset> The current data offset to the start of the user data on 319*4882a593Smuzhiyun each component device of a raid set (see the respective 320*4882a593Smuzhiyun raid parameter to support out-of-place reshaping). 321*4882a593Smuzhiyun <journal_char> - 'A' - active write-through journal device. 322*4882a593Smuzhiyun - 'a' - active write-back journal device. 323*4882a593Smuzhiyun - 'D' - dead journal device. 324*4882a593Smuzhiyun - '-' - no journal device. 325*4882a593Smuzhiyun =============== ========================================================= 326*4882a593Smuzhiyun 327*4882a593Smuzhiyun 328*4882a593SmuzhiyunMessage Interface 329*4882a593Smuzhiyun----------------- 330*4882a593SmuzhiyunThe dm-raid target will accept certain actions through the 'message' interface. 331*4882a593Smuzhiyun('man dmsetup' for more information on the message interface.) These actions 332*4882a593Smuzhiyuninclude: 333*4882a593Smuzhiyun 334*4882a593Smuzhiyun ========= ================================================ 335*4882a593Smuzhiyun "idle" Halt the current sync action. 336*4882a593Smuzhiyun "frozen" Freeze the current sync action. 337*4882a593Smuzhiyun "resync" Initiate/continue a resync. 338*4882a593Smuzhiyun "recover" Initiate/continue a recover process. 339*4882a593Smuzhiyun "check" Initiate a check (i.e. a "scrub") of the array. 340*4882a593Smuzhiyun "repair" Initiate a repair of the array. 341*4882a593Smuzhiyun ========= ================================================ 342*4882a593Smuzhiyun 343*4882a593Smuzhiyun 344*4882a593SmuzhiyunDiscard Support 345*4882a593Smuzhiyun--------------- 346*4882a593SmuzhiyunThe implementation of discard support among hardware vendors varies. 347*4882a593SmuzhiyunWhen a block is discarded, some storage devices will return zeroes when 348*4882a593Smuzhiyunthe block is read. These devices set the 'discard_zeroes_data' 349*4882a593Smuzhiyunattribute. Other devices will return random data. Confusingly, some 350*4882a593Smuzhiyundevices that advertise 'discard_zeroes_data' will not reliably return 351*4882a593Smuzhiyunzeroes when discarded blocks are read! Since RAID 4/5/6 uses blocks 352*4882a593Smuzhiyunfrom a number of devices to calculate parity blocks and (for performance 353*4882a593Smuzhiyunreasons) relies on 'discard_zeroes_data' being reliable, it is important 354*4882a593Smuzhiyunthat the devices be consistent. Blocks may be discarded in the middle 355*4882a593Smuzhiyunof a RAID 4/5/6 stripe and if subsequent read results are not 356*4882a593Smuzhiyunconsistent, the parity blocks may be calculated differently at any time; 357*4882a593Smuzhiyunmaking the parity blocks useless for redundancy. It is important to 358*4882a593Smuzhiyununderstand how your hardware behaves with discards if you are going to 359*4882a593Smuzhiyunenable discards with RAID 4/5/6. 360*4882a593Smuzhiyun 361*4882a593SmuzhiyunSince the behavior of storage devices is unreliable in this respect, 362*4882a593Smuzhiyuneven when reporting 'discard_zeroes_data', by default RAID 4/5/6 363*4882a593Smuzhiyundiscard support is disabled -- this ensures data integrity at the 364*4882a593Smuzhiyunexpense of losing some performance. 365*4882a593Smuzhiyun 366*4882a593SmuzhiyunStorage devices that properly support 'discard_zeroes_data' are 367*4882a593Smuzhiyunincreasingly whitelisted in the kernel and can thus be trusted. 368*4882a593Smuzhiyun 369*4882a593SmuzhiyunFor trusted devices, the following dm-raid module parameter can be set 370*4882a593Smuzhiyunto safely enable discard support for RAID 4/5/6: 371*4882a593Smuzhiyun 372*4882a593Smuzhiyun 'devices_handle_discards_safely' 373*4882a593Smuzhiyun 374*4882a593Smuzhiyun 375*4882a593SmuzhiyunVersion History 376*4882a593Smuzhiyun--------------- 377*4882a593Smuzhiyun 378*4882a593Smuzhiyun:: 379*4882a593Smuzhiyun 380*4882a593Smuzhiyun 1.0.0 Initial version. Support for RAID 4/5/6 381*4882a593Smuzhiyun 1.1.0 Added support for RAID 1 382*4882a593Smuzhiyun 1.2.0 Handle creation of arrays that contain failed devices. 383*4882a593Smuzhiyun 1.3.0 Added support for RAID 10 384*4882a593Smuzhiyun 1.3.1 Allow device replacement/rebuild for RAID 10 385*4882a593Smuzhiyun 1.3.2 Fix/improve redundancy checking for RAID10 386*4882a593Smuzhiyun 1.4.0 Non-functional change. Removes arg from mapping function. 387*4882a593Smuzhiyun 1.4.1 RAID10 fix redundancy validation checks (commit 55ebbb5). 388*4882a593Smuzhiyun 1.4.2 Add RAID10 "far" and "offset" algorithm support. 389*4882a593Smuzhiyun 1.5.0 Add message interface to allow manipulation of the sync_action. 390*4882a593Smuzhiyun New status (STATUSTYPE_INFO) fields: sync_action and mismatch_cnt. 391*4882a593Smuzhiyun 1.5.1 Add ability to restore transiently failed devices on resume. 392*4882a593Smuzhiyun 1.5.2 'mismatch_cnt' is zero unless [last_]sync_action is "check". 393*4882a593Smuzhiyun 1.6.0 Add discard support (and devices_handle_discard_safely module param). 394*4882a593Smuzhiyun 1.7.0 Add support for MD RAID0 mappings. 395*4882a593Smuzhiyun 1.8.0 Explicitly check for compatible flags in the superblock metadata 396*4882a593Smuzhiyun and reject to start the raid set if any are set by a newer 397*4882a593Smuzhiyun target version, thus avoiding data corruption on a raid set 398*4882a593Smuzhiyun with a reshape in progress. 399*4882a593Smuzhiyun 1.9.0 Add support for RAID level takeover/reshape/region size 400*4882a593Smuzhiyun and set size reduction. 401*4882a593Smuzhiyun 1.9.1 Fix activation of existing RAID 4/10 mapped devices 402*4882a593Smuzhiyun 1.9.2 Don't emit '- -' on the status table line in case the constructor 403*4882a593Smuzhiyun fails reading a superblock. Correctly emit 'maj:min1 maj:min2' and 404*4882a593Smuzhiyun 'D' on the status line. If '- -' is passed into the constructor, emit 405*4882a593Smuzhiyun '- -' on the table line and '-' as the status line health character. 406*4882a593Smuzhiyun 1.10.0 Add support for raid4/5/6 journal device 407*4882a593Smuzhiyun 1.10.1 Fix data corruption on reshape request 408*4882a593Smuzhiyun 1.11.0 Fix table line argument order 409*4882a593Smuzhiyun (wrong raid10_copies/raid10_format sequence) 410*4882a593Smuzhiyun 1.11.1 Add raid4/5/6 journal write-back support via journal_mode option 411*4882a593Smuzhiyun 1.12.1 Fix for MD deadlock between mddev_suspend() and md_write_start() available 412*4882a593Smuzhiyun 1.13.0 Fix dev_health status at end of "recover" (was 'a', now 'A') 413*4882a593Smuzhiyun 1.13.1 Fix deadlock caused by early md_stop_writes(). Also fix size an 414*4882a593Smuzhiyun state races. 415*4882a593Smuzhiyun 1.13.2 Fix raid redundancy validation and avoid keeping raid set frozen 416*4882a593Smuzhiyun 1.14.0 Fix reshape race on small devices. Fix stripe adding reshape 417*4882a593Smuzhiyun deadlock/potential data corruption. Update superblock when 418*4882a593Smuzhiyun specific devices are requested via rebuild. Fix RAID leg 419*4882a593Smuzhiyun rebuild errors. 420*4882a593Smuzhiyun 1.15.0 Fix size extensions not being synchronized in case of new MD bitmap 421*4882a593Smuzhiyun pages allocated; also fix those not occuring after previous reductions 422*4882a593Smuzhiyun 1.15.1 Fix argument count and arguments for rebuild/write_mostly/journal_(dev|mode) 423*4882a593Smuzhiyun on the status line. 424