1*4882a593Smuzhiyun================================ 2*4882a593SmuzhiyunDevice-mapper "unstriped" target 3*4882a593Smuzhiyun================================ 4*4882a593Smuzhiyun 5*4882a593SmuzhiyunIntroduction 6*4882a593Smuzhiyun============ 7*4882a593Smuzhiyun 8*4882a593SmuzhiyunThe device-mapper "unstriped" target provides a transparent mechanism to 9*4882a593Smuzhiyununstripe a device-mapper "striped" target to access the underlying disks 10*4882a593Smuzhiyunwithout having to touch the true backing block-device. It can also be 11*4882a593Smuzhiyunused to unstripe a hardware RAID-0 to access backing disks. 12*4882a593Smuzhiyun 13*4882a593SmuzhiyunParameters: 14*4882a593Smuzhiyun<number of stripes> <chunk size> <stripe #> <dev_path> <offset> 15*4882a593Smuzhiyun 16*4882a593Smuzhiyun<number of stripes> 17*4882a593Smuzhiyun The number of stripes in the RAID 0. 18*4882a593Smuzhiyun 19*4882a593Smuzhiyun<chunk size> 20*4882a593Smuzhiyun The amount of 512B sectors in the chunk striping. 21*4882a593Smuzhiyun 22*4882a593Smuzhiyun<dev_path> 23*4882a593Smuzhiyun The block device you wish to unstripe. 24*4882a593Smuzhiyun 25*4882a593Smuzhiyun<stripe #> 26*4882a593Smuzhiyun The stripe number within the device that corresponds to physical 27*4882a593Smuzhiyun drive you wish to unstripe. This must be 0 indexed. 28*4882a593Smuzhiyun 29*4882a593Smuzhiyun 30*4882a593SmuzhiyunWhy use this module? 31*4882a593Smuzhiyun==================== 32*4882a593Smuzhiyun 33*4882a593SmuzhiyunAn example of undoing an existing dm-stripe 34*4882a593Smuzhiyun------------------------------------------- 35*4882a593Smuzhiyun 36*4882a593SmuzhiyunThis small bash script will setup 4 loop devices and use the existing 37*4882a593Smuzhiyunstriped target to combine the 4 devices into one. It then will use 38*4882a593Smuzhiyunthe unstriped target ontop of the striped device to access the 39*4882a593Smuzhiyunindividual backing loop devices. We write data to the newly exposed 40*4882a593Smuzhiyununstriped devices and verify the data written matches the correct 41*4882a593Smuzhiyununderlying device on the striped array:: 42*4882a593Smuzhiyun 43*4882a593Smuzhiyun #!/bin/bash 44*4882a593Smuzhiyun 45*4882a593Smuzhiyun MEMBER_SIZE=$((128 * 1024 * 1024)) 46*4882a593Smuzhiyun NUM=4 47*4882a593Smuzhiyun SEQ_END=$((${NUM}-1)) 48*4882a593Smuzhiyun CHUNK=256 49*4882a593Smuzhiyun BS=4096 50*4882a593Smuzhiyun 51*4882a593Smuzhiyun RAID_SIZE=$((${MEMBER_SIZE}*${NUM}/512)) 52*4882a593Smuzhiyun DM_PARMS="0 ${RAID_SIZE} striped ${NUM} ${CHUNK}" 53*4882a593Smuzhiyun COUNT=$((${MEMBER_SIZE} / ${BS})) 54*4882a593Smuzhiyun 55*4882a593Smuzhiyun for i in $(seq 0 ${SEQ_END}); do 56*4882a593Smuzhiyun dd if=/dev/zero of=member-${i} bs=${MEMBER_SIZE} count=1 oflag=direct 57*4882a593Smuzhiyun losetup /dev/loop${i} member-${i} 58*4882a593Smuzhiyun DM_PARMS+=" /dev/loop${i} 0" 59*4882a593Smuzhiyun done 60*4882a593Smuzhiyun 61*4882a593Smuzhiyun echo $DM_PARMS | dmsetup create raid0 62*4882a593Smuzhiyun for i in $(seq 0 ${SEQ_END}); do 63*4882a593Smuzhiyun echo "0 1 unstriped ${NUM} ${CHUNK} ${i} /dev/mapper/raid0 0" | dmsetup create set-${i} 64*4882a593Smuzhiyun done; 65*4882a593Smuzhiyun 66*4882a593Smuzhiyun for i in $(seq 0 ${SEQ_END}); do 67*4882a593Smuzhiyun dd if=/dev/urandom of=/dev/mapper/set-${i} bs=${BS} count=${COUNT} oflag=direct 68*4882a593Smuzhiyun diff /dev/mapper/set-${i} member-${i} 69*4882a593Smuzhiyun done; 70*4882a593Smuzhiyun 71*4882a593Smuzhiyun for i in $(seq 0 ${SEQ_END}); do 72*4882a593Smuzhiyun dmsetup remove set-${i} 73*4882a593Smuzhiyun done 74*4882a593Smuzhiyun 75*4882a593Smuzhiyun dmsetup remove raid0 76*4882a593Smuzhiyun 77*4882a593Smuzhiyun for i in $(seq 0 ${SEQ_END}); do 78*4882a593Smuzhiyun losetup -d /dev/loop${i} 79*4882a593Smuzhiyun rm -f member-${i} 80*4882a593Smuzhiyun done 81*4882a593Smuzhiyun 82*4882a593SmuzhiyunAnother example 83*4882a593Smuzhiyun--------------- 84*4882a593Smuzhiyun 85*4882a593SmuzhiyunIntel NVMe drives contain two cores on the physical device. 86*4882a593SmuzhiyunEach core of the drive has segregated access to its LBA range. 87*4882a593SmuzhiyunThe current LBA model has a RAID 0 128k chunk on each core, resulting 88*4882a593Smuzhiyunin a 256k stripe across the two cores:: 89*4882a593Smuzhiyun 90*4882a593Smuzhiyun Core 0: Core 1: 91*4882a593Smuzhiyun __________ __________ 92*4882a593Smuzhiyun | LBA 512| | LBA 768| 93*4882a593Smuzhiyun | LBA 0 | | LBA 256| 94*4882a593Smuzhiyun ---------- ---------- 95*4882a593Smuzhiyun 96*4882a593SmuzhiyunThe purpose of this unstriping is to provide better QoS in noisy 97*4882a593Smuzhiyunneighbor environments. When two partitions are created on the 98*4882a593Smuzhiyunaggregate drive without this unstriping, reads on one partition 99*4882a593Smuzhiyuncan affect writes on another partition. This is because the partitions 100*4882a593Smuzhiyunare striped across the two cores. When we unstripe this hardware RAID 0 101*4882a593Smuzhiyunand make partitions on each new exposed device the two partitions are now 102*4882a593Smuzhiyunphysically separated. 103*4882a593Smuzhiyun 104*4882a593SmuzhiyunWith the dm-unstriped target we're able to segregate an fio script that 105*4882a593Smuzhiyunhas read and write jobs that are independent of each other. Compared to 106*4882a593Smuzhiyunwhen we run the test on a combined drive with partitions, we were able 107*4882a593Smuzhiyunto get a 92% reduction in read latency using this device mapper target. 108*4882a593Smuzhiyun 109*4882a593Smuzhiyun 110*4882a593SmuzhiyunExample dmsetup usage 111*4882a593Smuzhiyun===================== 112*4882a593Smuzhiyun 113*4882a593Smuzhiyununstriped ontop of Intel NVMe device that has 2 cores 114*4882a593Smuzhiyun----------------------------------------------------- 115*4882a593Smuzhiyun 116*4882a593Smuzhiyun:: 117*4882a593Smuzhiyun 118*4882a593Smuzhiyun dmsetup create nvmset0 --table '0 512 unstriped 2 256 0 /dev/nvme0n1 0' 119*4882a593Smuzhiyun dmsetup create nvmset1 --table '0 512 unstriped 2 256 1 /dev/nvme0n1 0' 120*4882a593Smuzhiyun 121*4882a593SmuzhiyunThere will now be two devices that expose Intel NVMe core 0 and 1 122*4882a593Smuzhiyunrespectively:: 123*4882a593Smuzhiyun 124*4882a593Smuzhiyun /dev/mapper/nvmset0 125*4882a593Smuzhiyun /dev/mapper/nvmset1 126*4882a593Smuzhiyun 127*4882a593Smuzhiyununstriped ontop of striped with 4 drives using 128K chunk size 128*4882a593Smuzhiyun-------------------------------------------------------------- 129*4882a593Smuzhiyun 130*4882a593Smuzhiyun:: 131*4882a593Smuzhiyun 132*4882a593Smuzhiyun dmsetup create raid_disk0 --table '0 512 unstriped 4 256 0 /dev/mapper/striped 0' 133*4882a593Smuzhiyun dmsetup create raid_disk1 --table '0 512 unstriped 4 256 1 /dev/mapper/striped 0' 134*4882a593Smuzhiyun dmsetup create raid_disk2 --table '0 512 unstriped 4 256 2 /dev/mapper/striped 0' 135*4882a593Smuzhiyun dmsetup create raid_disk3 --table '0 512 unstriped 4 256 3 /dev/mapper/striped 0' 136