xref: /OK3568_Linux_fs/kernel/Documentation/admin-guide/device-mapper/snapshot.rst (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593Smuzhiyun==============================
2*4882a593SmuzhiyunDevice-mapper snapshot support
3*4882a593Smuzhiyun==============================
4*4882a593Smuzhiyun
5*4882a593SmuzhiyunDevice-mapper allows you, without massive data copying:
6*4882a593Smuzhiyun
7*4882a593Smuzhiyun-  To create snapshots of any block device i.e. mountable, saved states of
8*4882a593Smuzhiyun   the block device which are also writable without interfering with the
9*4882a593Smuzhiyun   original content;
10*4882a593Smuzhiyun-  To create device "forks", i.e. multiple different versions of the
11*4882a593Smuzhiyun   same data stream.
12*4882a593Smuzhiyun-  To merge a snapshot of a block device back into the snapshot's origin
13*4882a593Smuzhiyun   device.
14*4882a593Smuzhiyun
15*4882a593SmuzhiyunIn the first two cases, dm copies only the chunks of data that get
16*4882a593Smuzhiyunchanged and uses a separate copy-on-write (COW) block device for
17*4882a593Smuzhiyunstorage.
18*4882a593Smuzhiyun
19*4882a593SmuzhiyunFor snapshot merge the contents of the COW storage are merged back into
20*4882a593Smuzhiyunthe origin device.
21*4882a593Smuzhiyun
22*4882a593Smuzhiyun
23*4882a593SmuzhiyunThere are three dm targets available:
24*4882a593Smuzhiyunsnapshot, snapshot-origin, and snapshot-merge.
25*4882a593Smuzhiyun
26*4882a593Smuzhiyun-  snapshot-origin <origin>
27*4882a593Smuzhiyun
28*4882a593Smuzhiyunwhich will normally have one or more snapshots based on it.
29*4882a593SmuzhiyunReads will be mapped directly to the backing device. For each write, the
30*4882a593Smuzhiyunoriginal data will be saved in the <COW device> of each snapshot to keep
31*4882a593Smuzhiyunits visible content unchanged, at least until the <COW device> fills up.
32*4882a593Smuzhiyun
33*4882a593Smuzhiyun
34*4882a593Smuzhiyun-  snapshot <origin> <COW device> <persistent?> <chunksize>
35*4882a593Smuzhiyun   [<# feature args> [<arg>]*]
36*4882a593Smuzhiyun
37*4882a593SmuzhiyunA snapshot of the <origin> block device is created. Changed chunks of
38*4882a593Smuzhiyun<chunksize> sectors will be stored on the <COW device>.  Writes will
39*4882a593Smuzhiyunonly go to the <COW device>.  Reads will come from the <COW device> or
40*4882a593Smuzhiyunfrom <origin> for unchanged data.  <COW device> will often be
41*4882a593Smuzhiyunsmaller than the origin and if it fills up the snapshot will become
42*4882a593Smuzhiyunuseless and be disabled, returning errors.  So it is important to monitor
43*4882a593Smuzhiyunthe amount of free space and expand the <COW device> before it fills up.
44*4882a593Smuzhiyun
45*4882a593Smuzhiyun<persistent?> is P (Persistent) or N (Not persistent - will not survive
46*4882a593Smuzhiyunafter reboot).  O (Overflow) can be added as a persistent store option
47*4882a593Smuzhiyunto allow userspace to advertise its support for seeing "Overflow" in the
48*4882a593Smuzhiyunsnapshot status.  So supported store types are "P", "PO" and "N".
49*4882a593Smuzhiyun
50*4882a593SmuzhiyunThe difference between persistent and transient is with transient
51*4882a593Smuzhiyunsnapshots less metadata must be saved on disk - they can be kept in
52*4882a593Smuzhiyunmemory by the kernel.
53*4882a593Smuzhiyun
54*4882a593SmuzhiyunWhen loading or unloading the snapshot target, the corresponding
55*4882a593Smuzhiyunsnapshot-origin or snapshot-merge target must be suspended. A failure to
56*4882a593Smuzhiyunsuspend the origin target could result in data corruption.
57*4882a593Smuzhiyun
58*4882a593SmuzhiyunOptional features:
59*4882a593Smuzhiyun
60*4882a593Smuzhiyun   discard_zeroes_cow - a discard issued to the snapshot device that
61*4882a593Smuzhiyun   maps to entire chunks to will zero the corresponding exception(s) in
62*4882a593Smuzhiyun   the snapshot's exception store.
63*4882a593Smuzhiyun
64*4882a593Smuzhiyun   discard_passdown_origin - a discard to the snapshot device is passed
65*4882a593Smuzhiyun   down to the snapshot-origin's underlying device.  This doesn't cause
66*4882a593Smuzhiyun   copy-out to the snapshot exception store because the snapshot-origin
67*4882a593Smuzhiyun   target is bypassed.
68*4882a593Smuzhiyun
69*4882a593Smuzhiyun   The discard_passdown_origin feature depends on the discard_zeroes_cow
70*4882a593Smuzhiyun   feature being enabled.
71*4882a593Smuzhiyun
72*4882a593Smuzhiyun
73*4882a593Smuzhiyun-  snapshot-merge <origin> <COW device> <persistent> <chunksize>
74*4882a593Smuzhiyun   [<# feature args> [<arg>]*]
75*4882a593Smuzhiyun
76*4882a593Smuzhiyuntakes the same table arguments as the snapshot target except it only
77*4882a593Smuzhiyunworks with persistent snapshots.  This target assumes the role of the
78*4882a593Smuzhiyun"snapshot-origin" target and must not be loaded if the "snapshot-origin"
79*4882a593Smuzhiyunis still present for <origin>.
80*4882a593Smuzhiyun
81*4882a593SmuzhiyunCreates a merging snapshot that takes control of the changed chunks
82*4882a593Smuzhiyunstored in the <COW device> of an existing snapshot, through a handover
83*4882a593Smuzhiyunprocedure, and merges these chunks back into the <origin>.  Once merging
84*4882a593Smuzhiyunhas started (in the background) the <origin> may be opened and the merge
85*4882a593Smuzhiyunwill continue while I/O is flowing to it.  Changes to the <origin> are
86*4882a593Smuzhiyundeferred until the merging snapshot's corresponding chunk(s) have been
87*4882a593Smuzhiyunmerged.  Once merging has started the snapshot device, associated with
88*4882a593Smuzhiyunthe "snapshot" target, will return -EIO when accessed.
89*4882a593Smuzhiyun
90*4882a593Smuzhiyun
91*4882a593SmuzhiyunHow snapshot is used by LVM2
92*4882a593Smuzhiyun============================
93*4882a593SmuzhiyunWhen you create the first LVM2 snapshot of a volume, four dm devices are used:
94*4882a593Smuzhiyun
95*4882a593Smuzhiyun1) a device containing the original mapping table of the source volume;
96*4882a593Smuzhiyun2) a device used as the <COW device>;
97*4882a593Smuzhiyun3) a "snapshot" device, combining #1 and #2, which is the visible snapshot
98*4882a593Smuzhiyun   volume;
99*4882a593Smuzhiyun4) the "original" volume (which uses the device number used by the original
100*4882a593Smuzhiyun   source volume), whose table is replaced by a "snapshot-origin" mapping
101*4882a593Smuzhiyun   from device #1.
102*4882a593Smuzhiyun
103*4882a593SmuzhiyunA fixed naming scheme is used, so with the following commands::
104*4882a593Smuzhiyun
105*4882a593Smuzhiyun  lvcreate -L 1G -n base volumeGroup
106*4882a593Smuzhiyun  lvcreate -L 100M --snapshot -n snap volumeGroup/base
107*4882a593Smuzhiyun
108*4882a593Smuzhiyunwe'll have this situation (with volumes in above order)::
109*4882a593Smuzhiyun
110*4882a593Smuzhiyun  # dmsetup table|grep volumeGroup
111*4882a593Smuzhiyun
112*4882a593Smuzhiyun  volumeGroup-base-real: 0 2097152 linear 8:19 384
113*4882a593Smuzhiyun  volumeGroup-snap-cow: 0 204800 linear 8:19 2097536
114*4882a593Smuzhiyun  volumeGroup-snap: 0 2097152 snapshot 254:11 254:12 P 16
115*4882a593Smuzhiyun  volumeGroup-base: 0 2097152 snapshot-origin 254:11
116*4882a593Smuzhiyun
117*4882a593Smuzhiyun  # ls -lL /dev/mapper/volumeGroup-*
118*4882a593Smuzhiyun  brw-------  1 root root 254, 11 29 ago 18:15 /dev/mapper/volumeGroup-base-real
119*4882a593Smuzhiyun  brw-------  1 root root 254, 12 29 ago 18:15 /dev/mapper/volumeGroup-snap-cow
120*4882a593Smuzhiyun  brw-------  1 root root 254, 13 29 ago 18:15 /dev/mapper/volumeGroup-snap
121*4882a593Smuzhiyun  brw-------  1 root root 254, 10 29 ago 18:14 /dev/mapper/volumeGroup-base
122*4882a593Smuzhiyun
123*4882a593Smuzhiyun
124*4882a593SmuzhiyunHow snapshot-merge is used by LVM2
125*4882a593Smuzhiyun==================================
126*4882a593SmuzhiyunA merging snapshot assumes the role of the "snapshot-origin" while
127*4882a593Smuzhiyunmerging.  As such the "snapshot-origin" is replaced with
128*4882a593Smuzhiyun"snapshot-merge".  The "-real" device is not changed and the "-cow"
129*4882a593Smuzhiyundevice is renamed to <origin name>-cow to aid LVM2's cleanup of the
130*4882a593Smuzhiyunmerging snapshot after it completes.  The "snapshot" that hands over its
131*4882a593SmuzhiyunCOW device to the "snapshot-merge" is deactivated (unless using lvchange
132*4882a593Smuzhiyun--refresh); but if it is left active it will simply return I/O errors.
133*4882a593Smuzhiyun
134*4882a593SmuzhiyunA snapshot will merge into its origin with the following command::
135*4882a593Smuzhiyun
136*4882a593Smuzhiyun  lvconvert --merge volumeGroup/snap
137*4882a593Smuzhiyun
138*4882a593Smuzhiyunwe'll now have this situation::
139*4882a593Smuzhiyun
140*4882a593Smuzhiyun  # dmsetup table|grep volumeGroup
141*4882a593Smuzhiyun
142*4882a593Smuzhiyun  volumeGroup-base-real: 0 2097152 linear 8:19 384
143*4882a593Smuzhiyun  volumeGroup-base-cow: 0 204800 linear 8:19 2097536
144*4882a593Smuzhiyun  volumeGroup-base: 0 2097152 snapshot-merge 254:11 254:12 P 16
145*4882a593Smuzhiyun
146*4882a593Smuzhiyun  # ls -lL /dev/mapper/volumeGroup-*
147*4882a593Smuzhiyun  brw-------  1 root root 254, 11 29 ago 18:15 /dev/mapper/volumeGroup-base-real
148*4882a593Smuzhiyun  brw-------  1 root root 254, 12 29 ago 18:16 /dev/mapper/volumeGroup-base-cow
149*4882a593Smuzhiyun  brw-------  1 root root 254, 10 29 ago 18:16 /dev/mapper/volumeGroup-base
150*4882a593Smuzhiyun
151*4882a593Smuzhiyun
152*4882a593SmuzhiyunHow to determine when a merging is complete
153*4882a593Smuzhiyun===========================================
154*4882a593SmuzhiyunThe snapshot-merge and snapshot status lines end with:
155*4882a593Smuzhiyun
156*4882a593Smuzhiyun  <sectors_allocated>/<total_sectors> <metadata_sectors>
157*4882a593Smuzhiyun
158*4882a593SmuzhiyunBoth <sectors_allocated> and <total_sectors> include both data and metadata.
159*4882a593SmuzhiyunDuring merging, the number of sectors allocated gets smaller and
160*4882a593Smuzhiyunsmaller.  Merging has finished when the number of sectors holding data
161*4882a593Smuzhiyunis zero, in other words <sectors_allocated> == <metadata_sectors>.
162*4882a593Smuzhiyun
163*4882a593SmuzhiyunHere is a practical example (using a hybrid of lvm and dmsetup commands)::
164*4882a593Smuzhiyun
165*4882a593Smuzhiyun  # lvs
166*4882a593Smuzhiyun    LV      VG          Attr   LSize Origin  Snap%  Move Log Copy%  Convert
167*4882a593Smuzhiyun    base    volumeGroup owi-a- 4.00g
168*4882a593Smuzhiyun    snap    volumeGroup swi-a- 1.00g base  18.97
169*4882a593Smuzhiyun
170*4882a593Smuzhiyun  # dmsetup status volumeGroup-snap
171*4882a593Smuzhiyun  0 8388608 snapshot 397896/2097152 1560
172*4882a593Smuzhiyun                                    ^^^^ metadata sectors
173*4882a593Smuzhiyun
174*4882a593Smuzhiyun  # lvconvert --merge -b volumeGroup/snap
175*4882a593Smuzhiyun    Merging of volume snap started.
176*4882a593Smuzhiyun
177*4882a593Smuzhiyun  # lvs volumeGroup/snap
178*4882a593Smuzhiyun    LV      VG          Attr   LSize Origin  Snap%  Move Log Copy%  Convert
179*4882a593Smuzhiyun    base    volumeGroup Owi-a- 4.00g          17.23
180*4882a593Smuzhiyun
181*4882a593Smuzhiyun  # dmsetup status volumeGroup-base
182*4882a593Smuzhiyun  0 8388608 snapshot-merge 281688/2097152 1104
183*4882a593Smuzhiyun
184*4882a593Smuzhiyun  # dmsetup status volumeGroup-base
185*4882a593Smuzhiyun  0 8388608 snapshot-merge 180480/2097152 712
186*4882a593Smuzhiyun
187*4882a593Smuzhiyun  # dmsetup status volumeGroup-base
188*4882a593Smuzhiyun  0 8388608 snapshot-merge 16/2097152 16
189*4882a593Smuzhiyun
190*4882a593SmuzhiyunMerging has finished.
191*4882a593Smuzhiyun
192*4882a593Smuzhiyun::
193*4882a593Smuzhiyun
194*4882a593Smuzhiyun  # lvs
195*4882a593Smuzhiyun    LV      VG          Attr   LSize Origin  Snap%  Move Log Copy%  Convert
196*4882a593Smuzhiyun    base    volumeGroup owi-a- 4.00g
197