xref: /OK3568_Linux_fs/kernel/fs/cramfs/README (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593SmuzhiyunNotes on Filesystem Layout
2*4882a593Smuzhiyun--------------------------
3*4882a593Smuzhiyun
4*4882a593SmuzhiyunThese notes describe what mkcramfs generates.  Kernel requirements are
5*4882a593Smuzhiyuna bit looser, e.g. it doesn't care if the <file_data> items are
6*4882a593Smuzhiyunswapped around (though it does care that directory entries (inodes) in
7*4882a593Smuzhiyuna given directory are contiguous, as this is used by readdir).
8*4882a593Smuzhiyun
9*4882a593SmuzhiyunAll data is currently in host-endian format; neither mkcramfs nor the
10*4882a593Smuzhiyunkernel ever do swabbing.  (See section `Block Size' below.)
11*4882a593Smuzhiyun
12*4882a593Smuzhiyun<filesystem>:
13*4882a593Smuzhiyun	<superblock>
14*4882a593Smuzhiyun	<directory_structure>
15*4882a593Smuzhiyun	<data>
16*4882a593Smuzhiyun
17*4882a593Smuzhiyun<superblock>: struct cramfs_super (see cramfs_fs.h).
18*4882a593Smuzhiyun
19*4882a593Smuzhiyun<directory_structure>:
20*4882a593Smuzhiyun	For each file:
21*4882a593Smuzhiyun		struct cramfs_inode (see cramfs_fs.h).
22*4882a593Smuzhiyun		Filename.  Not generally null-terminated, but it is
23*4882a593Smuzhiyun		 null-padded to a multiple of 4 bytes.
24*4882a593Smuzhiyun
25*4882a593SmuzhiyunThe order of inode traversal is described as "width-first" (not to be
26*4882a593Smuzhiyunconfused with breadth-first); i.e. like depth-first but listing all of
27*4882a593Smuzhiyuna directory's entries before recursing down its subdirectories: the
28*4882a593Smuzhiyunsame order as `ls -AUR' (but without the /^\..*:$/ directory header
29*4882a593Smuzhiyunlines); put another way, the same order as `find -type d -exec
30*4882a593Smuzhiyunls -AU1 {} \;'.
31*4882a593Smuzhiyun
32*4882a593SmuzhiyunBeginning in 2.4.7, directory entries are sorted.  This optimization
33*4882a593Smuzhiyunallows cramfs_lookup to return more quickly when a filename does not
34*4882a593Smuzhiyunexist, speeds up user-space directory sorts, etc.
35*4882a593Smuzhiyun
36*4882a593Smuzhiyun<data>:
37*4882a593Smuzhiyun	One <file_data> for each file that's either a symlink or a
38*4882a593Smuzhiyun	 regular file of non-zero st_size.
39*4882a593Smuzhiyun
40*4882a593Smuzhiyun<file_data>:
41*4882a593Smuzhiyun	nblocks * <block_pointer>
42*4882a593Smuzhiyun	 (where nblocks = (st_size - 1) / blksize + 1)
43*4882a593Smuzhiyun	nblocks * <block>
44*4882a593Smuzhiyun	padding to multiple of 4 bytes
45*4882a593Smuzhiyun
46*4882a593SmuzhiyunThe i'th <block_pointer> for a file stores the byte offset of the
47*4882a593Smuzhiyun*end* of the i'th <block> (i.e. one past the last byte, which is the
48*4882a593Smuzhiyunsame as the start of the (i+1)'th <block> if there is one).  The first
49*4882a593Smuzhiyun<block> immediately follows the last <block_pointer> for the file.
50*4882a593Smuzhiyun<block_pointer>s are each 32 bits long.
51*4882a593Smuzhiyun
52*4882a593SmuzhiyunWhen the CRAMFS_FLAG_EXT_BLOCK_POINTERS capability bit is set, each
53*4882a593Smuzhiyun<block_pointer>'s top bits may contain special flags as follows:
54*4882a593Smuzhiyun
55*4882a593SmuzhiyunCRAMFS_BLK_FLAG_UNCOMPRESSED (bit 31):
56*4882a593Smuzhiyun	The block data is not compressed and should be copied verbatim.
57*4882a593Smuzhiyun
58*4882a593SmuzhiyunCRAMFS_BLK_FLAG_DIRECT_PTR (bit 30):
59*4882a593Smuzhiyun	The <block_pointer> stores the actual block start offset and not
60*4882a593Smuzhiyun	its end, shifted right by 2 bits. The block must therefore be
61*4882a593Smuzhiyun	aligned to a 4-byte boundary. The block size is either blksize
62*4882a593Smuzhiyun	if CRAMFS_BLK_FLAG_UNCOMPRESSED is also specified, otherwise
63*4882a593Smuzhiyun	the compressed data length is included in the first 2 bytes of
64*4882a593Smuzhiyun	the block data. This is used to allow discontiguous data layout
65*4882a593Smuzhiyun	and specific data block alignments e.g. for XIP applications.
66*4882a593Smuzhiyun
67*4882a593Smuzhiyun
68*4882a593SmuzhiyunThe order of <file_data>'s is a depth-first descent of the directory
69*4882a593Smuzhiyuntree, i.e. the same order as `find -size +0 \( -type f -o -type l \)
70*4882a593Smuzhiyun-print'.
71*4882a593Smuzhiyun
72*4882a593Smuzhiyun
73*4882a593Smuzhiyun<block>: The i'th <block> is the output of zlib's compress function
74*4882a593Smuzhiyunapplied to the i'th blksize-sized chunk of the input data if the
75*4882a593Smuzhiyuncorresponding CRAMFS_BLK_FLAG_UNCOMPRESSED <block_ptr> bit is not set,
76*4882a593Smuzhiyunotherwise it is the input data directly.
77*4882a593Smuzhiyun(For the last <block> of the file, the input may of course be smaller.)
78*4882a593SmuzhiyunEach <block> may be a different size.  (See <block_pointer> above.)
79*4882a593Smuzhiyun
80*4882a593Smuzhiyun<block>s are merely byte-aligned, not generally u32-aligned.
81*4882a593Smuzhiyun
82*4882a593SmuzhiyunWhen CRAMFS_BLK_FLAG_DIRECT_PTR is specified then the corresponding
83*4882a593Smuzhiyun<block> may be located anywhere and not necessarily contiguous with
84*4882a593Smuzhiyunthe previous/next blocks. In that case it is minimally u32-aligned.
85*4882a593SmuzhiyunIf CRAMFS_BLK_FLAG_UNCOMPRESSED is also specified then the size is always
86*4882a593Smuzhiyunblksize except for the last block which is limited by the file length.
87*4882a593SmuzhiyunIf CRAMFS_BLK_FLAG_DIRECT_PTR is set and CRAMFS_BLK_FLAG_UNCOMPRESSED
88*4882a593Smuzhiyunis not set then the first 2 bytes of the block contains the size of the
89*4882a593Smuzhiyunremaining block data as this cannot be determined from the placement of
90*4882a593Smuzhiyunlogically adjacent blocks.
91*4882a593Smuzhiyun
92*4882a593Smuzhiyun
93*4882a593SmuzhiyunHoles
94*4882a593Smuzhiyun-----
95*4882a593Smuzhiyun
96*4882a593SmuzhiyunThis kernel supports cramfs holes (i.e. [efficient representation of]
97*4882a593Smuzhiyunblocks in uncompressed data consisting entirely of NUL bytes), but by
98*4882a593Smuzhiyundefault mkcramfs doesn't test for & create holes, since cramfs in
99*4882a593Smuzhiyunkernels up to at least 2.3.39 didn't support holes.  Run mkcramfs
100*4882a593Smuzhiyunwith -z if you want it to create files that can have holes in them.
101*4882a593Smuzhiyun
102*4882a593Smuzhiyun
103*4882a593SmuzhiyunTools
104*4882a593Smuzhiyun-----
105*4882a593Smuzhiyun
106*4882a593SmuzhiyunThe cramfs user-space tools, including mkcramfs and cramfsck, are
107*4882a593Smuzhiyunlocated at <http://sourceforge.net/projects/cramfs/>.
108*4882a593Smuzhiyun
109*4882a593Smuzhiyun
110*4882a593SmuzhiyunFuture Development
111*4882a593Smuzhiyun==================
112*4882a593Smuzhiyun
113*4882a593SmuzhiyunBlock Size
114*4882a593Smuzhiyun----------
115*4882a593Smuzhiyun
116*4882a593Smuzhiyun(Block size in cramfs refers to the size of input data that is
117*4882a593Smuzhiyuncompressed at a time.  It's intended to be somewhere around
118*4882a593SmuzhiyunPAGE_SIZE for cramfs_readpage's convenience.)
119*4882a593Smuzhiyun
120*4882a593SmuzhiyunThe superblock ought to indicate the block size that the fs was
121*4882a593Smuzhiyunwritten for, since comments in <linux/pagemap.h> indicate that
122*4882a593SmuzhiyunPAGE_SIZE may grow in future (if I interpret the comment
123*4882a593Smuzhiyuncorrectly).
124*4882a593Smuzhiyun
125*4882a593SmuzhiyunCurrently, mkcramfs #define's PAGE_SIZE as 4096 and uses that
126*4882a593Smuzhiyunfor blksize, whereas Linux-2.3.39 uses its PAGE_SIZE, which in
127*4882a593Smuzhiyunturn is defined as PAGE_SIZE (which can be as large as 32KB on arm).
128*4882a593SmuzhiyunThis discrepancy is a bug, though it's not clear which should be
129*4882a593Smuzhiyunchanged.
130*4882a593Smuzhiyun
131*4882a593SmuzhiyunOne option is to change mkcramfs to take its PAGE_SIZE from
132*4882a593Smuzhiyun<asm/page.h>.  Personally I don't like this option, but it does
133*4882a593Smuzhiyunrequire the least amount of change: just change `#define
134*4882a593SmuzhiyunPAGE_SIZE (4096)' to `#include <asm/page.h>'.  The disadvantage
135*4882a593Smuzhiyunis that the generated cramfs cannot always be shared between different
136*4882a593Smuzhiyunkernels, not even necessarily kernels of the same architecture if
137*4882a593SmuzhiyunPAGE_SIZE is subject to change between kernel versions
138*4882a593Smuzhiyun(currently possible with arm and ia64).
139*4882a593Smuzhiyun
140*4882a593SmuzhiyunThe remaining options try to make cramfs more sharable.
141*4882a593Smuzhiyun
142*4882a593SmuzhiyunOne part of that is addressing endianness.  The two options here are
143*4882a593Smuzhiyun`always use little-endian' (like ext2fs) or `writer chooses
144*4882a593Smuzhiyunendianness; kernel adapts at runtime'.  Little-endian wins because of
145*4882a593Smuzhiyuncode simplicity and little CPU overhead even on big-endian machines.
146*4882a593Smuzhiyun
147*4882a593SmuzhiyunThe cost of swabbing is changing the code to use the le32_to_cpu
148*4882a593Smuzhiyunetc. macros as used by ext2fs.  We don't need to swab the compressed
149*4882a593Smuzhiyundata, only the superblock, inodes and block pointers.
150*4882a593Smuzhiyun
151*4882a593Smuzhiyun
152*4882a593SmuzhiyunThe other part of making cramfs more sharable is choosing a block
153*4882a593Smuzhiyunsize.  The options are:
154*4882a593Smuzhiyun
155*4882a593Smuzhiyun  1. Always 4096 bytes.
156*4882a593Smuzhiyun
157*4882a593Smuzhiyun  2. Writer chooses blocksize; kernel adapts but rejects blocksize >
158*4882a593Smuzhiyun     PAGE_SIZE.
159*4882a593Smuzhiyun
160*4882a593Smuzhiyun  3. Writer chooses blocksize; kernel adapts even to blocksize >
161*4882a593Smuzhiyun     PAGE_SIZE.
162*4882a593Smuzhiyun
163*4882a593SmuzhiyunIt's easy enough to change the kernel to use a smaller value than
164*4882a593SmuzhiyunPAGE_SIZE: just make cramfs_readpage read multiple blocks.
165*4882a593Smuzhiyun
166*4882a593SmuzhiyunThe cost of option 1 is that kernels with a larger PAGE_SIZE
167*4882a593Smuzhiyunvalue don't get as good compression as they can.
168*4882a593Smuzhiyun
169*4882a593SmuzhiyunThe cost of option 2 relative to option 1 is that the code uses
170*4882a593Smuzhiyunvariables instead of #define'd constants.  The gain is that people
171*4882a593Smuzhiyunwith kernels having larger PAGE_SIZE can make use of that if
172*4882a593Smuzhiyunthey don't mind their cramfs being inaccessible to kernels with
173*4882a593Smuzhiyunsmaller PAGE_SIZE values.
174*4882a593Smuzhiyun
175*4882a593SmuzhiyunOption 3 is easy to implement if we don't mind being CPU-inefficient:
176*4882a593Smuzhiyune.g. get readpage to decompress to a buffer of size MAX_BLKSIZE (which
177*4882a593Smuzhiyunmust be no larger than 32KB) and discard what it doesn't need.
178*4882a593SmuzhiyunGetting readpage to read into all the covered pages is harder.
179*4882a593Smuzhiyun
180*4882a593SmuzhiyunThe main advantage of option 3 over 1, 2, is better compression.  The
181*4882a593Smuzhiyuncost is greater complexity.  Probably not worth it, but I hope someone
182*4882a593Smuzhiyunwill disagree.  (If it is implemented, then I'll re-use that code in
183*4882a593Smuzhiyune2compr.)
184*4882a593Smuzhiyun
185*4882a593Smuzhiyun
186*4882a593SmuzhiyunAnother cost of 2 and 3 over 1 is making mkcramfs use a different
187*4882a593Smuzhiyunblock size, but that just means adding and parsing a -b option.
188*4882a593Smuzhiyun
189*4882a593Smuzhiyun
190*4882a593SmuzhiyunInode Size
191*4882a593Smuzhiyun----------
192*4882a593Smuzhiyun
193*4882a593SmuzhiyunGiven that cramfs will probably be used for CDs etc. as well as just
194*4882a593Smuzhiyunsilicon ROMs, it might make sense to expand the inode a little from
195*4882a593Smuzhiyunits current 12 bytes.  Inodes other than the root inode are followed
196*4882a593Smuzhiyunby filename, so the expansion doesn't even have to be a multiple of 4
197*4882a593Smuzhiyunbytes.
198