1*4882a593SmuzhiyunNotes on Filesystem Layout 2*4882a593Smuzhiyun-------------------------- 3*4882a593Smuzhiyun 4*4882a593SmuzhiyunThese notes describe what mkcramfs generates. Kernel requirements are 5*4882a593Smuzhiyuna bit looser, e.g. it doesn't care if the <file_data> items are 6*4882a593Smuzhiyunswapped around (though it does care that directory entries (inodes) in 7*4882a593Smuzhiyuna given directory are contiguous, as this is used by readdir). 8*4882a593Smuzhiyun 9*4882a593SmuzhiyunAll data is currently in host-endian format; neither mkcramfs nor the 10*4882a593Smuzhiyunkernel ever do swabbing. (See section `Block Size' below.) 11*4882a593Smuzhiyun 12*4882a593Smuzhiyun<filesystem>: 13*4882a593Smuzhiyun <superblock> 14*4882a593Smuzhiyun <directory_structure> 15*4882a593Smuzhiyun <data> 16*4882a593Smuzhiyun 17*4882a593Smuzhiyun<superblock>: struct cramfs_super (see cramfs_fs.h). 18*4882a593Smuzhiyun 19*4882a593Smuzhiyun<directory_structure>: 20*4882a593Smuzhiyun For each file: 21*4882a593Smuzhiyun struct cramfs_inode (see cramfs_fs.h). 22*4882a593Smuzhiyun Filename. Not generally null-terminated, but it is 23*4882a593Smuzhiyun null-padded to a multiple of 4 bytes. 24*4882a593Smuzhiyun 25*4882a593SmuzhiyunThe order of inode traversal is described as "width-first" (not to be 26*4882a593Smuzhiyunconfused with breadth-first); i.e. like depth-first but listing all of 27*4882a593Smuzhiyuna directory's entries before recursing down its subdirectories: the 28*4882a593Smuzhiyunsame order as `ls -AUR' (but without the /^\..*:$/ directory header 29*4882a593Smuzhiyunlines); put another way, the same order as `find -type d -exec 30*4882a593Smuzhiyunls -AU1 {} \;'. 31*4882a593Smuzhiyun 32*4882a593SmuzhiyunBeginning in 2.4.7, directory entries are sorted. This optimization 33*4882a593Smuzhiyunallows cramfs_lookup to return more quickly when a filename does not 34*4882a593Smuzhiyunexist, speeds up user-space directory sorts, etc. 35*4882a593Smuzhiyun 36*4882a593Smuzhiyun<data>: 37*4882a593Smuzhiyun One <file_data> for each file that's either a symlink or a 38*4882a593Smuzhiyun regular file of non-zero st_size. 39*4882a593Smuzhiyun 40*4882a593Smuzhiyun<file_data>: 41*4882a593Smuzhiyun nblocks * <block_pointer> 42*4882a593Smuzhiyun (where nblocks = (st_size - 1) / blksize + 1) 43*4882a593Smuzhiyun nblocks * <block> 44*4882a593Smuzhiyun padding to multiple of 4 bytes 45*4882a593Smuzhiyun 46*4882a593SmuzhiyunThe i'th <block_pointer> for a file stores the byte offset of the 47*4882a593Smuzhiyun*end* of the i'th <block> (i.e. one past the last byte, which is the 48*4882a593Smuzhiyunsame as the start of the (i+1)'th <block> if there is one). The first 49*4882a593Smuzhiyun<block> immediately follows the last <block_pointer> for the file. 50*4882a593Smuzhiyun<block_pointer>s are each 32 bits long. 51*4882a593Smuzhiyun 52*4882a593SmuzhiyunWhen the CRAMFS_FLAG_EXT_BLOCK_POINTERS capability bit is set, each 53*4882a593Smuzhiyun<block_pointer>'s top bits may contain special flags as follows: 54*4882a593Smuzhiyun 55*4882a593SmuzhiyunCRAMFS_BLK_FLAG_UNCOMPRESSED (bit 31): 56*4882a593Smuzhiyun The block data is not compressed and should be copied verbatim. 57*4882a593Smuzhiyun 58*4882a593SmuzhiyunCRAMFS_BLK_FLAG_DIRECT_PTR (bit 30): 59*4882a593Smuzhiyun The <block_pointer> stores the actual block start offset and not 60*4882a593Smuzhiyun its end, shifted right by 2 bits. The block must therefore be 61*4882a593Smuzhiyun aligned to a 4-byte boundary. The block size is either blksize 62*4882a593Smuzhiyun if CRAMFS_BLK_FLAG_UNCOMPRESSED is also specified, otherwise 63*4882a593Smuzhiyun the compressed data length is included in the first 2 bytes of 64*4882a593Smuzhiyun the block data. This is used to allow discontiguous data layout 65*4882a593Smuzhiyun and specific data block alignments e.g. for XIP applications. 66*4882a593Smuzhiyun 67*4882a593Smuzhiyun 68*4882a593SmuzhiyunThe order of <file_data>'s is a depth-first descent of the directory 69*4882a593Smuzhiyuntree, i.e. the same order as `find -size +0 \( -type f -o -type l \) 70*4882a593Smuzhiyun-print'. 71*4882a593Smuzhiyun 72*4882a593Smuzhiyun 73*4882a593Smuzhiyun<block>: The i'th <block> is the output of zlib's compress function 74*4882a593Smuzhiyunapplied to the i'th blksize-sized chunk of the input data if the 75*4882a593Smuzhiyuncorresponding CRAMFS_BLK_FLAG_UNCOMPRESSED <block_ptr> bit is not set, 76*4882a593Smuzhiyunotherwise it is the input data directly. 77*4882a593Smuzhiyun(For the last <block> of the file, the input may of course be smaller.) 78*4882a593SmuzhiyunEach <block> may be a different size. (See <block_pointer> above.) 79*4882a593Smuzhiyun 80*4882a593Smuzhiyun<block>s are merely byte-aligned, not generally u32-aligned. 81*4882a593Smuzhiyun 82*4882a593SmuzhiyunWhen CRAMFS_BLK_FLAG_DIRECT_PTR is specified then the corresponding 83*4882a593Smuzhiyun<block> may be located anywhere and not necessarily contiguous with 84*4882a593Smuzhiyunthe previous/next blocks. In that case it is minimally u32-aligned. 85*4882a593SmuzhiyunIf CRAMFS_BLK_FLAG_UNCOMPRESSED is also specified then the size is always 86*4882a593Smuzhiyunblksize except for the last block which is limited by the file length. 87*4882a593SmuzhiyunIf CRAMFS_BLK_FLAG_DIRECT_PTR is set and CRAMFS_BLK_FLAG_UNCOMPRESSED 88*4882a593Smuzhiyunis not set then the first 2 bytes of the block contains the size of the 89*4882a593Smuzhiyunremaining block data as this cannot be determined from the placement of 90*4882a593Smuzhiyunlogically adjacent blocks. 91*4882a593Smuzhiyun 92*4882a593Smuzhiyun 93*4882a593SmuzhiyunHoles 94*4882a593Smuzhiyun----- 95*4882a593Smuzhiyun 96*4882a593SmuzhiyunThis kernel supports cramfs holes (i.e. [efficient representation of] 97*4882a593Smuzhiyunblocks in uncompressed data consisting entirely of NUL bytes), but by 98*4882a593Smuzhiyundefault mkcramfs doesn't test for & create holes, since cramfs in 99*4882a593Smuzhiyunkernels up to at least 2.3.39 didn't support holes. Run mkcramfs 100*4882a593Smuzhiyunwith -z if you want it to create files that can have holes in them. 101*4882a593Smuzhiyun 102*4882a593Smuzhiyun 103*4882a593SmuzhiyunTools 104*4882a593Smuzhiyun----- 105*4882a593Smuzhiyun 106*4882a593SmuzhiyunThe cramfs user-space tools, including mkcramfs and cramfsck, are 107*4882a593Smuzhiyunlocated at <http://sourceforge.net/projects/cramfs/>. 108*4882a593Smuzhiyun 109*4882a593Smuzhiyun 110*4882a593SmuzhiyunFuture Development 111*4882a593Smuzhiyun================== 112*4882a593Smuzhiyun 113*4882a593SmuzhiyunBlock Size 114*4882a593Smuzhiyun---------- 115*4882a593Smuzhiyun 116*4882a593Smuzhiyun(Block size in cramfs refers to the size of input data that is 117*4882a593Smuzhiyuncompressed at a time. It's intended to be somewhere around 118*4882a593SmuzhiyunPAGE_SIZE for cramfs_readpage's convenience.) 119*4882a593Smuzhiyun 120*4882a593SmuzhiyunThe superblock ought to indicate the block size that the fs was 121*4882a593Smuzhiyunwritten for, since comments in <linux/pagemap.h> indicate that 122*4882a593SmuzhiyunPAGE_SIZE may grow in future (if I interpret the comment 123*4882a593Smuzhiyuncorrectly). 124*4882a593Smuzhiyun 125*4882a593SmuzhiyunCurrently, mkcramfs #define's PAGE_SIZE as 4096 and uses that 126*4882a593Smuzhiyunfor blksize, whereas Linux-2.3.39 uses its PAGE_SIZE, which in 127*4882a593Smuzhiyunturn is defined as PAGE_SIZE (which can be as large as 32KB on arm). 128*4882a593SmuzhiyunThis discrepancy is a bug, though it's not clear which should be 129*4882a593Smuzhiyunchanged. 130*4882a593Smuzhiyun 131*4882a593SmuzhiyunOne option is to change mkcramfs to take its PAGE_SIZE from 132*4882a593Smuzhiyun<asm/page.h>. Personally I don't like this option, but it does 133*4882a593Smuzhiyunrequire the least amount of change: just change `#define 134*4882a593SmuzhiyunPAGE_SIZE (4096)' to `#include <asm/page.h>'. The disadvantage 135*4882a593Smuzhiyunis that the generated cramfs cannot always be shared between different 136*4882a593Smuzhiyunkernels, not even necessarily kernels of the same architecture if 137*4882a593SmuzhiyunPAGE_SIZE is subject to change between kernel versions 138*4882a593Smuzhiyun(currently possible with arm and ia64). 139*4882a593Smuzhiyun 140*4882a593SmuzhiyunThe remaining options try to make cramfs more sharable. 141*4882a593Smuzhiyun 142*4882a593SmuzhiyunOne part of that is addressing endianness. The two options here are 143*4882a593Smuzhiyun`always use little-endian' (like ext2fs) or `writer chooses 144*4882a593Smuzhiyunendianness; kernel adapts at runtime'. Little-endian wins because of 145*4882a593Smuzhiyuncode simplicity and little CPU overhead even on big-endian machines. 146*4882a593Smuzhiyun 147*4882a593SmuzhiyunThe cost of swabbing is changing the code to use the le32_to_cpu 148*4882a593Smuzhiyunetc. macros as used by ext2fs. We don't need to swab the compressed 149*4882a593Smuzhiyundata, only the superblock, inodes and block pointers. 150*4882a593Smuzhiyun 151*4882a593Smuzhiyun 152*4882a593SmuzhiyunThe other part of making cramfs more sharable is choosing a block 153*4882a593Smuzhiyunsize. The options are: 154*4882a593Smuzhiyun 155*4882a593Smuzhiyun 1. Always 4096 bytes. 156*4882a593Smuzhiyun 157*4882a593Smuzhiyun 2. Writer chooses blocksize; kernel adapts but rejects blocksize > 158*4882a593Smuzhiyun PAGE_SIZE. 159*4882a593Smuzhiyun 160*4882a593Smuzhiyun 3. Writer chooses blocksize; kernel adapts even to blocksize > 161*4882a593Smuzhiyun PAGE_SIZE. 162*4882a593Smuzhiyun 163*4882a593SmuzhiyunIt's easy enough to change the kernel to use a smaller value than 164*4882a593SmuzhiyunPAGE_SIZE: just make cramfs_readpage read multiple blocks. 165*4882a593Smuzhiyun 166*4882a593SmuzhiyunThe cost of option 1 is that kernels with a larger PAGE_SIZE 167*4882a593Smuzhiyunvalue don't get as good compression as they can. 168*4882a593Smuzhiyun 169*4882a593SmuzhiyunThe cost of option 2 relative to option 1 is that the code uses 170*4882a593Smuzhiyunvariables instead of #define'd constants. The gain is that people 171*4882a593Smuzhiyunwith kernels having larger PAGE_SIZE can make use of that if 172*4882a593Smuzhiyunthey don't mind their cramfs being inaccessible to kernels with 173*4882a593Smuzhiyunsmaller PAGE_SIZE values. 174*4882a593Smuzhiyun 175*4882a593SmuzhiyunOption 3 is easy to implement if we don't mind being CPU-inefficient: 176*4882a593Smuzhiyune.g. get readpage to decompress to a buffer of size MAX_BLKSIZE (which 177*4882a593Smuzhiyunmust be no larger than 32KB) and discard what it doesn't need. 178*4882a593SmuzhiyunGetting readpage to read into all the covered pages is harder. 179*4882a593Smuzhiyun 180*4882a593SmuzhiyunThe main advantage of option 3 over 1, 2, is better compression. The 181*4882a593Smuzhiyuncost is greater complexity. Probably not worth it, but I hope someone 182*4882a593Smuzhiyunwill disagree. (If it is implemented, then I'll re-use that code in 183*4882a593Smuzhiyune2compr.) 184*4882a593Smuzhiyun 185*4882a593Smuzhiyun 186*4882a593SmuzhiyunAnother cost of 2 and 3 over 1 is making mkcramfs use a different 187*4882a593Smuzhiyunblock size, but that just means adding and parsing a -b option. 188*4882a593Smuzhiyun 189*4882a593Smuzhiyun 190*4882a593SmuzhiyunInode Size 191*4882a593Smuzhiyun---------- 192*4882a593Smuzhiyun 193*4882a593SmuzhiyunGiven that cramfs will probably be used for CDs etc. as well as just 194*4882a593Smuzhiyunsilicon ROMs, it might make sense to expand the inode a little from 195*4882a593Smuzhiyunits current 12 bytes. Inodes other than the root inode are followed 196*4882a593Smuzhiyunby filename, so the expansion doesn't even have to be a multiple of 4 197*4882a593Smuzhiyunbytes. 198