1*4882a593Smuzhiyun.. SPDX-License-Identifier: GPL-2.0 2*4882a593Smuzhiyun 3*4882a593Smuzhiyun 4*4882a593SmuzhiyunThe Second Extended Filesystem 5*4882a593Smuzhiyun============================== 6*4882a593Smuzhiyun 7*4882a593Smuzhiyunext2 was originally released in January 1993. Written by R\'emy Card, 8*4882a593SmuzhiyunTheodore Ts'o and Stephen Tweedie, it was a major rewrite of the 9*4882a593SmuzhiyunExtended Filesystem. It is currently still (April 2001) the predominant 10*4882a593Smuzhiyunfilesystem in use by Linux. There are also implementations available 11*4882a593Smuzhiyunfor NetBSD, FreeBSD, the GNU HURD, Windows 95/98/NT, OS/2 and RISC OS. 12*4882a593Smuzhiyun 13*4882a593SmuzhiyunOptions 14*4882a593Smuzhiyun======= 15*4882a593Smuzhiyun 16*4882a593SmuzhiyunMost defaults are determined by the filesystem superblock, and can be 17*4882a593Smuzhiyunset using tune2fs(8). Kernel-determined defaults are indicated by (*). 18*4882a593Smuzhiyun 19*4882a593Smuzhiyun==================== === ================================================ 20*4882a593Smuzhiyunbsddf (*) Makes ``df`` act like BSD. 21*4882a593Smuzhiyunminixdf Makes ``df`` act like Minix. 22*4882a593Smuzhiyun 23*4882a593Smuzhiyuncheck=none, nocheck (*) Don't do extra checking of bitmaps on mount 24*4882a593Smuzhiyun (check=normal and check=strict options removed) 25*4882a593Smuzhiyun 26*4882a593Smuzhiyundax Use direct access (no page cache). See 27*4882a593Smuzhiyun Documentation/filesystems/dax.txt. 28*4882a593Smuzhiyun 29*4882a593Smuzhiyundebug Extra debugging information is sent to the 30*4882a593Smuzhiyun kernel syslog. Useful for developers. 31*4882a593Smuzhiyun 32*4882a593Smuzhiyunerrors=continue Keep going on a filesystem error. 33*4882a593Smuzhiyunerrors=remount-ro Remount the filesystem read-only on an error. 34*4882a593Smuzhiyunerrors=panic Panic and halt the machine if an error occurs. 35*4882a593Smuzhiyun 36*4882a593Smuzhiyungrpid, bsdgroups Give objects the same group ID as their parent. 37*4882a593Smuzhiyunnogrpid, sysvgroups New objects have the group ID of their creator. 38*4882a593Smuzhiyun 39*4882a593Smuzhiyunnouid32 Use 16-bit UIDs and GIDs. 40*4882a593Smuzhiyun 41*4882a593Smuzhiyunoldalloc Enable the old block allocator. Orlov should 42*4882a593Smuzhiyun have better performance, we'd like to get some 43*4882a593Smuzhiyun feedback if it's the contrary for you. 44*4882a593Smuzhiyunorlov (*) Use the Orlov block allocator. 45*4882a593Smuzhiyun (See http://lwn.net/Articles/14633/ and 46*4882a593Smuzhiyun http://lwn.net/Articles/14446/.) 47*4882a593Smuzhiyun 48*4882a593Smuzhiyunresuid=n The user ID which may use the reserved blocks. 49*4882a593Smuzhiyunresgid=n The group ID which may use the reserved blocks. 50*4882a593Smuzhiyun 51*4882a593Smuzhiyunsb=n Use alternate superblock at this location. 52*4882a593Smuzhiyun 53*4882a593Smuzhiyunuser_xattr Enable "user." POSIX Extended Attributes 54*4882a593Smuzhiyun (requires CONFIG_EXT2_FS_XATTR). 55*4882a593Smuzhiyunnouser_xattr Don't support "user." extended attributes. 56*4882a593Smuzhiyun 57*4882a593Smuzhiyunacl Enable POSIX Access Control Lists support 58*4882a593Smuzhiyun (requires CONFIG_EXT2_FS_POSIX_ACL). 59*4882a593Smuzhiyunnoacl Don't support POSIX ACLs. 60*4882a593Smuzhiyun 61*4882a593Smuzhiyunnobh Do not attach buffer_heads to file pagecache. 62*4882a593Smuzhiyun 63*4882a593Smuzhiyunquota, usrquota Enable user disk quota support 64*4882a593Smuzhiyun (requires CONFIG_QUOTA). 65*4882a593Smuzhiyun 66*4882a593Smuzhiyungrpquota Enable group disk quota support 67*4882a593Smuzhiyun (requires CONFIG_QUOTA). 68*4882a593Smuzhiyun==================== === ================================================ 69*4882a593Smuzhiyun 70*4882a593Smuzhiyunnoquota option ls silently ignored by ext2. 71*4882a593Smuzhiyun 72*4882a593Smuzhiyun 73*4882a593SmuzhiyunSpecification 74*4882a593Smuzhiyun============= 75*4882a593Smuzhiyun 76*4882a593Smuzhiyunext2 shares many properties with traditional Unix filesystems. It has 77*4882a593Smuzhiyunthe concepts of blocks, inodes and directories. It has space in the 78*4882a593Smuzhiyunspecification for Access Control Lists (ACLs), fragments, undeletion and 79*4882a593Smuzhiyuncompression though these are not yet implemented (some are available as 80*4882a593Smuzhiyunseparate patches). There is also a versioning mechanism to allow new 81*4882a593Smuzhiyunfeatures (such as journalling) to be added in a maximally compatible 82*4882a593Smuzhiyunmanner. 83*4882a593Smuzhiyun 84*4882a593SmuzhiyunBlocks 85*4882a593Smuzhiyun------ 86*4882a593Smuzhiyun 87*4882a593SmuzhiyunThe space in the device or file is split up into blocks. These are 88*4882a593Smuzhiyuna fixed size, of 1024, 2048 or 4096 bytes (8192 bytes on Alpha systems), 89*4882a593Smuzhiyunwhich is decided when the filesystem is created. Smaller blocks mean 90*4882a593Smuzhiyunless wasted space per file, but require slightly more accounting overhead, 91*4882a593Smuzhiyunand also impose other limits on the size of files and the filesystem. 92*4882a593Smuzhiyun 93*4882a593SmuzhiyunBlock Groups 94*4882a593Smuzhiyun------------ 95*4882a593Smuzhiyun 96*4882a593SmuzhiyunBlocks are clustered into block groups in order to reduce fragmentation 97*4882a593Smuzhiyunand minimise the amount of head seeking when reading a large amount 98*4882a593Smuzhiyunof consecutive data. Information about each block group is kept in a 99*4882a593Smuzhiyundescriptor table stored in the block(s) immediately after the superblock. 100*4882a593SmuzhiyunTwo blocks near the start of each group are reserved for the block usage 101*4882a593Smuzhiyunbitmap and the inode usage bitmap which show which blocks and inodes 102*4882a593Smuzhiyunare in use. Since each bitmap is limited to a single block, this means 103*4882a593Smuzhiyunthat the maximum size of a block group is 8 times the size of a block. 104*4882a593Smuzhiyun 105*4882a593SmuzhiyunThe block(s) following the bitmaps in each block group are designated 106*4882a593Smuzhiyunas the inode table for that block group and the remainder are the data 107*4882a593Smuzhiyunblocks. The block allocation algorithm attempts to allocate data blocks 108*4882a593Smuzhiyunin the same block group as the inode which contains them. 109*4882a593Smuzhiyun 110*4882a593SmuzhiyunThe Superblock 111*4882a593Smuzhiyun-------------- 112*4882a593Smuzhiyun 113*4882a593SmuzhiyunThe superblock contains all the information about the configuration of 114*4882a593Smuzhiyunthe filing system. The primary copy of the superblock is stored at an 115*4882a593Smuzhiyunoffset of 1024 bytes from the start of the device, and it is essential 116*4882a593Smuzhiyunto mounting the filesystem. Since it is so important, backup copies of 117*4882a593Smuzhiyunthe superblock are stored in block groups throughout the filesystem. 118*4882a593SmuzhiyunThe first version of ext2 (revision 0) stores a copy at the start of 119*4882a593Smuzhiyunevery block group, along with backups of the group descriptor block(s). 120*4882a593SmuzhiyunBecause this can consume a considerable amount of space for large 121*4882a593Smuzhiyunfilesystems, later revisions can optionally reduce the number of backup 122*4882a593Smuzhiyuncopies by only putting backups in specific groups (this is the sparse 123*4882a593Smuzhiyunsuperblock feature). The groups chosen are 0, 1 and powers of 3, 5 and 7. 124*4882a593Smuzhiyun 125*4882a593SmuzhiyunThe information in the superblock contains fields such as the total 126*4882a593Smuzhiyunnumber of inodes and blocks in the filesystem and how many are free, 127*4882a593Smuzhiyunhow many inodes and blocks are in each block group, when the filesystem 128*4882a593Smuzhiyunwas mounted (and if it was cleanly unmounted), when it was modified, 129*4882a593Smuzhiyunwhat version of the filesystem it is (see the Revisions section below) 130*4882a593Smuzhiyunand which OS created it. 131*4882a593Smuzhiyun 132*4882a593SmuzhiyunIf the filesystem is revision 1 or higher, then there are extra fields, 133*4882a593Smuzhiyunsuch as a volume name, a unique identification number, the inode size, 134*4882a593Smuzhiyunand space for optional filesystem features to store configuration info. 135*4882a593Smuzhiyun 136*4882a593SmuzhiyunAll fields in the superblock (as in all other ext2 structures) are stored 137*4882a593Smuzhiyunon the disc in little endian format, so a filesystem is portable between 138*4882a593Smuzhiyunmachines without having to know what machine it was created on. 139*4882a593Smuzhiyun 140*4882a593SmuzhiyunInodes 141*4882a593Smuzhiyun------ 142*4882a593Smuzhiyun 143*4882a593SmuzhiyunThe inode (index node) is a fundamental concept in the ext2 filesystem. 144*4882a593SmuzhiyunEach object in the filesystem is represented by an inode. The inode 145*4882a593Smuzhiyunstructure contains pointers to the filesystem blocks which contain the 146*4882a593Smuzhiyundata held in the object and all of the metadata about an object except 147*4882a593Smuzhiyunits name. The metadata about an object includes the permissions, owner, 148*4882a593Smuzhiyungroup, flags, size, number of blocks used, access time, change time, 149*4882a593Smuzhiyunmodification time, deletion time, number of links, fragments, version 150*4882a593Smuzhiyun(for NFS) and extended attributes (EAs) and/or Access Control Lists (ACLs). 151*4882a593Smuzhiyun 152*4882a593SmuzhiyunThere are some reserved fields which are currently unused in the inode 153*4882a593Smuzhiyunstructure and several which are overloaded. One field is reserved for the 154*4882a593Smuzhiyundirectory ACL if the inode is a directory and alternately for the top 32 155*4882a593Smuzhiyunbits of the file size if the inode is a regular file (allowing file sizes 156*4882a593Smuzhiyunlarger than 2GB). The translator field is unused under Linux, but is used 157*4882a593Smuzhiyunby the HURD to reference the inode of a program which will be used to 158*4882a593Smuzhiyuninterpret this object. Most of the remaining reserved fields have been 159*4882a593Smuzhiyunused up for both Linux and the HURD for larger owner and group fields, 160*4882a593SmuzhiyunThe HURD also has a larger mode field so it uses another of the remaining 161*4882a593Smuzhiyunfields to store the extra more bits. 162*4882a593Smuzhiyun 163*4882a593SmuzhiyunThere are pointers to the first 12 blocks which contain the file's data 164*4882a593Smuzhiyunin the inode. There is a pointer to an indirect block (which contains 165*4882a593Smuzhiyunpointers to the next set of blocks), a pointer to a doubly-indirect 166*4882a593Smuzhiyunblock (which contains pointers to indirect blocks) and a pointer to a 167*4882a593Smuzhiyuntrebly-indirect block (which contains pointers to doubly-indirect blocks). 168*4882a593Smuzhiyun 169*4882a593SmuzhiyunThe flags field contains some ext2-specific flags which aren't catered 170*4882a593Smuzhiyunfor by the standard chmod flags. These flags can be listed with lsattr 171*4882a593Smuzhiyunand changed with the chattr command, and allow specific filesystem 172*4882a593Smuzhiyunbehaviour on a per-file basis. There are flags for secure deletion, 173*4882a593Smuzhiyunundeletable, compression, synchronous updates, immutability, append-only, 174*4882a593Smuzhiyundumpable, no-atime, indexed directories, and data-journaling. Not all 175*4882a593Smuzhiyunof these are supported yet. 176*4882a593Smuzhiyun 177*4882a593SmuzhiyunDirectories 178*4882a593Smuzhiyun----------- 179*4882a593Smuzhiyun 180*4882a593SmuzhiyunA directory is a filesystem object and has an inode just like a file. 181*4882a593SmuzhiyunIt is a specially formatted file containing records which associate 182*4882a593Smuzhiyuneach name with an inode number. Later revisions of the filesystem also 183*4882a593Smuzhiyunencode the type of the object (file, directory, symlink, device, fifo, 184*4882a593Smuzhiyunsocket) to avoid the need to check the inode itself for this information 185*4882a593Smuzhiyun(support for taking advantage of this feature does not yet exist in 186*4882a593SmuzhiyunGlibc 2.2). 187*4882a593Smuzhiyun 188*4882a593SmuzhiyunThe inode allocation code tries to assign inodes which are in the same 189*4882a593Smuzhiyunblock group as the directory in which they are first created. 190*4882a593Smuzhiyun 191*4882a593SmuzhiyunThe current implementation of ext2 uses a singly-linked list to store 192*4882a593Smuzhiyunthe filenames in the directory; a pending enhancement uses hashing of the 193*4882a593Smuzhiyunfilenames to allow lookup without the need to scan the entire directory. 194*4882a593Smuzhiyun 195*4882a593SmuzhiyunThe current implementation never removes empty directory blocks once they 196*4882a593Smuzhiyunhave been allocated to hold more files. 197*4882a593Smuzhiyun 198*4882a593SmuzhiyunSpecial files 199*4882a593Smuzhiyun------------- 200*4882a593Smuzhiyun 201*4882a593SmuzhiyunSymbolic links are also filesystem objects with inodes. They deserve 202*4882a593Smuzhiyunspecial mention because the data for them is stored within the inode 203*4882a593Smuzhiyunitself if the symlink is less than 60 bytes long. It uses the fields 204*4882a593Smuzhiyunwhich would normally be used to store the pointers to data blocks. 205*4882a593SmuzhiyunThis is a worthwhile optimisation as it we avoid allocating a full 206*4882a593Smuzhiyunblock for the symlink, and most symlinks are less than 60 characters long. 207*4882a593Smuzhiyun 208*4882a593SmuzhiyunCharacter and block special devices never have data blocks assigned to 209*4882a593Smuzhiyunthem. Instead, their device number is stored in the inode, again reusing 210*4882a593Smuzhiyunthe fields which would be used to point to the data blocks. 211*4882a593Smuzhiyun 212*4882a593SmuzhiyunReserved Space 213*4882a593Smuzhiyun-------------- 214*4882a593Smuzhiyun 215*4882a593SmuzhiyunIn ext2, there is a mechanism for reserving a certain number of blocks 216*4882a593Smuzhiyunfor a particular user (normally the super-user). This is intended to 217*4882a593Smuzhiyunallow for the system to continue functioning even if non-privileged users 218*4882a593Smuzhiyunfill up all the space available to them (this is independent of filesystem 219*4882a593Smuzhiyunquotas). It also keeps the filesystem from filling up entirely which 220*4882a593Smuzhiyunhelps combat fragmentation. 221*4882a593Smuzhiyun 222*4882a593SmuzhiyunFilesystem check 223*4882a593Smuzhiyun---------------- 224*4882a593Smuzhiyun 225*4882a593SmuzhiyunAt boot time, most systems run a consistency check (e2fsck) on their 226*4882a593Smuzhiyunfilesystems. The superblock of the ext2 filesystem contains several 227*4882a593Smuzhiyunfields which indicate whether fsck should actually run (since checking 228*4882a593Smuzhiyunthe filesystem at boot can take a long time if it is large). fsck will 229*4882a593Smuzhiyunrun if the filesystem was not cleanly unmounted, if the maximum mount 230*4882a593Smuzhiyuncount has been exceeded or if the maximum time between checks has been 231*4882a593Smuzhiyunexceeded. 232*4882a593Smuzhiyun 233*4882a593SmuzhiyunFeature Compatibility 234*4882a593Smuzhiyun--------------------- 235*4882a593Smuzhiyun 236*4882a593SmuzhiyunThe compatibility feature mechanism used in ext2 is sophisticated. 237*4882a593SmuzhiyunIt safely allows features to be added to the filesystem, without 238*4882a593Smuzhiyununnecessarily sacrificing compatibility with older versions of the 239*4882a593Smuzhiyunfilesystem code. The feature compatibility mechanism is not supported by 240*4882a593Smuzhiyunthe original revision 0 (EXT2_GOOD_OLD_REV) of ext2, but was introduced in 241*4882a593Smuzhiyunrevision 1. There are three 32-bit fields, one for compatible features 242*4882a593Smuzhiyun(COMPAT), one for read-only compatible (RO_COMPAT) features and one for 243*4882a593Smuzhiyunincompatible (INCOMPAT) features. 244*4882a593Smuzhiyun 245*4882a593SmuzhiyunThese feature flags have specific meanings for the kernel as follows: 246*4882a593Smuzhiyun 247*4882a593SmuzhiyunA COMPAT flag indicates that a feature is present in the filesystem, 248*4882a593Smuzhiyunbut the on-disk format is 100% compatible with older on-disk formats, so 249*4882a593Smuzhiyuna kernel which didn't know anything about this feature could read/write 250*4882a593Smuzhiyunthe filesystem without any chance of corrupting the filesystem (or even 251*4882a593Smuzhiyunmaking it inconsistent). This is essentially just a flag which says 252*4882a593Smuzhiyun"this filesystem has a (hidden) feature" that the kernel or e2fsck may 253*4882a593Smuzhiyunwant to be aware of (more on e2fsck and feature flags later). The ext3 254*4882a593SmuzhiyunHAS_JOURNAL feature is a COMPAT flag because the ext3 journal is simply 255*4882a593Smuzhiyuna regular file with data blocks in it so the kernel does not need to 256*4882a593Smuzhiyuntake any special notice of it if it doesn't understand ext3 journaling. 257*4882a593Smuzhiyun 258*4882a593SmuzhiyunAn RO_COMPAT flag indicates that the on-disk format is 100% compatible 259*4882a593Smuzhiyunwith older on-disk formats for reading (i.e. the feature does not change 260*4882a593Smuzhiyunthe visible on-disk format). However, an old kernel writing to such a 261*4882a593Smuzhiyunfilesystem would/could corrupt the filesystem, so this is prevented. The 262*4882a593Smuzhiyunmost common such feature, SPARSE_SUPER, is an RO_COMPAT feature because 263*4882a593Smuzhiyunsparse groups allow file data blocks where superblock/group descriptor 264*4882a593Smuzhiyunbackups used to live, and ext2_free_blocks() refuses to free these blocks, 265*4882a593Smuzhiyunwhich would leading to inconsistent bitmaps. An old kernel would also 266*4882a593Smuzhiyunget an error if it tried to free a series of blocks which crossed a group 267*4882a593Smuzhiyunboundary, but this is a legitimate layout in a SPARSE_SUPER filesystem. 268*4882a593Smuzhiyun 269*4882a593SmuzhiyunAn INCOMPAT flag indicates the on-disk format has changed in some 270*4882a593Smuzhiyunway that makes it unreadable by older kernels, or would otherwise 271*4882a593Smuzhiyuncause a problem if an old kernel tried to mount it. FILETYPE is an 272*4882a593SmuzhiyunINCOMPAT flag because older kernels would think a filename was longer 273*4882a593Smuzhiyunthan 256 characters, which would lead to corrupt directory listings. 274*4882a593SmuzhiyunThe COMPRESSION flag is an obvious INCOMPAT flag - if the kernel 275*4882a593Smuzhiyundoesn't understand compression, you would just get garbage back from 276*4882a593Smuzhiyunread() instead of it automatically decompressing your data. The ext3 277*4882a593SmuzhiyunRECOVER flag is needed to prevent a kernel which does not understand the 278*4882a593Smuzhiyunext3 journal from mounting the filesystem without replaying the journal. 279*4882a593Smuzhiyun 280*4882a593SmuzhiyunFor e2fsck, it needs to be more strict with the handling of these 281*4882a593Smuzhiyunflags than the kernel. If it doesn't understand ANY of the COMPAT, 282*4882a593SmuzhiyunRO_COMPAT, or INCOMPAT flags it will refuse to check the filesystem, 283*4882a593Smuzhiyunbecause it has no way of verifying whether a given feature is valid 284*4882a593Smuzhiyunor not. Allowing e2fsck to succeed on a filesystem with an unknown 285*4882a593Smuzhiyunfeature is a false sense of security for the user. Refusing to check 286*4882a593Smuzhiyuna filesystem with unknown features is a good incentive for the user to 287*4882a593Smuzhiyunupdate to the latest e2fsck. This also means that anyone adding feature 288*4882a593Smuzhiyunflags to ext2 also needs to update e2fsck to verify these features. 289*4882a593Smuzhiyun 290*4882a593SmuzhiyunMetadata 291*4882a593Smuzhiyun-------- 292*4882a593Smuzhiyun 293*4882a593SmuzhiyunIt is frequently claimed that the ext2 implementation of writing 294*4882a593Smuzhiyunasynchronous metadata is faster than the ffs synchronous metadata 295*4882a593Smuzhiyunscheme but less reliable. Both methods are equally resolvable by their 296*4882a593Smuzhiyunrespective fsck programs. 297*4882a593Smuzhiyun 298*4882a593SmuzhiyunIf you're exceptionally paranoid, there are 3 ways of making metadata 299*4882a593Smuzhiyunwrites synchronous on ext2: 300*4882a593Smuzhiyun 301*4882a593Smuzhiyun- per-file if you have the program source: use the O_SYNC flag to open() 302*4882a593Smuzhiyun- per-file if you don't have the source: use "chattr +S" on the file 303*4882a593Smuzhiyun- per-filesystem: add the "sync" option to mount (or in /etc/fstab) 304*4882a593Smuzhiyun 305*4882a593Smuzhiyunthe first and last are not ext2 specific but do force the metadata to 306*4882a593Smuzhiyunbe written synchronously. See also Journaling below. 307*4882a593Smuzhiyun 308*4882a593SmuzhiyunLimitations 309*4882a593Smuzhiyun----------- 310*4882a593Smuzhiyun 311*4882a593SmuzhiyunThere are various limits imposed by the on-disk layout of ext2. Other 312*4882a593Smuzhiyunlimits are imposed by the current implementation of the kernel code. 313*4882a593SmuzhiyunMany of the limits are determined at the time the filesystem is first 314*4882a593Smuzhiyuncreated, and depend upon the block size chosen. The ratio of inodes to 315*4882a593Smuzhiyundata blocks is fixed at filesystem creation time, so the only way to 316*4882a593Smuzhiyunincrease the number of inodes is to increase the size of the filesystem. 317*4882a593SmuzhiyunNo tools currently exist which can change the ratio of inodes to blocks. 318*4882a593Smuzhiyun 319*4882a593SmuzhiyunMost of these limits could be overcome with slight changes in the on-disk 320*4882a593Smuzhiyunformat and using a compatibility flag to signal the format change (at 321*4882a593Smuzhiyunthe expense of some compatibility). 322*4882a593Smuzhiyun 323*4882a593Smuzhiyun===================== ======= ======= ======= ======== 324*4882a593SmuzhiyunFilesystem block size 1kB 2kB 4kB 8kB 325*4882a593Smuzhiyun===================== ======= ======= ======= ======== 326*4882a593SmuzhiyunFile size limit 16GB 256GB 2048GB 2048GB 327*4882a593SmuzhiyunFilesystem size limit 2047GB 8192GB 16384GB 32768GB 328*4882a593Smuzhiyun===================== ======= ======= ======= ======== 329*4882a593Smuzhiyun 330*4882a593SmuzhiyunThere is a 2.4 kernel limit of 2048GB for a single block device, so no 331*4882a593Smuzhiyunfilesystem larger than that can be created at this time. There is also 332*4882a593Smuzhiyunan upper limit on the block size imposed by the page size of the kernel, 333*4882a593Smuzhiyunso 8kB blocks are only allowed on Alpha systems (and other architectures 334*4882a593Smuzhiyunwhich support larger pages). 335*4882a593Smuzhiyun 336*4882a593SmuzhiyunThere is an upper limit of 32000 subdirectories in a single directory. 337*4882a593Smuzhiyun 338*4882a593SmuzhiyunThere is a "soft" upper limit of about 10-15k files in a single directory 339*4882a593Smuzhiyunwith the current linear linked-list directory implementation. This limit 340*4882a593Smuzhiyunstems from performance problems when creating and deleting (and also 341*4882a593Smuzhiyunfinding) files in such large directories. Using a hashed directory index 342*4882a593Smuzhiyun(under development) allows 100k-1M+ files in a single directory without 343*4882a593Smuzhiyunperformance problems (although RAM size becomes an issue at this point). 344*4882a593Smuzhiyun 345*4882a593SmuzhiyunThe (meaningless) absolute upper limit of files in a single directory 346*4882a593Smuzhiyun(imposed by the file size, the realistic limit is obviously much less) 347*4882a593Smuzhiyunis over 130 trillion files. It would be higher except there are not 348*4882a593Smuzhiyunenough 4-character names to make up unique directory entries, so they 349*4882a593Smuzhiyunhave to be 8 character filenames, even then we are fairly close to 350*4882a593Smuzhiyunrunning out of unique filenames. 351*4882a593Smuzhiyun 352*4882a593SmuzhiyunJournaling 353*4882a593Smuzhiyun---------- 354*4882a593Smuzhiyun 355*4882a593SmuzhiyunA journaling extension to the ext2 code has been developed by Stephen 356*4882a593SmuzhiyunTweedie. It avoids the risks of metadata corruption and the need to 357*4882a593Smuzhiyunwait for e2fsck to complete after a crash, without requiring a change 358*4882a593Smuzhiyunto the on-disk ext2 layout. In a nutshell, the journal is a regular 359*4882a593Smuzhiyunfile which stores whole metadata (and optionally data) blocks that have 360*4882a593Smuzhiyunbeen modified, prior to writing them into the filesystem. This means 361*4882a593Smuzhiyunit is possible to add a journal to an existing ext2 filesystem without 362*4882a593Smuzhiyunthe need for data conversion. 363*4882a593Smuzhiyun 364*4882a593SmuzhiyunWhen changes to the filesystem (e.g. a file is renamed) they are stored in 365*4882a593Smuzhiyuna transaction in the journal and can either be complete or incomplete at 366*4882a593Smuzhiyunthe time of a crash. If a transaction is complete at the time of a crash 367*4882a593Smuzhiyun(or in the normal case where the system does not crash), then any blocks 368*4882a593Smuzhiyunin that transaction are guaranteed to represent a valid filesystem state, 369*4882a593Smuzhiyunand are copied into the filesystem. If a transaction is incomplete at 370*4882a593Smuzhiyunthe time of the crash, then there is no guarantee of consistency for 371*4882a593Smuzhiyunthe blocks in that transaction so they are discarded (which means any 372*4882a593Smuzhiyunfilesystem changes they represent are also lost). 373*4882a593SmuzhiyunCheck Documentation/filesystems/ext4/ if you want to read more about 374*4882a593Smuzhiyunext4 and journaling. 375*4882a593Smuzhiyun 376*4882a593SmuzhiyunReferences 377*4882a593Smuzhiyun========== 378*4882a593Smuzhiyun 379*4882a593Smuzhiyun======================= =============================================== 380*4882a593SmuzhiyunThe kernel source file:/usr/src/linux/fs/ext2/ 381*4882a593Smuzhiyune2fsprogs (e2fsck) http://e2fsprogs.sourceforge.net/ 382*4882a593SmuzhiyunDesign & Implementation http://e2fsprogs.sourceforge.net/ext2intro.html 383*4882a593SmuzhiyunJournaling (ext3) ftp://ftp.uk.linux.org/pub/linux/sct/fs/jfs/ 384*4882a593SmuzhiyunFilesystem Resizing http://ext2resize.sourceforge.net/ 385*4882a593SmuzhiyunCompression [1]_ http://e2compr.sourceforge.net/ 386*4882a593Smuzhiyun======================= =============================================== 387*4882a593Smuzhiyun 388*4882a593SmuzhiyunImplementations for: 389*4882a593Smuzhiyun 390*4882a593Smuzhiyun======================= =========================================================== 391*4882a593SmuzhiyunWindows 95/98/NT/2000 http://www.chrysocome.net/explore2fs 392*4882a593SmuzhiyunWindows 95 [1]_ http://www.yipton.net/content.html#FSDEXT2 393*4882a593SmuzhiyunDOS client [1]_ ftp://metalab.unc.edu/pub/Linux/system/filesystems/ext2/ 394*4882a593SmuzhiyunOS/2 [2]_ ftp://metalab.unc.edu/pub/Linux/system/filesystems/ext2/ 395*4882a593SmuzhiyunRISC OS client http://www.esw-heim.tu-clausthal.de/~marco/smorbrod/IscaFS/ 396*4882a593Smuzhiyun======================= =========================================================== 397*4882a593Smuzhiyun 398*4882a593Smuzhiyun.. [1] no longer actively developed/supported (as of Apr 2001) 399*4882a593Smuzhiyun.. [2] no longer actively developed/supported (as of Mar 2009) 400