1*4882a593Smuzhiyun.. SPDX-License-Identifier: GPL-2.0 2*4882a593Smuzhiyun 3*4882a593Smuzhiyun====================== 4*4882a593SmuzhiyunThe SGI XFS Filesystem 5*4882a593Smuzhiyun====================== 6*4882a593Smuzhiyun 7*4882a593SmuzhiyunXFS is a high performance journaling filesystem which originated 8*4882a593Smuzhiyunon the SGI IRIX platform. It is completely multi-threaded, can 9*4882a593Smuzhiyunsupport large files and large filesystems, extended attributes, 10*4882a593Smuzhiyunvariable block sizes, is extent based, and makes extensive use of 11*4882a593SmuzhiyunBtrees (directories, extents, free space) to aid both performance 12*4882a593Smuzhiyunand scalability. 13*4882a593Smuzhiyun 14*4882a593SmuzhiyunRefer to the documentation at https://xfs.wiki.kernel.org/ 15*4882a593Smuzhiyunfor further details. This implementation is on-disk compatible 16*4882a593Smuzhiyunwith the IRIX version of XFS. 17*4882a593Smuzhiyun 18*4882a593Smuzhiyun 19*4882a593SmuzhiyunMount Options 20*4882a593Smuzhiyun============= 21*4882a593Smuzhiyun 22*4882a593SmuzhiyunWhen mounting an XFS filesystem, the following options are accepted. 23*4882a593Smuzhiyun 24*4882a593Smuzhiyun allocsize=size 25*4882a593Smuzhiyun Sets the buffered I/O end-of-file preallocation size when 26*4882a593Smuzhiyun doing delayed allocation writeout (default size is 64KiB). 27*4882a593Smuzhiyun Valid values for this option are page size (typically 4KiB) 28*4882a593Smuzhiyun through to 1GiB, inclusive, in power-of-2 increments. 29*4882a593Smuzhiyun 30*4882a593Smuzhiyun The default behaviour is for dynamic end-of-file 31*4882a593Smuzhiyun preallocation size, which uses a set of heuristics to 32*4882a593Smuzhiyun optimise the preallocation size based on the current 33*4882a593Smuzhiyun allocation patterns within the file and the access patterns 34*4882a593Smuzhiyun to the file. Specifying a fixed ``allocsize`` value turns off 35*4882a593Smuzhiyun the dynamic behaviour. 36*4882a593Smuzhiyun 37*4882a593Smuzhiyun attr2 or noattr2 38*4882a593Smuzhiyun The options enable/disable an "opportunistic" improvement to 39*4882a593Smuzhiyun be made in the way inline extended attributes are stored 40*4882a593Smuzhiyun on-disk. When the new form is used for the first time when 41*4882a593Smuzhiyun ``attr2`` is selected (either when setting or removing extended 42*4882a593Smuzhiyun attributes) the on-disk superblock feature bit field will be 43*4882a593Smuzhiyun updated to reflect this format being in use. 44*4882a593Smuzhiyun 45*4882a593Smuzhiyun The default behaviour is determined by the on-disk feature 46*4882a593Smuzhiyun bit indicating that ``attr2`` behaviour is active. If either 47*4882a593Smuzhiyun mount option is set, then that becomes the new default used 48*4882a593Smuzhiyun by the filesystem. 49*4882a593Smuzhiyun 50*4882a593Smuzhiyun CRC enabled filesystems always use the ``attr2`` format, and so 51*4882a593Smuzhiyun will reject the ``noattr2`` mount option if it is set. 52*4882a593Smuzhiyun 53*4882a593Smuzhiyun discard or nodiscard (default) 54*4882a593Smuzhiyun Enable/disable the issuing of commands to let the block 55*4882a593Smuzhiyun device reclaim space freed by the filesystem. This is 56*4882a593Smuzhiyun useful for SSD devices, thinly provisioned LUNs and virtual 57*4882a593Smuzhiyun machine images, but may have a performance impact. 58*4882a593Smuzhiyun 59*4882a593Smuzhiyun Note: It is currently recommended that you use the ``fstrim`` 60*4882a593Smuzhiyun application to ``discard`` unused blocks rather than the ``discard`` 61*4882a593Smuzhiyun mount option because the performance impact of this option 62*4882a593Smuzhiyun is quite severe. 63*4882a593Smuzhiyun 64*4882a593Smuzhiyun grpid/bsdgroups or nogrpid/sysvgroups (default) 65*4882a593Smuzhiyun These options define what group ID a newly created file 66*4882a593Smuzhiyun gets. When ``grpid`` is set, it takes the group ID of the 67*4882a593Smuzhiyun directory in which it is created; otherwise it takes the 68*4882a593Smuzhiyun ``fsgid`` of the current process, unless the directory has the 69*4882a593Smuzhiyun ``setgid`` bit set, in which case it takes the ``gid`` from the 70*4882a593Smuzhiyun parent directory, and also gets the ``setgid`` bit set if it is 71*4882a593Smuzhiyun a directory itself. 72*4882a593Smuzhiyun 73*4882a593Smuzhiyun filestreams 74*4882a593Smuzhiyun Make the data allocator use the filestreams allocation mode 75*4882a593Smuzhiyun across the entire filesystem rather than just on directories 76*4882a593Smuzhiyun configured to use it. 77*4882a593Smuzhiyun 78*4882a593Smuzhiyun ikeep or noikeep (default) 79*4882a593Smuzhiyun When ``ikeep`` is specified, XFS does not delete empty inode 80*4882a593Smuzhiyun clusters and keeps them around on disk. When ``noikeep`` is 81*4882a593Smuzhiyun specified, empty inode clusters are returned to the free 82*4882a593Smuzhiyun space pool. 83*4882a593Smuzhiyun 84*4882a593Smuzhiyun inode32 or inode64 (default) 85*4882a593Smuzhiyun When ``inode32`` is specified, it indicates that XFS limits 86*4882a593Smuzhiyun inode creation to locations which will not result in inode 87*4882a593Smuzhiyun numbers with more than 32 bits of significance. 88*4882a593Smuzhiyun 89*4882a593Smuzhiyun When ``inode64`` is specified, it indicates that XFS is allowed 90*4882a593Smuzhiyun to create inodes at any location in the filesystem, 91*4882a593Smuzhiyun including those which will result in inode numbers occupying 92*4882a593Smuzhiyun more than 32 bits of significance. 93*4882a593Smuzhiyun 94*4882a593Smuzhiyun ``inode32`` is provided for backwards compatibility with older 95*4882a593Smuzhiyun systems and applications, since 64 bits inode numbers might 96*4882a593Smuzhiyun cause problems for some applications that cannot handle 97*4882a593Smuzhiyun large inode numbers. If applications are in use which do 98*4882a593Smuzhiyun not handle inode numbers bigger than 32 bits, the ``inode32`` 99*4882a593Smuzhiyun option should be specified. 100*4882a593Smuzhiyun 101*4882a593Smuzhiyun largeio or nolargeio (default) 102*4882a593Smuzhiyun If ``nolargeio`` is specified, the optimal I/O reported in 103*4882a593Smuzhiyun ``st_blksize`` by **stat(2)** will be as small as possible to allow 104*4882a593Smuzhiyun user applications to avoid inefficient read/modify/write 105*4882a593Smuzhiyun I/O. This is typically the page size of the machine, as 106*4882a593Smuzhiyun this is the granularity of the page cache. 107*4882a593Smuzhiyun 108*4882a593Smuzhiyun If ``largeio`` is specified, a filesystem that was created with a 109*4882a593Smuzhiyun ``swidth`` specified will return the ``swidth`` value (in bytes) 110*4882a593Smuzhiyun in ``st_blksize``. If the filesystem does not have a ``swidth`` 111*4882a593Smuzhiyun specified but does specify an ``allocsize`` then ``allocsize`` 112*4882a593Smuzhiyun (in bytes) will be returned instead. Otherwise the behaviour 113*4882a593Smuzhiyun is the same as if ``nolargeio`` was specified. 114*4882a593Smuzhiyun 115*4882a593Smuzhiyun logbufs=value 116*4882a593Smuzhiyun Set the number of in-memory log buffers. Valid numbers 117*4882a593Smuzhiyun range from 2-8 inclusive. 118*4882a593Smuzhiyun 119*4882a593Smuzhiyun The default value is 8 buffers. 120*4882a593Smuzhiyun 121*4882a593Smuzhiyun If the memory cost of 8 log buffers is too high on small 122*4882a593Smuzhiyun systems, then it may be reduced at some cost to performance 123*4882a593Smuzhiyun on metadata intensive workloads. The ``logbsize`` option below 124*4882a593Smuzhiyun controls the size of each buffer and so is also relevant to 125*4882a593Smuzhiyun this case. 126*4882a593Smuzhiyun 127*4882a593Smuzhiyun logbsize=value 128*4882a593Smuzhiyun Set the size of each in-memory log buffer. The size may be 129*4882a593Smuzhiyun specified in bytes, or in kilobytes with a "k" suffix. 130*4882a593Smuzhiyun Valid sizes for version 1 and version 2 logs are 16384 (16k) 131*4882a593Smuzhiyun and 32768 (32k). Valid sizes for version 2 logs also 132*4882a593Smuzhiyun include 65536 (64k), 131072 (128k) and 262144 (256k). The 133*4882a593Smuzhiyun logbsize must be an integer multiple of the log 134*4882a593Smuzhiyun stripe unit configured at **mkfs(8)** time. 135*4882a593Smuzhiyun 136*4882a593Smuzhiyun The default value for version 1 logs is 32768, while the 137*4882a593Smuzhiyun default value for version 2 logs is MAX(32768, log_sunit). 138*4882a593Smuzhiyun 139*4882a593Smuzhiyun logdev=device and rtdev=device 140*4882a593Smuzhiyun Use an external log (metadata journal) and/or real-time device. 141*4882a593Smuzhiyun An XFS filesystem has up to three parts: a data section, a log 142*4882a593Smuzhiyun section, and a real-time section. The real-time section is 143*4882a593Smuzhiyun optional, and the log section can be separate from the data 144*4882a593Smuzhiyun section or contained within it. 145*4882a593Smuzhiyun 146*4882a593Smuzhiyun noalign 147*4882a593Smuzhiyun Data allocations will not be aligned at stripe unit 148*4882a593Smuzhiyun boundaries. This is only relevant to filesystems created 149*4882a593Smuzhiyun with non-zero data alignment parameters (``sunit``, ``swidth``) by 150*4882a593Smuzhiyun **mkfs(8)**. 151*4882a593Smuzhiyun 152*4882a593Smuzhiyun norecovery 153*4882a593Smuzhiyun The filesystem will be mounted without running log recovery. 154*4882a593Smuzhiyun If the filesystem was not cleanly unmounted, it is likely to 155*4882a593Smuzhiyun be inconsistent when mounted in ``norecovery`` mode. 156*4882a593Smuzhiyun Some files or directories may not be accessible because of this. 157*4882a593Smuzhiyun Filesystems mounted ``norecovery`` must be mounted read-only or 158*4882a593Smuzhiyun the mount will fail. 159*4882a593Smuzhiyun 160*4882a593Smuzhiyun nouuid 161*4882a593Smuzhiyun Don't check for double mounted file systems using the file 162*4882a593Smuzhiyun system ``uuid``. This is useful to mount LVM snapshot volumes, 163*4882a593Smuzhiyun and often used in combination with ``norecovery`` for mounting 164*4882a593Smuzhiyun read-only snapshots. 165*4882a593Smuzhiyun 166*4882a593Smuzhiyun noquota 167*4882a593Smuzhiyun Forcibly turns off all quota accounting and enforcement 168*4882a593Smuzhiyun within the filesystem. 169*4882a593Smuzhiyun 170*4882a593Smuzhiyun uquota/usrquota/uqnoenforce/quota 171*4882a593Smuzhiyun User disk quota accounting enabled, and limits (optionally) 172*4882a593Smuzhiyun enforced. Refer to **xfs_quota(8)** for further details. 173*4882a593Smuzhiyun 174*4882a593Smuzhiyun gquota/grpquota/gqnoenforce 175*4882a593Smuzhiyun Group disk quota accounting enabled and limits (optionally) 176*4882a593Smuzhiyun enforced. Refer to **xfs_quota(8)** for further details. 177*4882a593Smuzhiyun 178*4882a593Smuzhiyun pquota/prjquota/pqnoenforce 179*4882a593Smuzhiyun Project disk quota accounting enabled and limits (optionally) 180*4882a593Smuzhiyun enforced. Refer to **xfs_quota(8)** for further details. 181*4882a593Smuzhiyun 182*4882a593Smuzhiyun sunit=value and swidth=value 183*4882a593Smuzhiyun Used to specify the stripe unit and width for a RAID device 184*4882a593Smuzhiyun or a stripe volume. "value" must be specified in 512-byte 185*4882a593Smuzhiyun block units. These options are only relevant to filesystems 186*4882a593Smuzhiyun that were created with non-zero data alignment parameters. 187*4882a593Smuzhiyun 188*4882a593Smuzhiyun The ``sunit`` and ``swidth`` parameters specified must be compatible 189*4882a593Smuzhiyun with the existing filesystem alignment characteristics. In 190*4882a593Smuzhiyun general, that means the only valid changes to ``sunit`` are 191*4882a593Smuzhiyun increasing it by a power-of-2 multiple. Valid ``swidth`` values 192*4882a593Smuzhiyun are any integer multiple of a valid ``sunit`` value. 193*4882a593Smuzhiyun 194*4882a593Smuzhiyun Typically the only time these mount options are necessary if 195*4882a593Smuzhiyun after an underlying RAID device has had it's geometry 196*4882a593Smuzhiyun modified, such as adding a new disk to a RAID5 lun and 197*4882a593Smuzhiyun reshaping it. 198*4882a593Smuzhiyun 199*4882a593Smuzhiyun swalloc 200*4882a593Smuzhiyun Data allocations will be rounded up to stripe width boundaries 201*4882a593Smuzhiyun when the current end of file is being extended and the file 202*4882a593Smuzhiyun size is larger than the stripe width size. 203*4882a593Smuzhiyun 204*4882a593Smuzhiyun wsync 205*4882a593Smuzhiyun When specified, all filesystem namespace operations are 206*4882a593Smuzhiyun executed synchronously. This ensures that when the namespace 207*4882a593Smuzhiyun operation (create, unlink, etc) completes, the change to the 208*4882a593Smuzhiyun namespace is on stable storage. This is useful in HA setups 209*4882a593Smuzhiyun where failover must not result in clients seeing 210*4882a593Smuzhiyun inconsistent namespace presentation during or after a 211*4882a593Smuzhiyun failover event. 212*4882a593Smuzhiyun 213*4882a593SmuzhiyunDeprecation of V4 Format 214*4882a593Smuzhiyun======================== 215*4882a593Smuzhiyun 216*4882a593SmuzhiyunThe V4 filesystem format lacks certain features that are supported by 217*4882a593Smuzhiyunthe V5 format, such as metadata checksumming, strengthened metadata 218*4882a593Smuzhiyunverification, and the ability to store timestamps past the year 2038. 219*4882a593SmuzhiyunBecause of this, the V4 format is deprecated. All users should upgrade 220*4882a593Smuzhiyunby backing up their files, reformatting, and restoring from the backup. 221*4882a593Smuzhiyun 222*4882a593SmuzhiyunAdministrators and users can detect a V4 filesystem by running xfs_info 223*4882a593Smuzhiyunagainst a filesystem mountpoint and checking for a string containing 224*4882a593Smuzhiyun"crc=". If no such string is found, please upgrade xfsprogs to the 225*4882a593Smuzhiyunlatest version and try again. 226*4882a593Smuzhiyun 227*4882a593SmuzhiyunThe deprecation will take place in two parts. Support for mounting V4 228*4882a593Smuzhiyunfilesystems can now be disabled at kernel build time via Kconfig option. 229*4882a593SmuzhiyunThe option will default to yes until September 2025, at which time it 230*4882a593Smuzhiyunwill be changed to default to no. In September 2030, support will be 231*4882a593Smuzhiyunremoved from the codebase entirely. 232*4882a593Smuzhiyun 233*4882a593SmuzhiyunNote: Distributors may choose to withdraw V4 format support earlier than 234*4882a593Smuzhiyunthe dates listed above. 235*4882a593Smuzhiyun 236*4882a593SmuzhiyunDeprecated Mount Options 237*4882a593Smuzhiyun======================== 238*4882a593Smuzhiyun 239*4882a593Smuzhiyun=========================== ================ 240*4882a593Smuzhiyun Name Removal Schedule 241*4882a593Smuzhiyun=========================== ================ 242*4882a593SmuzhiyunMounting with V4 filesystem September 2030 243*4882a593Smuzhiyunikeep/noikeep September 2025 244*4882a593Smuzhiyunattr2/noattr2 September 2025 245*4882a593Smuzhiyun=========================== ================ 246*4882a593Smuzhiyun 247*4882a593Smuzhiyun 248*4882a593SmuzhiyunRemoved Mount Options 249*4882a593Smuzhiyun===================== 250*4882a593Smuzhiyun 251*4882a593Smuzhiyun=========================== ======= 252*4882a593Smuzhiyun Name Removed 253*4882a593Smuzhiyun=========================== ======= 254*4882a593Smuzhiyun delaylog/nodelaylog v4.0 255*4882a593Smuzhiyun ihashsize v4.0 256*4882a593Smuzhiyun irixsgid v4.0 257*4882a593Smuzhiyun osyncisdsync/osyncisosync v4.0 258*4882a593Smuzhiyun barrier v4.19 259*4882a593Smuzhiyun nobarrier v4.19 260*4882a593Smuzhiyun=========================== ======= 261*4882a593Smuzhiyun 262*4882a593Smuzhiyunsysctls 263*4882a593Smuzhiyun======= 264*4882a593Smuzhiyun 265*4882a593SmuzhiyunThe following sysctls are available for the XFS filesystem: 266*4882a593Smuzhiyun 267*4882a593Smuzhiyun fs.xfs.stats_clear (Min: 0 Default: 0 Max: 1) 268*4882a593Smuzhiyun Setting this to "1" clears accumulated XFS statistics 269*4882a593Smuzhiyun in /proc/fs/xfs/stat. It then immediately resets to "0". 270*4882a593Smuzhiyun 271*4882a593Smuzhiyun fs.xfs.xfssyncd_centisecs (Min: 100 Default: 3000 Max: 720000) 272*4882a593Smuzhiyun The interval at which the filesystem flushes metadata 273*4882a593Smuzhiyun out to disk and runs internal cache cleanup routines. 274*4882a593Smuzhiyun 275*4882a593Smuzhiyun fs.xfs.filestream_centisecs (Min: 1 Default: 3000 Max: 360000) 276*4882a593Smuzhiyun The interval at which the filesystem ages filestreams cache 277*4882a593Smuzhiyun references and returns timed-out AGs back to the free stream 278*4882a593Smuzhiyun pool. 279*4882a593Smuzhiyun 280*4882a593Smuzhiyun fs.xfs.speculative_prealloc_lifetime 281*4882a593Smuzhiyun (Units: seconds Min: 1 Default: 300 Max: 86400) 282*4882a593Smuzhiyun The interval at which the background scanning for inodes 283*4882a593Smuzhiyun with unused speculative preallocation runs. The scan 284*4882a593Smuzhiyun removes unused preallocation from clean inodes and releases 285*4882a593Smuzhiyun the unused space back to the free pool. 286*4882a593Smuzhiyun 287*4882a593Smuzhiyun fs.xfs.error_level (Min: 0 Default: 3 Max: 11) 288*4882a593Smuzhiyun A volume knob for error reporting when internal errors occur. 289*4882a593Smuzhiyun This will generate detailed messages & backtraces for filesystem 290*4882a593Smuzhiyun shutdowns, for example. Current threshold values are: 291*4882a593Smuzhiyun 292*4882a593Smuzhiyun XFS_ERRLEVEL_OFF: 0 293*4882a593Smuzhiyun XFS_ERRLEVEL_LOW: 1 294*4882a593Smuzhiyun XFS_ERRLEVEL_HIGH: 5 295*4882a593Smuzhiyun 296*4882a593Smuzhiyun fs.xfs.panic_mask (Min: 0 Default: 0 Max: 256) 297*4882a593Smuzhiyun Causes certain error conditions to call BUG(). Value is a bitmask; 298*4882a593Smuzhiyun OR together the tags which represent errors which should cause panics: 299*4882a593Smuzhiyun 300*4882a593Smuzhiyun XFS_NO_PTAG 0 301*4882a593Smuzhiyun XFS_PTAG_IFLUSH 0x00000001 302*4882a593Smuzhiyun XFS_PTAG_LOGRES 0x00000002 303*4882a593Smuzhiyun XFS_PTAG_AILDELETE 0x00000004 304*4882a593Smuzhiyun XFS_PTAG_ERROR_REPORT 0x00000008 305*4882a593Smuzhiyun XFS_PTAG_SHUTDOWN_CORRUPT 0x00000010 306*4882a593Smuzhiyun XFS_PTAG_SHUTDOWN_IOERROR 0x00000020 307*4882a593Smuzhiyun XFS_PTAG_SHUTDOWN_LOGERROR 0x00000040 308*4882a593Smuzhiyun XFS_PTAG_FSBLOCK_ZERO 0x00000080 309*4882a593Smuzhiyun XFS_PTAG_VERIFIER_ERROR 0x00000100 310*4882a593Smuzhiyun 311*4882a593Smuzhiyun This option is intended for debugging only. 312*4882a593Smuzhiyun 313*4882a593Smuzhiyun fs.xfs.irix_symlink_mode (Min: 0 Default: 0 Max: 1) 314*4882a593Smuzhiyun Controls whether symlinks are created with mode 0777 (default) 315*4882a593Smuzhiyun or whether their mode is affected by the umask (irix mode). 316*4882a593Smuzhiyun 317*4882a593Smuzhiyun fs.xfs.irix_sgid_inherit (Min: 0 Default: 0 Max: 1) 318*4882a593Smuzhiyun Controls files created in SGID directories. 319*4882a593Smuzhiyun If the group ID of the new file does not match the effective group 320*4882a593Smuzhiyun ID or one of the supplementary group IDs of the parent dir, the 321*4882a593Smuzhiyun ISGID bit is cleared if the irix_sgid_inherit compatibility sysctl 322*4882a593Smuzhiyun is set. 323*4882a593Smuzhiyun 324*4882a593Smuzhiyun fs.xfs.inherit_sync (Min: 0 Default: 1 Max: 1) 325*4882a593Smuzhiyun Setting this to "1" will cause the "sync" flag set 326*4882a593Smuzhiyun by the **xfs_io(8)** chattr command on a directory to be 327*4882a593Smuzhiyun inherited by files in that directory. 328*4882a593Smuzhiyun 329*4882a593Smuzhiyun fs.xfs.inherit_nodump (Min: 0 Default: 1 Max: 1) 330*4882a593Smuzhiyun Setting this to "1" will cause the "nodump" flag set 331*4882a593Smuzhiyun by the **xfs_io(8)** chattr command on a directory to be 332*4882a593Smuzhiyun inherited by files in that directory. 333*4882a593Smuzhiyun 334*4882a593Smuzhiyun fs.xfs.inherit_noatime (Min: 0 Default: 1 Max: 1) 335*4882a593Smuzhiyun Setting this to "1" will cause the "noatime" flag set 336*4882a593Smuzhiyun by the **xfs_io(8)** chattr command on a directory to be 337*4882a593Smuzhiyun inherited by files in that directory. 338*4882a593Smuzhiyun 339*4882a593Smuzhiyun fs.xfs.inherit_nosymlinks (Min: 0 Default: 1 Max: 1) 340*4882a593Smuzhiyun Setting this to "1" will cause the "nosymlinks" flag set 341*4882a593Smuzhiyun by the **xfs_io(8)** chattr command on a directory to be 342*4882a593Smuzhiyun inherited by files in that directory. 343*4882a593Smuzhiyun 344*4882a593Smuzhiyun fs.xfs.inherit_nodefrag (Min: 0 Default: 1 Max: 1) 345*4882a593Smuzhiyun Setting this to "1" will cause the "nodefrag" flag set 346*4882a593Smuzhiyun by the **xfs_io(8)** chattr command on a directory to be 347*4882a593Smuzhiyun inherited by files in that directory. 348*4882a593Smuzhiyun 349*4882a593Smuzhiyun fs.xfs.rotorstep (Min: 1 Default: 1 Max: 256) 350*4882a593Smuzhiyun In "inode32" allocation mode, this option determines how many 351*4882a593Smuzhiyun files the allocator attempts to allocate in the same allocation 352*4882a593Smuzhiyun group before moving to the next allocation group. The intent 353*4882a593Smuzhiyun is to control the rate at which the allocator moves between 354*4882a593Smuzhiyun allocation groups when allocating extents for new files. 355*4882a593Smuzhiyun 356*4882a593SmuzhiyunDeprecated Sysctls 357*4882a593Smuzhiyun================== 358*4882a593Smuzhiyun 359*4882a593Smuzhiyun=========================== ================ 360*4882a593Smuzhiyun Name Removal Schedule 361*4882a593Smuzhiyun=========================== ================ 362*4882a593Smuzhiyunfs.xfs.irix_sgid_inherit September 2025 363*4882a593Smuzhiyunfs.xfs.irix_symlink_mode September 2025 364*4882a593Smuzhiyun=========================== ================ 365*4882a593Smuzhiyun 366*4882a593Smuzhiyun 367*4882a593SmuzhiyunRemoved Sysctls 368*4882a593Smuzhiyun=============== 369*4882a593Smuzhiyun 370*4882a593Smuzhiyun============================= ======= 371*4882a593Smuzhiyun Name Removed 372*4882a593Smuzhiyun============================= ======= 373*4882a593Smuzhiyun fs.xfs.xfsbufd_centisec v4.0 374*4882a593Smuzhiyun fs.xfs.age_buffer_centisecs v4.0 375*4882a593Smuzhiyun============================= ======= 376*4882a593Smuzhiyun 377*4882a593SmuzhiyunError handling 378*4882a593Smuzhiyun============== 379*4882a593Smuzhiyun 380*4882a593SmuzhiyunXFS can act differently according to the type of error found during its 381*4882a593Smuzhiyunoperation. The implementation introduces the following concepts to the error 382*4882a593Smuzhiyunhandler: 383*4882a593Smuzhiyun 384*4882a593Smuzhiyun -failure speed: 385*4882a593Smuzhiyun Defines how fast XFS should propagate an error upwards when a specific 386*4882a593Smuzhiyun error is found during the filesystem operation. It can propagate 387*4882a593Smuzhiyun immediately, after a defined number of retries, after a set time period, 388*4882a593Smuzhiyun or simply retry forever. 389*4882a593Smuzhiyun 390*4882a593Smuzhiyun -error classes: 391*4882a593Smuzhiyun Specifies the subsystem the error configuration will apply to, such as 392*4882a593Smuzhiyun metadata IO or memory allocation. Different subsystems will have 393*4882a593Smuzhiyun different error handlers for which behaviour can be configured. 394*4882a593Smuzhiyun 395*4882a593Smuzhiyun -error handlers: 396*4882a593Smuzhiyun Defines the behavior for a specific error. 397*4882a593Smuzhiyun 398*4882a593SmuzhiyunThe filesystem behavior during an error can be set via ``sysfs`` files. Each 399*4882a593Smuzhiyunerror handler works independently - the first condition met by an error handler 400*4882a593Smuzhiyunfor a specific class will cause the error to be propagated rather than reset and 401*4882a593Smuzhiyunretried. 402*4882a593Smuzhiyun 403*4882a593SmuzhiyunThe action taken by the filesystem when the error is propagated is context 404*4882a593Smuzhiyundependent - it may cause a shut down in the case of an unrecoverable error, 405*4882a593Smuzhiyunit may be reported back to userspace, or it may even be ignored because 406*4882a593Smuzhiyunthere's nothing useful we can with the error or anyone we can report it to (e.g. 407*4882a593Smuzhiyunduring unmount). 408*4882a593Smuzhiyun 409*4882a593SmuzhiyunThe configuration files are organized into the following hierarchy for each 410*4882a593Smuzhiyunmounted filesystem: 411*4882a593Smuzhiyun 412*4882a593Smuzhiyun /sys/fs/xfs/<dev>/error/<class>/<error>/ 413*4882a593Smuzhiyun 414*4882a593SmuzhiyunWhere: 415*4882a593Smuzhiyun <dev> 416*4882a593Smuzhiyun The short device name of the mounted filesystem. This is the same device 417*4882a593Smuzhiyun name that shows up in XFS kernel error messages as "XFS(<dev>): ..." 418*4882a593Smuzhiyun 419*4882a593Smuzhiyun <class> 420*4882a593Smuzhiyun The subsystem the error configuration belongs to. As of 4.9, the defined 421*4882a593Smuzhiyun classes are: 422*4882a593Smuzhiyun 423*4882a593Smuzhiyun - "metadata": applies metadata buffer write IO 424*4882a593Smuzhiyun 425*4882a593Smuzhiyun <error> 426*4882a593Smuzhiyun The individual error handler configurations. 427*4882a593Smuzhiyun 428*4882a593Smuzhiyun 429*4882a593SmuzhiyunEach filesystem has "global" error configuration options defined in their top 430*4882a593Smuzhiyunlevel directory: 431*4882a593Smuzhiyun 432*4882a593Smuzhiyun /sys/fs/xfs/<dev>/error/ 433*4882a593Smuzhiyun 434*4882a593Smuzhiyun fail_at_unmount (Min: 0 Default: 1 Max: 1) 435*4882a593Smuzhiyun Defines the filesystem error behavior at unmount time. 436*4882a593Smuzhiyun 437*4882a593Smuzhiyun If set to a value of 1, XFS will override all other error configurations 438*4882a593Smuzhiyun during unmount and replace them with "immediate fail" characteristics. 439*4882a593Smuzhiyun i.e. no retries, no retry timeout. This will always allow unmount to 440*4882a593Smuzhiyun succeed when there are persistent errors present. 441*4882a593Smuzhiyun 442*4882a593Smuzhiyun If set to 0, the configured retry behaviour will continue until all 443*4882a593Smuzhiyun retries and/or timeouts have been exhausted. This will delay unmount 444*4882a593Smuzhiyun completion when there are persistent errors, and it may prevent the 445*4882a593Smuzhiyun filesystem from ever unmounting fully in the case of "retry forever" 446*4882a593Smuzhiyun handler configurations. 447*4882a593Smuzhiyun 448*4882a593Smuzhiyun Note: there is no guarantee that fail_at_unmount can be set while an 449*4882a593Smuzhiyun unmount is in progress. It is possible that the ``sysfs`` entries are 450*4882a593Smuzhiyun removed by the unmounting filesystem before a "retry forever" error 451*4882a593Smuzhiyun handler configuration causes unmount to hang, and hence the filesystem 452*4882a593Smuzhiyun must be configured appropriately before unmount begins to prevent 453*4882a593Smuzhiyun unmount hangs. 454*4882a593Smuzhiyun 455*4882a593SmuzhiyunEach filesystem has specific error class handlers that define the error 456*4882a593Smuzhiyunpropagation behaviour for specific errors. There is also a "default" error 457*4882a593Smuzhiyunhandler defined, which defines the behaviour for all errors that don't have 458*4882a593Smuzhiyunspecific handlers defined. Where multiple retry constraints are configured for 459*4882a593Smuzhiyuna single error, the first retry configuration that expires will cause the error 460*4882a593Smuzhiyunto be propagated. The handler configurations are found in the directory: 461*4882a593Smuzhiyun 462*4882a593Smuzhiyun /sys/fs/xfs/<dev>/error/<class>/<error>/ 463*4882a593Smuzhiyun 464*4882a593Smuzhiyun max_retries (Min: -1 Default: Varies Max: INTMAX) 465*4882a593Smuzhiyun Defines the allowed number of retries of a specific error before 466*4882a593Smuzhiyun the filesystem will propagate the error. The retry count for a given 467*4882a593Smuzhiyun error context (e.g. a specific metadata buffer) is reset every time 468*4882a593Smuzhiyun there is a successful completion of the operation. 469*4882a593Smuzhiyun 470*4882a593Smuzhiyun Setting the value to "-1" will cause XFS to retry forever for this 471*4882a593Smuzhiyun specific error. 472*4882a593Smuzhiyun 473*4882a593Smuzhiyun Setting the value to "0" will cause XFS to fail immediately when the 474*4882a593Smuzhiyun specific error is reported. 475*4882a593Smuzhiyun 476*4882a593Smuzhiyun Setting the value to "N" (where 0 < N < Max) will make XFS retry the 477*4882a593Smuzhiyun operation "N" times before propagating the error. 478*4882a593Smuzhiyun 479*4882a593Smuzhiyun retry_timeout_seconds (Min: -1 Default: Varies Max: 1 day) 480*4882a593Smuzhiyun Define the amount of time (in seconds) that the filesystem is 481*4882a593Smuzhiyun allowed to retry its operations when the specific error is 482*4882a593Smuzhiyun found. 483*4882a593Smuzhiyun 484*4882a593Smuzhiyun Setting the value to "-1" will allow XFS to retry forever for this 485*4882a593Smuzhiyun specific error. 486*4882a593Smuzhiyun 487*4882a593Smuzhiyun Setting the value to "0" will cause XFS to fail immediately when the 488*4882a593Smuzhiyun specific error is reported. 489*4882a593Smuzhiyun 490*4882a593Smuzhiyun Setting the value to "N" (where 0 < N < Max) will allow XFS to retry the 491*4882a593Smuzhiyun operation for up to "N" seconds before propagating the error. 492*4882a593Smuzhiyun 493*4882a593Smuzhiyun**Note:** The default behaviour for a specific error handler is dependent on both 494*4882a593Smuzhiyunthe class and error context. For example, the default values for 495*4882a593Smuzhiyun"metadata/ENODEV" are "0" rather than "-1" so that this error handler defaults 496*4882a593Smuzhiyunto "fail immediately" behaviour. This is done because ENODEV is a fatal, 497*4882a593Smuzhiyununrecoverable error no matter how many times the metadata IO is retried. 498