1*4882a593Smuzhiyun=================================== 2*4882a593SmuzhiyunDocumentation for /proc/sys/kernel/ 3*4882a593Smuzhiyun=================================== 4*4882a593Smuzhiyun 5*4882a593Smuzhiyun.. See scripts/check-sysctl-docs to keep this up to date 6*4882a593Smuzhiyun 7*4882a593Smuzhiyun 8*4882a593SmuzhiyunCopyright (c) 1998, 1999, Rik van Riel <riel@nl.linux.org> 9*4882a593Smuzhiyun 10*4882a593SmuzhiyunCopyright (c) 2009, Shen Feng<shen@cn.fujitsu.com> 11*4882a593Smuzhiyun 12*4882a593SmuzhiyunFor general info and legal blurb, please look in :doc:`index`. 13*4882a593Smuzhiyun 14*4882a593Smuzhiyun------------------------------------------------------------------------------ 15*4882a593Smuzhiyun 16*4882a593SmuzhiyunThis file contains documentation for the sysctl files in 17*4882a593Smuzhiyun``/proc/sys/kernel/`` and is valid for Linux kernel version 2.2. 18*4882a593Smuzhiyun 19*4882a593SmuzhiyunThe files in this directory can be used to tune and monitor 20*4882a593Smuzhiyunmiscellaneous and general things in the operation of the Linux 21*4882a593Smuzhiyunkernel. Since some of the files *can* be used to screw up your 22*4882a593Smuzhiyunsystem, it is advisable to read both documentation and source 23*4882a593Smuzhiyunbefore actually making adjustments. 24*4882a593Smuzhiyun 25*4882a593SmuzhiyunCurrently, these files might (depending on your configuration) 26*4882a593Smuzhiyunshow up in ``/proc/sys/kernel``: 27*4882a593Smuzhiyun 28*4882a593Smuzhiyun.. contents:: :local: 29*4882a593Smuzhiyun 30*4882a593Smuzhiyun 31*4882a593Smuzhiyunacct 32*4882a593Smuzhiyun==== 33*4882a593Smuzhiyun 34*4882a593Smuzhiyun:: 35*4882a593Smuzhiyun 36*4882a593Smuzhiyun highwater lowwater frequency 37*4882a593Smuzhiyun 38*4882a593SmuzhiyunIf BSD-style process accounting is enabled these values control 39*4882a593Smuzhiyunits behaviour. If free space on filesystem where the log lives 40*4882a593Smuzhiyungoes below ``lowwater``% accounting suspends. If free space gets 41*4882a593Smuzhiyunabove ``highwater``% accounting resumes. ``frequency`` determines 42*4882a593Smuzhiyunhow often do we check the amount of free space (value is in 43*4882a593Smuzhiyunseconds). Default: 44*4882a593Smuzhiyun 45*4882a593Smuzhiyun:: 46*4882a593Smuzhiyun 47*4882a593Smuzhiyun 4 2 30 48*4882a593Smuzhiyun 49*4882a593SmuzhiyunThat is, suspend accounting if free space drops below 2%; resume it 50*4882a593Smuzhiyunif it increases to at least 4%; consider information about amount of 51*4882a593Smuzhiyunfree space valid for 30 seconds. 52*4882a593Smuzhiyun 53*4882a593Smuzhiyun 54*4882a593Smuzhiyunacpi_video_flags 55*4882a593Smuzhiyun================ 56*4882a593Smuzhiyun 57*4882a593SmuzhiyunSee :doc:`/power/video`. This allows the video resume mode to be set, 58*4882a593Smuzhiyunin a similar fashion to the ``acpi_sleep`` kernel parameter, by 59*4882a593Smuzhiyuncombining the following values: 60*4882a593Smuzhiyun 61*4882a593Smuzhiyun= ======= 62*4882a593Smuzhiyun1 s3_bios 63*4882a593Smuzhiyun2 s3_mode 64*4882a593Smuzhiyun4 s3_beep 65*4882a593Smuzhiyun= ======= 66*4882a593Smuzhiyun 67*4882a593Smuzhiyun 68*4882a593Smuzhiyunauto_msgmni 69*4882a593Smuzhiyun=========== 70*4882a593Smuzhiyun 71*4882a593SmuzhiyunThis variable has no effect and may be removed in future kernel 72*4882a593Smuzhiyunreleases. Reading it always returns 0. 73*4882a593SmuzhiyunUp to Linux 3.17, it enabled/disabled automatic recomputing of 74*4882a593Smuzhiyun`msgmni`_ 75*4882a593Smuzhiyunupon memory add/remove or upon IPC namespace creation/removal. 76*4882a593SmuzhiyunEchoing "1" into this file enabled msgmni automatic recomputing. 77*4882a593SmuzhiyunEchoing "0" turned it off. The default value was 1. 78*4882a593Smuzhiyun 79*4882a593Smuzhiyun 80*4882a593Smuzhiyunbootloader_type (x86 only) 81*4882a593Smuzhiyun========================== 82*4882a593Smuzhiyun 83*4882a593SmuzhiyunThis gives the bootloader type number as indicated by the bootloader, 84*4882a593Smuzhiyunshifted left by 4, and OR'd with the low four bits of the bootloader 85*4882a593Smuzhiyunversion. The reason for this encoding is that this used to match the 86*4882a593Smuzhiyun``type_of_loader`` field in the kernel header; the encoding is kept for 87*4882a593Smuzhiyunbackwards compatibility. That is, if the full bootloader type number 88*4882a593Smuzhiyunis 0x15 and the full version number is 0x234, this file will contain 89*4882a593Smuzhiyunthe value 340 = 0x154. 90*4882a593Smuzhiyun 91*4882a593SmuzhiyunSee the ``type_of_loader`` and ``ext_loader_type`` fields in 92*4882a593Smuzhiyun:doc:`/x86/boot` for additional information. 93*4882a593Smuzhiyun 94*4882a593Smuzhiyun 95*4882a593Smuzhiyunbootloader_version (x86 only) 96*4882a593Smuzhiyun============================= 97*4882a593Smuzhiyun 98*4882a593SmuzhiyunThe complete bootloader version number. In the example above, this 99*4882a593Smuzhiyunfile will contain the value 564 = 0x234. 100*4882a593Smuzhiyun 101*4882a593SmuzhiyunSee the ``type_of_loader`` and ``ext_loader_ver`` fields in 102*4882a593Smuzhiyun:doc:`/x86/boot` for additional information. 103*4882a593Smuzhiyun 104*4882a593Smuzhiyun 105*4882a593Smuzhiyunbpf_stats_enabled 106*4882a593Smuzhiyun================= 107*4882a593Smuzhiyun 108*4882a593SmuzhiyunControls whether the kernel should collect statistics on BPF programs 109*4882a593Smuzhiyun(total time spent running, number of times run...). Enabling 110*4882a593Smuzhiyunstatistics causes a slight reduction in performance on each program 111*4882a593Smuzhiyunrun. The statistics can be seen using ``bpftool``. 112*4882a593Smuzhiyun 113*4882a593Smuzhiyun= =================================== 114*4882a593Smuzhiyun0 Don't collect statistics (default). 115*4882a593Smuzhiyun1 Collect statistics. 116*4882a593Smuzhiyun= =================================== 117*4882a593Smuzhiyun 118*4882a593Smuzhiyun 119*4882a593Smuzhiyuncad_pid 120*4882a593Smuzhiyun======= 121*4882a593Smuzhiyun 122*4882a593SmuzhiyunThis is the pid which will be signalled on reboot (notably, by 123*4882a593SmuzhiyunCtrl-Alt-Delete). Writing a value to this file which doesn't 124*4882a593Smuzhiyuncorrespond to a running process will result in ``-ESRCH``. 125*4882a593Smuzhiyun 126*4882a593SmuzhiyunSee also `ctrl-alt-del`_. 127*4882a593Smuzhiyun 128*4882a593Smuzhiyun 129*4882a593Smuzhiyuncap_last_cap 130*4882a593Smuzhiyun============ 131*4882a593Smuzhiyun 132*4882a593SmuzhiyunHighest valid capability of the running kernel. Exports 133*4882a593Smuzhiyun``CAP_LAST_CAP`` from the kernel. 134*4882a593Smuzhiyun 135*4882a593Smuzhiyun 136*4882a593Smuzhiyuncore_pattern 137*4882a593Smuzhiyun============ 138*4882a593Smuzhiyun 139*4882a593Smuzhiyun``core_pattern`` is used to specify a core dumpfile pattern name. 140*4882a593Smuzhiyun 141*4882a593Smuzhiyun* max length 127 characters; default value is "core" 142*4882a593Smuzhiyun* ``core_pattern`` is used as a pattern template for the output 143*4882a593Smuzhiyun filename; certain string patterns (beginning with '%') are 144*4882a593Smuzhiyun substituted with their actual values. 145*4882a593Smuzhiyun* backward compatibility with ``core_uses_pid``: 146*4882a593Smuzhiyun 147*4882a593Smuzhiyun If ``core_pattern`` does not include "%p" (default does not) 148*4882a593Smuzhiyun and ``core_uses_pid`` is set, then .PID will be appended to 149*4882a593Smuzhiyun the filename. 150*4882a593Smuzhiyun 151*4882a593Smuzhiyun* corename format specifiers 152*4882a593Smuzhiyun 153*4882a593Smuzhiyun ======== ========================================== 154*4882a593Smuzhiyun %<NUL> '%' is dropped 155*4882a593Smuzhiyun %% output one '%' 156*4882a593Smuzhiyun %p pid 157*4882a593Smuzhiyun %P global pid (init PID namespace) 158*4882a593Smuzhiyun %i tid 159*4882a593Smuzhiyun %I global tid (init PID namespace) 160*4882a593Smuzhiyun %u uid (in initial user namespace) 161*4882a593Smuzhiyun %g gid (in initial user namespace) 162*4882a593Smuzhiyun %d dump mode, matches ``PR_SET_DUMPABLE`` and 163*4882a593Smuzhiyun ``/proc/sys/fs/suid_dumpable`` 164*4882a593Smuzhiyun %s signal number 165*4882a593Smuzhiyun %t UNIX time of dump 166*4882a593Smuzhiyun %h hostname 167*4882a593Smuzhiyun %e executable filename (may be shortened, could be changed by prctl etc) 168*4882a593Smuzhiyun %f executable filename 169*4882a593Smuzhiyun %E executable path 170*4882a593Smuzhiyun %c maximum size of core file by resource limit RLIMIT_CORE 171*4882a593Smuzhiyun %<OTHER> both are dropped 172*4882a593Smuzhiyun ======== ========================================== 173*4882a593Smuzhiyun 174*4882a593Smuzhiyun* If the first character of the pattern is a '|', the kernel will treat 175*4882a593Smuzhiyun the rest of the pattern as a command to run. The core dump will be 176*4882a593Smuzhiyun written to the standard input of that program instead of to a file. 177*4882a593Smuzhiyun 178*4882a593Smuzhiyun 179*4882a593Smuzhiyuncore_pipe_limit 180*4882a593Smuzhiyun=============== 181*4882a593Smuzhiyun 182*4882a593SmuzhiyunThis sysctl is only applicable when `core_pattern`_ is configured to 183*4882a593Smuzhiyunpipe core files to a user space helper (when the first character of 184*4882a593Smuzhiyun``core_pattern`` is a '|', see above). 185*4882a593SmuzhiyunWhen collecting cores via a pipe to an application, it is occasionally 186*4882a593Smuzhiyunuseful for the collecting application to gather data about the 187*4882a593Smuzhiyuncrashing process from its ``/proc/pid`` directory. 188*4882a593SmuzhiyunIn order to do this safely, the kernel must wait for the collecting 189*4882a593Smuzhiyunprocess to exit, so as not to remove the crashing processes proc files 190*4882a593Smuzhiyunprematurely. 191*4882a593SmuzhiyunThis in turn creates the possibility that a misbehaving userspace 192*4882a593Smuzhiyuncollecting process can block the reaping of a crashed process simply 193*4882a593Smuzhiyunby never exiting. 194*4882a593SmuzhiyunThis sysctl defends against that. 195*4882a593SmuzhiyunIt defines how many concurrent crashing processes may be piped to user 196*4882a593Smuzhiyunspace applications in parallel. 197*4882a593SmuzhiyunIf this value is exceeded, then those crashing processes above that 198*4882a593Smuzhiyunvalue are noted via the kernel log and their cores are skipped. 199*4882a593Smuzhiyun0 is a special value, indicating that unlimited processes may be 200*4882a593Smuzhiyuncaptured in parallel, but that no waiting will take place (i.e. the 201*4882a593Smuzhiyuncollecting process is not guaranteed access to ``/proc/<crashing 202*4882a593Smuzhiyunpid>/``). 203*4882a593SmuzhiyunThis value defaults to 0. 204*4882a593Smuzhiyun 205*4882a593Smuzhiyun 206*4882a593Smuzhiyuncore_uses_pid 207*4882a593Smuzhiyun============= 208*4882a593Smuzhiyun 209*4882a593SmuzhiyunThe default coredump filename is "core". By setting 210*4882a593Smuzhiyun``core_uses_pid`` to 1, the coredump filename becomes core.PID. 211*4882a593SmuzhiyunIf `core_pattern`_ does not include "%p" (default does not) 212*4882a593Smuzhiyunand ``core_uses_pid`` is set, then .PID will be appended to 213*4882a593Smuzhiyunthe filename. 214*4882a593Smuzhiyun 215*4882a593Smuzhiyun 216*4882a593Smuzhiyunctrl-alt-del 217*4882a593Smuzhiyun============ 218*4882a593Smuzhiyun 219*4882a593SmuzhiyunWhen the value in this file is 0, ctrl-alt-del is trapped and 220*4882a593Smuzhiyunsent to the ``init(1)`` program to handle a graceful restart. 221*4882a593SmuzhiyunWhen, however, the value is > 0, Linux's reaction to a Vulcan 222*4882a593SmuzhiyunNerve Pinch (tm) will be an immediate reboot, without even 223*4882a593Smuzhiyunsyncing its dirty buffers. 224*4882a593Smuzhiyun 225*4882a593SmuzhiyunNote: 226*4882a593Smuzhiyun when a program (like dosemu) has the keyboard in 'raw' 227*4882a593Smuzhiyun mode, the ctrl-alt-del is intercepted by the program before it 228*4882a593Smuzhiyun ever reaches the kernel tty layer, and it's up to the program 229*4882a593Smuzhiyun to decide what to do with it. 230*4882a593Smuzhiyun 231*4882a593Smuzhiyun 232*4882a593Smuzhiyundmesg_restrict 233*4882a593Smuzhiyun============== 234*4882a593Smuzhiyun 235*4882a593SmuzhiyunThis toggle indicates whether unprivileged users are prevented 236*4882a593Smuzhiyunfrom using ``dmesg(8)`` to view messages from the kernel's log 237*4882a593Smuzhiyunbuffer. 238*4882a593SmuzhiyunWhen ``dmesg_restrict`` is set to 0 there are no restrictions. 239*4882a593SmuzhiyunWhen ``dmesg_restrict`` is set to 1, users must have 240*4882a593Smuzhiyun``CAP_SYSLOG`` to use ``dmesg(8)``. 241*4882a593Smuzhiyun 242*4882a593SmuzhiyunThe kernel config option ``CONFIG_SECURITY_DMESG_RESTRICT`` sets the 243*4882a593Smuzhiyundefault value of ``dmesg_restrict``. 244*4882a593Smuzhiyun 245*4882a593Smuzhiyun 246*4882a593Smuzhiyundomainname & hostname 247*4882a593Smuzhiyun===================== 248*4882a593Smuzhiyun 249*4882a593SmuzhiyunThese files can be used to set the NIS/YP domainname and the 250*4882a593Smuzhiyunhostname of your box in exactly the same way as the commands 251*4882a593Smuzhiyundomainname and hostname, i.e.:: 252*4882a593Smuzhiyun 253*4882a593Smuzhiyun # echo "darkstar" > /proc/sys/kernel/hostname 254*4882a593Smuzhiyun # echo "mydomain" > /proc/sys/kernel/domainname 255*4882a593Smuzhiyun 256*4882a593Smuzhiyunhas the same effect as:: 257*4882a593Smuzhiyun 258*4882a593Smuzhiyun # hostname "darkstar" 259*4882a593Smuzhiyun # domainname "mydomain" 260*4882a593Smuzhiyun 261*4882a593SmuzhiyunNote, however, that the classic darkstar.frop.org has the 262*4882a593Smuzhiyunhostname "darkstar" and DNS (Internet Domain Name Server) 263*4882a593Smuzhiyundomainname "frop.org", not to be confused with the NIS (Network 264*4882a593SmuzhiyunInformation Service) or YP (Yellow Pages) domainname. These two 265*4882a593Smuzhiyundomain names are in general different. For a detailed discussion 266*4882a593Smuzhiyunsee the ``hostname(1)`` man page. 267*4882a593Smuzhiyun 268*4882a593Smuzhiyun 269*4882a593Smuzhiyunfirmware_config 270*4882a593Smuzhiyun=============== 271*4882a593Smuzhiyun 272*4882a593SmuzhiyunSee :doc:`/driver-api/firmware/fallback-mechanisms`. 273*4882a593Smuzhiyun 274*4882a593SmuzhiyunThe entries in this directory allow the firmware loader helper 275*4882a593Smuzhiyunfallback to be controlled: 276*4882a593Smuzhiyun 277*4882a593Smuzhiyun* ``force_sysfs_fallback``, when set to 1, forces the use of the 278*4882a593Smuzhiyun fallback; 279*4882a593Smuzhiyun* ``ignore_sysfs_fallback``, when set to 1, ignores any fallback. 280*4882a593Smuzhiyun 281*4882a593Smuzhiyun 282*4882a593Smuzhiyunftrace_dump_on_oops 283*4882a593Smuzhiyun=================== 284*4882a593Smuzhiyun 285*4882a593SmuzhiyunDetermines whether ``ftrace_dump()`` should be called on an oops (or 286*4882a593Smuzhiyunkernel panic). This will output the contents of the ftrace buffers to 287*4882a593Smuzhiyunthe console. This is very useful for capturing traces that lead to 288*4882a593Smuzhiyuncrashes and outputting them to a serial console. 289*4882a593Smuzhiyun 290*4882a593Smuzhiyun= =================================================== 291*4882a593Smuzhiyun0 Disabled (default). 292*4882a593Smuzhiyun1 Dump buffers of all CPUs. 293*4882a593Smuzhiyun2 Dump the buffer of the CPU that triggered the oops. 294*4882a593Smuzhiyun= =================================================== 295*4882a593Smuzhiyun 296*4882a593Smuzhiyun 297*4882a593Smuzhiyunftrace_enabled, stack_tracer_enabled 298*4882a593Smuzhiyun==================================== 299*4882a593Smuzhiyun 300*4882a593SmuzhiyunSee :doc:`/trace/ftrace`. 301*4882a593Smuzhiyun 302*4882a593Smuzhiyun 303*4882a593Smuzhiyunhardlockup_all_cpu_backtrace 304*4882a593Smuzhiyun============================ 305*4882a593Smuzhiyun 306*4882a593SmuzhiyunThis value controls the hard lockup detector behavior when a hard 307*4882a593Smuzhiyunlockup condition is detected as to whether or not to gather further 308*4882a593Smuzhiyundebug information. If enabled, arch-specific all-CPU stack dumping 309*4882a593Smuzhiyunwill be initiated. 310*4882a593Smuzhiyun 311*4882a593Smuzhiyun= ============================================ 312*4882a593Smuzhiyun0 Do nothing. This is the default behavior. 313*4882a593Smuzhiyun1 On detection capture more debug information. 314*4882a593Smuzhiyun= ============================================ 315*4882a593Smuzhiyun 316*4882a593Smuzhiyun 317*4882a593Smuzhiyunhardlockup_panic 318*4882a593Smuzhiyun================ 319*4882a593Smuzhiyun 320*4882a593SmuzhiyunThis parameter can be used to control whether the kernel panics 321*4882a593Smuzhiyunwhen a hard lockup is detected. 322*4882a593Smuzhiyun 323*4882a593Smuzhiyun= =========================== 324*4882a593Smuzhiyun0 Don't panic on hard lockup. 325*4882a593Smuzhiyun1 Panic on hard lockup. 326*4882a593Smuzhiyun= =========================== 327*4882a593Smuzhiyun 328*4882a593SmuzhiyunSee :doc:`/admin-guide/lockup-watchdogs` for more information. 329*4882a593SmuzhiyunThis can also be set using the nmi_watchdog kernel parameter. 330*4882a593Smuzhiyun 331*4882a593Smuzhiyun 332*4882a593Smuzhiyunhotplug 333*4882a593Smuzhiyun======= 334*4882a593Smuzhiyun 335*4882a593SmuzhiyunPath for the hotplug policy agent. 336*4882a593SmuzhiyunDefault value is "``/sbin/hotplug``". 337*4882a593Smuzhiyun 338*4882a593Smuzhiyun 339*4882a593Smuzhiyunhung_task_all_cpu_backtrace 340*4882a593Smuzhiyun=========================== 341*4882a593Smuzhiyun 342*4882a593SmuzhiyunIf this option is set, the kernel will send an NMI to all CPUs to dump 343*4882a593Smuzhiyuntheir backtraces when a hung task is detected. This file shows up if 344*4882a593SmuzhiyunCONFIG_DETECT_HUNG_TASK and CONFIG_SMP are enabled. 345*4882a593Smuzhiyun 346*4882a593Smuzhiyun0: Won't show all CPUs backtraces when a hung task is detected. 347*4882a593SmuzhiyunThis is the default behavior. 348*4882a593Smuzhiyun 349*4882a593Smuzhiyun1: Will non-maskably interrupt all CPUs and dump their backtraces when 350*4882a593Smuzhiyuna hung task is detected. 351*4882a593Smuzhiyun 352*4882a593Smuzhiyun 353*4882a593Smuzhiyunhung_task_panic 354*4882a593Smuzhiyun=============== 355*4882a593Smuzhiyun 356*4882a593SmuzhiyunControls the kernel's behavior when a hung task is detected. 357*4882a593SmuzhiyunThis file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled. 358*4882a593Smuzhiyun 359*4882a593Smuzhiyun= ================================================= 360*4882a593Smuzhiyun0 Continue operation. This is the default behavior. 361*4882a593Smuzhiyun1 Panic immediately. 362*4882a593Smuzhiyun= ================================================= 363*4882a593Smuzhiyun 364*4882a593Smuzhiyun 365*4882a593Smuzhiyunhung_task_check_count 366*4882a593Smuzhiyun===================== 367*4882a593Smuzhiyun 368*4882a593SmuzhiyunThe upper bound on the number of tasks that are checked. 369*4882a593SmuzhiyunThis file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled. 370*4882a593Smuzhiyun 371*4882a593Smuzhiyun 372*4882a593Smuzhiyunhung_task_timeout_secs 373*4882a593Smuzhiyun====================== 374*4882a593Smuzhiyun 375*4882a593SmuzhiyunWhen a task in D state did not get scheduled 376*4882a593Smuzhiyunfor more than this value report a warning. 377*4882a593SmuzhiyunThis file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled. 378*4882a593Smuzhiyun 379*4882a593Smuzhiyun0 means infinite timeout, no checking is done. 380*4882a593Smuzhiyun 381*4882a593SmuzhiyunPossible values to set are in range {0:``LONG_MAX``/``HZ``}. 382*4882a593Smuzhiyun 383*4882a593Smuzhiyun 384*4882a593Smuzhiyunhung_task_check_interval_secs 385*4882a593Smuzhiyun============================= 386*4882a593Smuzhiyun 387*4882a593SmuzhiyunHung task check interval. If hung task checking is enabled 388*4882a593Smuzhiyun(see `hung_task_timeout_secs`_), the check is done every 389*4882a593Smuzhiyun``hung_task_check_interval_secs`` seconds. 390*4882a593SmuzhiyunThis file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled. 391*4882a593Smuzhiyun 392*4882a593Smuzhiyun0 (default) means use ``hung_task_timeout_secs`` as checking 393*4882a593Smuzhiyuninterval. 394*4882a593Smuzhiyun 395*4882a593SmuzhiyunPossible values to set are in range {0:``LONG_MAX``/``HZ``}. 396*4882a593Smuzhiyun 397*4882a593Smuzhiyun 398*4882a593Smuzhiyunhung_task_warnings 399*4882a593Smuzhiyun================== 400*4882a593Smuzhiyun 401*4882a593SmuzhiyunThe maximum number of warnings to report. During a check interval 402*4882a593Smuzhiyunif a hung task is detected, this value is decreased by 1. 403*4882a593SmuzhiyunWhen this value reaches 0, no more warnings will be reported. 404*4882a593SmuzhiyunThis file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled. 405*4882a593Smuzhiyun 406*4882a593Smuzhiyun-1: report an infinite number of warnings. 407*4882a593Smuzhiyun 408*4882a593Smuzhiyun 409*4882a593Smuzhiyunhyperv_record_panic_msg 410*4882a593Smuzhiyun======================= 411*4882a593Smuzhiyun 412*4882a593SmuzhiyunControls whether the panic kmsg data should be reported to Hyper-V. 413*4882a593Smuzhiyun 414*4882a593Smuzhiyun= ========================================================= 415*4882a593Smuzhiyun0 Do not report panic kmsg data. 416*4882a593Smuzhiyun1 Report the panic kmsg data. This is the default behavior. 417*4882a593Smuzhiyun= ========================================================= 418*4882a593Smuzhiyun 419*4882a593Smuzhiyun 420*4882a593Smuzhiyunignore-unaligned-usertrap 421*4882a593Smuzhiyun========================= 422*4882a593Smuzhiyun 423*4882a593SmuzhiyunOn architectures where unaligned accesses cause traps, and where this 424*4882a593Smuzhiyunfeature is supported (``CONFIG_SYSCTL_ARCH_UNALIGN_NO_WARN``; 425*4882a593Smuzhiyuncurrently, ``arc`` and ``ia64``), controls whether all unaligned traps 426*4882a593Smuzhiyunare logged. 427*4882a593Smuzhiyun 428*4882a593Smuzhiyun= ============================================================= 429*4882a593Smuzhiyun0 Log all unaligned accesses. 430*4882a593Smuzhiyun1 Only warn the first time a process traps. This is the default 431*4882a593Smuzhiyun setting. 432*4882a593Smuzhiyun= ============================================================= 433*4882a593Smuzhiyun 434*4882a593SmuzhiyunSee also `unaligned-trap`_ and `unaligned-dump-stack`_. On ``ia64``, 435*4882a593Smuzhiyunthis allows system administrators to override the 436*4882a593Smuzhiyun``IA64_THREAD_UAC_NOPRINT`` ``prctl`` and avoid logs being flooded. 437*4882a593Smuzhiyun 438*4882a593Smuzhiyun 439*4882a593Smuzhiyunkexec_load_disabled 440*4882a593Smuzhiyun=================== 441*4882a593Smuzhiyun 442*4882a593SmuzhiyunA toggle indicating if the ``kexec_load`` syscall has been disabled. 443*4882a593SmuzhiyunThis value defaults to 0 (false: ``kexec_load`` enabled), but can be 444*4882a593Smuzhiyunset to 1 (true: ``kexec_load`` disabled). 445*4882a593SmuzhiyunOnce true, kexec can no longer be used, and the toggle cannot be set 446*4882a593Smuzhiyunback to false. 447*4882a593SmuzhiyunThis allows a kexec image to be loaded before disabling the syscall, 448*4882a593Smuzhiyunallowing a system to set up (and later use) an image without it being 449*4882a593Smuzhiyunaltered. 450*4882a593SmuzhiyunGenerally used together with the `modules_disabled`_ sysctl. 451*4882a593Smuzhiyun 452*4882a593Smuzhiyun 453*4882a593Smuzhiyunkptr_restrict 454*4882a593Smuzhiyun============= 455*4882a593Smuzhiyun 456*4882a593SmuzhiyunThis toggle indicates whether restrictions are placed on 457*4882a593Smuzhiyunexposing kernel addresses via ``/proc`` and other interfaces. 458*4882a593Smuzhiyun 459*4882a593SmuzhiyunWhen ``kptr_restrict`` is set to 0 (the default) the address is hashed 460*4882a593Smuzhiyunbefore printing. 461*4882a593Smuzhiyun(This is the equivalent to %p.) 462*4882a593Smuzhiyun 463*4882a593SmuzhiyunWhen ``kptr_restrict`` is set to 1, kernel pointers printed using the 464*4882a593Smuzhiyun%pK format specifier will be replaced with 0s unless the user has 465*4882a593Smuzhiyun``CAP_SYSLOG`` and effective user and group ids are equal to the real 466*4882a593Smuzhiyunids. 467*4882a593SmuzhiyunThis is because %pK checks are done at read() time rather than open() 468*4882a593Smuzhiyuntime, so if permissions are elevated between the open() and the read() 469*4882a593Smuzhiyun(e.g via a setuid binary) then %pK will not leak kernel pointers to 470*4882a593Smuzhiyununprivileged users. 471*4882a593SmuzhiyunNote, this is a temporary solution only. 472*4882a593SmuzhiyunThe correct long-term solution is to do the permission checks at 473*4882a593Smuzhiyunopen() time. 474*4882a593SmuzhiyunConsider removing world read permissions from files that use %pK, and 475*4882a593Smuzhiyunusing `dmesg_restrict`_ to protect against uses of %pK in ``dmesg(8)`` 476*4882a593Smuzhiyunif leaking kernel pointer values to unprivileged users is a concern. 477*4882a593Smuzhiyun 478*4882a593SmuzhiyunWhen ``kptr_restrict`` is set to 2, kernel pointers printed using 479*4882a593Smuzhiyun%pK will be replaced with 0s regardless of privileges. 480*4882a593Smuzhiyun 481*4882a593Smuzhiyun 482*4882a593Smuzhiyunmodprobe 483*4882a593Smuzhiyun======== 484*4882a593Smuzhiyun 485*4882a593SmuzhiyunThe full path to the usermode helper for autoloading kernel modules, 486*4882a593Smuzhiyunby default "/sbin/modprobe". This binary is executed when the kernel 487*4882a593Smuzhiyunrequests a module. For example, if userspace passes an unknown 488*4882a593Smuzhiyunfilesystem type to mount(), then the kernel will automatically request 489*4882a593Smuzhiyunthe corresponding filesystem module by executing this usermode helper. 490*4882a593SmuzhiyunThis usermode helper should insert the needed module into the kernel. 491*4882a593Smuzhiyun 492*4882a593SmuzhiyunThis sysctl only affects module autoloading. It has no effect on the 493*4882a593Smuzhiyunability to explicitly insert modules. 494*4882a593Smuzhiyun 495*4882a593SmuzhiyunThis sysctl can be used to debug module loading requests:: 496*4882a593Smuzhiyun 497*4882a593Smuzhiyun echo '#! /bin/sh' > /tmp/modprobe 498*4882a593Smuzhiyun echo 'echo "$@" >> /tmp/modprobe.log' >> /tmp/modprobe 499*4882a593Smuzhiyun echo 'exec /sbin/modprobe "$@"' >> /tmp/modprobe 500*4882a593Smuzhiyun chmod a+x /tmp/modprobe 501*4882a593Smuzhiyun echo /tmp/modprobe > /proc/sys/kernel/modprobe 502*4882a593Smuzhiyun 503*4882a593SmuzhiyunAlternatively, if this sysctl is set to the empty string, then module 504*4882a593Smuzhiyunautoloading is completely disabled. The kernel will not try to 505*4882a593Smuzhiyunexecute a usermode helper at all, nor will it call the 506*4882a593Smuzhiyunkernel_module_request LSM hook. 507*4882a593Smuzhiyun 508*4882a593SmuzhiyunIf CONFIG_STATIC_USERMODEHELPER=y is set in the kernel configuration, 509*4882a593Smuzhiyunthen the configured static usermode helper overrides this sysctl, 510*4882a593Smuzhiyunexcept that the empty string is still accepted to completely disable 511*4882a593Smuzhiyunmodule autoloading as described above. 512*4882a593Smuzhiyun 513*4882a593Smuzhiyunmodules_disabled 514*4882a593Smuzhiyun================ 515*4882a593Smuzhiyun 516*4882a593SmuzhiyunA toggle value indicating if modules are allowed to be loaded 517*4882a593Smuzhiyunin an otherwise modular kernel. This toggle defaults to off 518*4882a593Smuzhiyun(0), but can be set true (1). Once true, modules can be 519*4882a593Smuzhiyunneither loaded nor unloaded, and the toggle cannot be set back 520*4882a593Smuzhiyunto false. Generally used with the `kexec_load_disabled`_ toggle. 521*4882a593Smuzhiyun 522*4882a593Smuzhiyun 523*4882a593Smuzhiyun.. _msgmni: 524*4882a593Smuzhiyun 525*4882a593Smuzhiyunmsgmax, msgmnb, and msgmni 526*4882a593Smuzhiyun========================== 527*4882a593Smuzhiyun 528*4882a593Smuzhiyun``msgmax`` is the maximum size of an IPC message, in bytes. 8192 by 529*4882a593Smuzhiyundefault (``MSGMAX``). 530*4882a593Smuzhiyun 531*4882a593Smuzhiyun``msgmnb`` is the maximum size of an IPC queue, in bytes. 16384 by 532*4882a593Smuzhiyundefault (``MSGMNB``). 533*4882a593Smuzhiyun 534*4882a593Smuzhiyun``msgmni`` is the maximum number of IPC queues. 32000 by default 535*4882a593Smuzhiyun(``MSGMNI``). 536*4882a593Smuzhiyun 537*4882a593Smuzhiyun 538*4882a593Smuzhiyunmsg_next_id, sem_next_id, and shm_next_id (System V IPC) 539*4882a593Smuzhiyun======================================================== 540*4882a593Smuzhiyun 541*4882a593SmuzhiyunThese three toggles allows to specify desired id for next allocated IPC 542*4882a593Smuzhiyunobject: message, semaphore or shared memory respectively. 543*4882a593Smuzhiyun 544*4882a593SmuzhiyunBy default they are equal to -1, which means generic allocation logic. 545*4882a593SmuzhiyunPossible values to set are in range {0:``INT_MAX``}. 546*4882a593Smuzhiyun 547*4882a593SmuzhiyunNotes: 548*4882a593Smuzhiyun 1) kernel doesn't guarantee, that new object will have desired id. So, 549*4882a593Smuzhiyun it's up to userspace, how to handle an object with "wrong" id. 550*4882a593Smuzhiyun 2) Toggle with non-default value will be set back to -1 by kernel after 551*4882a593Smuzhiyun successful IPC object allocation. If an IPC object allocation syscall 552*4882a593Smuzhiyun fails, it is undefined if the value remains unmodified or is reset to -1. 553*4882a593Smuzhiyun 554*4882a593Smuzhiyun 555*4882a593Smuzhiyunngroups_max 556*4882a593Smuzhiyun=========== 557*4882a593Smuzhiyun 558*4882a593SmuzhiyunMaximum number of supplementary groups, _i.e._ the maximum size which 559*4882a593Smuzhiyun``setgroups`` will accept. Exports ``NGROUPS_MAX`` from the kernel. 560*4882a593Smuzhiyun 561*4882a593Smuzhiyun 562*4882a593Smuzhiyun 563*4882a593Smuzhiyunnmi_watchdog 564*4882a593Smuzhiyun============ 565*4882a593Smuzhiyun 566*4882a593SmuzhiyunThis parameter can be used to control the NMI watchdog 567*4882a593Smuzhiyun(i.e. the hard lockup detector) on x86 systems. 568*4882a593Smuzhiyun 569*4882a593Smuzhiyun= ================================= 570*4882a593Smuzhiyun0 Disable the hard lockup detector. 571*4882a593Smuzhiyun1 Enable the hard lockup detector. 572*4882a593Smuzhiyun= ================================= 573*4882a593Smuzhiyun 574*4882a593SmuzhiyunThe hard lockup detector monitors each CPU for its ability to respond to 575*4882a593Smuzhiyuntimer interrupts. The mechanism utilizes CPU performance counter registers 576*4882a593Smuzhiyunthat are programmed to generate Non-Maskable Interrupts (NMIs) periodically 577*4882a593Smuzhiyunwhile a CPU is busy. Hence, the alternative name 'NMI watchdog'. 578*4882a593Smuzhiyun 579*4882a593SmuzhiyunThe NMI watchdog is disabled by default if the kernel is running as a guest 580*4882a593Smuzhiyunin a KVM virtual machine. This default can be overridden by adding:: 581*4882a593Smuzhiyun 582*4882a593Smuzhiyun nmi_watchdog=1 583*4882a593Smuzhiyun 584*4882a593Smuzhiyunto the guest kernel command line (see :doc:`/admin-guide/kernel-parameters`). 585*4882a593Smuzhiyun 586*4882a593Smuzhiyun 587*4882a593Smuzhiyunnuma_balancing 588*4882a593Smuzhiyun============== 589*4882a593Smuzhiyun 590*4882a593SmuzhiyunEnables/disables automatic page fault based NUMA memory 591*4882a593Smuzhiyunbalancing. Memory is moved automatically to nodes 592*4882a593Smuzhiyunthat access it often. 593*4882a593Smuzhiyun 594*4882a593SmuzhiyunEnables/disables automatic NUMA memory balancing. On NUMA machines, there 595*4882a593Smuzhiyunis a performance penalty if remote memory is accessed by a CPU. When this 596*4882a593Smuzhiyunfeature is enabled the kernel samples what task thread is accessing memory 597*4882a593Smuzhiyunby periodically unmapping pages and later trapping a page fault. At the 598*4882a593Smuzhiyuntime of the page fault, it is determined if the data being accessed should 599*4882a593Smuzhiyunbe migrated to a local memory node. 600*4882a593Smuzhiyun 601*4882a593SmuzhiyunThe unmapping of pages and trapping faults incur additional overhead that 602*4882a593Smuzhiyunideally is offset by improved memory locality but there is no universal 603*4882a593Smuzhiyunguarantee. If the target workload is already bound to NUMA nodes then this 604*4882a593Smuzhiyunfeature should be disabled. Otherwise, if the system overhead from the 605*4882a593Smuzhiyunfeature is too high then the rate the kernel samples for NUMA hinting 606*4882a593Smuzhiyunfaults may be controlled by the `numa_balancing_scan_period_min_ms, 607*4882a593Smuzhiyunnuma_balancing_scan_delay_ms, numa_balancing_scan_period_max_ms, 608*4882a593Smuzhiyunnuma_balancing_scan_size_mb`_, and numa_balancing_settle_count sysctls. 609*4882a593Smuzhiyun 610*4882a593Smuzhiyun 611*4882a593Smuzhiyunnuma_balancing_scan_period_min_ms, numa_balancing_scan_delay_ms, numa_balancing_scan_period_max_ms, numa_balancing_scan_size_mb 612*4882a593Smuzhiyun=============================================================================================================================== 613*4882a593Smuzhiyun 614*4882a593Smuzhiyun 615*4882a593SmuzhiyunAutomatic NUMA balancing scans tasks address space and unmaps pages to 616*4882a593Smuzhiyundetect if pages are properly placed or if the data should be migrated to a 617*4882a593Smuzhiyunmemory node local to where the task is running. Every "scan delay" the task 618*4882a593Smuzhiyunscans the next "scan size" number of pages in its address space. When the 619*4882a593Smuzhiyunend of the address space is reached the scanner restarts from the beginning. 620*4882a593Smuzhiyun 621*4882a593SmuzhiyunIn combination, the "scan delay" and "scan size" determine the scan rate. 622*4882a593SmuzhiyunWhen "scan delay" decreases, the scan rate increases. The scan delay and 623*4882a593Smuzhiyunhence the scan rate of every task is adaptive and depends on historical 624*4882a593Smuzhiyunbehaviour. If pages are properly placed then the scan delay increases, 625*4882a593Smuzhiyunotherwise the scan delay decreases. The "scan size" is not adaptive but 626*4882a593Smuzhiyunthe higher the "scan size", the higher the scan rate. 627*4882a593Smuzhiyun 628*4882a593SmuzhiyunHigher scan rates incur higher system overhead as page faults must be 629*4882a593Smuzhiyuntrapped and potentially data must be migrated. However, the higher the scan 630*4882a593Smuzhiyunrate, the more quickly a tasks memory is migrated to a local node if the 631*4882a593Smuzhiyunworkload pattern changes and minimises performance impact due to remote 632*4882a593Smuzhiyunmemory accesses. These sysctls control the thresholds for scan delays and 633*4882a593Smuzhiyunthe number of pages scanned. 634*4882a593Smuzhiyun 635*4882a593Smuzhiyun``numa_balancing_scan_period_min_ms`` is the minimum time in milliseconds to 636*4882a593Smuzhiyunscan a tasks virtual memory. It effectively controls the maximum scanning 637*4882a593Smuzhiyunrate for each task. 638*4882a593Smuzhiyun 639*4882a593Smuzhiyun``numa_balancing_scan_delay_ms`` is the starting "scan delay" used for a task 640*4882a593Smuzhiyunwhen it initially forks. 641*4882a593Smuzhiyun 642*4882a593Smuzhiyun``numa_balancing_scan_period_max_ms`` is the maximum time in milliseconds to 643*4882a593Smuzhiyunscan a tasks virtual memory. It effectively controls the minimum scanning 644*4882a593Smuzhiyunrate for each task. 645*4882a593Smuzhiyun 646*4882a593Smuzhiyun``numa_balancing_scan_size_mb`` is how many megabytes worth of pages are 647*4882a593Smuzhiyunscanned for a given scan. 648*4882a593Smuzhiyun 649*4882a593Smuzhiyun 650*4882a593Smuzhiyunoops_all_cpu_backtrace 651*4882a593Smuzhiyun====================== 652*4882a593Smuzhiyun 653*4882a593SmuzhiyunIf this option is set, the kernel will send an NMI to all CPUs to dump 654*4882a593Smuzhiyuntheir backtraces when an oops event occurs. It should be used as a last 655*4882a593Smuzhiyunresort in case a panic cannot be triggered (to protect VMs running, for 656*4882a593Smuzhiyunexample) or kdump can't be collected. This file shows up if CONFIG_SMP 657*4882a593Smuzhiyunis enabled. 658*4882a593Smuzhiyun 659*4882a593Smuzhiyun0: Won't show all CPUs backtraces when an oops is detected. 660*4882a593SmuzhiyunThis is the default behavior. 661*4882a593Smuzhiyun 662*4882a593Smuzhiyun1: Will non-maskably interrupt all CPUs and dump their backtraces when 663*4882a593Smuzhiyunan oops event is detected. 664*4882a593Smuzhiyun 665*4882a593Smuzhiyun 666*4882a593Smuzhiyunosrelease, ostype & version 667*4882a593Smuzhiyun=========================== 668*4882a593Smuzhiyun 669*4882a593Smuzhiyun:: 670*4882a593Smuzhiyun 671*4882a593Smuzhiyun # cat osrelease 672*4882a593Smuzhiyun 2.1.88 673*4882a593Smuzhiyun # cat ostype 674*4882a593Smuzhiyun Linux 675*4882a593Smuzhiyun # cat version 676*4882a593Smuzhiyun #5 Wed Feb 25 21:49:24 MET 1998 677*4882a593Smuzhiyun 678*4882a593SmuzhiyunThe files ``osrelease`` and ``ostype`` should be clear enough. 679*4882a593Smuzhiyun``version`` 680*4882a593Smuzhiyunneeds a little more clarification however. The '#5' means that 681*4882a593Smuzhiyunthis is the fifth kernel built from this source base and the 682*4882a593Smuzhiyundate behind it indicates the time the kernel was built. 683*4882a593SmuzhiyunThe only way to tune these values is to rebuild the kernel :-) 684*4882a593Smuzhiyun 685*4882a593Smuzhiyun 686*4882a593Smuzhiyunoverflowgid & overflowuid 687*4882a593Smuzhiyun========================= 688*4882a593Smuzhiyun 689*4882a593Smuzhiyunif your architecture did not always support 32-bit UIDs (i.e. arm, 690*4882a593Smuzhiyuni386, m68k, sh, and sparc32), a fixed UID and GID will be returned to 691*4882a593Smuzhiyunapplications that use the old 16-bit UID/GID system calls, if the 692*4882a593Smuzhiyunactual UID or GID would exceed 65535. 693*4882a593Smuzhiyun 694*4882a593SmuzhiyunThese sysctls allow you to change the value of the fixed UID and GID. 695*4882a593SmuzhiyunThe default is 65534. 696*4882a593Smuzhiyun 697*4882a593Smuzhiyun 698*4882a593Smuzhiyunpanic 699*4882a593Smuzhiyun===== 700*4882a593Smuzhiyun 701*4882a593SmuzhiyunThe value in this file determines the behaviour of the kernel on a 702*4882a593Smuzhiyunpanic: 703*4882a593Smuzhiyun 704*4882a593Smuzhiyun* if zero, the kernel will loop forever; 705*4882a593Smuzhiyun* if negative, the kernel will reboot immediately; 706*4882a593Smuzhiyun* if positive, the kernel will reboot after the corresponding number 707*4882a593Smuzhiyun of seconds. 708*4882a593Smuzhiyun 709*4882a593SmuzhiyunWhen you use the software watchdog, the recommended setting is 60. 710*4882a593Smuzhiyun 711*4882a593Smuzhiyun 712*4882a593Smuzhiyunpanic_on_io_nmi 713*4882a593Smuzhiyun=============== 714*4882a593Smuzhiyun 715*4882a593SmuzhiyunControls the kernel's behavior when a CPU receives an NMI caused by 716*4882a593Smuzhiyunan IO error. 717*4882a593Smuzhiyun 718*4882a593Smuzhiyun= ================================================================== 719*4882a593Smuzhiyun0 Try to continue operation (default). 720*4882a593Smuzhiyun1 Panic immediately. The IO error triggered an NMI. This indicates a 721*4882a593Smuzhiyun serious system condition which could result in IO data corruption. 722*4882a593Smuzhiyun Rather than continuing, panicking might be a better choice. Some 723*4882a593Smuzhiyun servers issue this sort of NMI when the dump button is pushed, 724*4882a593Smuzhiyun and you can use this option to take a crash dump. 725*4882a593Smuzhiyun= ================================================================== 726*4882a593Smuzhiyun 727*4882a593Smuzhiyun 728*4882a593Smuzhiyunpanic_on_oops 729*4882a593Smuzhiyun============= 730*4882a593Smuzhiyun 731*4882a593SmuzhiyunControls the kernel's behaviour when an oops or BUG is encountered. 732*4882a593Smuzhiyun 733*4882a593Smuzhiyun= =================================================================== 734*4882a593Smuzhiyun0 Try to continue operation. 735*4882a593Smuzhiyun1 Panic immediately. If the `panic` sysctl is also non-zero then the 736*4882a593Smuzhiyun machine will be rebooted. 737*4882a593Smuzhiyun= =================================================================== 738*4882a593Smuzhiyun 739*4882a593Smuzhiyun 740*4882a593Smuzhiyunpanic_on_stackoverflow 741*4882a593Smuzhiyun====================== 742*4882a593Smuzhiyun 743*4882a593SmuzhiyunControls the kernel's behavior when detecting the overflows of 744*4882a593Smuzhiyunkernel, IRQ and exception stacks except a user stack. 745*4882a593SmuzhiyunThis file shows up if ``CONFIG_DEBUG_STACKOVERFLOW`` is enabled. 746*4882a593Smuzhiyun 747*4882a593Smuzhiyun= ========================== 748*4882a593Smuzhiyun0 Try to continue operation. 749*4882a593Smuzhiyun1 Panic immediately. 750*4882a593Smuzhiyun= ========================== 751*4882a593Smuzhiyun 752*4882a593Smuzhiyun 753*4882a593Smuzhiyunpanic_on_unrecovered_nmi 754*4882a593Smuzhiyun======================== 755*4882a593Smuzhiyun 756*4882a593SmuzhiyunThe default Linux behaviour on an NMI of either memory or unknown is 757*4882a593Smuzhiyunto continue operation. For many environments such as scientific 758*4882a593Smuzhiyuncomputing it is preferable that the box is taken out and the error 759*4882a593Smuzhiyundealt with than an uncorrected parity/ECC error get propagated. 760*4882a593Smuzhiyun 761*4882a593SmuzhiyunA small number of systems do generate NMIs for bizarre random reasons 762*4882a593Smuzhiyunsuch as power management so the default is off. That sysctl works like 763*4882a593Smuzhiyunthe existing panic controls already in that directory. 764*4882a593Smuzhiyun 765*4882a593Smuzhiyun 766*4882a593Smuzhiyunpanic_on_warn 767*4882a593Smuzhiyun============= 768*4882a593Smuzhiyun 769*4882a593SmuzhiyunCalls panic() in the WARN() path when set to 1. This is useful to avoid 770*4882a593Smuzhiyuna kernel rebuild when attempting to kdump at the location of a WARN(). 771*4882a593Smuzhiyun 772*4882a593Smuzhiyun= ================================================ 773*4882a593Smuzhiyun0 Only WARN(), default behaviour. 774*4882a593Smuzhiyun1 Call panic() after printing out WARN() location. 775*4882a593Smuzhiyun= ================================================ 776*4882a593Smuzhiyun 777*4882a593Smuzhiyun 778*4882a593Smuzhiyunpanic_print 779*4882a593Smuzhiyun=========== 780*4882a593Smuzhiyun 781*4882a593SmuzhiyunBitmask for printing system info when panic happens. User can chose 782*4882a593Smuzhiyuncombination of the following bits: 783*4882a593Smuzhiyun 784*4882a593Smuzhiyun===== ============================================ 785*4882a593Smuzhiyunbit 0 print all tasks info 786*4882a593Smuzhiyunbit 1 print system memory info 787*4882a593Smuzhiyunbit 2 print timer info 788*4882a593Smuzhiyunbit 3 print locks info if ``CONFIG_LOCKDEP`` is on 789*4882a593Smuzhiyunbit 4 print ftrace buffer 790*4882a593Smuzhiyunbit 5 print all printk messages in buffer 791*4882a593Smuzhiyun===== ============================================ 792*4882a593Smuzhiyun 793*4882a593SmuzhiyunSo for example to print tasks and memory info on panic, user can:: 794*4882a593Smuzhiyun 795*4882a593Smuzhiyun echo 3 > /proc/sys/kernel/panic_print 796*4882a593Smuzhiyun 797*4882a593Smuzhiyun 798*4882a593Smuzhiyunpanic_on_rcu_stall 799*4882a593Smuzhiyun================== 800*4882a593Smuzhiyun 801*4882a593SmuzhiyunWhen set to 1, calls panic() after RCU stall detection messages. This 802*4882a593Smuzhiyunis useful to define the root cause of RCU stalls using a vmcore. 803*4882a593Smuzhiyun 804*4882a593Smuzhiyun= ============================================================ 805*4882a593Smuzhiyun0 Do not panic() when RCU stall takes place, default behavior. 806*4882a593Smuzhiyun1 panic() after printing RCU stall messages. 807*4882a593Smuzhiyun= ============================================================ 808*4882a593Smuzhiyun 809*4882a593Smuzhiyun 810*4882a593Smuzhiyunperf_cpu_time_max_percent 811*4882a593Smuzhiyun========================= 812*4882a593Smuzhiyun 813*4882a593SmuzhiyunHints to the kernel how much CPU time it should be allowed to 814*4882a593Smuzhiyunuse to handle perf sampling events. If the perf subsystem 815*4882a593Smuzhiyunis informed that its samples are exceeding this limit, it 816*4882a593Smuzhiyunwill drop its sampling frequency to attempt to reduce its CPU 817*4882a593Smuzhiyunusage. 818*4882a593Smuzhiyun 819*4882a593SmuzhiyunSome perf sampling happens in NMIs. If these samples 820*4882a593Smuzhiyununexpectedly take too long to execute, the NMIs can become 821*4882a593Smuzhiyunstacked up next to each other so much that nothing else is 822*4882a593Smuzhiyunallowed to execute. 823*4882a593Smuzhiyun 824*4882a593Smuzhiyun===== ======================================================== 825*4882a593Smuzhiyun0 Disable the mechanism. Do not monitor or correct perf's 826*4882a593Smuzhiyun sampling rate no matter how CPU time it takes. 827*4882a593Smuzhiyun 828*4882a593Smuzhiyun1-100 Attempt to throttle perf's sample rate to this 829*4882a593Smuzhiyun percentage of CPU. Note: the kernel calculates an 830*4882a593Smuzhiyun "expected" length of each sample event. 100 here means 831*4882a593Smuzhiyun 100% of that expected length. Even if this is set to 832*4882a593Smuzhiyun 100, you may still see sample throttling if this 833*4882a593Smuzhiyun length is exceeded. Set to 0 if you truly do not care 834*4882a593Smuzhiyun how much CPU is consumed. 835*4882a593Smuzhiyun===== ======================================================== 836*4882a593Smuzhiyun 837*4882a593Smuzhiyun 838*4882a593Smuzhiyunperf_event_paranoid 839*4882a593Smuzhiyun=================== 840*4882a593Smuzhiyun 841*4882a593SmuzhiyunControls use of the performance events system by unprivileged 842*4882a593Smuzhiyunusers (without CAP_PERFMON). The default value is 2. 843*4882a593Smuzhiyun 844*4882a593SmuzhiyunFor backward compatibility reasons access to system performance 845*4882a593Smuzhiyunmonitoring and observability remains open for CAP_SYS_ADMIN 846*4882a593Smuzhiyunprivileged processes but CAP_SYS_ADMIN usage for secure system 847*4882a593Smuzhiyunperformance monitoring and observability operations is discouraged 848*4882a593Smuzhiyunwith respect to CAP_PERFMON use cases. 849*4882a593Smuzhiyun 850*4882a593Smuzhiyun=== ================================================================== 851*4882a593Smuzhiyun -1 Allow use of (almost) all events by all users. 852*4882a593Smuzhiyun 853*4882a593Smuzhiyun Ignore mlock limit after perf_event_mlock_kb without 854*4882a593Smuzhiyun ``CAP_IPC_LOCK``. 855*4882a593Smuzhiyun 856*4882a593Smuzhiyun>=0 Disallow ftrace function tracepoint by users without 857*4882a593Smuzhiyun ``CAP_PERFMON``. 858*4882a593Smuzhiyun 859*4882a593Smuzhiyun Disallow raw tracepoint access by users without ``CAP_PERFMON``. 860*4882a593Smuzhiyun 861*4882a593Smuzhiyun>=1 Disallow CPU event access by users without ``CAP_PERFMON``. 862*4882a593Smuzhiyun 863*4882a593Smuzhiyun>=2 Disallow kernel profiling by users without ``CAP_PERFMON``. 864*4882a593Smuzhiyun=== ================================================================== 865*4882a593Smuzhiyun 866*4882a593Smuzhiyun 867*4882a593Smuzhiyunperf_event_max_stack 868*4882a593Smuzhiyun==================== 869*4882a593Smuzhiyun 870*4882a593SmuzhiyunControls maximum number of stack frames to copy for (``attr.sample_type & 871*4882a593SmuzhiyunPERF_SAMPLE_CALLCHAIN``) configured events, for instance, when using 872*4882a593Smuzhiyun'``perf record -g``' or '``perf trace --call-graph fp``'. 873*4882a593Smuzhiyun 874*4882a593SmuzhiyunThis can only be done when no events are in use that have callchains 875*4882a593Smuzhiyunenabled, otherwise writing to this file will return ``-EBUSY``. 876*4882a593Smuzhiyun 877*4882a593SmuzhiyunThe default value is 127. 878*4882a593Smuzhiyun 879*4882a593Smuzhiyun 880*4882a593Smuzhiyunperf_event_mlock_kb 881*4882a593Smuzhiyun=================== 882*4882a593Smuzhiyun 883*4882a593SmuzhiyunControl size of per-cpu ring buffer not counted agains mlock limit. 884*4882a593Smuzhiyun 885*4882a593SmuzhiyunThe default value is 512 + 1 page 886*4882a593Smuzhiyun 887*4882a593Smuzhiyun 888*4882a593Smuzhiyunperf_event_max_contexts_per_stack 889*4882a593Smuzhiyun================================= 890*4882a593Smuzhiyun 891*4882a593SmuzhiyunControls maximum number of stack frame context entries for 892*4882a593Smuzhiyun(``attr.sample_type & PERF_SAMPLE_CALLCHAIN``) configured events, for 893*4882a593Smuzhiyuninstance, when using '``perf record -g``' or '``perf trace --call-graph fp``'. 894*4882a593Smuzhiyun 895*4882a593SmuzhiyunThis can only be done when no events are in use that have callchains 896*4882a593Smuzhiyunenabled, otherwise writing to this file will return ``-EBUSY``. 897*4882a593Smuzhiyun 898*4882a593SmuzhiyunThe default value is 8. 899*4882a593Smuzhiyun 900*4882a593Smuzhiyun 901*4882a593Smuzhiyunpid_max 902*4882a593Smuzhiyun======= 903*4882a593Smuzhiyun 904*4882a593SmuzhiyunPID allocation wrap value. When the kernel's next PID value 905*4882a593Smuzhiyunreaches this value, it wraps back to a minimum PID value. 906*4882a593SmuzhiyunPIDs of value ``pid_max`` or larger are not allocated. 907*4882a593Smuzhiyun 908*4882a593Smuzhiyun 909*4882a593Smuzhiyunns_last_pid 910*4882a593Smuzhiyun=========== 911*4882a593Smuzhiyun 912*4882a593SmuzhiyunThe last pid allocated in the current (the one task using this sysctl 913*4882a593Smuzhiyunlives in) pid namespace. When selecting a pid for a next task on fork 914*4882a593Smuzhiyunkernel tries to allocate a number starting from this one. 915*4882a593Smuzhiyun 916*4882a593Smuzhiyun 917*4882a593Smuzhiyunpowersave-nap (PPC only) 918*4882a593Smuzhiyun======================== 919*4882a593Smuzhiyun 920*4882a593SmuzhiyunIf set, Linux-PPC will use the 'nap' mode of powersaving, 921*4882a593Smuzhiyunotherwise the 'doze' mode will be used. 922*4882a593Smuzhiyun 923*4882a593Smuzhiyun 924*4882a593Smuzhiyun============================================================== 925*4882a593Smuzhiyun 926*4882a593Smuzhiyunprintk 927*4882a593Smuzhiyun====== 928*4882a593Smuzhiyun 929*4882a593SmuzhiyunThe four values in printk denote: ``console_loglevel``, 930*4882a593Smuzhiyun``default_message_loglevel``, ``minimum_console_loglevel`` and 931*4882a593Smuzhiyun``default_console_loglevel`` respectively. 932*4882a593Smuzhiyun 933*4882a593SmuzhiyunThese values influence printk() behavior when printing or 934*4882a593Smuzhiyunlogging error messages. See '``man 2 syslog``' for more info on 935*4882a593Smuzhiyunthe different loglevels. 936*4882a593Smuzhiyun 937*4882a593Smuzhiyun======================== ===================================== 938*4882a593Smuzhiyunconsole_loglevel messages with a higher priority than 939*4882a593Smuzhiyun this will be printed to the console 940*4882a593Smuzhiyundefault_message_loglevel messages without an explicit priority 941*4882a593Smuzhiyun will be printed with this priority 942*4882a593Smuzhiyunminimum_console_loglevel minimum (highest) value to which 943*4882a593Smuzhiyun console_loglevel can be set 944*4882a593Smuzhiyundefault_console_loglevel default value for console_loglevel 945*4882a593Smuzhiyun======================== ===================================== 946*4882a593Smuzhiyun 947*4882a593Smuzhiyun 948*4882a593Smuzhiyunprintk_delay 949*4882a593Smuzhiyun============ 950*4882a593Smuzhiyun 951*4882a593SmuzhiyunDelay each printk message in ``printk_delay`` milliseconds 952*4882a593Smuzhiyun 953*4882a593SmuzhiyunValue from 0 - 10000 is allowed. 954*4882a593Smuzhiyun 955*4882a593Smuzhiyun 956*4882a593Smuzhiyunprintk_ratelimit 957*4882a593Smuzhiyun================ 958*4882a593Smuzhiyun 959*4882a593SmuzhiyunSome warning messages are rate limited. ``printk_ratelimit`` specifies 960*4882a593Smuzhiyunthe minimum length of time between these messages (in seconds). 961*4882a593SmuzhiyunThe default value is 5 seconds. 962*4882a593Smuzhiyun 963*4882a593SmuzhiyunA value of 0 will disable rate limiting. 964*4882a593Smuzhiyun 965*4882a593Smuzhiyun 966*4882a593Smuzhiyunprintk_ratelimit_burst 967*4882a593Smuzhiyun====================== 968*4882a593Smuzhiyun 969*4882a593SmuzhiyunWhile long term we enforce one message per `printk_ratelimit`_ 970*4882a593Smuzhiyunseconds, we do allow a burst of messages to pass through. 971*4882a593Smuzhiyun``printk_ratelimit_burst`` specifies the number of messages we can 972*4882a593Smuzhiyunsend before ratelimiting kicks in. 973*4882a593Smuzhiyun 974*4882a593SmuzhiyunThe default value is 10 messages. 975*4882a593Smuzhiyun 976*4882a593Smuzhiyun 977*4882a593Smuzhiyunprintk_devkmsg 978*4882a593Smuzhiyun============== 979*4882a593Smuzhiyun 980*4882a593SmuzhiyunControl the logging to ``/dev/kmsg`` from userspace: 981*4882a593Smuzhiyun 982*4882a593Smuzhiyun========= ============================================= 983*4882a593Smuzhiyunratelimit default, ratelimited 984*4882a593Smuzhiyunon unlimited logging to /dev/kmsg from userspace 985*4882a593Smuzhiyunoff logging to /dev/kmsg disabled 986*4882a593Smuzhiyun========= ============================================= 987*4882a593Smuzhiyun 988*4882a593SmuzhiyunThe kernel command line parameter ``printk.devkmsg=`` overrides this and is 989*4882a593Smuzhiyuna one-time setting until next reboot: once set, it cannot be changed by 990*4882a593Smuzhiyunthis sysctl interface anymore. 991*4882a593Smuzhiyun 992*4882a593Smuzhiyun============================================================== 993*4882a593Smuzhiyun 994*4882a593Smuzhiyun 995*4882a593Smuzhiyunpty 996*4882a593Smuzhiyun=== 997*4882a593Smuzhiyun 998*4882a593SmuzhiyunSee Documentation/filesystems/devpts.rst. 999*4882a593Smuzhiyun 1000*4882a593Smuzhiyun 1001*4882a593Smuzhiyunrandom 1002*4882a593Smuzhiyun====== 1003*4882a593Smuzhiyun 1004*4882a593SmuzhiyunThis is a directory, with the following entries: 1005*4882a593Smuzhiyun 1006*4882a593Smuzhiyun* ``boot_id``: a UUID generated the first time this is retrieved, and 1007*4882a593Smuzhiyun unvarying after that; 1008*4882a593Smuzhiyun 1009*4882a593Smuzhiyun* ``uuid``: a UUID generated every time this is retrieved (this can 1010*4882a593Smuzhiyun thus be used to generate UUIDs at will); 1011*4882a593Smuzhiyun 1012*4882a593Smuzhiyun* ``entropy_avail``: the pool's entropy count, in bits; 1013*4882a593Smuzhiyun 1014*4882a593Smuzhiyun* ``poolsize``: the entropy pool size, in bits; 1015*4882a593Smuzhiyun 1016*4882a593Smuzhiyun* ``urandom_min_reseed_secs``: obsolete (used to determine the minimum 1017*4882a593Smuzhiyun number of seconds between urandom pool reseeding). This file is 1018*4882a593Smuzhiyun writable for compatibility purposes, but writing to it has no effect 1019*4882a593Smuzhiyun on any RNG behavior; 1020*4882a593Smuzhiyun 1021*4882a593Smuzhiyun* ``write_wakeup_threshold``: when the entropy count drops below this 1022*4882a593Smuzhiyun (as a number of bits), processes waiting to write to ``/dev/random`` 1023*4882a593Smuzhiyun are woken up. This file is writable for compatibility purposes, but 1024*4882a593Smuzhiyun writing to it has no effect on any RNG behavior. 1025*4882a593Smuzhiyun 1026*4882a593Smuzhiyun 1027*4882a593Smuzhiyunrandomize_va_space 1028*4882a593Smuzhiyun================== 1029*4882a593Smuzhiyun 1030*4882a593SmuzhiyunThis option can be used to select the type of process address 1031*4882a593Smuzhiyunspace randomization that is used in the system, for architectures 1032*4882a593Smuzhiyunthat support this feature. 1033*4882a593Smuzhiyun 1034*4882a593Smuzhiyun== =========================================================================== 1035*4882a593Smuzhiyun0 Turn the process address space randomization off. This is the 1036*4882a593Smuzhiyun default for architectures that do not support this feature anyways, 1037*4882a593Smuzhiyun and kernels that are booted with the "norandmaps" parameter. 1038*4882a593Smuzhiyun 1039*4882a593Smuzhiyun1 Make the addresses of mmap base, stack and VDSO page randomized. 1040*4882a593Smuzhiyun This, among other things, implies that shared libraries will be 1041*4882a593Smuzhiyun loaded to random addresses. Also for PIE-linked binaries, the 1042*4882a593Smuzhiyun location of code start is randomized. This is the default if the 1043*4882a593Smuzhiyun ``CONFIG_COMPAT_BRK`` option is enabled. 1044*4882a593Smuzhiyun 1045*4882a593Smuzhiyun2 Additionally enable heap randomization. This is the default if 1046*4882a593Smuzhiyun ``CONFIG_COMPAT_BRK`` is disabled. 1047*4882a593Smuzhiyun 1048*4882a593Smuzhiyun There are a few legacy applications out there (such as some ancient 1049*4882a593Smuzhiyun versions of libc.so.5 from 1996) that assume that brk area starts 1050*4882a593Smuzhiyun just after the end of the code+bss. These applications break when 1051*4882a593Smuzhiyun start of the brk area is randomized. There are however no known 1052*4882a593Smuzhiyun non-legacy applications that would be broken this way, so for most 1053*4882a593Smuzhiyun systems it is safe to choose full randomization. 1054*4882a593Smuzhiyun 1055*4882a593Smuzhiyun Systems with ancient and/or broken binaries should be configured 1056*4882a593Smuzhiyun with ``CONFIG_COMPAT_BRK`` enabled, which excludes the heap from process 1057*4882a593Smuzhiyun address space randomization. 1058*4882a593Smuzhiyun== =========================================================================== 1059*4882a593Smuzhiyun 1060*4882a593Smuzhiyun 1061*4882a593Smuzhiyunreal-root-dev 1062*4882a593Smuzhiyun============= 1063*4882a593Smuzhiyun 1064*4882a593SmuzhiyunSee :doc:`/admin-guide/initrd`. 1065*4882a593Smuzhiyun 1066*4882a593Smuzhiyun 1067*4882a593Smuzhiyunreboot-cmd (SPARC only) 1068*4882a593Smuzhiyun======================= 1069*4882a593Smuzhiyun 1070*4882a593Smuzhiyun??? This seems to be a way to give an argument to the Sparc 1071*4882a593SmuzhiyunROM/Flash boot loader. Maybe to tell it what to do after 1072*4882a593Smuzhiyunrebooting. ??? 1073*4882a593Smuzhiyun 1074*4882a593Smuzhiyun 1075*4882a593Smuzhiyunsched_energy_aware 1076*4882a593Smuzhiyun================== 1077*4882a593Smuzhiyun 1078*4882a593SmuzhiyunEnables/disables Energy Aware Scheduling (EAS). EAS starts 1079*4882a593Smuzhiyunautomatically on platforms where it can run (that is, 1080*4882a593Smuzhiyunplatforms with asymmetric CPU topologies and having an Energy 1081*4882a593SmuzhiyunModel available). If your platform happens to meet the 1082*4882a593Smuzhiyunrequirements for EAS but you do not want to use it, change 1083*4882a593Smuzhiyunthis value to 0. 1084*4882a593Smuzhiyun 1085*4882a593Smuzhiyun 1086*4882a593Smuzhiyunsched_schedstats 1087*4882a593Smuzhiyun================ 1088*4882a593Smuzhiyun 1089*4882a593SmuzhiyunEnables/disables scheduler statistics. Enabling this feature 1090*4882a593Smuzhiyunincurs a small amount of overhead in the scheduler but is 1091*4882a593Smuzhiyunuseful for debugging and performance tuning. 1092*4882a593Smuzhiyun 1093*4882a593Smuzhiyunsched_util_clamp_min: 1094*4882a593Smuzhiyun===================== 1095*4882a593Smuzhiyun 1096*4882a593SmuzhiyunMax allowed *minimum* utilization. 1097*4882a593Smuzhiyun 1098*4882a593SmuzhiyunDefault value is 1024, which is the maximum possible value. 1099*4882a593Smuzhiyun 1100*4882a593SmuzhiyunIt means that any requested uclamp.min value cannot be greater than 1101*4882a593Smuzhiyunsched_util_clamp_min, i.e., it is restricted to the range 1102*4882a593Smuzhiyun[0:sched_util_clamp_min]. 1103*4882a593Smuzhiyun 1104*4882a593Smuzhiyunsched_util_clamp_max: 1105*4882a593Smuzhiyun===================== 1106*4882a593Smuzhiyun 1107*4882a593SmuzhiyunMax allowed *maximum* utilization. 1108*4882a593Smuzhiyun 1109*4882a593SmuzhiyunDefault value is 1024, which is the maximum possible value. 1110*4882a593Smuzhiyun 1111*4882a593SmuzhiyunIt means that any requested uclamp.max value cannot be greater than 1112*4882a593Smuzhiyunsched_util_clamp_max, i.e., it is restricted to the range 1113*4882a593Smuzhiyun[0:sched_util_clamp_max]. 1114*4882a593Smuzhiyun 1115*4882a593Smuzhiyunsched_util_clamp_min_rt_default: 1116*4882a593Smuzhiyun================================ 1117*4882a593Smuzhiyun 1118*4882a593SmuzhiyunBy default Linux is tuned for performance. Which means that RT tasks always run 1119*4882a593Smuzhiyunat the highest frequency and most capable (highest capacity) CPU (in 1120*4882a593Smuzhiyunheterogeneous systems). 1121*4882a593Smuzhiyun 1122*4882a593SmuzhiyunUclamp achieves this by setting the requested uclamp.min of all RT tasks to 1123*4882a593Smuzhiyun1024 by default, which effectively boosts the tasks to run at the highest 1124*4882a593Smuzhiyunfrequency and biases them to run on the biggest CPU. 1125*4882a593Smuzhiyun 1126*4882a593SmuzhiyunThis knob allows admins to change the default behavior when uclamp is being 1127*4882a593Smuzhiyunused. In battery powered devices particularly, running at the maximum 1128*4882a593Smuzhiyuncapacity and frequency will increase energy consumption and shorten the battery 1129*4882a593Smuzhiyunlife. 1130*4882a593Smuzhiyun 1131*4882a593SmuzhiyunThis knob is only effective for RT tasks which the user hasn't modified their 1132*4882a593Smuzhiyunrequested uclamp.min value via sched_setattr() syscall. 1133*4882a593Smuzhiyun 1134*4882a593SmuzhiyunThis knob will not escape the range constraint imposed by sched_util_clamp_min 1135*4882a593Smuzhiyundefined above. 1136*4882a593Smuzhiyun 1137*4882a593SmuzhiyunFor example if 1138*4882a593Smuzhiyun 1139*4882a593Smuzhiyun sched_util_clamp_min_rt_default = 800 1140*4882a593Smuzhiyun sched_util_clamp_min = 600 1141*4882a593Smuzhiyun 1142*4882a593SmuzhiyunThen the boost will be clamped to 600 because 800 is outside of the permissible 1143*4882a593Smuzhiyunrange of [0:600]. This could happen for instance if a powersave mode will 1144*4882a593Smuzhiyunrestrict all boosts temporarily by modifying sched_util_clamp_min. As soon as 1145*4882a593Smuzhiyunthis restriction is lifted, the requested sched_util_clamp_min_rt_default 1146*4882a593Smuzhiyunwill take effect. 1147*4882a593Smuzhiyun 1148*4882a593Smuzhiyunseccomp 1149*4882a593Smuzhiyun======= 1150*4882a593Smuzhiyun 1151*4882a593SmuzhiyunSee :doc:`/userspace-api/seccomp_filter`. 1152*4882a593Smuzhiyun 1153*4882a593Smuzhiyun 1154*4882a593Smuzhiyunsg-big-buff 1155*4882a593Smuzhiyun=========== 1156*4882a593Smuzhiyun 1157*4882a593SmuzhiyunThis file shows the size of the generic SCSI (sg) buffer. 1158*4882a593SmuzhiyunYou can't tune it just yet, but you could change it on 1159*4882a593Smuzhiyuncompile time by editing ``include/scsi/sg.h`` and changing 1160*4882a593Smuzhiyunthe value of ``SG_BIG_BUFF``. 1161*4882a593Smuzhiyun 1162*4882a593SmuzhiyunThere shouldn't be any reason to change this value. If 1163*4882a593Smuzhiyunyou can come up with one, you probably know what you 1164*4882a593Smuzhiyunare doing anyway :) 1165*4882a593Smuzhiyun 1166*4882a593Smuzhiyun 1167*4882a593Smuzhiyunshmall 1168*4882a593Smuzhiyun====== 1169*4882a593Smuzhiyun 1170*4882a593SmuzhiyunThis parameter sets the total amount of shared memory pages that 1171*4882a593Smuzhiyuncan be used system wide. Hence, ``shmall`` should always be at least 1172*4882a593Smuzhiyun``ceil(shmmax/PAGE_SIZE)``. 1173*4882a593Smuzhiyun 1174*4882a593SmuzhiyunIf you are not sure what the default ``PAGE_SIZE`` is on your Linux 1175*4882a593Smuzhiyunsystem, you can run the following command:: 1176*4882a593Smuzhiyun 1177*4882a593Smuzhiyun # getconf PAGE_SIZE 1178*4882a593Smuzhiyun 1179*4882a593Smuzhiyun 1180*4882a593Smuzhiyunshmmax 1181*4882a593Smuzhiyun====== 1182*4882a593Smuzhiyun 1183*4882a593SmuzhiyunThis value can be used to query and set the run time limit 1184*4882a593Smuzhiyunon the maximum shared memory segment size that can be created. 1185*4882a593SmuzhiyunShared memory segments up to 1Gb are now supported in the 1186*4882a593Smuzhiyunkernel. This value defaults to ``SHMMAX``. 1187*4882a593Smuzhiyun 1188*4882a593Smuzhiyun 1189*4882a593Smuzhiyunshmmni 1190*4882a593Smuzhiyun====== 1191*4882a593Smuzhiyun 1192*4882a593SmuzhiyunThis value determines the maximum number of shared memory segments. 1193*4882a593Smuzhiyun4096 by default (``SHMMNI``). 1194*4882a593Smuzhiyun 1195*4882a593Smuzhiyun 1196*4882a593Smuzhiyunshm_rmid_forced 1197*4882a593Smuzhiyun=============== 1198*4882a593Smuzhiyun 1199*4882a593SmuzhiyunLinux lets you set resource limits, including how much memory one 1200*4882a593Smuzhiyunprocess can consume, via ``setrlimit(2)``. Unfortunately, shared memory 1201*4882a593Smuzhiyunsegments are allowed to exist without association with any process, and 1202*4882a593Smuzhiyunthus might not be counted against any resource limits. If enabled, 1203*4882a593Smuzhiyunshared memory segments are automatically destroyed when their attach 1204*4882a593Smuzhiyuncount becomes zero after a detach or a process termination. It will 1205*4882a593Smuzhiyunalso destroy segments that were created, but never attached to, on exit 1206*4882a593Smuzhiyunfrom the process. The only use left for ``IPC_RMID`` is to immediately 1207*4882a593Smuzhiyundestroy an unattached segment. Of course, this breaks the way things are 1208*4882a593Smuzhiyundefined, so some applications might stop working. Note that this 1209*4882a593Smuzhiyunfeature will do you no good unless you also configure your resource 1210*4882a593Smuzhiyunlimits (in particular, ``RLIMIT_AS`` and ``RLIMIT_NPROC``). Most systems don't 1211*4882a593Smuzhiyunneed this. 1212*4882a593Smuzhiyun 1213*4882a593SmuzhiyunNote that if you change this from 0 to 1, already created segments 1214*4882a593Smuzhiyunwithout users and with a dead originative process will be destroyed. 1215*4882a593Smuzhiyun 1216*4882a593Smuzhiyun 1217*4882a593Smuzhiyunsysctl_writes_strict 1218*4882a593Smuzhiyun==================== 1219*4882a593Smuzhiyun 1220*4882a593SmuzhiyunControl how file position affects the behavior of updating sysctl values 1221*4882a593Smuzhiyunvia the ``/proc/sys`` interface: 1222*4882a593Smuzhiyun 1223*4882a593Smuzhiyun == ====================================================================== 1224*4882a593Smuzhiyun -1 Legacy per-write sysctl value handling, with no printk warnings. 1225*4882a593Smuzhiyun Each write syscall must fully contain the sysctl value to be 1226*4882a593Smuzhiyun written, and multiple writes on the same sysctl file descriptor 1227*4882a593Smuzhiyun will rewrite the sysctl value, regardless of file position. 1228*4882a593Smuzhiyun 0 Same behavior as above, but warn about processes that perform writes 1229*4882a593Smuzhiyun to a sysctl file descriptor when the file position is not 0. 1230*4882a593Smuzhiyun 1 (default) Respect file position when writing sysctl strings. Multiple 1231*4882a593Smuzhiyun writes will append to the sysctl value buffer. Anything past the max 1232*4882a593Smuzhiyun length of the sysctl value buffer will be ignored. Writes to numeric 1233*4882a593Smuzhiyun sysctl entries must always be at file position 0 and the value must 1234*4882a593Smuzhiyun be fully contained in the buffer sent in the write syscall. 1235*4882a593Smuzhiyun == ====================================================================== 1236*4882a593Smuzhiyun 1237*4882a593Smuzhiyun 1238*4882a593Smuzhiyunsoftlockup_all_cpu_backtrace 1239*4882a593Smuzhiyun============================ 1240*4882a593Smuzhiyun 1241*4882a593SmuzhiyunThis value controls the soft lockup detector thread's behavior 1242*4882a593Smuzhiyunwhen a soft lockup condition is detected as to whether or not 1243*4882a593Smuzhiyunto gather further debug information. If enabled, each cpu will 1244*4882a593Smuzhiyunbe issued an NMI and instructed to capture stack trace. 1245*4882a593Smuzhiyun 1246*4882a593SmuzhiyunThis feature is only applicable for architectures which support 1247*4882a593SmuzhiyunNMI. 1248*4882a593Smuzhiyun 1249*4882a593Smuzhiyun= ============================================ 1250*4882a593Smuzhiyun0 Do nothing. This is the default behavior. 1251*4882a593Smuzhiyun1 On detection capture more debug information. 1252*4882a593Smuzhiyun= ============================================ 1253*4882a593Smuzhiyun 1254*4882a593Smuzhiyun 1255*4882a593Smuzhiyunsoftlockup_panic 1256*4882a593Smuzhiyun================= 1257*4882a593Smuzhiyun 1258*4882a593SmuzhiyunThis parameter can be used to control whether the kernel panics 1259*4882a593Smuzhiyunwhen a soft lockup is detected. 1260*4882a593Smuzhiyun 1261*4882a593Smuzhiyun= ============================================ 1262*4882a593Smuzhiyun0 Don't panic on soft lockup. 1263*4882a593Smuzhiyun1 Panic on soft lockup. 1264*4882a593Smuzhiyun= ============================================ 1265*4882a593Smuzhiyun 1266*4882a593SmuzhiyunThis can also be set using the softlockup_panic kernel parameter. 1267*4882a593Smuzhiyun 1268*4882a593Smuzhiyun 1269*4882a593Smuzhiyunsoft_watchdog 1270*4882a593Smuzhiyun============= 1271*4882a593Smuzhiyun 1272*4882a593SmuzhiyunThis parameter can be used to control the soft lockup detector. 1273*4882a593Smuzhiyun 1274*4882a593Smuzhiyun= ================================= 1275*4882a593Smuzhiyun0 Disable the soft lockup detector. 1276*4882a593Smuzhiyun1 Enable the soft lockup detector. 1277*4882a593Smuzhiyun= ================================= 1278*4882a593Smuzhiyun 1279*4882a593SmuzhiyunThe soft lockup detector monitors CPUs for threads that are hogging the CPUs 1280*4882a593Smuzhiyunwithout rescheduling voluntarily, and thus prevent the 'watchdog/N' threads 1281*4882a593Smuzhiyunfrom running. The mechanism depends on the CPUs ability to respond to timer 1282*4882a593Smuzhiyuninterrupts which are needed for the 'watchdog/N' threads to be woken up by 1283*4882a593Smuzhiyunthe watchdog timer function, otherwise the NMI watchdog — if enabled — can 1284*4882a593Smuzhiyundetect a hard lockup condition. 1285*4882a593Smuzhiyun 1286*4882a593Smuzhiyun 1287*4882a593Smuzhiyunstack_erasing 1288*4882a593Smuzhiyun============= 1289*4882a593Smuzhiyun 1290*4882a593SmuzhiyunThis parameter can be used to control kernel stack erasing at the end 1291*4882a593Smuzhiyunof syscalls for kernels built with ``CONFIG_GCC_PLUGIN_STACKLEAK``. 1292*4882a593Smuzhiyun 1293*4882a593SmuzhiyunThat erasing reduces the information which kernel stack leak bugs 1294*4882a593Smuzhiyuncan reveal and blocks some uninitialized stack variable attacks. 1295*4882a593SmuzhiyunThe tradeoff is the performance impact: on a single CPU system kernel 1296*4882a593Smuzhiyuncompilation sees a 1% slowdown, other systems and workloads may vary. 1297*4882a593Smuzhiyun 1298*4882a593Smuzhiyun= ==================================================================== 1299*4882a593Smuzhiyun0 Kernel stack erasing is disabled, STACKLEAK_METRICS are not updated. 1300*4882a593Smuzhiyun1 Kernel stack erasing is enabled (default), it is performed before 1301*4882a593Smuzhiyun returning to the userspace at the end of syscalls. 1302*4882a593Smuzhiyun= ==================================================================== 1303*4882a593Smuzhiyun 1304*4882a593Smuzhiyun 1305*4882a593Smuzhiyunstop-a (SPARC only) 1306*4882a593Smuzhiyun=================== 1307*4882a593Smuzhiyun 1308*4882a593SmuzhiyunControls Stop-A: 1309*4882a593Smuzhiyun 1310*4882a593Smuzhiyun= ==================================== 1311*4882a593Smuzhiyun0 Stop-A has no effect. 1312*4882a593Smuzhiyun1 Stop-A breaks to the PROM (default). 1313*4882a593Smuzhiyun= ==================================== 1314*4882a593Smuzhiyun 1315*4882a593SmuzhiyunStop-A is always enabled on a panic, so that the user can return to 1316*4882a593Smuzhiyunthe boot PROM. 1317*4882a593Smuzhiyun 1318*4882a593Smuzhiyun 1319*4882a593Smuzhiyunsysrq 1320*4882a593Smuzhiyun===== 1321*4882a593Smuzhiyun 1322*4882a593SmuzhiyunSee :doc:`/admin-guide/sysrq`. 1323*4882a593Smuzhiyun 1324*4882a593Smuzhiyun 1325*4882a593Smuzhiyuntainted 1326*4882a593Smuzhiyun======= 1327*4882a593Smuzhiyun 1328*4882a593SmuzhiyunNon-zero if the kernel has been tainted. Numeric values, which can be 1329*4882a593SmuzhiyunORed together. The letters are seen in "Tainted" line of Oops reports. 1330*4882a593Smuzhiyun 1331*4882a593Smuzhiyun====== ===== ============================================================== 1332*4882a593Smuzhiyun 1 `(P)` proprietary module was loaded 1333*4882a593Smuzhiyun 2 `(F)` module was force loaded 1334*4882a593Smuzhiyun 4 `(S)` SMP kernel oops on an officially SMP incapable processor 1335*4882a593Smuzhiyun 8 `(R)` module was force unloaded 1336*4882a593Smuzhiyun 16 `(M)` processor reported a Machine Check Exception (MCE) 1337*4882a593Smuzhiyun 32 `(B)` bad page referenced or some unexpected page flags 1338*4882a593Smuzhiyun 64 `(U)` taint requested by userspace application 1339*4882a593Smuzhiyun 128 `(D)` kernel died recently, i.e. there was an OOPS or BUG 1340*4882a593Smuzhiyun 256 `(A)` an ACPI table was overridden by user 1341*4882a593Smuzhiyun 512 `(W)` kernel issued warning 1342*4882a593Smuzhiyun 1024 `(C)` staging driver was loaded 1343*4882a593Smuzhiyun 2048 `(I)` workaround for bug in platform firmware applied 1344*4882a593Smuzhiyun 4096 `(O)` externally-built ("out-of-tree") module was loaded 1345*4882a593Smuzhiyun 8192 `(E)` unsigned module was loaded 1346*4882a593Smuzhiyun 16384 `(L)` soft lockup occurred 1347*4882a593Smuzhiyun 32768 `(K)` kernel has been live patched 1348*4882a593Smuzhiyun 65536 `(X)` Auxiliary taint, defined and used by for distros 1349*4882a593Smuzhiyun131072 `(T)` The kernel was built with the struct randomization plugin 1350*4882a593Smuzhiyun====== ===== ============================================================== 1351*4882a593Smuzhiyun 1352*4882a593SmuzhiyunSee :doc:`/admin-guide/tainted-kernels` for more information. 1353*4882a593Smuzhiyun 1354*4882a593SmuzhiyunNote: 1355*4882a593Smuzhiyun writes to this sysctl interface will fail with ``EINVAL`` if the kernel is 1356*4882a593Smuzhiyun booted with the command line option ``panic_on_taint=<bitmask>,nousertaint`` 1357*4882a593Smuzhiyun and any of the ORed together values being written to ``tainted`` match with 1358*4882a593Smuzhiyun the bitmask declared on panic_on_taint. 1359*4882a593Smuzhiyun See :doc:`/admin-guide/kernel-parameters` for more details on that particular 1360*4882a593Smuzhiyun kernel command line option and its optional ``nousertaint`` switch. 1361*4882a593Smuzhiyun 1362*4882a593Smuzhiyunthreads-max 1363*4882a593Smuzhiyun=========== 1364*4882a593Smuzhiyun 1365*4882a593SmuzhiyunThis value controls the maximum number of threads that can be created 1366*4882a593Smuzhiyunusing ``fork()``. 1367*4882a593Smuzhiyun 1368*4882a593SmuzhiyunDuring initialization the kernel sets this value such that even if the 1369*4882a593Smuzhiyunmaximum number of threads is created, the thread structures occupy only 1370*4882a593Smuzhiyuna part (1/8th) of the available RAM pages. 1371*4882a593Smuzhiyun 1372*4882a593SmuzhiyunThe minimum value that can be written to ``threads-max`` is 1. 1373*4882a593Smuzhiyun 1374*4882a593SmuzhiyunThe maximum value that can be written to ``threads-max`` is given by the 1375*4882a593Smuzhiyunconstant ``FUTEX_TID_MASK`` (0x3fffffff). 1376*4882a593Smuzhiyun 1377*4882a593SmuzhiyunIf a value outside of this range is written to ``threads-max`` an 1378*4882a593Smuzhiyun``EINVAL`` error occurs. 1379*4882a593Smuzhiyun 1380*4882a593Smuzhiyun 1381*4882a593Smuzhiyuntraceoff_on_warning 1382*4882a593Smuzhiyun=================== 1383*4882a593Smuzhiyun 1384*4882a593SmuzhiyunWhen set, disables tracing (see :doc:`/trace/ftrace`) when a 1385*4882a593Smuzhiyun``WARN()`` is hit. 1386*4882a593Smuzhiyun 1387*4882a593Smuzhiyun 1388*4882a593Smuzhiyuntracepoint_printk 1389*4882a593Smuzhiyun================= 1390*4882a593Smuzhiyun 1391*4882a593SmuzhiyunWhen tracepoints are sent to printk() (enabled by the ``tp_printk`` 1392*4882a593Smuzhiyunboot parameter), this entry provides runtime control:: 1393*4882a593Smuzhiyun 1394*4882a593Smuzhiyun echo 0 > /proc/sys/kernel/tracepoint_printk 1395*4882a593Smuzhiyun 1396*4882a593Smuzhiyunwill stop tracepoints from being sent to printk(), and:: 1397*4882a593Smuzhiyun 1398*4882a593Smuzhiyun echo 1 > /proc/sys/kernel/tracepoint_printk 1399*4882a593Smuzhiyun 1400*4882a593Smuzhiyunwill send them to printk() again. 1401*4882a593Smuzhiyun 1402*4882a593SmuzhiyunThis only works if the kernel was booted with ``tp_printk`` enabled. 1403*4882a593Smuzhiyun 1404*4882a593SmuzhiyunSee :doc:`/admin-guide/kernel-parameters` and 1405*4882a593Smuzhiyun:doc:`/trace/boottime-trace`. 1406*4882a593Smuzhiyun 1407*4882a593Smuzhiyun 1408*4882a593Smuzhiyun.. _unaligned-dump-stack: 1409*4882a593Smuzhiyun 1410*4882a593Smuzhiyununaligned-dump-stack (ia64) 1411*4882a593Smuzhiyun=========================== 1412*4882a593Smuzhiyun 1413*4882a593SmuzhiyunWhen logging unaligned accesses, controls whether the stack is 1414*4882a593Smuzhiyundumped. 1415*4882a593Smuzhiyun 1416*4882a593Smuzhiyun= =================================================== 1417*4882a593Smuzhiyun0 Do not dump the stack. This is the default setting. 1418*4882a593Smuzhiyun1 Dump the stack. 1419*4882a593Smuzhiyun= =================================================== 1420*4882a593Smuzhiyun 1421*4882a593SmuzhiyunSee also `ignore-unaligned-usertrap`_. 1422*4882a593Smuzhiyun 1423*4882a593Smuzhiyun 1424*4882a593Smuzhiyununaligned-trap 1425*4882a593Smuzhiyun============== 1426*4882a593Smuzhiyun 1427*4882a593SmuzhiyunOn architectures where unaligned accesses cause traps, and where this 1428*4882a593Smuzhiyunfeature is supported (``CONFIG_SYSCTL_ARCH_UNALIGN_ALLOW``; currently, 1429*4882a593Smuzhiyun``arc`` and ``parisc``), controls whether unaligned traps are caught 1430*4882a593Smuzhiyunand emulated (instead of failing). 1431*4882a593Smuzhiyun 1432*4882a593Smuzhiyun= ======================================================== 1433*4882a593Smuzhiyun0 Do not emulate unaligned accesses. 1434*4882a593Smuzhiyun1 Emulate unaligned accesses. This is the default setting. 1435*4882a593Smuzhiyun= ======================================================== 1436*4882a593Smuzhiyun 1437*4882a593SmuzhiyunSee also `ignore-unaligned-usertrap`_. 1438*4882a593Smuzhiyun 1439*4882a593Smuzhiyun 1440*4882a593Smuzhiyununknown_nmi_panic 1441*4882a593Smuzhiyun================= 1442*4882a593Smuzhiyun 1443*4882a593SmuzhiyunThe value in this file affects behavior of handling NMI. When the 1444*4882a593Smuzhiyunvalue is non-zero, unknown NMI is trapped and then panic occurs. At 1445*4882a593Smuzhiyunthat time, kernel debugging information is displayed on console. 1446*4882a593Smuzhiyun 1447*4882a593SmuzhiyunNMI switch that most IA32 servers have fires unknown NMI up, for 1448*4882a593Smuzhiyunexample. If a system hangs up, try pressing the NMI switch. 1449*4882a593Smuzhiyun 1450*4882a593Smuzhiyun 1451*4882a593Smuzhiyununprivileged_bpf_disabled 1452*4882a593Smuzhiyun========================= 1453*4882a593Smuzhiyun 1454*4882a593SmuzhiyunWriting 1 to this entry will disable unprivileged calls to ``bpf()``; 1455*4882a593Smuzhiyunonce disabled, calling ``bpf()`` without ``CAP_SYS_ADMIN`` or ``CAP_BPF`` 1456*4882a593Smuzhiyunwill return ``-EPERM``. Once set to 1, this can't be cleared from the 1457*4882a593Smuzhiyunrunning kernel anymore. 1458*4882a593Smuzhiyun 1459*4882a593SmuzhiyunWriting 2 to this entry will also disable unprivileged calls to ``bpf()``, 1460*4882a593Smuzhiyunhowever, an admin can still change this setting later on, if needed, by 1461*4882a593Smuzhiyunwriting 0 or 1 to this entry. 1462*4882a593Smuzhiyun 1463*4882a593SmuzhiyunIf ``BPF_UNPRIV_DEFAULT_OFF`` is enabled in the kernel config, then this 1464*4882a593Smuzhiyunentry will default to 2 instead of 0. 1465*4882a593Smuzhiyun 1466*4882a593Smuzhiyun= ============================================================= 1467*4882a593Smuzhiyun0 Unprivileged calls to ``bpf()`` are enabled 1468*4882a593Smuzhiyun1 Unprivileged calls to ``bpf()`` are disabled without recovery 1469*4882a593Smuzhiyun2 Unprivileged calls to ``bpf()`` are disabled 1470*4882a593Smuzhiyun= ============================================================= 1471*4882a593Smuzhiyun 1472*4882a593Smuzhiyunwatchdog 1473*4882a593Smuzhiyun======== 1474*4882a593Smuzhiyun 1475*4882a593SmuzhiyunThis parameter can be used to disable or enable the soft lockup detector 1476*4882a593Smuzhiyun*and* the NMI watchdog (i.e. the hard lockup detector) at the same time. 1477*4882a593Smuzhiyun 1478*4882a593Smuzhiyun= ============================== 1479*4882a593Smuzhiyun0 Disable both lockup detectors. 1480*4882a593Smuzhiyun1 Enable both lockup detectors. 1481*4882a593Smuzhiyun= ============================== 1482*4882a593Smuzhiyun 1483*4882a593SmuzhiyunThe soft lockup detector and the NMI watchdog can also be disabled or 1484*4882a593Smuzhiyunenabled individually, using the ``soft_watchdog`` and ``nmi_watchdog`` 1485*4882a593Smuzhiyunparameters. 1486*4882a593SmuzhiyunIf the ``watchdog`` parameter is read, for example by executing:: 1487*4882a593Smuzhiyun 1488*4882a593Smuzhiyun cat /proc/sys/kernel/watchdog 1489*4882a593Smuzhiyun 1490*4882a593Smuzhiyunthe output of this command (0 or 1) shows the logical OR of 1491*4882a593Smuzhiyun``soft_watchdog`` and ``nmi_watchdog``. 1492*4882a593Smuzhiyun 1493*4882a593Smuzhiyun 1494*4882a593Smuzhiyunwatchdog_cpumask 1495*4882a593Smuzhiyun================ 1496*4882a593Smuzhiyun 1497*4882a593SmuzhiyunThis value can be used to control on which cpus the watchdog may run. 1498*4882a593SmuzhiyunThe default cpumask is all possible cores, but if ``NO_HZ_FULL`` is 1499*4882a593Smuzhiyunenabled in the kernel config, and cores are specified with the 1500*4882a593Smuzhiyun``nohz_full=`` boot argument, those cores are excluded by default. 1501*4882a593SmuzhiyunOffline cores can be included in this mask, and if the core is later 1502*4882a593Smuzhiyunbrought online, the watchdog will be started based on the mask value. 1503*4882a593Smuzhiyun 1504*4882a593SmuzhiyunTypically this value would only be touched in the ``nohz_full`` case 1505*4882a593Smuzhiyunto re-enable cores that by default were not running the watchdog, 1506*4882a593Smuzhiyunif a kernel lockup was suspected on those cores. 1507*4882a593Smuzhiyun 1508*4882a593SmuzhiyunThe argument value is the standard cpulist format for cpumasks, 1509*4882a593Smuzhiyunso for example to enable the watchdog on cores 0, 2, 3, and 4 you 1510*4882a593Smuzhiyunmight say:: 1511*4882a593Smuzhiyun 1512*4882a593Smuzhiyun echo 0,2-4 > /proc/sys/kernel/watchdog_cpumask 1513*4882a593Smuzhiyun 1514*4882a593Smuzhiyun 1515*4882a593Smuzhiyunwatchdog_thresh 1516*4882a593Smuzhiyun=============== 1517*4882a593Smuzhiyun 1518*4882a593SmuzhiyunThis value can be used to control the frequency of hrtimer and NMI 1519*4882a593Smuzhiyunevents and the soft and hard lockup thresholds. The default threshold 1520*4882a593Smuzhiyunis 10 seconds. 1521*4882a593Smuzhiyun 1522*4882a593SmuzhiyunThe softlockup threshold is (``2 * watchdog_thresh``). Setting this 1523*4882a593Smuzhiyuntunable to zero will disable lockup detection altogether. 1524