xref: /OK3568_Linux_fs/kernel/Documentation/vm/overcommit-accounting.rst (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593Smuzhiyun.. _overcommit_accounting:
2*4882a593Smuzhiyun
3*4882a593Smuzhiyun=====================
4*4882a593SmuzhiyunOvercommit Accounting
5*4882a593Smuzhiyun=====================
6*4882a593Smuzhiyun
7*4882a593SmuzhiyunThe Linux kernel supports the following overcommit handling modes
8*4882a593Smuzhiyun
9*4882a593Smuzhiyun0
10*4882a593Smuzhiyun	Heuristic overcommit handling. Obvious overcommits of address
11*4882a593Smuzhiyun	space are refused. Used for a typical system. It ensures a
12*4882a593Smuzhiyun	seriously wild allocation fails while allowing overcommit to
13*4882a593Smuzhiyun	reduce swap usage.  root is allowed to allocate slightly more
14*4882a593Smuzhiyun	memory in this mode. This is the default.
15*4882a593Smuzhiyun
16*4882a593Smuzhiyun1
17*4882a593Smuzhiyun	Always overcommit. Appropriate for some scientific
18*4882a593Smuzhiyun	applications. Classic example is code using sparse arrays and
19*4882a593Smuzhiyun	just relying on the virtual memory consisting almost entirely
20*4882a593Smuzhiyun	of zero pages.
21*4882a593Smuzhiyun
22*4882a593Smuzhiyun2
23*4882a593Smuzhiyun	Don't overcommit. The total address space commit for the
24*4882a593Smuzhiyun	system is not permitted to exceed swap + a configurable amount
25*4882a593Smuzhiyun	(default is 50%) of physical RAM.  Depending on the amount you
26*4882a593Smuzhiyun	use, in most situations this means a process will not be
27*4882a593Smuzhiyun	killed while accessing pages but will receive errors on memory
28*4882a593Smuzhiyun	allocation as appropriate.
29*4882a593Smuzhiyun
30*4882a593Smuzhiyun	Useful for applications that want to guarantee their memory
31*4882a593Smuzhiyun	allocations will be available in the future without having to
32*4882a593Smuzhiyun	initialize every page.
33*4882a593Smuzhiyun
34*4882a593SmuzhiyunThe overcommit policy is set via the sysctl ``vm.overcommit_memory``.
35*4882a593Smuzhiyun
36*4882a593SmuzhiyunThe overcommit amount can be set via ``vm.overcommit_ratio`` (percentage)
37*4882a593Smuzhiyunor ``vm.overcommit_kbytes`` (absolute value).
38*4882a593Smuzhiyun
39*4882a593SmuzhiyunThe current overcommit limit and amount committed are viewable in
40*4882a593Smuzhiyun``/proc/meminfo`` as CommitLimit and Committed_AS respectively.
41*4882a593Smuzhiyun
42*4882a593SmuzhiyunGotchas
43*4882a593Smuzhiyun=======
44*4882a593Smuzhiyun
45*4882a593SmuzhiyunThe C language stack growth does an implicit mremap. If you want absolute
46*4882a593Smuzhiyunguarantees and run close to the edge you MUST mmap your stack for the
47*4882a593Smuzhiyunlargest size you think you will need. For typical stack usage this does
48*4882a593Smuzhiyunnot matter much but it's a corner case if you really really care
49*4882a593Smuzhiyun
50*4882a593SmuzhiyunIn mode 2 the MAP_NORESERVE flag is ignored.
51*4882a593Smuzhiyun
52*4882a593Smuzhiyun
53*4882a593SmuzhiyunHow It Works
54*4882a593Smuzhiyun============
55*4882a593Smuzhiyun
56*4882a593SmuzhiyunThe overcommit is based on the following rules
57*4882a593Smuzhiyun
58*4882a593SmuzhiyunFor a file backed map
59*4882a593Smuzhiyun	| SHARED or READ-only	-	0 cost (the file is the map not swap)
60*4882a593Smuzhiyun	| PRIVATE WRITABLE	-	size of mapping per instance
61*4882a593Smuzhiyun
62*4882a593SmuzhiyunFor an anonymous or ``/dev/zero`` map
63*4882a593Smuzhiyun	| SHARED			-	size of mapping
64*4882a593Smuzhiyun	| PRIVATE READ-only	-	0 cost (but of little use)
65*4882a593Smuzhiyun	| PRIVATE WRITABLE	-	size of mapping per instance
66*4882a593Smuzhiyun
67*4882a593SmuzhiyunAdditional accounting
68*4882a593Smuzhiyun	| Pages made writable copies by mmap
69*4882a593Smuzhiyun	| shmfs memory drawn from the same pool
70*4882a593Smuzhiyun
71*4882a593SmuzhiyunStatus
72*4882a593Smuzhiyun======
73*4882a593Smuzhiyun
74*4882a593Smuzhiyun*	We account mmap memory mappings
75*4882a593Smuzhiyun*	We account mprotect changes in commit
76*4882a593Smuzhiyun*	We account mremap changes in size
77*4882a593Smuzhiyun*	We account brk
78*4882a593Smuzhiyun*	We account munmap
79*4882a593Smuzhiyun*	We report the commit status in /proc
80*4882a593Smuzhiyun*	Account and check on fork
81*4882a593Smuzhiyun*	Review stack handling/building on exec
82*4882a593Smuzhiyun*	SHMfs accounting
83*4882a593Smuzhiyun*	Implement actual limit enforcement
84*4882a593Smuzhiyun
85*4882a593SmuzhiyunTo Do
86*4882a593Smuzhiyun=====
87*4882a593Smuzhiyun*	Account ptrace pages (this is hard)
88