1*4882a593Smuzhiyun========================= 2*4882a593SmuzhiyunProcess Number Controller 3*4882a593Smuzhiyun========================= 4*4882a593Smuzhiyun 5*4882a593SmuzhiyunAbstract 6*4882a593Smuzhiyun-------- 7*4882a593Smuzhiyun 8*4882a593SmuzhiyunThe process number controller is used to allow a cgroup hierarchy to stop any 9*4882a593Smuzhiyunnew tasks from being fork()'d or clone()'d after a certain limit is reached. 10*4882a593Smuzhiyun 11*4882a593SmuzhiyunSince it is trivial to hit the task limit without hitting any kmemcg limits in 12*4882a593Smuzhiyunplace, PIDs are a fundamental resource. As such, PID exhaustion must be 13*4882a593Smuzhiyunpreventable in the scope of a cgroup hierarchy by allowing resource limiting of 14*4882a593Smuzhiyunthe number of tasks in a cgroup. 15*4882a593Smuzhiyun 16*4882a593SmuzhiyunUsage 17*4882a593Smuzhiyun----- 18*4882a593Smuzhiyun 19*4882a593SmuzhiyunIn order to use the `pids` controller, set the maximum number of tasks in 20*4882a593Smuzhiyunpids.max (this is not available in the root cgroup for obvious reasons). The 21*4882a593Smuzhiyunnumber of processes currently in the cgroup is given by pids.current. 22*4882a593Smuzhiyun 23*4882a593SmuzhiyunOrganisational operations are not blocked by cgroup policies, so it is possible 24*4882a593Smuzhiyunto have pids.current > pids.max. This can be done by either setting the limit to 25*4882a593Smuzhiyunbe smaller than pids.current, or attaching enough processes to the cgroup such 26*4882a593Smuzhiyunthat pids.current > pids.max. However, it is not possible to violate a cgroup 27*4882a593Smuzhiyunpolicy through fork() or clone(). fork() and clone() will return -EAGAIN if the 28*4882a593Smuzhiyuncreation of a new process would cause a cgroup policy to be violated. 29*4882a593Smuzhiyun 30*4882a593SmuzhiyunTo set a cgroup to have no limit, set pids.max to "max". This is the default for 31*4882a593Smuzhiyunall new cgroups (N.B. that PID limits are hierarchical, so the most stringent 32*4882a593Smuzhiyunlimit in the hierarchy is followed). 33*4882a593Smuzhiyun 34*4882a593Smuzhiyunpids.current tracks all child cgroup hierarchies, so parent/pids.current is a 35*4882a593Smuzhiyunsuperset of parent/child/pids.current. 36*4882a593Smuzhiyun 37*4882a593SmuzhiyunThe pids.events file contains event counters: 38*4882a593Smuzhiyun 39*4882a593Smuzhiyun - max: Number of times fork failed because limit was hit. 40*4882a593Smuzhiyun 41*4882a593SmuzhiyunExample 42*4882a593Smuzhiyun------- 43*4882a593Smuzhiyun 44*4882a593SmuzhiyunFirst, we mount the pids controller:: 45*4882a593Smuzhiyun 46*4882a593Smuzhiyun # mkdir -p /sys/fs/cgroup/pids 47*4882a593Smuzhiyun # mount -t cgroup -o pids none /sys/fs/cgroup/pids 48*4882a593Smuzhiyun 49*4882a593SmuzhiyunThen we create a hierarchy, set limits and attach processes to it:: 50*4882a593Smuzhiyun 51*4882a593Smuzhiyun # mkdir -p /sys/fs/cgroup/pids/parent/child 52*4882a593Smuzhiyun # echo 2 > /sys/fs/cgroup/pids/parent/pids.max 53*4882a593Smuzhiyun # echo $$ > /sys/fs/cgroup/pids/parent/cgroup.procs 54*4882a593Smuzhiyun # cat /sys/fs/cgroup/pids/parent/pids.current 55*4882a593Smuzhiyun 2 56*4882a593Smuzhiyun # 57*4882a593Smuzhiyun 58*4882a593SmuzhiyunIt should be noted that attempts to overcome the set limit (2 in this case) will 59*4882a593Smuzhiyunfail:: 60*4882a593Smuzhiyun 61*4882a593Smuzhiyun # cat /sys/fs/cgroup/pids/parent/pids.current 62*4882a593Smuzhiyun 2 63*4882a593Smuzhiyun # ( /bin/echo "Here's some processes for you." | cat ) 64*4882a593Smuzhiyun sh: fork: Resource temporary unavailable 65*4882a593Smuzhiyun # 66*4882a593Smuzhiyun 67*4882a593SmuzhiyunEven if we migrate to a child cgroup (which doesn't have a set limit), we will 68*4882a593Smuzhiyunnot be able to overcome the most stringent limit in the hierarchy (in this case, 69*4882a593Smuzhiyunparent's):: 70*4882a593Smuzhiyun 71*4882a593Smuzhiyun # echo $$ > /sys/fs/cgroup/pids/parent/child/cgroup.procs 72*4882a593Smuzhiyun # cat /sys/fs/cgroup/pids/parent/pids.current 73*4882a593Smuzhiyun 2 74*4882a593Smuzhiyun # cat /sys/fs/cgroup/pids/parent/child/pids.current 75*4882a593Smuzhiyun 2 76*4882a593Smuzhiyun # cat /sys/fs/cgroup/pids/parent/child/pids.max 77*4882a593Smuzhiyun max 78*4882a593Smuzhiyun # ( /bin/echo "Here's some processes for you." | cat ) 79*4882a593Smuzhiyun sh: fork: Resource temporary unavailable 80*4882a593Smuzhiyun # 81*4882a593Smuzhiyun 82*4882a593SmuzhiyunWe can set a limit that is smaller than pids.current, which will stop any new 83*4882a593Smuzhiyunprocesses from being forked at all (note that the shell itself counts towards 84*4882a593Smuzhiyunpids.current):: 85*4882a593Smuzhiyun 86*4882a593Smuzhiyun # echo 1 > /sys/fs/cgroup/pids/parent/pids.max 87*4882a593Smuzhiyun # /bin/echo "We can't even spawn a single process now." 88*4882a593Smuzhiyun sh: fork: Resource temporary unavailable 89*4882a593Smuzhiyun # echo 0 > /sys/fs/cgroup/pids/parent/pids.max 90*4882a593Smuzhiyun # /bin/echo "We can't even spawn a single process now." 91*4882a593Smuzhiyun sh: fork: Resource temporary unavailable 92*4882a593Smuzhiyun # 93