xref: /OK3568_Linux_fs/kernel/Documentation/admin-guide/cgroup-v1/devices.rst (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593Smuzhiyun===========================
2*4882a593SmuzhiyunDevice Whitelist Controller
3*4882a593Smuzhiyun===========================
4*4882a593Smuzhiyun
5*4882a593Smuzhiyun1. Description
6*4882a593Smuzhiyun==============
7*4882a593Smuzhiyun
8*4882a593SmuzhiyunImplement a cgroup to track and enforce open and mknod restrictions
9*4882a593Smuzhiyunon device files.  A device cgroup associates a device access
10*4882a593Smuzhiyunwhitelist with each cgroup.  A whitelist entry has 4 fields.
11*4882a593Smuzhiyun'type' is a (all), c (char), or b (block).  'all' means it applies
12*4882a593Smuzhiyunto all types and all major and minor numbers.  Major and minor are
13*4882a593Smuzhiyuneither an integer or * for all.  Access is a composition of r
14*4882a593Smuzhiyun(read), w (write), and m (mknod).
15*4882a593Smuzhiyun
16*4882a593SmuzhiyunThe root device cgroup starts with rwm to 'all'.  A child device
17*4882a593Smuzhiyuncgroup gets a copy of the parent.  Administrators can then remove
18*4882a593Smuzhiyundevices from the whitelist or add new entries.  A child cgroup can
19*4882a593Smuzhiyunnever receive a device access which is denied by its parent.
20*4882a593Smuzhiyun
21*4882a593Smuzhiyun2. User Interface
22*4882a593Smuzhiyun=================
23*4882a593Smuzhiyun
24*4882a593SmuzhiyunAn entry is added using devices.allow, and removed using
25*4882a593Smuzhiyundevices.deny.  For instance::
26*4882a593Smuzhiyun
27*4882a593Smuzhiyun	echo 'c 1:3 mr' > /sys/fs/cgroup/1/devices.allow
28*4882a593Smuzhiyun
29*4882a593Smuzhiyunallows cgroup 1 to read and mknod the device usually known as
30*4882a593Smuzhiyun/dev/null.  Doing::
31*4882a593Smuzhiyun
32*4882a593Smuzhiyun	echo a > /sys/fs/cgroup/1/devices.deny
33*4882a593Smuzhiyun
34*4882a593Smuzhiyunwill remove the default 'a *:* rwm' entry. Doing::
35*4882a593Smuzhiyun
36*4882a593Smuzhiyun	echo a > /sys/fs/cgroup/1/devices.allow
37*4882a593Smuzhiyun
38*4882a593Smuzhiyunwill add the 'a *:* rwm' entry to the whitelist.
39*4882a593Smuzhiyun
40*4882a593Smuzhiyun3. Security
41*4882a593Smuzhiyun===========
42*4882a593Smuzhiyun
43*4882a593SmuzhiyunAny task can move itself between cgroups.  This clearly won't
44*4882a593Smuzhiyunsuffice, but we can decide the best way to adequately restrict
45*4882a593Smuzhiyunmovement as people get some experience with this.  We may just want
46*4882a593Smuzhiyunto require CAP_SYS_ADMIN, which at least is a separate bit from
47*4882a593SmuzhiyunCAP_MKNOD.  We may want to just refuse moving to a cgroup which
48*4882a593Smuzhiyunisn't a descendant of the current one.  Or we may want to use
49*4882a593SmuzhiyunCAP_MAC_ADMIN, since we really are trying to lock down root.
50*4882a593Smuzhiyun
51*4882a593SmuzhiyunCAP_SYS_ADMIN is needed to modify the whitelist or move another
52*4882a593Smuzhiyuntask to a new cgroup.  (Again we'll probably want to change that).
53*4882a593Smuzhiyun
54*4882a593SmuzhiyunA cgroup may not be granted more permissions than the cgroup's
55*4882a593Smuzhiyunparent has.
56*4882a593Smuzhiyun
57*4882a593Smuzhiyun4. Hierarchy
58*4882a593Smuzhiyun============
59*4882a593Smuzhiyun
60*4882a593Smuzhiyundevice cgroups maintain hierarchy by making sure a cgroup never has more
61*4882a593Smuzhiyunaccess permissions than its parent.  Every time an entry is written to
62*4882a593Smuzhiyuna cgroup's devices.deny file, all its children will have that entry removed
63*4882a593Smuzhiyunfrom their whitelist and all the locally set whitelist entries will be
64*4882a593Smuzhiyunre-evaluated.  In case one of the locally set whitelist entries would provide
65*4882a593Smuzhiyunmore access than the cgroup's parent, it'll be removed from the whitelist.
66*4882a593Smuzhiyun
67*4882a593SmuzhiyunExample::
68*4882a593Smuzhiyun
69*4882a593Smuzhiyun      A
70*4882a593Smuzhiyun     / \
71*4882a593Smuzhiyun        B
72*4882a593Smuzhiyun
73*4882a593Smuzhiyun    group        behavior	exceptions
74*4882a593Smuzhiyun    A            allow		"b 8:* rwm", "c 116:1 rw"
75*4882a593Smuzhiyun    B            deny		"c 1:3 rwm", "c 116:2 rwm", "b 3:* rwm"
76*4882a593Smuzhiyun
77*4882a593SmuzhiyunIf a device is denied in group A::
78*4882a593Smuzhiyun
79*4882a593Smuzhiyun	# echo "c 116:* r" > A/devices.deny
80*4882a593Smuzhiyun
81*4882a593Smuzhiyunit'll propagate down and after revalidating B's entries, the whitelist entry
82*4882a593Smuzhiyun"c 116:2 rwm" will be removed::
83*4882a593Smuzhiyun
84*4882a593Smuzhiyun    group        whitelist entries                        denied devices
85*4882a593Smuzhiyun    A            all                                      "b 8:* rwm", "c 116:* rw"
86*4882a593Smuzhiyun    B            "c 1:3 rwm", "b 3:* rwm"                 all the rest
87*4882a593Smuzhiyun
88*4882a593SmuzhiyunIn case parent's exceptions change and local exceptions are not allowed
89*4882a593Smuzhiyunanymore, they'll be deleted.
90*4882a593Smuzhiyun
91*4882a593SmuzhiyunNotice that new whitelist entries will not be propagated::
92*4882a593Smuzhiyun
93*4882a593Smuzhiyun      A
94*4882a593Smuzhiyun     / \
95*4882a593Smuzhiyun        B
96*4882a593Smuzhiyun
97*4882a593Smuzhiyun    group        whitelist entries                        denied devices
98*4882a593Smuzhiyun    A            "c 1:3 rwm", "c 1:5 r"                   all the rest
99*4882a593Smuzhiyun    B            "c 1:3 rwm", "c 1:5 r"                   all the rest
100*4882a593Smuzhiyun
101*4882a593Smuzhiyunwhen adding ``c *:3 rwm``::
102*4882a593Smuzhiyun
103*4882a593Smuzhiyun	# echo "c *:3 rwm" >A/devices.allow
104*4882a593Smuzhiyun
105*4882a593Smuzhiyunthe result::
106*4882a593Smuzhiyun
107*4882a593Smuzhiyun    group        whitelist entries                        denied devices
108*4882a593Smuzhiyun    A            "c *:3 rwm", "c 1:5 r"                   all the rest
109*4882a593Smuzhiyun    B            "c 1:3 rwm", "c 1:5 r"                   all the rest
110*4882a593Smuzhiyun
111*4882a593Smuzhiyunbut now it'll be possible to add new entries to B::
112*4882a593Smuzhiyun
113*4882a593Smuzhiyun	# echo "c 2:3 rwm" >B/devices.allow
114*4882a593Smuzhiyun	# echo "c 50:3 r" >B/devices.allow
115*4882a593Smuzhiyun
116*4882a593Smuzhiyunor even::
117*4882a593Smuzhiyun
118*4882a593Smuzhiyun	# echo "c *:3 rwm" >B/devices.allow
119*4882a593Smuzhiyun
120*4882a593SmuzhiyunAllowing or denying all by writing 'a' to devices.allow or devices.deny will
121*4882a593Smuzhiyunnot be possible once the device cgroups has children.
122*4882a593Smuzhiyun
123*4882a593Smuzhiyun4.1 Hierarchy (internal implementation)
124*4882a593Smuzhiyun---------------------------------------
125*4882a593Smuzhiyun
126*4882a593Smuzhiyundevice cgroups is implemented internally using a behavior (ALLOW, DENY) and a
127*4882a593Smuzhiyunlist of exceptions.  The internal state is controlled using the same user
128*4882a593Smuzhiyuninterface to preserve compatibility with the previous whitelist-only
129*4882a593Smuzhiyunimplementation.  Removal or addition of exceptions that will reduce the access
130*4882a593Smuzhiyunto devices will be propagated down the hierarchy.
131*4882a593SmuzhiyunFor every propagated exception, the effective rules will be re-evaluated based
132*4882a593Smuzhiyunon current parent's access rules.
133