1*4882a593Smuzhiyun=========================== 2*4882a593SmuzhiyunDevice Whitelist Controller 3*4882a593Smuzhiyun=========================== 4*4882a593Smuzhiyun 5*4882a593Smuzhiyun1. Description 6*4882a593Smuzhiyun============== 7*4882a593Smuzhiyun 8*4882a593SmuzhiyunImplement a cgroup to track and enforce open and mknod restrictions 9*4882a593Smuzhiyunon device files. A device cgroup associates a device access 10*4882a593Smuzhiyunwhitelist with each cgroup. A whitelist entry has 4 fields. 11*4882a593Smuzhiyun'type' is a (all), c (char), or b (block). 'all' means it applies 12*4882a593Smuzhiyunto all types and all major and minor numbers. Major and minor are 13*4882a593Smuzhiyuneither an integer or * for all. Access is a composition of r 14*4882a593Smuzhiyun(read), w (write), and m (mknod). 15*4882a593Smuzhiyun 16*4882a593SmuzhiyunThe root device cgroup starts with rwm to 'all'. A child device 17*4882a593Smuzhiyuncgroup gets a copy of the parent. Administrators can then remove 18*4882a593Smuzhiyundevices from the whitelist or add new entries. A child cgroup can 19*4882a593Smuzhiyunnever receive a device access which is denied by its parent. 20*4882a593Smuzhiyun 21*4882a593Smuzhiyun2. User Interface 22*4882a593Smuzhiyun================= 23*4882a593Smuzhiyun 24*4882a593SmuzhiyunAn entry is added using devices.allow, and removed using 25*4882a593Smuzhiyundevices.deny. For instance:: 26*4882a593Smuzhiyun 27*4882a593Smuzhiyun echo 'c 1:3 mr' > /sys/fs/cgroup/1/devices.allow 28*4882a593Smuzhiyun 29*4882a593Smuzhiyunallows cgroup 1 to read and mknod the device usually known as 30*4882a593Smuzhiyun/dev/null. Doing:: 31*4882a593Smuzhiyun 32*4882a593Smuzhiyun echo a > /sys/fs/cgroup/1/devices.deny 33*4882a593Smuzhiyun 34*4882a593Smuzhiyunwill remove the default 'a *:* rwm' entry. Doing:: 35*4882a593Smuzhiyun 36*4882a593Smuzhiyun echo a > /sys/fs/cgroup/1/devices.allow 37*4882a593Smuzhiyun 38*4882a593Smuzhiyunwill add the 'a *:* rwm' entry to the whitelist. 39*4882a593Smuzhiyun 40*4882a593Smuzhiyun3. Security 41*4882a593Smuzhiyun=========== 42*4882a593Smuzhiyun 43*4882a593SmuzhiyunAny task can move itself between cgroups. This clearly won't 44*4882a593Smuzhiyunsuffice, but we can decide the best way to adequately restrict 45*4882a593Smuzhiyunmovement as people get some experience with this. We may just want 46*4882a593Smuzhiyunto require CAP_SYS_ADMIN, which at least is a separate bit from 47*4882a593SmuzhiyunCAP_MKNOD. We may want to just refuse moving to a cgroup which 48*4882a593Smuzhiyunisn't a descendant of the current one. Or we may want to use 49*4882a593SmuzhiyunCAP_MAC_ADMIN, since we really are trying to lock down root. 50*4882a593Smuzhiyun 51*4882a593SmuzhiyunCAP_SYS_ADMIN is needed to modify the whitelist or move another 52*4882a593Smuzhiyuntask to a new cgroup. (Again we'll probably want to change that). 53*4882a593Smuzhiyun 54*4882a593SmuzhiyunA cgroup may not be granted more permissions than the cgroup's 55*4882a593Smuzhiyunparent has. 56*4882a593Smuzhiyun 57*4882a593Smuzhiyun4. Hierarchy 58*4882a593Smuzhiyun============ 59*4882a593Smuzhiyun 60*4882a593Smuzhiyundevice cgroups maintain hierarchy by making sure a cgroup never has more 61*4882a593Smuzhiyunaccess permissions than its parent. Every time an entry is written to 62*4882a593Smuzhiyuna cgroup's devices.deny file, all its children will have that entry removed 63*4882a593Smuzhiyunfrom their whitelist and all the locally set whitelist entries will be 64*4882a593Smuzhiyunre-evaluated. In case one of the locally set whitelist entries would provide 65*4882a593Smuzhiyunmore access than the cgroup's parent, it'll be removed from the whitelist. 66*4882a593Smuzhiyun 67*4882a593SmuzhiyunExample:: 68*4882a593Smuzhiyun 69*4882a593Smuzhiyun A 70*4882a593Smuzhiyun / \ 71*4882a593Smuzhiyun B 72*4882a593Smuzhiyun 73*4882a593Smuzhiyun group behavior exceptions 74*4882a593Smuzhiyun A allow "b 8:* rwm", "c 116:1 rw" 75*4882a593Smuzhiyun B deny "c 1:3 rwm", "c 116:2 rwm", "b 3:* rwm" 76*4882a593Smuzhiyun 77*4882a593SmuzhiyunIf a device is denied in group A:: 78*4882a593Smuzhiyun 79*4882a593Smuzhiyun # echo "c 116:* r" > A/devices.deny 80*4882a593Smuzhiyun 81*4882a593Smuzhiyunit'll propagate down and after revalidating B's entries, the whitelist entry 82*4882a593Smuzhiyun"c 116:2 rwm" will be removed:: 83*4882a593Smuzhiyun 84*4882a593Smuzhiyun group whitelist entries denied devices 85*4882a593Smuzhiyun A all "b 8:* rwm", "c 116:* rw" 86*4882a593Smuzhiyun B "c 1:3 rwm", "b 3:* rwm" all the rest 87*4882a593Smuzhiyun 88*4882a593SmuzhiyunIn case parent's exceptions change and local exceptions are not allowed 89*4882a593Smuzhiyunanymore, they'll be deleted. 90*4882a593Smuzhiyun 91*4882a593SmuzhiyunNotice that new whitelist entries will not be propagated:: 92*4882a593Smuzhiyun 93*4882a593Smuzhiyun A 94*4882a593Smuzhiyun / \ 95*4882a593Smuzhiyun B 96*4882a593Smuzhiyun 97*4882a593Smuzhiyun group whitelist entries denied devices 98*4882a593Smuzhiyun A "c 1:3 rwm", "c 1:5 r" all the rest 99*4882a593Smuzhiyun B "c 1:3 rwm", "c 1:5 r" all the rest 100*4882a593Smuzhiyun 101*4882a593Smuzhiyunwhen adding ``c *:3 rwm``:: 102*4882a593Smuzhiyun 103*4882a593Smuzhiyun # echo "c *:3 rwm" >A/devices.allow 104*4882a593Smuzhiyun 105*4882a593Smuzhiyunthe result:: 106*4882a593Smuzhiyun 107*4882a593Smuzhiyun group whitelist entries denied devices 108*4882a593Smuzhiyun A "c *:3 rwm", "c 1:5 r" all the rest 109*4882a593Smuzhiyun B "c 1:3 rwm", "c 1:5 r" all the rest 110*4882a593Smuzhiyun 111*4882a593Smuzhiyunbut now it'll be possible to add new entries to B:: 112*4882a593Smuzhiyun 113*4882a593Smuzhiyun # echo "c 2:3 rwm" >B/devices.allow 114*4882a593Smuzhiyun # echo "c 50:3 r" >B/devices.allow 115*4882a593Smuzhiyun 116*4882a593Smuzhiyunor even:: 117*4882a593Smuzhiyun 118*4882a593Smuzhiyun # echo "c *:3 rwm" >B/devices.allow 119*4882a593Smuzhiyun 120*4882a593SmuzhiyunAllowing or denying all by writing 'a' to devices.allow or devices.deny will 121*4882a593Smuzhiyunnot be possible once the device cgroups has children. 122*4882a593Smuzhiyun 123*4882a593Smuzhiyun4.1 Hierarchy (internal implementation) 124*4882a593Smuzhiyun--------------------------------------- 125*4882a593Smuzhiyun 126*4882a593Smuzhiyundevice cgroups is implemented internally using a behavior (ALLOW, DENY) and a 127*4882a593Smuzhiyunlist of exceptions. The internal state is controlled using the same user 128*4882a593Smuzhiyuninterface to preserve compatibility with the previous whitelist-only 129*4882a593Smuzhiyunimplementation. Removal or addition of exceptions that will reduce the access 130*4882a593Smuzhiyunto devices will be propagated down the hierarchy. 131*4882a593SmuzhiyunFor every propagated exception, the effective rules will be re-evaluated based 132*4882a593Smuzhiyunon current parent's access rules. 133