1*4882a593Smuzhiyun================ 2*4882a593SmuzhiyunDelay accounting 3*4882a593Smuzhiyun================ 4*4882a593Smuzhiyun 5*4882a593SmuzhiyunTasks encounter delays in execution when they wait 6*4882a593Smuzhiyunfor some kernel resource to become available e.g. a 7*4882a593Smuzhiyunrunnable task may wait for a free CPU to run on. 8*4882a593Smuzhiyun 9*4882a593SmuzhiyunThe per-task delay accounting functionality measures 10*4882a593Smuzhiyunthe delays experienced by a task while 11*4882a593Smuzhiyun 12*4882a593Smuzhiyuna) waiting for a CPU (while being runnable) 13*4882a593Smuzhiyunb) completion of synchronous block I/O initiated by the task 14*4882a593Smuzhiyunc) swapping in pages 15*4882a593Smuzhiyund) memory reclaim 16*4882a593Smuzhiyun 17*4882a593Smuzhiyunand makes these statistics available to userspace through 18*4882a593Smuzhiyunthe taskstats interface. 19*4882a593Smuzhiyun 20*4882a593SmuzhiyunSuch delays provide feedback for setting a task's cpu priority, 21*4882a593Smuzhiyunio priority and rss limit values appropriately. Long delays for 22*4882a593Smuzhiyunimportant tasks could be a trigger for raising its corresponding priority. 23*4882a593Smuzhiyun 24*4882a593SmuzhiyunThe functionality, through its use of the taskstats interface, also provides 25*4882a593Smuzhiyundelay statistics aggregated for all tasks (or threads) belonging to a 26*4882a593Smuzhiyunthread group (corresponding to a traditional Unix process). This is a commonly 27*4882a593Smuzhiyunneeded aggregation that is more efficiently done by the kernel. 28*4882a593Smuzhiyun 29*4882a593SmuzhiyunUserspace utilities, particularly resource management applications, can also 30*4882a593Smuzhiyunaggregate delay statistics into arbitrary groups. To enable this, delay 31*4882a593Smuzhiyunstatistics of a task are available both during its lifetime as well as on its 32*4882a593Smuzhiyunexit, ensuring continuous and complete monitoring can be done. 33*4882a593Smuzhiyun 34*4882a593Smuzhiyun 35*4882a593SmuzhiyunInterface 36*4882a593Smuzhiyun--------- 37*4882a593Smuzhiyun 38*4882a593SmuzhiyunDelay accounting uses the taskstats interface which is described 39*4882a593Smuzhiyunin detail in a separate document in this directory. Taskstats returns a 40*4882a593Smuzhiyungeneric data structure to userspace corresponding to per-pid and per-tgid 41*4882a593Smuzhiyunstatistics. The delay accounting functionality populates specific fields of 42*4882a593Smuzhiyunthis structure. See 43*4882a593Smuzhiyun 44*4882a593Smuzhiyun include/linux/taskstats.h 45*4882a593Smuzhiyun 46*4882a593Smuzhiyunfor a description of the fields pertaining to delay accounting. 47*4882a593SmuzhiyunIt will generally be in the form of counters returning the cumulative 48*4882a593Smuzhiyundelay seen for cpu, sync block I/O, swapin, memory reclaim etc. 49*4882a593Smuzhiyun 50*4882a593SmuzhiyunTaking the difference of two successive readings of a given 51*4882a593Smuzhiyuncounter (say cpu_delay_total) for a task will give the delay 52*4882a593Smuzhiyunexperienced by the task waiting for the corresponding resource 53*4882a593Smuzhiyunin that interval. 54*4882a593Smuzhiyun 55*4882a593SmuzhiyunWhen a task exits, records containing the per-task statistics 56*4882a593Smuzhiyunare sent to userspace without requiring a command. If it is the last exiting 57*4882a593Smuzhiyuntask of a thread group, the per-tgid statistics are also sent. More details 58*4882a593Smuzhiyunare given in the taskstats interface description. 59*4882a593Smuzhiyun 60*4882a593SmuzhiyunThe getdelays.c userspace utility in tools/accounting directory allows simple 61*4882a593Smuzhiyuncommands to be run and the corresponding delay statistics to be displayed. It 62*4882a593Smuzhiyunalso serves as an example of using the taskstats interface. 63*4882a593Smuzhiyun 64*4882a593SmuzhiyunUsage 65*4882a593Smuzhiyun----- 66*4882a593Smuzhiyun 67*4882a593SmuzhiyunCompile the kernel with:: 68*4882a593Smuzhiyun 69*4882a593Smuzhiyun CONFIG_TASK_DELAY_ACCT=y 70*4882a593Smuzhiyun CONFIG_TASKSTATS=y 71*4882a593Smuzhiyun 72*4882a593SmuzhiyunDelay accounting is enabled by default at boot up. 73*4882a593SmuzhiyunTo disable, add:: 74*4882a593Smuzhiyun 75*4882a593Smuzhiyun nodelayacct 76*4882a593Smuzhiyun 77*4882a593Smuzhiyunto the kernel boot options. The rest of the instructions 78*4882a593Smuzhiyunbelow assume this has not been done. 79*4882a593Smuzhiyun 80*4882a593SmuzhiyunAfter the system has booted up, use a utility 81*4882a593Smuzhiyunsimilar to getdelays.c to access the delays 82*4882a593Smuzhiyunseen by a given task or a task group (tgid). 83*4882a593SmuzhiyunThe utility also allows a given command to be 84*4882a593Smuzhiyunexecuted and the corresponding delays to be 85*4882a593Smuzhiyunseen. 86*4882a593Smuzhiyun 87*4882a593SmuzhiyunGeneral format of the getdelays command:: 88*4882a593Smuzhiyun 89*4882a593Smuzhiyun getdelays [-t tgid] [-p pid] [-c cmd...] 90*4882a593Smuzhiyun 91*4882a593Smuzhiyun 92*4882a593SmuzhiyunGet delays, since system boot, for pid 10:: 93*4882a593Smuzhiyun 94*4882a593Smuzhiyun # ./getdelays -p 10 95*4882a593Smuzhiyun (output similar to next case) 96*4882a593Smuzhiyun 97*4882a593SmuzhiyunGet sum of delays, since system boot, for all pids with tgid 5:: 98*4882a593Smuzhiyun 99*4882a593Smuzhiyun # ./getdelays -t 5 100*4882a593Smuzhiyun 101*4882a593Smuzhiyun 102*4882a593Smuzhiyun CPU count real total virtual total delay total 103*4882a593Smuzhiyun 7876 92005750 100000000 24001500 104*4882a593Smuzhiyun IO count delay total 105*4882a593Smuzhiyun 0 0 106*4882a593Smuzhiyun SWAP count delay total 107*4882a593Smuzhiyun 0 0 108*4882a593Smuzhiyun RECLAIM count delay total 109*4882a593Smuzhiyun 0 0 110*4882a593Smuzhiyun 111*4882a593SmuzhiyunGet delays seen in executing a given simple command:: 112*4882a593Smuzhiyun 113*4882a593Smuzhiyun # ./getdelays -c ls / 114*4882a593Smuzhiyun 115*4882a593Smuzhiyun bin data1 data3 data5 dev home media opt root srv sys usr 116*4882a593Smuzhiyun boot data2 data4 data6 etc lib mnt proc sbin subdomain tmp var 117*4882a593Smuzhiyun 118*4882a593Smuzhiyun 119*4882a593Smuzhiyun CPU count real total virtual total delay total 120*4882a593Smuzhiyun 6 4000250 4000000 0 121*4882a593Smuzhiyun IO count delay total 122*4882a593Smuzhiyun 0 0 123*4882a593Smuzhiyun SWAP count delay total 124*4882a593Smuzhiyun 0 0 125*4882a593Smuzhiyun RECLAIM count delay total 126*4882a593Smuzhiyun 0 0 127