1*4882a593SmuzhiyunOverhead calculation 2*4882a593Smuzhiyun-------------------- 3*4882a593SmuzhiyunThe overhead can be shown in two columns as 'Children' and 'Self' when 4*4882a593Smuzhiyunperf collects callchains. The 'self' overhead is simply calculated by 5*4882a593Smuzhiyunadding all period values of the entry - usually a function (symbol). 6*4882a593SmuzhiyunThis is the value that perf shows traditionally and sum of all the 7*4882a593Smuzhiyun'self' overhead values should be 100%. 8*4882a593Smuzhiyun 9*4882a593SmuzhiyunThe 'children' overhead is calculated by adding all period values of 10*4882a593Smuzhiyunthe child functions so that it can show the total overhead of the 11*4882a593Smuzhiyunhigher level functions even if they don't directly execute much. 12*4882a593Smuzhiyun'Children' here means functions that are called from another (parent) 13*4882a593Smuzhiyunfunction. 14*4882a593Smuzhiyun 15*4882a593SmuzhiyunIt might be confusing that the sum of all the 'children' overhead 16*4882a593Smuzhiyunvalues exceeds 100% since each of them is already an accumulation of 17*4882a593Smuzhiyun'self' overhead of its child functions. But with this enabled, users 18*4882a593Smuzhiyuncan find which function has the most overhead even if samples are 19*4882a593Smuzhiyunspread over the children. 20*4882a593Smuzhiyun 21*4882a593SmuzhiyunConsider the following example; there are three functions like below. 22*4882a593Smuzhiyun 23*4882a593Smuzhiyun----------------------- 24*4882a593Smuzhiyunvoid foo(void) { 25*4882a593Smuzhiyun /* do something */ 26*4882a593Smuzhiyun} 27*4882a593Smuzhiyun 28*4882a593Smuzhiyunvoid bar(void) { 29*4882a593Smuzhiyun /* do something */ 30*4882a593Smuzhiyun foo(); 31*4882a593Smuzhiyun} 32*4882a593Smuzhiyun 33*4882a593Smuzhiyunint main(void) { 34*4882a593Smuzhiyun bar() 35*4882a593Smuzhiyun return 0; 36*4882a593Smuzhiyun} 37*4882a593Smuzhiyun----------------------- 38*4882a593Smuzhiyun 39*4882a593SmuzhiyunIn this case 'foo' is a child of 'bar', and 'bar' is an immediate 40*4882a593Smuzhiyunchild of 'main' so 'foo' also is a child of 'main'. In other words, 41*4882a593Smuzhiyun'main' is a parent of 'foo' and 'bar', and 'bar' is a parent of 'foo'. 42*4882a593Smuzhiyun 43*4882a593SmuzhiyunSuppose all samples are recorded in 'foo' and 'bar' only. When it's 44*4882a593Smuzhiyunrecorded with callchains the output will show something like below 45*4882a593Smuzhiyunin the usual (self-overhead-only) output of perf report: 46*4882a593Smuzhiyun 47*4882a593Smuzhiyun---------------------------------- 48*4882a593SmuzhiyunOverhead Symbol 49*4882a593Smuzhiyun........ ..................... 50*4882a593Smuzhiyun 60.00% foo 51*4882a593Smuzhiyun | 52*4882a593Smuzhiyun --- foo 53*4882a593Smuzhiyun bar 54*4882a593Smuzhiyun main 55*4882a593Smuzhiyun __libc_start_main 56*4882a593Smuzhiyun 57*4882a593Smuzhiyun 40.00% bar 58*4882a593Smuzhiyun | 59*4882a593Smuzhiyun --- bar 60*4882a593Smuzhiyun main 61*4882a593Smuzhiyun __libc_start_main 62*4882a593Smuzhiyun---------------------------------- 63*4882a593Smuzhiyun 64*4882a593SmuzhiyunWhen the --children option is enabled, the 'self' overhead values of 65*4882a593Smuzhiyunchild functions (i.e. 'foo' and 'bar') are added to the parents to 66*4882a593Smuzhiyuncalculate the 'children' overhead. In this case the report could be 67*4882a593Smuzhiyundisplayed as: 68*4882a593Smuzhiyun 69*4882a593Smuzhiyun------------------------------------------- 70*4882a593SmuzhiyunChildren Self Symbol 71*4882a593Smuzhiyun........ ........ .................... 72*4882a593Smuzhiyun 100.00% 0.00% __libc_start_main 73*4882a593Smuzhiyun | 74*4882a593Smuzhiyun --- __libc_start_main 75*4882a593Smuzhiyun 76*4882a593Smuzhiyun 100.00% 0.00% main 77*4882a593Smuzhiyun | 78*4882a593Smuzhiyun --- main 79*4882a593Smuzhiyun __libc_start_main 80*4882a593Smuzhiyun 81*4882a593Smuzhiyun 100.00% 40.00% bar 82*4882a593Smuzhiyun | 83*4882a593Smuzhiyun --- bar 84*4882a593Smuzhiyun main 85*4882a593Smuzhiyun __libc_start_main 86*4882a593Smuzhiyun 87*4882a593Smuzhiyun 60.00% 60.00% foo 88*4882a593Smuzhiyun | 89*4882a593Smuzhiyun --- foo 90*4882a593Smuzhiyun bar 91*4882a593Smuzhiyun main 92*4882a593Smuzhiyun __libc_start_main 93*4882a593Smuzhiyun------------------------------------------- 94*4882a593Smuzhiyun 95*4882a593SmuzhiyunIn the above output, the 'self' overhead of 'foo' (60%) was add to the 96*4882a593Smuzhiyun'children' overhead of 'bar', 'main' and '\_\_libc_start_main'. 97*4882a593SmuzhiyunLikewise, the 'self' overhead of 'bar' (40%) was added to the 98*4882a593Smuzhiyun'children' overhead of 'main' and '\_\_libc_start_main'. 99*4882a593Smuzhiyun 100*4882a593SmuzhiyunSo '\_\_libc_start_main' and 'main' are shown first since they have 101*4882a593Smuzhiyunsame (100%) 'children' overhead (even though they have zero 'self' 102*4882a593Smuzhiyunoverhead) and they are the parents of 'foo' and 'bar'. 103*4882a593Smuzhiyun 104*4882a593SmuzhiyunSince v3.16 the 'children' overhead is shown by default and the output 105*4882a593Smuzhiyunis sorted by its values. The 'children' overhead is disabled by 106*4882a593Smuzhiyunspecifying --no-children option on the command line or by adding 107*4882a593Smuzhiyun'report.children = false' or 'top.children = false' in the perf config 108*4882a593Smuzhiyunfile. 109