xref: /OK3568_Linux_fs/kernel/tools/power/cpupower/bench/README-BENCH (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593SmuzhiyunThis is cpufreq-bench, a microbenchmark for the cpufreq framework.
2*4882a593Smuzhiyun
3*4882a593SmuzhiyunPurpose
4*4882a593Smuzhiyun=======
5*4882a593Smuzhiyun
6*4882a593SmuzhiyunWhat is this benchmark for:
7*4882a593Smuzhiyun  - Identify worst case performance loss when doing dynamic frequency
8*4882a593Smuzhiyun    scaling using Linux kernel governors
9*4882a593Smuzhiyun  - Identify average reaction time of a governor to CPU load changes
10*4882a593Smuzhiyun  - (Stress) Testing whether a cpufreq low level driver or governor works
11*4882a593Smuzhiyun    as expected
12*4882a593Smuzhiyun  - Identify cpufreq related performance regressions between kernels
13*4882a593Smuzhiyun  - Possibly Real time priority testing? -> what happens if there are
14*4882a593Smuzhiyun    processes with a higher prio than the governor's kernel thread
15*4882a593Smuzhiyun  - ...
16*4882a593Smuzhiyun
17*4882a593SmuzhiyunWhat this benchmark does *not* cover:
18*4882a593Smuzhiyun  - Power saving related regressions (In fact as better the performance
19*4882a593Smuzhiyun    throughput is, the worse the power savings will be, but the first should
20*4882a593Smuzhiyun    mostly count more...)
21*4882a593Smuzhiyun  - Real world (workloads)
22*4882a593Smuzhiyun
23*4882a593Smuzhiyun
24*4882a593SmuzhiyunDescription
25*4882a593Smuzhiyun===========
26*4882a593Smuzhiyun
27*4882a593Smuzhiyuncpufreq-bench helps to test the condition of a given cpufreq governor.
28*4882a593SmuzhiyunFor that purpose, it compares the performance governor to a configured
29*4882a593Smuzhiyunpowersave module.
30*4882a593Smuzhiyun
31*4882a593Smuzhiyun
32*4882a593SmuzhiyunHow it works
33*4882a593Smuzhiyun============
34*4882a593SmuzhiyunYou can specify load (100% CPU load) and sleep (0% CPU load) times in us which
35*4882a593Smuzhiyunwill be run X time in a row (cycles):
36*4882a593Smuzhiyun
37*4882a593Smuzhiyun         sleep=25000
38*4882a593Smuzhiyun         load=25000
39*4882a593Smuzhiyun         cycles=20
40*4882a593Smuzhiyun
41*4882a593SmuzhiyunThis part of the configuration file will create 25ms load/sleep turns,
42*4882a593Smuzhiyunrepeated 20 times.
43*4882a593Smuzhiyun
44*4882a593SmuzhiyunAdding this:
45*4882a593Smuzhiyun         sleep_step=25000
46*4882a593Smuzhiyun         load_step=25000
47*4882a593Smuzhiyun         rounds=5
48*4882a593SmuzhiyunWill increase load and sleep time by 25ms 5 times.
49*4882a593SmuzhiyunTogether you get following test:
50*4882a593Smuzhiyun25ms  load/sleep time repeated 20 times (cycles).
51*4882a593Smuzhiyun50ms  load/sleep time repeated 20 times (cycles).
52*4882a593Smuzhiyun..
53*4882a593Smuzhiyun100ms load/sleep time repeated 20 times (cycles).
54*4882a593Smuzhiyun
55*4882a593SmuzhiyunFirst it is calibrated how long a specific CPU intensive calculation
56*4882a593Smuzhiyuntakes on this machine and needs to be run in a loop using the performance
57*4882a593Smuzhiyungovernor.
58*4882a593SmuzhiyunThen the above test runs are processed using the performance governor
59*4882a593Smuzhiyunand the governor to test. The time the calculation really needed
60*4882a593Smuzhiyunwith the dynamic freq scaling governor is compared with the time needed
61*4882a593Smuzhiyunon full performance and you get the overall performance loss.
62*4882a593Smuzhiyun
63*4882a593Smuzhiyun
64*4882a593SmuzhiyunExample of expected results with ondemand governor:
65*4882a593Smuzhiyun
66*4882a593SmuzhiyunThis shows expected results of the first two test run rounds from
67*4882a593Smuzhiyunabove config, you there have:
68*4882a593Smuzhiyun
69*4882a593Smuzhiyun100% CPU load (load) | 0 % CPU load (sleep)  | round
70*4882a593Smuzhiyun   25 ms             |    25 ms              |   1
71*4882a593Smuzhiyun   50 ms             |    50 ms              |   2
72*4882a593Smuzhiyun
73*4882a593SmuzhiyunFor example if ondemand governor is configured to have a 50ms
74*4882a593Smuzhiyunsampling rate you get:
75*4882a593Smuzhiyun
76*4882a593SmuzhiyunIn round 1, ondemand should have rather static 50% load and probably
77*4882a593Smuzhiyunwon't ever switch up (as long as up_threshold is above).
78*4882a593Smuzhiyun
79*4882a593SmuzhiyunIn round 2, if the ondemand sampling times exactly match the load/sleep
80*4882a593Smuzhiyuntrigger of the cpufreq-bench, you will see no performance loss (compare with
81*4882a593Smuzhiyunbelow possible ondemand sample kick ins (1)):
82*4882a593Smuzhiyun
83*4882a593SmuzhiyunBut if ondemand always kicks in in the middle of the load sleep cycles, it
84*4882a593Smuzhiyunwill always see 50% loads and you get worst performance impact never
85*4882a593Smuzhiyunswitching up (compare with below possible ondemand sample kick ins (2))::
86*4882a593Smuzhiyun
87*4882a593Smuzhiyun      50     50   50   50ms ->time
88*4882a593Smuzhiyunload -----|     |-----|     |-----|     |-----|
89*4882a593Smuzhiyun          |     |     |     |     |     |     |
90*4882a593Smuzhiyunsleep     |-----|     |-----|     |-----|     |----
91*4882a593Smuzhiyun    |-----|-----|-----|-----|-----|-----|-----|----  ondemand sampling (1)
92*4882a593Smuzhiyun         100    0    100    0    100    0    100     load seen by ondemand(%)
93*4882a593Smuzhiyun       |-----|-----|-----|-----|-----|-----|-----|--   ondemand sampling (2)
94*4882a593Smuzhiyun      50     50    50    50    50    50    50        load seen by ondemand(%)
95*4882a593Smuzhiyun
96*4882a593SmuzhiyunYou can easily test all kind of load/sleep times and check whether your
97*4882a593Smuzhiyungovernor in average behaves as expected.
98*4882a593Smuzhiyun
99*4882a593Smuzhiyun
100*4882a593SmuzhiyunToDo
101*4882a593Smuzhiyun====
102*4882a593Smuzhiyun
103*4882a593SmuzhiyunProvide a gnuplot utility script for easy generation of plots to present
104*4882a593Smuzhiyunthe outcome nicely.
105*4882a593Smuzhiyun
106*4882a593Smuzhiyun
107*4882a593Smuzhiyuncpufreq-bench Command Usage
108*4882a593Smuzhiyun===========================
109*4882a593Smuzhiyun-l, --load=<long int>           initial load time in us
110*4882a593Smuzhiyun-s, --sleep=<long int>          initial sleep time in us
111*4882a593Smuzhiyun-x, --load-step=<long int>      time to be added to load time, in us
112*4882a593Smuzhiyun-y, --sleep-step=<long int>     time to be added to sleep time, in us
113*4882a593Smuzhiyun-c, --cpu=<unsigned int>        CPU Number to use, starting at 0
114*4882a593Smuzhiyun-p, --prio=<priority>           scheduler priority, HIGH, LOW or DEFAULT
115*4882a593Smuzhiyun-g, --governor=<governor>       cpufreq governor to test
116*4882a593Smuzhiyun-n, --cycles=<int>              load/sleep cycles to get an average value to compare
117*4882a593Smuzhiyun-r, --rounds<int>               load/sleep rounds
118*4882a593Smuzhiyun-f, --file=<configfile>         config file to use
119*4882a593Smuzhiyun-o, --output=<dir>              output dir, must exist
120*4882a593Smuzhiyun-v, --verbose                   verbose output on/off
121*4882a593Smuzhiyun
122*4882a593SmuzhiyunDue to the high priority, the application may not be responsible for some time.
123*4882a593SmuzhiyunAfter the benchmark, the logfile is saved in OUTPUTDIR/benchmark_TIMESTAMP.log
124*4882a593Smuzhiyun
125