README-BENCH (revision 4882a59341e53eb6f0b4789bf948001014eff981) - OpenGrok cross reference for /OK3568_Linux_fs/kernel/tools/power/cpupower/bench/README-BENCH

*4882a593SmuzhiyunThis is cpufreq-bench, a microbenchmark for the cpufreq framework.
*4882a593Smuzhiyun
*4882a593SmuzhiyunPurpose
*4882a593Smuzhiyun=======
*4882a593Smuzhiyun
*4882a593SmuzhiyunWhat is this benchmark for:
*4882a593Smuzhiyun  - Identify worst case performance loss when doing dynamic frequency
*4882a593Smuzhiyun    scaling using Linux kernel governors
*4882a593Smuzhiyun  - Identify average reaction time of a governor to CPU load changes
*4882a593Smuzhiyun  - (Stress) Testing whether a cpufreq low level driver or governor works
*4882a593Smuzhiyun    as expected
*4882a593Smuzhiyun  - Identify cpufreq related performance regressions between kernels
*4882a593Smuzhiyun  - Possibly Real time priority testing? -> what happens if there are
*4882a593Smuzhiyun    processes with a higher prio than the governor's kernel thread
*4882a593Smuzhiyun  - ...
*4882a593Smuzhiyun
*4882a593SmuzhiyunWhat this benchmark does *not* cover:
*4882a593Smuzhiyun  - Power saving related regressions (In fact as better the performance
*4882a593Smuzhiyun    throughput is, the worse the power savings will be, but the first should
*4882a593Smuzhiyun    mostly count more...)
*4882a593Smuzhiyun  - Real world (workloads)
*4882a593Smuzhiyun
*4882a593Smuzhiyun
*4882a593SmuzhiyunDescription
*4882a593Smuzhiyun===========
*4882a593Smuzhiyun
*4882a593Smuzhiyuncpufreq-bench helps to test the condition of a given cpufreq governor.
*4882a593SmuzhiyunFor that purpose, it compares the performance governor to a configured
*4882a593Smuzhiyunpowersave module.
*4882a593Smuzhiyun
*4882a593Smuzhiyun
*4882a593SmuzhiyunHow it works
*4882a593Smuzhiyun============
*4882a593SmuzhiyunYou can specify load (100% CPU load) and sleep (0% CPU load) times in us which
*4882a593Smuzhiyunwill be run X time in a row (cycles):
*4882a593Smuzhiyun
*4882a593Smuzhiyun         sleep=25000
*4882a593Smuzhiyun         load=25000
*4882a593Smuzhiyun         cycles=20
*4882a593Smuzhiyun
*4882a593SmuzhiyunThis part of the configuration file will create 25ms load/sleep turns,
*4882a593Smuzhiyunrepeated 20 times.
*4882a593Smuzhiyun
*4882a593SmuzhiyunAdding this:
*4882a593Smuzhiyun         sleep_step=25000
*4882a593Smuzhiyun         load_step=25000
*4882a593Smuzhiyun         rounds=5
*4882a593SmuzhiyunWill increase load and sleep time by 25ms 5 times.
*4882a593SmuzhiyunTogether you get following test:
*4882a593Smuzhiyun25ms  load/sleep time repeated 20 times (cycles).
*4882a593Smuzhiyun50ms  load/sleep time repeated 20 times (cycles).
*4882a593Smuzhiyun..
*4882a593Smuzhiyun100ms load/sleep time repeated 20 times (cycles).
*4882a593Smuzhiyun
*4882a593SmuzhiyunFirst it is calibrated how long a specific CPU intensive calculation
*4882a593Smuzhiyuntakes on this machine and needs to be run in a loop using the performance
*4882a593Smuzhiyungovernor.
*4882a593SmuzhiyunThen the above test runs are processed using the performance governor
*4882a593Smuzhiyunand the governor to test. The time the calculation really needed
*4882a593Smuzhiyunwith the dynamic freq scaling governor is compared with the time needed
*4882a593Smuzhiyunon full performance and you get the overall performance loss.
*4882a593Smuzhiyun
*4882a593Smuzhiyun
*4882a593SmuzhiyunExample of expected results with ondemand governor:
*4882a593Smuzhiyun
*4882a593SmuzhiyunThis shows expected results of the first two test run rounds from
*4882a593Smuzhiyunabove config, you there have:
*4882a593Smuzhiyun
*4882a593Smuzhiyun100% CPU load (load) | 0 % CPU load (sleep)  | round
*4882a593Smuzhiyun   25 ms             |    25 ms              |   1
*4882a593Smuzhiyun   50 ms             |    50 ms              |   2
*4882a593Smuzhiyun
*4882a593SmuzhiyunFor example if ondemand governor is configured to have a 50ms
*4882a593Smuzhiyunsampling rate you get:
*4882a593Smuzhiyun
*4882a593SmuzhiyunIn round 1, ondemand should have rather static 50% load and probably
*4882a593Smuzhiyunwon't ever switch up (as long as up_threshold is above).
*4882a593Smuzhiyun
*4882a593SmuzhiyunIn round 2, if the ondemand sampling times exactly match the load/sleep
*4882a593Smuzhiyuntrigger of the cpufreq-bench, you will see no performance loss (compare with
*4882a593Smuzhiyunbelow possible ondemand sample kick ins (1)):
*4882a593Smuzhiyun
*4882a593SmuzhiyunBut if ondemand always kicks in in the middle of the load sleep cycles, it
*4882a593Smuzhiyunwill always see 50% loads and you get worst performance impact never
*4882a593Smuzhiyunswitching up (compare with below possible ondemand sample kick ins (2))::
*4882a593Smuzhiyun
*4882a593Smuzhiyun      50     50   50   50ms ->time
*4882a593Smuzhiyunload -----|     |-----|     |-----|     |-----|
*4882a593Smuzhiyun          |     |     |     |     |     |     |
*4882a593Smuzhiyunsleep     |-----|     |-----|     |-----|     |----
*4882a593Smuzhiyun    |-----|-----|-----|-----|-----|-----|-----|----  ondemand sampling (1)
*4882a593Smuzhiyun         100    0    100    0    100    0    100     load seen by ondemand(%)
*4882a593Smuzhiyun       |-----|-----|-----|-----|-----|-----|-----|--   ondemand sampling (2)
*4882a593Smuzhiyun      50     50    50    50    50    50    50        load seen by ondemand(%)
*4882a593Smuzhiyun
*4882a593SmuzhiyunYou can easily test all kind of load/sleep times and check whether your
*4882a593Smuzhiyungovernor in average behaves as expected.
*4882a593Smuzhiyun
*4882a593Smuzhiyun
*4882a593SmuzhiyunToDo
*4882a593Smuzhiyun====
*4882a593Smuzhiyun
*4882a593SmuzhiyunProvide a gnuplot utility script for easy generation of plots to present
*4882a593Smuzhiyunthe outcome nicely.
*4882a593Smuzhiyun
*4882a593Smuzhiyun
*4882a593Smuzhiyuncpufreq-bench Command Usage
*4882a593Smuzhiyun===========================
*4882a593Smuzhiyun-l, --load=<long int>           initial load time in us
*4882a593Smuzhiyun-s, --sleep=<long int>          initial sleep time in us
*4882a593Smuzhiyun-x, --load-step=<long int>      time to be added to load time, in us
*4882a593Smuzhiyun-y, --sleep-step=<long int>     time to be added to sleep time, in us
*4882a593Smuzhiyun-c, --cpu=<unsigned int>        CPU Number to use, starting at 0
*4882a593Smuzhiyun-p, --prio=<priority>           scheduler priority, HIGH, LOW or DEFAULT
*4882a593Smuzhiyun-g, --governor=<governor>       cpufreq governor to test
*4882a593Smuzhiyun-n, --cycles=<int>              load/sleep cycles to get an average value to compare
*4882a593Smuzhiyun-r, --rounds<int>               load/sleep rounds
*4882a593Smuzhiyun-f, --file=<configfile>         config file to use
*4882a593Smuzhiyun-o, --output=<dir>              output dir, must exist
*4882a593Smuzhiyun-v, --verbose                   verbose output on/off
*4882a593Smuzhiyun
*4882a593SmuzhiyunDue to the high priority, the application may not be responsible for some time.
*4882a593SmuzhiyunAfter the benchmark, the logfile is saved in OUTPUTDIR/benchmark_TIMESTAMP.log
*4882a593Smuzhiyun