1*4882a593SmuzhiyunThis is cpufreq-bench, a microbenchmark for the cpufreq framework. 2*4882a593Smuzhiyun 3*4882a593SmuzhiyunPurpose 4*4882a593Smuzhiyun======= 5*4882a593Smuzhiyun 6*4882a593SmuzhiyunWhat is this benchmark for: 7*4882a593Smuzhiyun - Identify worst case performance loss when doing dynamic frequency 8*4882a593Smuzhiyun scaling using Linux kernel governors 9*4882a593Smuzhiyun - Identify average reaction time of a governor to CPU load changes 10*4882a593Smuzhiyun - (Stress) Testing whether a cpufreq low level driver or governor works 11*4882a593Smuzhiyun as expected 12*4882a593Smuzhiyun - Identify cpufreq related performance regressions between kernels 13*4882a593Smuzhiyun - Possibly Real time priority testing? -> what happens if there are 14*4882a593Smuzhiyun processes with a higher prio than the governor's kernel thread 15*4882a593Smuzhiyun - ... 16*4882a593Smuzhiyun 17*4882a593SmuzhiyunWhat this benchmark does *not* cover: 18*4882a593Smuzhiyun - Power saving related regressions (In fact as better the performance 19*4882a593Smuzhiyun throughput is, the worse the power savings will be, but the first should 20*4882a593Smuzhiyun mostly count more...) 21*4882a593Smuzhiyun - Real world (workloads) 22*4882a593Smuzhiyun 23*4882a593Smuzhiyun 24*4882a593SmuzhiyunDescription 25*4882a593Smuzhiyun=========== 26*4882a593Smuzhiyun 27*4882a593Smuzhiyuncpufreq-bench helps to test the condition of a given cpufreq governor. 28*4882a593SmuzhiyunFor that purpose, it compares the performance governor to a configured 29*4882a593Smuzhiyunpowersave module. 30*4882a593Smuzhiyun 31*4882a593Smuzhiyun 32*4882a593SmuzhiyunHow it works 33*4882a593Smuzhiyun============ 34*4882a593SmuzhiyunYou can specify load (100% CPU load) and sleep (0% CPU load) times in us which 35*4882a593Smuzhiyunwill be run X time in a row (cycles): 36*4882a593Smuzhiyun 37*4882a593Smuzhiyun sleep=25000 38*4882a593Smuzhiyun load=25000 39*4882a593Smuzhiyun cycles=20 40*4882a593Smuzhiyun 41*4882a593SmuzhiyunThis part of the configuration file will create 25ms load/sleep turns, 42*4882a593Smuzhiyunrepeated 20 times. 43*4882a593Smuzhiyun 44*4882a593SmuzhiyunAdding this: 45*4882a593Smuzhiyun sleep_step=25000 46*4882a593Smuzhiyun load_step=25000 47*4882a593Smuzhiyun rounds=5 48*4882a593SmuzhiyunWill increase load and sleep time by 25ms 5 times. 49*4882a593SmuzhiyunTogether you get following test: 50*4882a593Smuzhiyun25ms load/sleep time repeated 20 times (cycles). 51*4882a593Smuzhiyun50ms load/sleep time repeated 20 times (cycles). 52*4882a593Smuzhiyun.. 53*4882a593Smuzhiyun100ms load/sleep time repeated 20 times (cycles). 54*4882a593Smuzhiyun 55*4882a593SmuzhiyunFirst it is calibrated how long a specific CPU intensive calculation 56*4882a593Smuzhiyuntakes on this machine and needs to be run in a loop using the performance 57*4882a593Smuzhiyungovernor. 58*4882a593SmuzhiyunThen the above test runs are processed using the performance governor 59*4882a593Smuzhiyunand the governor to test. The time the calculation really needed 60*4882a593Smuzhiyunwith the dynamic freq scaling governor is compared with the time needed 61*4882a593Smuzhiyunon full performance and you get the overall performance loss. 62*4882a593Smuzhiyun 63*4882a593Smuzhiyun 64*4882a593SmuzhiyunExample of expected results with ondemand governor: 65*4882a593Smuzhiyun 66*4882a593SmuzhiyunThis shows expected results of the first two test run rounds from 67*4882a593Smuzhiyunabove config, you there have: 68*4882a593Smuzhiyun 69*4882a593Smuzhiyun100% CPU load (load) | 0 % CPU load (sleep) | round 70*4882a593Smuzhiyun 25 ms | 25 ms | 1 71*4882a593Smuzhiyun 50 ms | 50 ms | 2 72*4882a593Smuzhiyun 73*4882a593SmuzhiyunFor example if ondemand governor is configured to have a 50ms 74*4882a593Smuzhiyunsampling rate you get: 75*4882a593Smuzhiyun 76*4882a593SmuzhiyunIn round 1, ondemand should have rather static 50% load and probably 77*4882a593Smuzhiyunwon't ever switch up (as long as up_threshold is above). 78*4882a593Smuzhiyun 79*4882a593SmuzhiyunIn round 2, if the ondemand sampling times exactly match the load/sleep 80*4882a593Smuzhiyuntrigger of the cpufreq-bench, you will see no performance loss (compare with 81*4882a593Smuzhiyunbelow possible ondemand sample kick ins (1)): 82*4882a593Smuzhiyun 83*4882a593SmuzhiyunBut if ondemand always kicks in in the middle of the load sleep cycles, it 84*4882a593Smuzhiyunwill always see 50% loads and you get worst performance impact never 85*4882a593Smuzhiyunswitching up (compare with below possible ondemand sample kick ins (2)):: 86*4882a593Smuzhiyun 87*4882a593Smuzhiyun 50 50 50 50ms ->time 88*4882a593Smuzhiyunload -----| |-----| |-----| |-----| 89*4882a593Smuzhiyun | | | | | | | 90*4882a593Smuzhiyunsleep |-----| |-----| |-----| |---- 91*4882a593Smuzhiyun |-----|-----|-----|-----|-----|-----|-----|---- ondemand sampling (1) 92*4882a593Smuzhiyun 100 0 100 0 100 0 100 load seen by ondemand(%) 93*4882a593Smuzhiyun |-----|-----|-----|-----|-----|-----|-----|-- ondemand sampling (2) 94*4882a593Smuzhiyun 50 50 50 50 50 50 50 load seen by ondemand(%) 95*4882a593Smuzhiyun 96*4882a593SmuzhiyunYou can easily test all kind of load/sleep times and check whether your 97*4882a593Smuzhiyungovernor in average behaves as expected. 98*4882a593Smuzhiyun 99*4882a593Smuzhiyun 100*4882a593SmuzhiyunToDo 101*4882a593Smuzhiyun==== 102*4882a593Smuzhiyun 103*4882a593SmuzhiyunProvide a gnuplot utility script for easy generation of plots to present 104*4882a593Smuzhiyunthe outcome nicely. 105*4882a593Smuzhiyun 106*4882a593Smuzhiyun 107*4882a593Smuzhiyuncpufreq-bench Command Usage 108*4882a593Smuzhiyun=========================== 109*4882a593Smuzhiyun-l, --load=<long int> initial load time in us 110*4882a593Smuzhiyun-s, --sleep=<long int> initial sleep time in us 111*4882a593Smuzhiyun-x, --load-step=<long int> time to be added to load time, in us 112*4882a593Smuzhiyun-y, --sleep-step=<long int> time to be added to sleep time, in us 113*4882a593Smuzhiyun-c, --cpu=<unsigned int> CPU Number to use, starting at 0 114*4882a593Smuzhiyun-p, --prio=<priority> scheduler priority, HIGH, LOW or DEFAULT 115*4882a593Smuzhiyun-g, --governor=<governor> cpufreq governor to test 116*4882a593Smuzhiyun-n, --cycles=<int> load/sleep cycles to get an average value to compare 117*4882a593Smuzhiyun-r, --rounds<int> load/sleep rounds 118*4882a593Smuzhiyun-f, --file=<configfile> config file to use 119*4882a593Smuzhiyun-o, --output=<dir> output dir, must exist 120*4882a593Smuzhiyun-v, --verbose verbose output on/off 121*4882a593Smuzhiyun 122*4882a593SmuzhiyunDue to the high priority, the application may not be responsible for some time. 123*4882a593SmuzhiyunAfter the benchmark, the logfile is saved in OUTPUTDIR/benchmark_TIMESTAMP.log 124*4882a593Smuzhiyun 125