Lines Matching +full:short +full:- +full:press +full:- +full:ms

1 .. SPDX-License-Identifier: CC-BY-SA-2.0-UK
20 that it's only for tracing and profiling the kernel - you can indeed use
24 all applications on the system simultaneously to gain a system-wide view
45 ----------
48 outlined in the ":ref:`profile-manual/intro:General Setup`" section.
61 ----------------
63 The perf tool is pretty much self-documenting. To remind yourself of the
69 usage: perf [--version] [--help] COMMAND [ARGS]
73 archive Create archive with object files with build-ids found in perf.data file
75 buildid-cache Manage build-id cache.
76 buildid-list List the buildids in a perf.data file
107 root@crownbay:~# rm linux-2.6.19.2.tar.bz2; \
108 wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2
115 root@crownbay:~# perf stat wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2
117 …linux-2.6.19.2.tar.b 100% |***************************************************| 41727k 0:00:00 ETA
119 Performance counter stats for 'wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2':
121 4597.223902 task-clock # 0.077 CPUs utilized
122 23568 context-switches # 0.005 M/sec
123 68 CPU-migrations # 0.015 K/sec
124 241 page-faults # 0.052 K/sec
126 <not supported> stalled-cycles-frontend
127 <not supported> stalled-cycles-backend
130 19550329 branch-misses # 11.82% of all branches
134 Many times such a simple-minded test doesn't yield much of
135 interest, but sometimes it does (see Real-world Yocto bug (slow
136 loop-mounted write speed)).
139 - basically any event listed in the output of 'perf list' can be tallied
144 …root@crownbay:~# perf stat -e kmem:* -e cache-references -e cache-misses wget &YOCTO_DL_URL;/mirro…
146 …linux-2.6.19.2.tar.b 100% |***************************************************| 41727k 0:00:00 ETA
148 Performance counter stats for 'wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2':
162 66848602 cache-references
163 2917740 cache-misses # 4.365 % of all cache refs
174 interactive text-based UI (or simply as text if we specify ``--stdio`` to
179 'perf record' and any perf options we hand it - here none - will be
184 root@crownbay:~# perf record wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2
187 linux-2.6.19.2.tar.b 100% |************************************************| 41727k 0:00:00 ETA
192 'text-based UI' (tui), simply run 'perf report', which will read the
198 .. image:: figures/perf-wget-flat-stripped.png
205 only above a certain threshold and so on - see the perf documentation
208 containing a [k]). (perf has command-line modifiers that can be used to
213 useful function name in that entry, it displays a not-so-friendly hex
218 between the new profile and the previous one is that we'll add the -g
222 root@crownbay:~# perf record -g wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2
224 linux-2.6.19.2.tar.b 100% |************************************************| 41727k 0:00:00 ETA
231 .. image:: figures/perf-wget-g-copy-to-user-expanded-stripped.png
240 left-hand side. This means that we can expand the entry and drill down
242 one of them will expand the callchain (you can also press 'E' to expand
260 down to wget i.e. a copy-to-user.
267 .. image:: figures/perf-wget-g-copy-from-user-expanded-stripped.png
270 The above screenshot shows the other half of the journey for the data -
272 disk, the wget program issues a ``write(2)``, which does a ``copy-from-user`` to
284 .. image:: figures/perf-wget-busybox-expanded-stripped.png
305 ``dbg-pkgs`` to :term:`EXTRA_IMAGE_FEATURES` in ``local.conf``. For example::
307 EXTRA_IMAGE_FEATURES = "debug-tweaks tools-profile dbg-pkgs"
314 PACKAGE_DEBUG_SPLIT_STYLE = 'debug-file-directory'
318 ``build/tmp/deploy/rpm/*`` on the host system. Find the busybox-dbg-...rpm
321 …empanada core2]$ scp /home/trz/yocto/crownbay-tracing-dbg/build/tmp/deploy/rpm/core2_32/busybox-db…
322 busybox-dbg-1.20.2-r2.core2_32.rpm 100% 1826KB 1.8MB/s 00:01
326 root@crownbay:~# rpm -i busybox-dbg-1.20.2-r2.core2_32.rpm
331 .. image:: figures/perf-wget-busybox-debuginfo.png
334 If we expand one of the entries and press 'enter' on a leaf node, we're
338 .. image:: figures/perf-wget-busybox-dso-zoom-menu.png
342 busybox-centric view of the profiled functions (in this case we've also
345 .. image:: figures/perf-wget-busybox-dso-zoom.png
352 copy-to-user overhead was initiated by the ``handle_input()`` BusyBox
355 .. image:: figures/perf-wget-g-copy-to-user-expanded-debuginfo.png
363 .. image:: figures/perf-wget-busybox-annotate-menu.png
371 .. image:: figures/perf-wget-busybox-annotate-udhcpc.png
389 List of pre-defined events (to be used in -e):
390 cpu-cycles OR cycles [Hardware event]
391 stalled-cycles-frontend OR idle-cycles-frontend [Hardware event]
392 stalled-cycles-backend OR idle-cycles-backend [Hardware event]
394 cache-references [Hardware event]
395 cache-misses [Hardware event]
396 branch-instructions OR branches [Hardware event]
397 branch-misses [Hardware event]
398 bus-cycles [Hardware event]
399 ref-cycles [Hardware event]
401 cpu-clock [Software event]
402 task-clock [Software event]
403 page-faults OR faults [Software event]
404 minor-faults [Software event]
405 major-faults [Software event]
406 context-switches OR cs [Software event]
407 cpu-migrations OR migrations [Software event]
408 alignment-faults [Software event]
409 emulation-faults [Software event]
411 L1-dcache-loads [Hardware cache event]
412 L1-dcache-load-misses [Hardware cache event]
413 L1-dcache-prefetch-misses [Hardware cache event]
414 L1-icache-loads [Hardware cache event]
415 L1-icache-load-misses [Hardware cache event]
421 (see 'perf list --help' on how to encode it)
530 …bay:~# perf stat -e skb:* -e net:* -e napi:* -e sched:* -e workqueue:* -e irq:* -e syscalls:* wget…
531 Performance counter stats for 'wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2':
592 …root@crownbay:~# perf record -g -e sched:sched_wakeup wget &YOCTO_DL_URL;/mirror/sources/linux-2.6…
594 .. image:: figures/sched-wakeup-profile.png
600 tracepoint - it's actually defined just inside ttwu_do_wakeup(), which
603 .. code-block:: c
606 * Mark the task runnable and perform wakeup-preemption.
624 sample on every event occurrence (this can be changed using the -c
635 high-level view of what's going on with a workload or across the system.
643 individual steps that go into the higher-level behavior exposed by the
644 coarse-grained profiling data.
649 …root@crownbay:~# perf record -g -e skb:* -e net:* -e napi:* -e sched:sched_switch -e sched:sched_w…
650-e syscalls:sys_enter_read -e syscalls:sys_exit_read -e syscalls:sys_enter_write -e syscalls:sys_e…
651 wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2
684 In many ways, profiling can be viewed as a subset of tracing -
693 in the end it can't be used in an open-ended way to extract data that
697 Full-blown detailed tracing data does however offer the opportunity to
715 something other than pre-canned profiling tools to do that.
717 Luckily, there is a general-purpose way to handle such needs, called
727 DProbes dpcc compiler, an ANSI C compiler which targeted a low-level
728 assembly language running on an in-kernel interpreter on the target
731 heavily inspired by DTrace, also created its own one-off language,
732 but rather than running the product on an in-kernel interpreter,
733 created an elaborate compiler-based machinery to translate its
737 -g' to generate a skeleton script with handlers for the read/write
740 root@crownbay:~# perf script -g python
741 generated Python script: perf-script.py
747 .. code-block:: python
760 root@crownbay:~# perf script -s perf-script.py
796 per-event count. When the program is run against a perf.data file, each
800 .. code-block:: python
811 hash function that does Perl-like autovivification, a capability that's
812 extremely useful for kinds of multi-level aggregation commonly used in
816 .. code-block:: python
827 result of all the per-event tallies. For that, we use the special
830 .. code-block:: python
834 print "%-40s %10s\n" % (event_name, count)
859 the right kind of trace data, higher-level profiling-type summaries can
863 binding <https://linux.die.net/man/1/perf-script-python>`__.
865 System-Wide Tracing and Profiling
869 workload - in other words, every profiling run has specified the program
870 to profile in the command-line e.g. 'perf record wget ...'.
873 system-wide profile or trace while running the workload in a separate
876 To do system-wide profiling or tracing, you typically use the -a flag to
880 -a flag (press Ctrl-C to stop tracing)::
882 root@crownbay:~# perf record -g -a
888 root@crownbay:~# wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2
890 linux-2.6.19.2.tar.b 100% \|*******************************\| 41727k 0:00:00 ETA
895 .. image:: figures/perf-systemwide.png
908 root@crownbay:~# perf record -g -a -e cycles:u
912 .. image:: figures/perf-report-cycles-u.png
917 Finally, we can press 'enter' on a leaf node and select the 'Zoom into
920 the entries associated with the libc-xxx.so DSO.
922 .. image:: figures/perf-systemwide-libc.png
925 We can also use the system-wide -a switch to do system-wide tracing.
928 root@crownbay:~# perf record -a -e sched:sched_switch -e sched:sched_wakeup
953 the '--filter' option - for each event we specify using -e, we can add a
954 --filter after that to filter out trace events that contain fields with
957 …ot@crownbay:~# perf record -a -e sched:sched_switch --filter 'next_comm != perf && prev_comm != pe…
994 These event filters are implemented by a special-purpose
995 pseudo-interpreter in the kernel and are an integral and
997 kernel-based event filters provide a mechanism to precisely throttle
1004 project-defined language in order to accommodate that design, or
1006 super-efficient kernel-to-userspace transport mechanism in order to
1028 perf record -e probe:do_fork -aR sleep 1
1036 root@crownbay:/sys/kernel/debug/tracing/events/probe/do_fork# ls -al
1037 drwxr-xr-x 2 root root 0 Oct 28 11:42 .
1038 drwxr-xr-x 3 root root 0 Oct 28 11:42 ..
1039 -rw-r--r-- 1 root root 0 Oct 28 11:42 enable
1040 -rw-r--r-- 1 root root 0 Oct 28 11:42 filter
1041 -r--r--r-- 1 root root 0 Oct 28 11:42 format
1042 -r--r--r-- 1 root root 0 Oct 28 11:42 id
1048 field:unsigned short common_type; offset:0; size:2; signed:0;
1056 print fmt: "(%lx)", REC->__probe_ip
1061 root@crownbay:~# perf probe -l
1065 Let's record system-wide ('sleep 30' is a
1066 trick for recording system-wide but basically do nothing and then wake
1069 root@crownbay:~# perf record -g -a -e probe:do_fork sleep 30
1080 # os release : 3.4.11-yocto-standard
1088 # cmdline : /usr/bin/perf record -g -a -e probe:do_fork sleep 30
1091 # HEADER_CPU_TOPOLOGY info available, use -I to display
1094 matchbox-deskto 1197 [001] 34211.378318: do_fork: (c1028460)
1095 matchbox-deskto 1295 [001] 34211.380388: do_fork: (c1028460)
1098 matchbox-deskto 1197 [001] 34217.541603: do_fork: (c1028460)
1099 matchbox-deskto 1299 [001] 34217.543584: do_fork: (c1028460)
1105 matchbox-deskto 1197 [001] 34224.955965: do_fork: (c1028460)
1106 matchbox-deskto 1306 [001] 34224.957972: do_fork: (c1028460)
1107 matchbox-termin 1307 [000] 34225.038214: do_fork: (c1028460)
1108 matchbox-termin 1307 [001] 34225.044218: do_fork: (c1028460)
1109 matchbox-termin 1307 [000] 34225.046442: do_fork: (c1028460)
1110 matchbox-deskto 1197 [001] 34237.112138: do_fork: (c1028460)
1111 matchbox-deskto 1311 [001] 34237.114106: do_fork: (c1028460)
1117 .. image:: figures/perf-probe-do_fork-profile.png
1123 in exactly the same way - there's no difference as far as the
1134 ------------------
1139 - The `'perf stat' manpage <https://linux.die.net/man/1/perf-stat>`__.
1141 - The `'perf record'
1142 manpage <https://linux.die.net/man/1/perf-record>`__.
1144 - The `'perf report'
1145 manpage <https://linux.die.net/man/1/perf-report>`__.
1147 - The `'perf probe' manpage <https://linux.die.net/man/1/perf-probe>`__.
1149 - The `'perf script'
1150 manpage <https://linux.die.net/man/1/perf-script>`__.
1152 - Documentation on using the `'perf script' python
1153 binding <https://linux.die.net/man/1/perf-script-python>`__.
1155 - The top-level `perf(1) manpage <https://linux.die.net/man/1/perf>`__.
1163 IMAGE_INSTALL:append = " perf perf-doc"
1164 DISTRO_FEATURES:append = " api-documentation"
1183 ------------
1186 outlined in the ":ref:`profile-manual/intro:General Setup`" section.
1188 ftrace, trace-cmd, and kernelshark run on the target system, and are
1189 ready to go out-of-the-box - no additional setup is necessary. For the
1192 you use the '-X' option to ssh you can have the kernelshark GUI run on
1196 ------------------
1217 - some relate directly to the tracers themselves, others are used to set
1224 We'll start by looking at some of the available built-in tracers.
1257 # entries-in-buffer/entries-written: 310629/766471 #P:8
1259 # _-----=> irqs-off
1260 # / _----=> need-resched
1261 # | / _---=> hardirq/softirq
1262 # || / _--=> preempt-depth
1264 # TASK-PID CPU# |||| TIMESTAMP FUNCTION
1266 <idle>-0 [004] d..1 470.867169: ktime_get_real <-intel_idle
1267 <idle>-0 [004] d..1 470.867170: getnstimeofday <-ktime_get_real
1268 <idle>-0 [004] d..1 470.867171: ns_to_timeval <-intel_idle
1269 <idle>-0 [004] d..1 470.867171: ns_to_timespec <-ns_to_timeval
1270 <idle>-0 [004] d..1 470.867172: smp_apic_timer_interrupt <-apic_timer_interrupt
1271 <idle>-0 [004] d..1 470.867172: native_apic_mem_write <-smp_apic_timer_interrupt
1272 <idle>-0 [004] d..1 470.867172: irq_enter <-smp_apic_timer_interrupt
1273 <idle>-0 [004] d..1 470.867172: rcu_irq_enter <-irq_enter
1274 <idle>-0 [004] d..1 470.867173: rcu_idle_exit_common.isra.33 <-rcu_irq_enter
1275 <idle>-0 [004] d..1 470.867173: local_bh_disable <-irq_enter
1276 <idle>-0 [004] d..1 470.867173: add_preempt_count <-local_bh_disable
1277 <idle>-0 [004] d.s1 470.867174: tick_check_idle <-irq_enter
1278 <idle>-0 [004] d.s1 470.867174: tick_check_oneshot_broadcast <-tick_check_idle
1279 <idle>-0 [004] d.s1 470.867174: ktime_get <-tick_check_idle
1280 <idle>-0 [004] d.s1 470.867174: tick_nohz_stop_idle <-tick_check_idle
1281 <idle>-0 [004] d.s1 470.867175: update_ts_time_stats <-tick_nohz_stop_idle
1282 <idle>-0 [004] d.s1 470.867175: nr_iowait_cpu <-update_ts_time_stats
1283 <idle>-0 [004] d.s1 470.867175: tick_do_update_jiffies64 <-tick_check_idle
1284 <idle>-0 [004] d.s1 470.867175: _raw_spin_lock <-tick_do_update_jiffies64
1285 <idle>-0 [004] d.s1 470.867176: add_preempt_count <-_raw_spin_lock
1286 <idle>-0 [004] d.s2 470.867176: do_timer <-tick_do_update_jiffies64
1287 <idle>-0 [004] d.s2 470.867176: _raw_spin_lock <-do_timer
1288 <idle>-0 [004] d.s2 470.867176: add_preempt_count <-_raw_spin_lock
1289 <idle>-0 [004] d.s3 470.867177: ntp_tick_length <-do_timer
1290 <idle>-0 [004] d.s3 470.867177: _raw_spin_lock_irqsave <-ntp_tick_length
1309 be - luckily there's a variant of the function tracer that displays the
1372 ------------------------------------------
1373 4) kworker-74 => <idle>-0
1374 ------------------------------------------
1440 ----------------------------
1449 root@sugarbay:/sys/kernel/debug/tracing/events# ls -al
1450 drwxr-xr-x 38 root root 0 Nov 14 23:19 .
1451 drwxr-xr-x 5 root root 0 Nov 14 23:19 ..
1452 drwxr-xr-x 19 root root 0 Nov 14 23:19 block
1453 drwxr-xr-x 32 root root 0 Nov 14 23:19 btrfs
1454 drwxr-xr-x 5 root root 0 Nov 14 23:19 drm
1455 -rw-r--r-- 1 root root 0 Nov 14 23:19 enable
1456 drwxr-xr-x 40 root root 0 Nov 14 23:19 ext3
1457 drwxr-xr-x 79 root root 0 Nov 14 23:19 ext4
1458 drwxr-xr-x 14 root root 0 Nov 14 23:19 ftrace
1459 drwxr-xr-x 8 root root 0 Nov 14 23:19 hda
1460 -r--r--r-- 1 root root 0 Nov 14 23:19 header_event
1461 -r--r--r-- 1 root root 0 Nov 14 23:19 header_page
1462 drwxr-xr-x 25 root root 0 Nov 14 23:19 i915
1463 drwxr-xr-x 7 root root 0 Nov 14 23:19 irq
1464 drwxr-xr-x 12 root root 0 Nov 14 23:19 jbd
1465 drwxr-xr-x 14 root root 0 Nov 14 23:19 jbd2
1466 drwxr-xr-x 14 root root 0 Nov 14 23:19 kmem
1467 drwxr-xr-x 7 root root 0 Nov 14 23:19 module
1468 drwxr-xr-x 3 root root 0 Nov 14 23:19 napi
1469 drwxr-xr-x 6 root root 0 Nov 14 23:19 net
1470 drwxr-xr-x 3 root root 0 Nov 14 23:19 oom
1471 drwxr-xr-x 12 root root 0 Nov 14 23:19 power
1472 drwxr-xr-x 3 root root 0 Nov 14 23:19 printk
1473 drwxr-xr-x 8 root root 0 Nov 14 23:19 random
1474 drwxr-xr-x 4 root root 0 Nov 14 23:19 raw_syscalls
1475 drwxr-xr-x 3 root root 0 Nov 14 23:19 rcu
1476 drwxr-xr-x 6 root root 0 Nov 14 23:19 rpm
1477 drwxr-xr-x 20 root root 0 Nov 14 23:19 sched
1478 drwxr-xr-x 7 root root 0 Nov 14 23:19 scsi
1479 drwxr-xr-x 4 root root 0 Nov 14 23:19 signal
1480 drwxr-xr-x 5 root root 0 Nov 14 23:19 skb
1481 drwxr-xr-x 4 root root 0 Nov 14 23:19 sock
1482 drwxr-xr-x 10 root root 0 Nov 14 23:19 sunrpc
1483 drwxr-xr-x 538 root root 0 Nov 14 23:19 syscalls
1484 drwxr-xr-x 4 root root 0 Nov 14 23:19 task
1485 drwxr-xr-x 14 root root 0 Nov 14 23:19 timer
1486 drwxr-xr-x 3 root root 0 Nov 14 23:19 udp
1487 drwxr-xr-x 21 root root 0 Nov 14 23:19 vmscan
1488 drwxr-xr-x 3 root root 0 Nov 14 23:19 vsyscall
1489 drwxr-xr-x 6 root root 0 Nov 14 23:19 workqueue
1490 drwxr-xr-x 26 root root 0 Nov 14 23:19 writeback
1498 root@sugarbay:/sys/kernel/debug/tracing/events/kmem# ls -al
1499 drwxr-xr-x 14 root root 0 Nov 14 23:19 .
1500 drwxr-xr-x 38 root root 0 Nov 14 23:19 ..
1501 -rw-r--r-- 1 root root 0 Nov 14 23:19 enable
1502 -rw-r--r-- 1 root root 0 Nov 14 23:19 filter
1503 drwxr-xr-x 2 root root 0 Nov 14 23:19 kfree
1504 drwxr-xr-x 2 root root 0 Nov 14 23:19 kmalloc
1505 drwxr-xr-x 2 root root 0 Nov 14 23:19 kmalloc_node
1506 drwxr-xr-x 2 root root 0 Nov 14 23:19 kmem_cache_alloc
1507 drwxr-xr-x 2 root root 0 Nov 14 23:19 kmem_cache_alloc_node
1508 drwxr-xr-x 2 root root 0 Nov 14 23:19 kmem_cache_free
1509 drwxr-xr-x 2 root root 0 Nov 14 23:19 mm_page_alloc
1510 drwxr-xr-x 2 root root 0 Nov 14 23:19 mm_page_alloc_extfrag
1511 drwxr-xr-x 2 root root 0 Nov 14 23:19 mm_page_alloc_zone_locked
1512 drwxr-xr-x 2 root root 0 Nov 14 23:19 mm_page_free
1513 drwxr-xr-x 2 root root 0 Nov 14 23:19 mm_page_free_batched
1514 drwxr-xr-x 2 root root 0 Nov 14 23:19 mm_page_pcpu_drain
1520 root@sugarbay:/sys/kernel/debug/tracing/events/kmem/kmalloc# ls -al
1521 drwxr-xr-x 2 root root 0 Nov 14 23:19 .
1522 drwxr-xr-x 14 root root 0 Nov 14 23:19 ..
1523 -rw-r--r-- 1 root root 0 Nov 14 23:19 enable
1524 -rw-r--r-- 1 root root 0 Nov 14 23:19 filter
1525 -r--r--r-- 1 root root 0 Nov 14 23:19 format
1526 -r--r--r-- 1 root root 0 Nov 14 23:19 id
1539 field:unsigned short common_type; offset:0; size:2; signed:0;
1551 …ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s", REC->call_site, REC->ptr, REC->bytes_req, REC-
1552 …(REC->gfp_flags) ? __print_flags(REC->gfp_flags, "|", {(unsigned long)(((( gfp_t)0x10u) | (( gfp_t…
1570 trace-cmd) to actually turn the tracepoint on and off. When enabled, the
1590 # entries-in-buffer/entries-written: 1897/1897 #P:8
1592 # _-----=> irqs-off
1593 # / _----=> need-resched
1594 # | / _---=> hardirq/softirq
1595 # || / _--=> preempt-depth
1597 # TASK-PID CPU# |||| TIMESTAMP FUNCTION
1599 …dropbear-1465 [000] ...1 18154.620753: kmalloc: call_site=ffffffff816650d4 ptr=ffff8800729c3000 b…
1600 …<idle>-0 [000] ..s3 18154.621640: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d555800 byt…
1601 …<idle>-0 [000] ..s3 18154.621656: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d555800 byt…
1602 …matchbox-termin-1361 [001] ...1 18154.755472: kmalloc: call_site=ffffffff81614050 ptr=ffff88006d5…
1603 …Xorg-1264 [002] ...1 18154.755581: kmalloc: call_site=ffffffff8141abe8 ptr=ffff8800734f4cc0 bytes…
1604 …Xorg-1264 [002] ...1 18154.755583: kmalloc: call_site=ffffffff814192a3 ptr=ffff88001f822520 bytes…
1605 …Xorg-1264 [002] ...1 18154.755589: kmalloc: call_site=ffffffff81419edb ptr=ffff8800721a2f00 bytes…
1606 …matchbox-termin-1361 [001] ...1 18155.354594: kmalloc: call_site=ffffffff81614050 ptr=ffff88006db…
1607 …Xorg-1264 [002] ...1 18155.354703: kmalloc: call_site=ffffffff8141abe8 ptr=ffff8800734f4cc0 bytes…
1608 …Xorg-1264 [002] ...1 18155.354705: kmalloc: call_site=ffffffff814192a3 ptr=ffff88001f822520 bytes…
1609 …Xorg-1264 [002] ...1 18155.354711: kmalloc: call_site=ffffffff81419edb ptr=ffff8800721a2f00 bytes…
1610 …<idle>-0 [000] ..s3 18155.673319: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d555800 byt…
1611 …dropbear-1465 [000] ...1 18155.673525: kmalloc: call_site=ffffffff816650d4 ptr=ffff8800729c3000 b…
1612 …<idle>-0 [000] ..s3 18155.674821: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d554800 byt…
1613 …<idle>-0 [000] ..s3 18155.793014: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d554800 byt…
1614 …dropbear-1465 [000] ...1 18155.793219: kmalloc: call_site=ffffffff816650d4 ptr=ffff8800729c3000 b…
1615 …<idle>-0 [000] ..s3 18155.794147: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d555800 byt…
1616 …<idle>-0 [000] ..s3 18155.936705: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d555800 byt…
1617 …dropbear-1465 [000] ...1 18155.936910: kmalloc: call_site=ffffffff816650d4 ptr=ffff8800729c3000 b…
1618 …<idle>-0 [000] ..s3 18155.937869: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d554800 byt…
1619 …matchbox-termin-1361 [001] ...1 18155.953667: kmalloc: call_site=ffffffff81614050 ptr=ffff88006d5…
1620 …Xorg-1264 [002] ...1 18155.953775: kmalloc: call_site=ffffffff8141abe8 ptr=ffff8800734f4cc0 bytes…
1621 …Xorg-1264 [002] ...1 18155.953777: kmalloc: call_site=ffffffff814192a3 ptr=ffff88001f822520 bytes…
1622 …Xorg-1264 [002] ...1 18155.953783: kmalloc: call_site=ffffffff81419edb ptr=ffff8800721a2f00 bytes…
1623 …<idle>-0 [000] ..s3 18156.176053: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d554800 byt…
1624 …dropbear-1465 [000] ...1 18156.176257: kmalloc: call_site=ffffffff816650d4 ptr=ffff8800729c3000 b…
1625 …<idle>-0 [000] ..s3 18156.177717: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d555800 byt…
1626 …<idle>-0 [000] ..s3 18156.399229: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d555800 byt…
1627 …dropbear-1465 [000] ...1 18156.399434: kmalloc: call_site=ffffffff816650d4 ptr=ffff8800729c3000 b…
1628 …<idle>-0 [000] ..s3 18156.400660: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d554800 byt…
1629 …matchbox-termin-1361 [001] ...1 18156.552800: kmalloc: call_site=ffffffff81614050 ptr=ffff88006db…
1637 arbitrarily fine-grained idea of what's going on in the system by
1641 trace-cmd and kernelshark in the next section.
1653 Eventually all the special-purpose tracers currently available in
1657 trace-cmd/kernelshark
1658 ---------------------
1660 trace-cmd is essentially an extensive command-line 'wrapper' interface
1669 a per-CPU graphical display. It directly uses 'trace-cmd' as the
1671 displays the trace-cmd command it uses, as we'll see).
1685 .. image:: figures/kernelshark-choose-events.png
1689 previous trace events subsystem section, and in fact is where trace-cmd
1700 .. image:: figures/kernelshark-output-display.png
1703 Notice that the right-hand pane shows the exact trace-cmd command-line
1704 that's used to run the trace, along with the results of the trace-cmd
1708 with a colorful per-cpu display of the trace data, along with the
1711 .. image:: figures/kernelshark-i915-display.png
1717 .. image:: figures/kernelshark-all.png
1720 The tool is pretty self-explanatory, but for more detailed information
1725 --------------------
1737 There is a nice series of articles on using ftrace and trace-cmd at LWN:
1739 - `Debugging the kernel using Ftrace - part
1742 - `Debugging the kernel using Ftrace - part
1745 - `Secrets of the Ftrace function
1748 - `trace-cmd: A front-end for
1754 An amusing yet useful README (a tracing mini-HOWTO) can be found in
1760 SystemTap is a system-wide script-based tracing and profiling tool.
1762 SystemTap scripts are C-like programs that are executed in the kernel to
1773 .. code-block:: none
1780 probe timer.ms(4000) # after 4 seconds
1809 section assumes only the typical embedded use-case.
1819 ---------------
1837 (or a build that includes 'tools-profile') in order to build
1841 - If you're running a pre-built image, download the release
1843 - If you're working from git sources, just clone the metadata
1845 - Make sure you're properly set up to build a new image (see
1848 - Build an -sdk version of the image e.g.:
1849 $ bitbake core-image-sato-sdk
1851 - Build a non-sdk image but include the profiling tools:
1852 [ edit local.conf and add 'tools-profile' to the end of
1854 $ bitbake core-image-sato
1857 boot it (or the equivalent pre-built image) and use 'crosstap'
1860 $ source oe-init-build-env
1870 default in \*-minimal images.
1873 do is build an SDK image or image with 'tools-profile' as detailed in
1874 the ":ref:`profile-manual/intro:General Setup`" section of this
1886 ----------------------------
1892 $ source oe-init-build-env
1899 core-image-minimal
1900 core-image-sato
1901 meta-toolchain
1902 meta-ide-support
1904 You can also run generated QEMU images with a command like 'runqemu qemux86-64'
1926 (enter the password when prompted, or press enter if it's set up to use
1929 .. code-block:: none
1933 matchbox-termin(1036) open ("/tmp/vte3FS2LW", O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600)
1934 matchbox-termin(1036) open ("/tmp/vteJMC7LW", O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600)
1937 -----------------------
1949 Sysprof is a very easy to use system-wide profiler that consists of a
1954 -------------
1957 outlined in the ":ref:`profile-manual/intro:General Setup`" section.
1959 Sysprof is a GUI-based application that runs on the target system. For
1961 be running Sysprof on the target (you can use the '-X' option to ssh and
1966 -------------------
1968 To start profiling the system, you simply press the 'Start' button. To
1970 press the 'Profile' button.
1975 .. image:: figures/sysprof-copy-to-user.png
1980 Note that this caller-oriented display is essentially the inverse of
1981 perf's default callee-oriented callchain display.
1989 .. image:: figures/sysprof-copy-from-user.png
1993 copy-from-user callchain.
2000 .. image:: figures/sysprof-callers.png
2003 Double-clicking on one of those functions will in turn change the focus
2008 If you like sysprof's 'caller-oriented' display, you may be able to
2010 the -g (--call-graph) option that you can experiment with; one of the
2011 options is 'caller' for an inverted caller-based callgraph display.
2014 ---------------------
2017 because it's pretty self-explanatory. The Sysprof website, however, is
2018 here: `Sysprof, System-wide Performance Profiler for
2025 -----------
2028 outlined in the ":ref:`profile-manual/intro:General Setup`" section.
2032 -----------------------------
2035 (you need to build the core-image-sato-sdk image or use one of the other
2036 methods described in the ":ref:`profile-manual/intro:General Setup`" section), you're ready to start
2044 $ ssh -l root 192.168.1.47
2055 Session auto-20121015-232120 created.
2056 Traces will be written in /home/root/lttng-traces/auto-20121015-232120
2060 root@crownbay:~# lttng enable-event --kernel --all
2066 Tracing started for session auto-20121015-232120
2072 Tracing stopped for session auto-20121015-232120
2082 [23:21:56.989351694] (+0.000022349) exit_syscall: { 1 }, { ret = -1247805440 }
2084 …[23:21:56.989477129] (+0.000044140) sched_stat_runtime: { 1 }, { comm = "lttng-consumerd", tid = 1…
2085 …[23:21:56.989486697] (+0.000009568) sched_migrate_task: { 1 }, { comm = "lttng-consumerd", tid = 1…
2094 …[23:21:56.989807130] (+0.000006495) sched_stat_runtime: { 1 }, { comm = "lttng-consumerd", tid = 1…
2095 …[23:21:56.989809993] (+0.000002863) sched_stat_runtime: { 0 }, { comm = "lttng-sessiond", tid = 11…
2106 …[23:21:56.989854831] (+0.000012012) sched_stat_runtime: { 1 }, { comm = "lttng-consumerd", tid = 1…
2107 …[23:21:56.989855949] (+0.000001118) sched_stat_runtime: { 0 }, { comm = "lttng-sessiond", tid = 11…
2119 session (note that this doesn't delete the trace - it's still there in
2120 ~/lttng-traces)::
2123 Session auto-20121015-232120 destroyed at /home/root
2126 'lttng create', under the ~/lttng-traces directory (note that you can change this by
2129 root@crownbay:~# ls -al ~/lttng-traces
2130 drwxrwx--- 3 root root 1024 Oct 15 23:21 .
2131 drwxr-xr-x 5 root root 1024 Oct 15 23:57 ..
2132 drwxrwx--- 3 root root 1024 Oct 15 23:21 auto-20121015-232120
2139 generated by the lttng-ust build.
2141 The 'hello' test program isn't installed on the root filesystem by the lttng-ust
2145 $ cd build/tmp/work/core2_32-poky-linux/lttng-ust/2.0.5-r0/git/tests/hello/.libs
2156 $ ssh -l root 192.168.1.47
2166 Session auto-20190303-021943 created.
2167 Traces will be written in /home/root/lttng-traces/auto-20190303-021943
2171 root@crownbay:~# lttng enable-event --userspace --all
2177 Tracing started for session auto-20190303-021943
2189 Tracing stopped for session auto-20190303-021943
2203 trace - it's still there in ~/lttng-traces)::
2206 Session auto-20190303-021943 destroyed at /home/root
2209 -------------------
2224 blktrace is a tool for tracing and reporting low-level disk I/O.
2227 human-readable form and does some basic analysis:
2230 --------------
2233 outlined in the ":ref:`profile-manual/intro:General Setup`"
2240 ":ref:`profile-manual/usage:Using blktrace Remotely`" section
2245 --------------------
2254 …root@crownbay:/media/sdc# rm linux-2.6.19.2.tar.bz2; wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.…
2256 linux-2.6.19.2.tar.b 100% \|*******************************\| 41727k 0:00:00 ETA
2258 Press Ctrl-C in the blktrace shell to stop the trace. It
2259 will display how many events were logged, along with the per-cpu file
2260 sizes (blktrace records traces in per-cpu kernel buffers and simply
2271 root@crownbay:~# ls -al
2272 drwxr-xr-x 6 root root 1024 Oct 27 22:39 .
2273 drwxr-sr-x 4 root root 1024 Oct 26 18:24 ..
2274 -rw-r--r-- 1 root root 339938 Oct 27 22:40 sdc.blktrace.0
2275 -rw-r--r-- 1 root root 75753 Oct 27 22:40 sdc.blktrace.1
2283 8,32 1 1 0.000000000 1225 Q WS 3417048 + 8 [jbd2/sdc-8]
2284 8,32 1 2 0.000025213 1225 G WS 3417048 + 8 [jbd2/sdc-8]
2285 8,32 1 3 0.000033384 1225 P N [jbd2/sdc-8]
2286 8,32 1 4 0.000043301 1225 I WS 3417048 + 8 [jbd2/sdc-8]
2289 8,32 1 5 0.000076336 1225 U N [jbd2/sdc-8] 1
2292 8,32 1 0 0.000104063 0 m N cfq1225 Not idling. st->count:1
2297 8,32 1 6 0.000136889 1225 D WS 3417048 + 8 [jbd2/sdc-8]
2298 8,32 1 7 0.000360381 1225 Q WS 3417056 + 8 [jbd2/sdc-8]
2299 8,32 1 8 0.000377422 1225 G WS 3417056 + 8 [jbd2/sdc-8]
2300 8,32 1 9 0.000388876 1225 P N [jbd2/sdc-8]
2301 8,32 1 10 0.000397886 1225 Q WS 3417064 + 8 [jbd2/sdc-8]
2302 8,32 1 11 0.000404800 1225 M WS 3417064 + 8 [jbd2/sdc-8]
2303 8,32 1 12 0.000412343 1225 Q WS 3417072 + 8 [jbd2/sdc-8]
2304 8,32 1 13 0.000416533 1225 M WS 3417072 + 8 [jbd2/sdc-8]
2305 8,32 1 14 0.000422121 1225 Q WS 3417080 + 8 [jbd2/sdc-8]
2306 8,32 1 15 0.000425194 1225 M WS 3417080 + 8 [jbd2/sdc-8]
2307 8,32 1 16 0.000431968 1225 Q WS 3417088 + 8 [jbd2/sdc-8]
2308 8,32 1 17 0.000435251 1225 M WS 3417088 + 8 [jbd2/sdc-8]
2309 8,32 1 18 0.000440279 1225 Q WS 3417096 + 8 [jbd2/sdc-8]
2310 8,32 1 19 0.000443911 1225 M WS 3417096 + 8 [jbd2/sdc-8]
2311 8,32 1 20 0.000450336 1225 Q WS 3417104 + 8 [jbd2/sdc-8]
2312 8,32 1 21 0.000454038 1225 M WS 3417104 + 8 [jbd2/sdc-8]
2313 8,32 1 22 0.000462070 1225 Q WS 3417112 + 8 [jbd2/sdc-8]
2314 8,32 1 23 0.000465422 1225 M WS 3417112 + 8 [jbd2/sdc-8]
2315 8,32 1 24 0.000474222 1225 I WS 3417056 + 64 [jbd2/sdc-8]
2317 8,32 1 25 0.000489727 1225 U N [jbd2/sdc-8] 1
2318 8,32 1 0 0.000498457 0 m N cfq1225 Not idling. st->count:1
2362 Skips: 0 forward (0 - 0.0%)
2379 root@crownbay:~# blktrace /dev/sdc -o - | blkparse -i -
2381 This enables long-lived tracing sessions
2383 certain conditions in the trace data in 'real-time' by viewing the trace
2405 system being traced with the -l option, along with the device to trace::
2407 root@crownbay:~# blktrace -l /dev/sdc
2410 On the host system, use the -h option to connect to the target system,
2413 $ blktrace -d /dev/sdc -h 192.168.1.43
2423 …root@crownbay:/media/sdc# rm linux-2.6.19.2.tar.bz2; wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.…
2425 linux-2.6.19.2.tar.b 100% \|*******************************\| 41727k 0:00:00 ETA
2427 When it's done, do a Ctrl-C on the host system to stop the
2445 save the target output inside a hostname-timestamp directory::
2447 $ ls -al
2448 drwxr-xr-x 10 root root 1024 Oct 28 02:40 .
2449 drwxr-sr-x 4 root root 1024 Oct 26 18:24 ..
2450 drwxr-xr-x 2 root root 1024 Oct 28 02:40 192.168.1.43-2012-10-28-02:40:56
2454 $ ls -l
2455 -rw-r--r-- 1 root root 369193 Oct 28 02:44 sdc.blktrace.0
2456 -rw-r--r-- 1 root root 197278 Oct 28 02:44 sdc.blktrace.1
2506 Skips: 0 forward (0 - 0.0%)
2515 :ref:`profile-manual/usage:The 'trace events' Subsystem`, which
2538 trace to capture this trace - this allows us to wait around on the pipe
2542 cat-3587 [001] d..1 3023.276361: 8,32 Q R 1699848 + 8 [cat]
2543 cat-3587 [001] d..1 3023.276410: 8,32 m N cfq3587 alloced
2544 cat-3587 [001] d..1 3023.276415: 8,32 G R 1699848 + 8 [cat]
2545 cat-3587 [001] d..1 3023.276424: 8,32 P N [cat]
2546 cat-3587 [001] d..2 3023.276432: 8,32 I R 1699848 + 8 [cat]
2547 cat-3587 [001] d..1 3023.276439: 8,32 m N cfq3587 insert_request
2548 cat-3587 [001] d..1 3023.276445: 8,32 m N cfq3587 add_to_rr
2549 cat-3587 [001] d..2 3023.276454: 8,32 U N [cat] 1
2550 cat-3587 [001] d..1 3023.276464: 8,32 m N cfq workload slice:150
2551 … cat-3587 [001] d..1 3023.276471: 8,32 m N cfq3587 set_active wl_prio:0 wl_type:2
2552 cat-3587 [001] d..1 3023.276478: 8,32 m N cfq3587 fifo= (null)
2553 cat-3587 [001] d..1 3023.276483: 8,32 m N cfq3587 dispatch_insert
2554 cat-3587 [001] d..1 3023.276490: 8,32 m N cfq3587 dispatched a request
2555 cat-3587 [001] d..1 3023.276497: 8,32 m N cfq3587 activate rq, drv=1
2556 cat-3587 [001] d..2 3023.276500: 8,32 D R 1699848 + 8 [cat]
2563 ----------------------
2568 - https://linux.die.net/man/8/blktrace
2570 - https://linux.die.net/man/1/blkparse
2572 - https://linux.die.net/man/8/btrace