Lines Matching +full:cpu +full:- +full:centric

1 .. SPDX-License-Identifier: CC-BY-SA-2.0-UK
20 that it's only for tracing and profiling the kernel - you can indeed use
24 all applications on the system simultaneously to gain a system-wide view
45 ----------
48 outlined in the ":ref:`profile-manual/intro:General Setup`" section.
61 ----------------
63 The perf tool is pretty much self-documenting. To remind yourself of the
69 usage: perf [--version] [--help] COMMAND [ARGS]
73 archive Create archive with object files with build-ids found in perf.data file
75 buildid-cache Manage build-id cache.
76 buildid-list List the buildids in a perf.data file
107 root@crownbay:~# rm linux-2.6.19.2.tar.bz2; \
108 wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2
115 root@crownbay:~# perf stat wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2
117 …linux-2.6.19.2.tar.b 100% |***************************************************| 41727k 0:00:00 ETA
119 Performance counter stats for 'wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2':
121 4597.223902 task-clock # 0.077 CPUs utilized
122 23568 context-switches # 0.005 M/sec
123 68 CPU-migrations # 0.015 K/sec
124 241 page-faults # 0.052 K/sec
126 <not supported> stalled-cycles-frontend
127 <not supported> stalled-cycles-backend
130 19550329 branch-misses # 11.82% of all branches
134 Many times such a simple-minded test doesn't yield much of
135 interest, but sometimes it does (see Real-world Yocto bug (slow
136 loop-mounted write speed)).
139 - basically any event listed in the output of 'perf list' can be tallied
144 …root@crownbay:~# perf stat -e kmem:* -e cache-references -e cache-misses wget &YOCTO_DL_URL;/mirro…
146 …linux-2.6.19.2.tar.b 100% |***************************************************| 41727k 0:00:00 ETA
148 Performance counter stats for 'wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2':
162 66848602 cache-references
163 2917740 cache-misses # 4.365 % of all cache refs
174 interactive text-based UI (or simply as text if we specify ``--stdio`` to
179 'perf record' and any perf options we hand it - here none - will be
184 root@crownbay:~# perf record wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2
187 linux-2.6.19.2.tar.b 100% |************************************************| 41727k 0:00:00 ETA
192 'text-based UI' (tui), simply run 'perf report', which will read the
198 .. image:: figures/perf-wget-flat-stripped.png
205 only above a certain threshold and so on - see the perf documentation
208 containing a [k]). (perf has command-line modifiers that can be used to
213 useful function name in that entry, it displays a not-so-friendly hex
218 between the new profile and the previous one is that we'll add the -g
222 root@crownbay:~# perf record -g wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2
224 linux-2.6.19.2.tar.b 100% |************************************************| 41727k 0:00:00 ETA
231 .. image:: figures/perf-wget-g-copy-to-user-expanded-stripped.png
240 left-hand side. This means that we can expand the entry and drill down
260 down to wget i.e. a copy-to-user.
267 .. image:: figures/perf-wget-g-copy-from-user-expanded-stripped.png
270 The above screenshot shows the other half of the journey for the data -
272 disk, the wget program issues a ``write(2)``, which does a ``copy-from-user`` to
284 .. image:: figures/perf-wget-busybox-expanded-stripped.png
305 ``dbg-pkgs`` to :term:`EXTRA_IMAGE_FEATURES` in ``local.conf``. For example::
307 EXTRA_IMAGE_FEATURES = "debug-tweaks tools-profile dbg-pkgs"
314 PACKAGE_DEBUG_SPLIT_STYLE = 'debug-file-directory'
318 ``build/tmp/deploy/rpm/*`` on the host system. Find the busybox-dbg-...rpm
321 …empanada core2]$ scp /home/trz/yocto/crownbay-tracing-dbg/build/tmp/deploy/rpm/core2_32/busybox-db…
322 busybox-dbg-1.20.2-r2.core2_32.rpm 100% 1826KB 1.8MB/s 00:01
326 root@crownbay:~# rpm -i busybox-dbg-1.20.2-r2.core2_32.rpm
331 .. image:: figures/perf-wget-busybox-debuginfo.png
338 .. image:: figures/perf-wget-busybox-dso-zoom-menu.png
342 busybox-centric view of the profiled functions (in this case we've also
345 .. image:: figures/perf-wget-busybox-dso-zoom.png
352 copy-to-user overhead was initiated by the ``handle_input()`` BusyBox
355 .. image:: figures/perf-wget-g-copy-to-user-expanded-debuginfo.png
363 .. image:: figures/perf-wget-busybox-annotate-menu.png
371 .. image:: figures/perf-wget-busybox-annotate-udhcpc.png
389 List of pre-defined events (to be used in -e):
390 cpu-cycles OR cycles [Hardware event]
391 stalled-cycles-frontend OR idle-cycles-frontend [Hardware event]
392 stalled-cycles-backend OR idle-cycles-backend [Hardware event]
394 cache-references [Hardware event]
395 cache-misses [Hardware event]
396 branch-instructions OR branches [Hardware event]
397 branch-misses [Hardware event]
398 bus-cycles [Hardware event]
399 ref-cycles [Hardware event]
401 cpu-clock [Software event]
402 task-clock [Software event]
403 page-faults OR faults [Software event]
404 minor-faults [Software event]
405 major-faults [Software event]
406 context-switches OR cs [Software event]
407 cpu-migrations OR migrations [Software event]
408 alignment-faults [Software event]
409 emulation-faults [Software event]
411 L1-dcache-loads [Hardware cache event]
412 L1-dcache-load-misses [Hardware cache event]
413 L1-dcache-prefetch-misses [Hardware cache event]
414 L1-icache-loads [Hardware cache event]
415 L1-icache-load-misses [Hardware cache event]
420 cpu/t1=v1[,t2=v2,t3 ...]/modifier [Raw hardware event descriptor]
421 (see 'perf list --help' on how to encode it)
530 …bay:~# perf stat -e skb:* -e net:* -e napi:* -e sched:* -e workqueue:* -e irq:* -e syscalls:* wget…
531 Performance counter stats for 'wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2':
592 …root@crownbay:~# perf record -g -e sched:sched_wakeup wget &YOCTO_DL_URL;/mirror/sources/linux-2.6…
594 .. image:: figures/sched-wakeup-profile.png
600 tracepoint - it's actually defined just inside ttwu_do_wakeup(), which
603 .. code-block:: c
606 * Mark the task runnable and perform wakeup-preemption.
624 sample on every event occurrence (this can be changed using the -c
635 high-level view of what's going on with a workload or across the system.
643 individual steps that go into the higher-level behavior exposed by the
644 coarse-grained profiling data.
649 …root@crownbay:~# perf record -g -e skb:* -e net:* -e napi:* -e sched:sched_switch -e sched:sched_w…
650-e syscalls:sys_enter_read -e syscalls:sys_exit_read -e syscalls:sys_enter_write -e syscalls:sys_e…
651 wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2
684 In many ways, profiling can be viewed as a subset of tracing -
693 in the end it can't be used in an open-ended way to extract data that
697 Full-blown detailed tracing data does however offer the opportunity to
715 something other than pre-canned profiling tools to do that.
717 Luckily, there is a general-purpose way to handle such needs, called
727 DProbes dpcc compiler, an ANSI C compiler which targeted a low-level
728 assembly language running on an in-kernel interpreter on the target
731 heavily inspired by DTrace, also created its own one-off language,
732 but rather than running the product on an in-kernel interpreter,
733 created an elaborate compiler-based machinery to translate its
737 -g' to generate a skeleton script with handlers for the read/write
740 root@crownbay:~# perf script -g python
741 generated Python script: perf-script.py
747 .. code-block:: python
760 root@crownbay:~# perf script -s perf-script.py
796 per-event count. When the program is run against a perf.data file, each
800 .. code-block:: python
811 hash function that does Perl-like autovivification, a capability that's
812 extremely useful for kinds of multi-level aggregation commonly used in
816 .. code-block:: python
827 result of all the per-event tallies. For that, we use the special
830 .. code-block:: python
834 print "%-40s %10s\n" % (event_name, count)
859 the right kind of trace data, higher-level profiling-type summaries can
863 binding <https://linux.die.net/man/1/perf-script-python>`__.
865 System-Wide Tracing and Profiling
869 workload - in other words, every profiling run has specified the program
870 to profile in the command-line e.g. 'perf record wget ...'.
873 system-wide profile or trace while running the workload in a separate
876 To do system-wide profiling or tracing, you typically use the -a flag to
880 -a flag (press Ctrl-C to stop tracing)::
882 root@crownbay:~# perf record -g -a
888 root@crownbay:~# wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2
890 linux-2.6.19.2.tar.b 100% \|*******************************\| 41727k 0:00:00 ETA
895 .. image:: figures/perf-systemwide.png
908 root@crownbay:~# perf record -g -a -e cycles:u
912 .. image:: figures/perf-report-cycles-u.png
920 the entries associated with the libc-xxx.so DSO.
922 .. image:: figures/perf-systemwide-libc.png
925 We can also use the system-wide -a switch to do system-wide tracing.
928 root@crownbay:~# perf record -a -e sched:sched_switch -e sched:sched_wakeup
953 the '--filter' option - for each event we specify using -e, we can add a
954 --filter after that to filter out trace events that contain fields with
957 …ot@crownbay:~# perf record -a -e sched:sched_switch --filter 'next_comm != perf && prev_comm != pe…
994 These event filters are implemented by a special-purpose
995 pseudo-interpreter in the kernel and are an integral and
997 kernel-based event filters provide a mechanism to precisely throttle
1004 project-defined language in order to accommodate that design, or
1006 super-efficient kernel-to-userspace transport mechanism in order to
1028 perf record -e probe:do_fork -aR sleep 1
1036 root@crownbay:/sys/kernel/debug/tracing/events/probe/do_fork# ls -al
1037 drwxr-xr-x 2 root root 0 Oct 28 11:42 .
1038 drwxr-xr-x 3 root root 0 Oct 28 11:42 ..
1039 -rw-r--r-- 1 root root 0 Oct 28 11:42 enable
1040 -rw-r--r-- 1 root root 0 Oct 28 11:42 filter
1041 -r--r--r-- 1 root root 0 Oct 28 11:42 format
1042 -r--r--r-- 1 root root 0 Oct 28 11:42 id
1056 print fmt: "(%lx)", REC->__probe_ip
1061 root@crownbay:~# perf probe -l
1065 Let's record system-wide ('sleep 30' is a
1066 trick for recording system-wide but basically do nothing and then wake
1069 root@crownbay:~# perf record -g -a -e probe:do_fork sleep 30
1080 # os release : 3.4.11-yocto-standard
1085 # cpudesc : Intel(R) Atom(TM) CPU E660 @ 1.30GHz
1088 # cmdline : /usr/bin/perf record -g -a -e probe:do_fork sleep 30
1091 # HEADER_CPU_TOPOLOGY info available, use -I to display
1094 matchbox-deskto 1197 [001] 34211.378318: do_fork: (c1028460)
1095 matchbox-deskto 1295 [001] 34211.380388: do_fork: (c1028460)
1098 matchbox-deskto 1197 [001] 34217.541603: do_fork: (c1028460)
1099 matchbox-deskto 1299 [001] 34217.543584: do_fork: (c1028460)
1105 matchbox-deskto 1197 [001] 34224.955965: do_fork: (c1028460)
1106 matchbox-deskto 1306 [001] 34224.957972: do_fork: (c1028460)
1107 matchbox-termin 1307 [000] 34225.038214: do_fork: (c1028460)
1108 matchbox-termin 1307 [001] 34225.044218: do_fork: (c1028460)
1109 matchbox-termin 1307 [000] 34225.046442: do_fork: (c1028460)
1110 matchbox-deskto 1197 [001] 34237.112138: do_fork: (c1028460)
1111 matchbox-deskto 1311 [001] 34237.114106: do_fork: (c1028460)
1117 .. image:: figures/perf-probe-do_fork-profile.png
1123 in exactly the same way - there's no difference as far as the
1134 ------------------
1139 - The `'perf stat' manpage <https://linux.die.net/man/1/perf-stat>`__.
1141 - The `'perf record'
1142 manpage <https://linux.die.net/man/1/perf-record>`__.
1144 - The `'perf report'
1145 manpage <https://linux.die.net/man/1/perf-report>`__.
1147 - The `'perf probe' manpage <https://linux.die.net/man/1/perf-probe>`__.
1149 - The `'perf script'
1150 manpage <https://linux.die.net/man/1/perf-script>`__.
1152 - Documentation on using the `'perf script' python
1153 binding <https://linux.die.net/man/1/perf-script-python>`__.
1155 - The top-level `perf(1) manpage <https://linux.die.net/man/1/perf>`__.
1163 IMAGE_INSTALL:append = " perf perf-doc"
1164 DISTRO_FEATURES:append = " api-documentation"
1183 ------------
1186 outlined in the ":ref:`profile-manual/intro:General Setup`" section.
1188 ftrace, trace-cmd, and kernelshark run on the target system, and are
1189 ready to go out-of-the-box - no additional setup is necessary. For the
1192 you use the '-X' option to ssh you can have the kernelshark GUI run on
1196 ------------------
1217 - some relate directly to the tracers themselves, others are used to set
1224 We'll start by looking at some of the available built-in tracers.
1257 # entries-in-buffer/entries-written: 310629/766471 #P:8
1259 # _-----=> irqs-off
1260 # / _----=> need-resched
1261 # | / _---=> hardirq/softirq
1262 # || / _--=> preempt-depth
1264 # TASK-PID CPU# |||| TIMESTAMP FUNCTION
1266 <idle>-0 [004] d..1 470.867169: ktime_get_real <-intel_idle
1267 <idle>-0 [004] d..1 470.867170: getnstimeofday <-ktime_get_real
1268 <idle>-0 [004] d..1 470.867171: ns_to_timeval <-intel_idle
1269 <idle>-0 [004] d..1 470.867171: ns_to_timespec <-ns_to_timeval
1270 <idle>-0 [004] d..1 470.867172: smp_apic_timer_interrupt <-apic_timer_interrupt
1271 <idle>-0 [004] d..1 470.867172: native_apic_mem_write <-smp_apic_timer_interrupt
1272 <idle>-0 [004] d..1 470.867172: irq_enter <-smp_apic_timer_interrupt
1273 <idle>-0 [004] d..1 470.867172: rcu_irq_enter <-irq_enter
1274 <idle>-0 [004] d..1 470.867173: rcu_idle_exit_common.isra.33 <-rcu_irq_enter
1275 <idle>-0 [004] d..1 470.867173: local_bh_disable <-irq_enter
1276 <idle>-0 [004] d..1 470.867173: add_preempt_count <-local_bh_disable
1277 <idle>-0 [004] d.s1 470.867174: tick_check_idle <-irq_enter
1278 <idle>-0 [004] d.s1 470.867174: tick_check_oneshot_broadcast <-tick_check_idle
1279 <idle>-0 [004] d.s1 470.867174: ktime_get <-tick_check_idle
1280 <idle>-0 [004] d.s1 470.867174: tick_nohz_stop_idle <-tick_check_idle
1281 <idle>-0 [004] d.s1 470.867175: update_ts_time_stats <-tick_nohz_stop_idle
1282 <idle>-0 [004] d.s1 470.867175: nr_iowait_cpu <-update_ts_time_stats
1283 <idle>-0 [004] d.s1 470.867175: tick_do_update_jiffies64 <-tick_check_idle
1284 <idle>-0 [004] d.s1 470.867175: _raw_spin_lock <-tick_do_update_jiffies64
1285 <idle>-0 [004] d.s1 470.867176: add_preempt_count <-_raw_spin_lock
1286 <idle>-0 [004] d.s2 470.867176: do_timer <-tick_do_update_jiffies64
1287 <idle>-0 [004] d.s2 470.867176: _raw_spin_lock <-do_timer
1288 <idle>-0 [004] d.s2 470.867176: add_preempt_count <-_raw_spin_lock
1289 <idle>-0 [004] d.s3 470.867177: ntp_tick_length <-do_timer
1290 <idle>-0 [004] d.s3 470.867177: _raw_spin_lock_irqsave <-ntp_tick_length
1296 cpu, to the level of detail of function calls. Each entry shows the function
1309 be - luckily there's a variant of the function tracer that displays the
1317 CPU DURATION FUNCTION CALLS
1372 ------------------------------------------
1373 4) kworker-74 => <idle>-0
1374 ------------------------------------------
1440 ----------------------------
1449 root@sugarbay:/sys/kernel/debug/tracing/events# ls -al
1450 drwxr-xr-x 38 root root 0 Nov 14 23:19 .
1451 drwxr-xr-x 5 root root 0 Nov 14 23:19 ..
1452 drwxr-xr-x 19 root root 0 Nov 14 23:19 block
1453 drwxr-xr-x 32 root root 0 Nov 14 23:19 btrfs
1454 drwxr-xr-x 5 root root 0 Nov 14 23:19 drm
1455 -rw-r--r-- 1 root root 0 Nov 14 23:19 enable
1456 drwxr-xr-x 40 root root 0 Nov 14 23:19 ext3
1457 drwxr-xr-x 79 root root 0 Nov 14 23:19 ext4
1458 drwxr-xr-x 14 root root 0 Nov 14 23:19 ftrace
1459 drwxr-xr-x 8 root root 0 Nov 14 23:19 hda
1460 -r--r--r-- 1 root root 0 Nov 14 23:19 header_event
1461 -r--r--r-- 1 root root 0 Nov 14 23:19 header_page
1462 drwxr-xr-x 25 root root 0 Nov 14 23:19 i915
1463 drwxr-xr-x 7 root root 0 Nov 14 23:19 irq
1464 drwxr-xr-x 12 root root 0 Nov 14 23:19 jbd
1465 drwxr-xr-x 14 root root 0 Nov 14 23:19 jbd2
1466 drwxr-xr-x 14 root root 0 Nov 14 23:19 kmem
1467 drwxr-xr-x 7 root root 0 Nov 14 23:19 module
1468 drwxr-xr-x 3 root root 0 Nov 14 23:19 napi
1469 drwxr-xr-x 6 root root 0 Nov 14 23:19 net
1470 drwxr-xr-x 3 root root 0 Nov 14 23:19 oom
1471 drwxr-xr-x 12 root root 0 Nov 14 23:19 power
1472 drwxr-xr-x 3 root root 0 Nov 14 23:19 printk
1473 drwxr-xr-x 8 root root 0 Nov 14 23:19 random
1474 drwxr-xr-x 4 root root 0 Nov 14 23:19 raw_syscalls
1475 drwxr-xr-x 3 root root 0 Nov 14 23:19 rcu
1476 drwxr-xr-x 6 root root 0 Nov 14 23:19 rpm
1477 drwxr-xr-x 20 root root 0 Nov 14 23:19 sched
1478 drwxr-xr-x 7 root root 0 Nov 14 23:19 scsi
1479 drwxr-xr-x 4 root root 0 Nov 14 23:19 signal
1480 drwxr-xr-x 5 root root 0 Nov 14 23:19 skb
1481 drwxr-xr-x 4 root root 0 Nov 14 23:19 sock
1482 drwxr-xr-x 10 root root 0 Nov 14 23:19 sunrpc
1483 drwxr-xr-x 538 root root 0 Nov 14 23:19 syscalls
1484 drwxr-xr-x 4 root root 0 Nov 14 23:19 task
1485 drwxr-xr-x 14 root root 0 Nov 14 23:19 timer
1486 drwxr-xr-x 3 root root 0 Nov 14 23:19 udp
1487 drwxr-xr-x 21 root root 0 Nov 14 23:19 vmscan
1488 drwxr-xr-x 3 root root 0 Nov 14 23:19 vsyscall
1489 drwxr-xr-x 6 root root 0 Nov 14 23:19 workqueue
1490 drwxr-xr-x 26 root root 0 Nov 14 23:19 writeback
1498 root@sugarbay:/sys/kernel/debug/tracing/events/kmem# ls -al
1499 drwxr-xr-x 14 root root 0 Nov 14 23:19 .
1500 drwxr-xr-x 38 root root 0 Nov 14 23:19 ..
1501 -rw-r--r-- 1 root root 0 Nov 14 23:19 enable
1502 -rw-r--r-- 1 root root 0 Nov 14 23:19 filter
1503 drwxr-xr-x 2 root root 0 Nov 14 23:19 kfree
1504 drwxr-xr-x 2 root root 0 Nov 14 23:19 kmalloc
1505 drwxr-xr-x 2 root root 0 Nov 14 23:19 kmalloc_node
1506 drwxr-xr-x 2 root root 0 Nov 14 23:19 kmem_cache_alloc
1507 drwxr-xr-x 2 root root 0 Nov 14 23:19 kmem_cache_alloc_node
1508 drwxr-xr-x 2 root root 0 Nov 14 23:19 kmem_cache_free
1509 drwxr-xr-x 2 root root 0 Nov 14 23:19 mm_page_alloc
1510 drwxr-xr-x 2 root root 0 Nov 14 23:19 mm_page_alloc_extfrag
1511 drwxr-xr-x 2 root root 0 Nov 14 23:19 mm_page_alloc_zone_locked
1512 drwxr-xr-x 2 root root 0 Nov 14 23:19 mm_page_free
1513 drwxr-xr-x 2 root root 0 Nov 14 23:19 mm_page_free_batched
1514 drwxr-xr-x 2 root root 0 Nov 14 23:19 mm_page_pcpu_drain
1520 root@sugarbay:/sys/kernel/debug/tracing/events/kmem/kmalloc# ls -al
1521 drwxr-xr-x 2 root root 0 Nov 14 23:19 .
1522 drwxr-xr-x 14 root root 0 Nov 14 23:19 ..
1523 -rw-r--r-- 1 root root 0 Nov 14 23:19 enable
1524 -rw-r--r-- 1 root root 0 Nov 14 23:19 filter
1525 -r--r--r-- 1 root root 0 Nov 14 23:19 format
1526 -r--r--r-- 1 root root 0 Nov 14 23:19 id
1551 …ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s", REC->call_site, REC->ptr, REC->bytes_req, REC-
1552 …(REC->gfp_flags) ? __print_flags(REC->gfp_flags, "|", {(unsigned long)(((( gfp_t)0x10u) | (( gfp_t…
1570 trace-cmd) to actually turn the tracepoint on and off. When enabled, the
1590 # entries-in-buffer/entries-written: 1897/1897 #P:8
1592 # _-----=> irqs-off
1593 # / _----=> need-resched
1594 # | / _---=> hardirq/softirq
1595 # || / _--=> preempt-depth
1597 # TASK-PID CPU# |||| TIMESTAMP FUNCTION
1599 …dropbear-1465 [000] ...1 18154.620753: kmalloc: call_site=ffffffff816650d4 ptr=ffff8800729c3000 b…
1600 …<idle>-0 [000] ..s3 18154.621640: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d555800 byt…
1601 …<idle>-0 [000] ..s3 18154.621656: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d555800 byt…
1602 …matchbox-termin-1361 [001] ...1 18154.755472: kmalloc: call_site=ffffffff81614050 ptr=ffff88006d5…
1603 …Xorg-1264 [002] ...1 18154.755581: kmalloc: call_site=ffffffff8141abe8 ptr=ffff8800734f4cc0 bytes…
1604 …Xorg-1264 [002] ...1 18154.755583: kmalloc: call_site=ffffffff814192a3 ptr=ffff88001f822520 bytes…
1605 …Xorg-1264 [002] ...1 18154.755589: kmalloc: call_site=ffffffff81419edb ptr=ffff8800721a2f00 bytes…
1606 …matchbox-termin-1361 [001] ...1 18155.354594: kmalloc: call_site=ffffffff81614050 ptr=ffff88006db…
1607 …Xorg-1264 [002] ...1 18155.354703: kmalloc: call_site=ffffffff8141abe8 ptr=ffff8800734f4cc0 bytes…
1608 …Xorg-1264 [002] ...1 18155.354705: kmalloc: call_site=ffffffff814192a3 ptr=ffff88001f822520 bytes…
1609 …Xorg-1264 [002] ...1 18155.354711: kmalloc: call_site=ffffffff81419edb ptr=ffff8800721a2f00 bytes…
1610 …<idle>-0 [000] ..s3 18155.673319: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d555800 byt…
1611 …dropbear-1465 [000] ...1 18155.673525: kmalloc: call_site=ffffffff816650d4 ptr=ffff8800729c3000 b…
1612 …<idle>-0 [000] ..s3 18155.674821: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d554800 byt…
1613 …<idle>-0 [000] ..s3 18155.793014: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d554800 byt…
1614 …dropbear-1465 [000] ...1 18155.793219: kmalloc: call_site=ffffffff816650d4 ptr=ffff8800729c3000 b…
1615 …<idle>-0 [000] ..s3 18155.794147: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d555800 byt…
1616 …<idle>-0 [000] ..s3 18155.936705: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d555800 byt…
1617 …dropbear-1465 [000] ...1 18155.936910: kmalloc: call_site=ffffffff816650d4 ptr=ffff8800729c3000 b…
1618 …<idle>-0 [000] ..s3 18155.937869: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d554800 byt…
1619 …matchbox-termin-1361 [001] ...1 18155.953667: kmalloc: call_site=ffffffff81614050 ptr=ffff88006d5…
1620 …Xorg-1264 [002] ...1 18155.953775: kmalloc: call_site=ffffffff8141abe8 ptr=ffff8800734f4cc0 bytes…
1621 …Xorg-1264 [002] ...1 18155.953777: kmalloc: call_site=ffffffff814192a3 ptr=ffff88001f822520 bytes…
1622 …Xorg-1264 [002] ...1 18155.953783: kmalloc: call_site=ffffffff81419edb ptr=ffff8800721a2f00 bytes…
1623 …<idle>-0 [000] ..s3 18156.176053: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d554800 byt…
1624 …dropbear-1465 [000] ...1 18156.176257: kmalloc: call_site=ffffffff816650d4 ptr=ffff8800729c3000 b…
1625 …<idle>-0 [000] ..s3 18156.177717: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d555800 byt…
1626 …<idle>-0 [000] ..s3 18156.399229: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d555800 byt…
1627 …dropbear-1465 [000] ...1 18156.399434: kmalloc: call_site=ffffffff816650d4 ptr=ffff8800729c3000 b…
1628 …<idle>-0 [000] ..s3 18156.400660: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d554800 byt…
1629 …matchbox-termin-1361 [001] ...1 18156.552800: kmalloc: call_site=ffffffff81614050 ptr=ffff88006db…
1637 arbitrarily fine-grained idea of what's going on in the system by
1641 trace-cmd and kernelshark in the next section.
1653 Eventually all the special-purpose tracers currently available in
1657 trace-cmd/kernelshark
1658 ---------------------
1660 trace-cmd is essentially an extensive command-line 'wrapper' interface
1669 a per-CPU graphical display. It directly uses 'trace-cmd' as the
1671 displays the trace-cmd command it uses, as we'll see).
1685 .. image:: figures/kernelshark-choose-events.png
1689 previous trace events subsystem section, and in fact is where trace-cmd
1700 .. image:: figures/kernelshark-output-display.png
1703 Notice that the right-hand pane shows the exact trace-cmd command-line
1704 that's used to run the trace, along with the results of the trace-cmd
1708 with a colorful per-cpu display of the trace data, along with the
1711 .. image:: figures/kernelshark-i915-display.png
1717 .. image:: figures/kernelshark-all.png
1720 The tool is pretty self-explanatory, but for more detailed information
1725 --------------------
1737 There is a nice series of articles on using ftrace and trace-cmd at LWN:
1739 - `Debugging the kernel using Ftrace - part
1742 - `Debugging the kernel using Ftrace - part
1745 - `Secrets of the Ftrace function
1748 - `trace-cmd: A front-end for
1754 An amusing yet useful README (a tracing mini-HOWTO) can be found in
1760 SystemTap is a system-wide script-based tracing and profiling tool.
1762 SystemTap scripts are C-like programs that are executed in the kernel to
1773 .. code-block:: none
1809 section assumes only the typical embedded use-case.
1819 ---------------
1837 (or a build that includes 'tools-profile') in order to build
1841 - If you're running a pre-built image, download the release
1843 - If you're working from git sources, just clone the metadata
1845 - Make sure you're properly set up to build a new image (see
1848 - Build an -sdk version of the image e.g.:
1849 $ bitbake core-image-sato-sdk
1851 - Build a non-sdk image but include the profiling tools:
1852 [ edit local.conf and add 'tools-profile' to the end of
1854 $ bitbake core-image-sato
1857 boot it (or the equivalent pre-built image) and use 'crosstap'
1860 $ source oe-init-build-env
1870 default in \*-minimal images.
1873 do is build an SDK image or image with 'tools-profile' as detailed in
1874 the ":ref:`profile-manual/intro:General Setup`" section of this
1886 ----------------------------
1892 $ source oe-init-build-env
1899 core-image-minimal
1900 core-image-sato
1901 meta-toolchain
1902 meta-ide-support
1904 You can also run generated QEMU images with a command like 'runqemu qemux86-64'
1929 .. code-block:: none
1933 matchbox-termin(1036) open ("/tmp/vte3FS2LW", O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600)
1934 matchbox-termin(1036) open ("/tmp/vteJMC7LW", O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600)
1937 -----------------------
1949 Sysprof is a very easy to use system-wide profiler that consists of a
1954 -------------
1957 outlined in the ":ref:`profile-manual/intro:General Setup`" section.
1959 Sysprof is a GUI-based application that runs on the target system. For
1961 be running Sysprof on the target (you can use the '-X' option to ssh and
1966 -------------------
1975 .. image:: figures/sysprof-copy-to-user.png
1980 Note that this caller-oriented display is essentially the inverse of
1981 perf's default callee-oriented callchain display.
1989 .. image:: figures/sysprof-copy-from-user.png
1993 copy-from-user callchain.
2000 .. image:: figures/sysprof-callers.png
2003 Double-clicking on one of those functions will in turn change the focus
2008 If you like sysprof's 'caller-oriented' display, you may be able to
2010 the -g (--call-graph) option that you can experiment with; one of the
2011 options is 'caller' for an inverted caller-based callgraph display.
2014 ---------------------
2017 because it's pretty self-explanatory. The Sysprof website, however, is
2018 here: `Sysprof, System-wide Performance Profiler for
2025 -----------
2028 outlined in the ":ref:`profile-manual/intro:General Setup`" section.
2032 -----------------------------
2035 (you need to build the core-image-sato-sdk image or use one of the other
2036 methods described in the ":ref:`profile-manual/intro:General Setup`" section), you're ready to start
2044 $ ssh -l root 192.168.1.47
2055 Session auto-20121015-232120 created.
2056 Traces will be written in /home/root/lttng-traces/auto-20121015-232120
2060 root@crownbay:~# lttng enable-event --kernel --all
2066 Tracing started for session auto-20121015-232120
2072 Tracing stopped for session auto-20121015-232120
2082 [23:21:56.989351694] (+0.000022349) exit_syscall: { 1 }, { ret = -1247805440 }
2084 …[23:21:56.989477129] (+0.000044140) sched_stat_runtime: { 1 }, { comm = "lttng-consumerd", tid = 1…
2085 …[23:21:56.989486697] (+0.000009568) sched_migrate_task: { 1 }, { comm = "lttng-consumerd", tid = 1…
2094 …[23:21:56.989807130] (+0.000006495) sched_stat_runtime: { 1 }, { comm = "lttng-consumerd", tid = 1…
2095 …[23:21:56.989809993] (+0.000002863) sched_stat_runtime: { 0 }, { comm = "lttng-sessiond", tid = 11…
2106 …[23:21:56.989854831] (+0.000012012) sched_stat_runtime: { 1 }, { comm = "lttng-consumerd", tid = 1…
2107 …[23:21:56.989855949] (+0.000001118) sched_stat_runtime: { 0 }, { comm = "lttng-sessiond", tid = 11…
2119 session (note that this doesn't delete the trace - it's still there in
2120 ~/lttng-traces)::
2123 Session auto-20121015-232120 destroyed at /home/root
2126 'lttng create', under the ~/lttng-traces directory (note that you can change this by
2129 root@crownbay:~# ls -al ~/lttng-traces
2130 drwxrwx--- 3 root root 1024 Oct 15 23:21 .
2131 drwxr-xr-x 5 root root 1024 Oct 15 23:57 ..
2132 drwxrwx--- 3 root root 1024 Oct 15 23:21 auto-20121015-232120
2139 generated by the lttng-ust build.
2141 The 'hello' test program isn't installed on the root filesystem by the lttng-ust
2145 $ cd build/tmp/work/core2_32-poky-linux/lttng-ust/2.0.5-r0/git/tests/hello/.libs
2156 $ ssh -l root 192.168.1.47
2166 Session auto-20190303-021943 created.
2167 Traces will be written in /home/root/lttng-traces/auto-20190303-021943
2171 root@crownbay:~# lttng enable-event --userspace --all
2177 Tracing started for session auto-20190303-021943
2189 Tracing stopped for session auto-20190303-021943
2203 trace - it's still there in ~/lttng-traces)::
2206 Session auto-20190303-021943 destroyed at /home/root
2209 -------------------
2224 blktrace is a tool for tracing and reporting low-level disk I/O.
2227 human-readable form and does some basic analysis:
2230 --------------
2233 outlined in the ":ref:`profile-manual/intro:General Setup`"
2240 ":ref:`profile-manual/usage:Using blktrace Remotely`" section
2245 --------------------
2254 …root@crownbay:/media/sdc# rm linux-2.6.19.2.tar.bz2; wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.…
2256 linux-2.6.19.2.tar.b 100% \|*******************************\| 41727k 0:00:00 ETA
2258 Press Ctrl-C in the blktrace shell to stop the trace. It
2259 will display how many events were logged, along with the per-cpu file
2260 sizes (blktrace records traces in per-cpu kernel buffers and simply
2264 CPU 0: 7082 events, 332 KiB data
2265 CPU 1: 1578 events, 74 KiB data
2268 If you examine the files saved to disk, you see multiple files, one per CPU and
2271 root@crownbay:~# ls -al
2272 drwxr-xr-x 6 root root 1024 Oct 27 22:39 .
2273 drwxr-sr-x 4 root root 1024 Oct 26 18:24 ..
2274 -rw-r--r-- 1 root root 339938 Oct 27 22:40 sdc.blktrace.0
2275 -rw-r--r-- 1 root root 75753 Oct 27 22:40 sdc.blktrace.1
2283 8,32 1 1 0.000000000 1225 Q WS 3417048 + 8 [jbd2/sdc-8]
2284 8,32 1 2 0.000025213 1225 G WS 3417048 + 8 [jbd2/sdc-8]
2285 8,32 1 3 0.000033384 1225 P N [jbd2/sdc-8]
2286 8,32 1 4 0.000043301 1225 I WS 3417048 + 8 [jbd2/sdc-8]
2289 8,32 1 5 0.000076336 1225 U N [jbd2/sdc-8] 1
2292 8,32 1 0 0.000104063 0 m N cfq1225 Not idling. st->count:1
2297 8,32 1 6 0.000136889 1225 D WS 3417048 + 8 [jbd2/sdc-8]
2298 8,32 1 7 0.000360381 1225 Q WS 3417056 + 8 [jbd2/sdc-8]
2299 8,32 1 8 0.000377422 1225 G WS 3417056 + 8 [jbd2/sdc-8]
2300 8,32 1 9 0.000388876 1225 P N [jbd2/sdc-8]
2301 8,32 1 10 0.000397886 1225 Q WS 3417064 + 8 [jbd2/sdc-8]
2302 8,32 1 11 0.000404800 1225 M WS 3417064 + 8 [jbd2/sdc-8]
2303 8,32 1 12 0.000412343 1225 Q WS 3417072 + 8 [jbd2/sdc-8]
2304 8,32 1 13 0.000416533 1225 M WS 3417072 + 8 [jbd2/sdc-8]
2305 8,32 1 14 0.000422121 1225 Q WS 3417080 + 8 [jbd2/sdc-8]
2306 8,32 1 15 0.000425194 1225 M WS 3417080 + 8 [jbd2/sdc-8]
2307 8,32 1 16 0.000431968 1225 Q WS 3417088 + 8 [jbd2/sdc-8]
2308 8,32 1 17 0.000435251 1225 M WS 3417088 + 8 [jbd2/sdc-8]
2309 8,32 1 18 0.000440279 1225 Q WS 3417096 + 8 [jbd2/sdc-8]
2310 8,32 1 19 0.000443911 1225 M WS 3417096 + 8 [jbd2/sdc-8]
2311 8,32 1 20 0.000450336 1225 Q WS 3417104 + 8 [jbd2/sdc-8]
2312 8,32 1 21 0.000454038 1225 M WS 3417104 + 8 [jbd2/sdc-8]
2313 8,32 1 22 0.000462070 1225 Q WS 3417112 + 8 [jbd2/sdc-8]
2314 8,32 1 23 0.000465422 1225 M WS 3417112 + 8 [jbd2/sdc-8]
2315 8,32 1 24 0.000474222 1225 I WS 3417056 + 64 [jbd2/sdc-8]
2317 8,32 1 25 0.000489727 1225 U N [jbd2/sdc-8] 1
2318 8,32 1 0 0.000498457 0 m N cfq1225 Not idling. st->count:1
2362 Skips: 0 forward (0 - 0.0%)
2379 root@crownbay:~# blktrace /dev/sdc -o - | blkparse -i -
2381 This enables long-lived tracing sessions
2383 certain conditions in the trace data in 'real-time' by viewing the trace
2405 system being traced with the -l option, along with the device to trace::
2407 root@crownbay:~# blktrace -l /dev/sdc
2410 On the host system, use the -h option to connect to the target system,
2413 $ blktrace -d /dev/sdc -h 192.168.1.43
2423 …root@crownbay:/media/sdc# rm linux-2.6.19.2.tar.bz2; wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.…
2425 linux-2.6.19.2.tar.b 100% \|*******************************\| 41727k 0:00:00 ETA
2427 When it's done, do a Ctrl-C on the host system to stop the
2431 CPU 0: 7691 events, 361 KiB data
2432 CPU 1: 4109 events, 193 KiB data
2440 CPU 0: 7691 events, 361 KiB data
2441 CPU 1: 4109 events, 193 KiB data
2445 save the target output inside a hostname-timestamp directory::
2447 $ ls -al
2448 drwxr-xr-x 10 root root 1024 Oct 28 02:40 .
2449 drwxr-sr-x 4 root root 1024 Oct 26 18:24 ..
2450 drwxr-xr-x 2 root root 1024 Oct 28 02:40 192.168.1.43-2012-10-28-02:40:56
2454 $ ls -l
2455 -rw-r--r-- 1 root root 369193 Oct 28 02:44 sdc.blktrace.0
2456 -rw-r--r-- 1 root root 197278 Oct 28 02:44 sdc.blktrace.1
2506 Skips: 0 forward (0 - 0.0%)
2515 :ref:`profile-manual/usage:The 'trace events' Subsystem`, which
2538 trace to capture this trace - this allows us to wait around on the pipe
2542 cat-3587 [001] d..1 3023.276361: 8,32 Q R 1699848 + 8 [cat]
2543 cat-3587 [001] d..1 3023.276410: 8,32 m N cfq3587 alloced
2544 cat-3587 [001] d..1 3023.276415: 8,32 G R 1699848 + 8 [cat]
2545 cat-3587 [001] d..1 3023.276424: 8,32 P N [cat]
2546 cat-3587 [001] d..2 3023.276432: 8,32 I R 1699848 + 8 [cat]
2547 cat-3587 [001] d..1 3023.276439: 8,32 m N cfq3587 insert_request
2548 cat-3587 [001] d..1 3023.276445: 8,32 m N cfq3587 add_to_rr
2549 cat-3587 [001] d..2 3023.276454: 8,32 U N [cat] 1
2550 cat-3587 [001] d..1 3023.276464: 8,32 m N cfq workload slice:150
2551 … cat-3587 [001] d..1 3023.276471: 8,32 m N cfq3587 set_active wl_prio:0 wl_type:2
2552 cat-3587 [001] d..1 3023.276478: 8,32 m N cfq3587 fifo= (null)
2553 cat-3587 [001] d..1 3023.276483: 8,32 m N cfq3587 dispatch_insert
2554 cat-3587 [001] d..1 3023.276490: 8,32 m N cfq3587 dispatched a request
2555 cat-3587 [001] d..1 3023.276497: 8,32 m N cfq3587 activate rq, drv=1
2556 cat-3587 [001] d..2 3023.276500: 8,32 D R 1699848 + 8 [cat]
2563 ----------------------
2568 - https://linux.die.net/man/8/blktrace
2570 - https://linux.die.net/man/1/blkparse
2572 - https://linux.die.net/man/8/btrace