summaryrefslogtreecommitdiff
path: root/tools/perf/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'tools/perf/Documentation')
-rw-r--r--tools/perf/Documentation/intel-hybrid.txt214
-rw-r--r--tools/perf/Documentation/perf-annotate.txt7
-rw-r--r--tools/perf/Documentation/perf-buildid-cache.txt2
-rw-r--r--tools/perf/Documentation/perf-config.txt11
-rw-r--r--tools/perf/Documentation/perf-data.txt5
-rw-r--r--tools/perf/Documentation/perf-iostat.txt88
-rw-r--r--tools/perf/Documentation/perf-record.txt1
-rw-r--r--tools/perf/Documentation/perf-report.txt10
-rw-r--r--tools/perf/Documentation/perf-stat.txt29
-rw-r--r--tools/perf/Documentation/perf-top.txt2
-rw-r--r--tools/perf/Documentation/perf.txt12
-rw-r--r--tools/perf/Documentation/topdown.txt18
12 files changed, 394 insertions, 5 deletions
diff --git a/tools/perf/Documentation/intel-hybrid.txt b/tools/perf/Documentation/intel-hybrid.txt
new file mode 100644
index 000000000000..07f0aa3bf682
--- /dev/null
+++ b/tools/perf/Documentation/intel-hybrid.txt
@@ -0,0 +1,214 @@
+Intel hybrid support
+--------------------
+Support for Intel hybrid events within perf tools.
+
+For some Intel platforms, such as AlderLake, which is hybrid platform and
+it consists of atom cpu and core cpu. Each cpu has dedicated event list.
+Part of events are available on core cpu, part of events are available
+on atom cpu and even part of events are available on both.
+
+Kernel exports two new cpu pmus via sysfs:
+/sys/devices/cpu_core
+/sys/devices/cpu_atom
+
+The 'cpus' files are created under the directories. For example,
+
+cat /sys/devices/cpu_core/cpus
+0-15
+
+cat /sys/devices/cpu_atom/cpus
+16-23
+
+It indicates cpu0-cpu15 are core cpus and cpu16-cpu23 are atom cpus.
+
+Quickstart
+
+List hybrid event
+-----------------
+
+As before, use perf-list to list the symbolic event.
+
+perf list
+
+inst_retired.any
+ [Fixed Counter: Counts the number of instructions retired. Unit: cpu_atom]
+inst_retired.any
+ [Number of instructions retired. Fixed Counter - architectural event. Unit: cpu_core]
+
+The 'Unit: xxx' is added to brief description to indicate which pmu
+the event is belong to. Same event name but with different pmu can
+be supported.
+
+Enable hybrid event with a specific pmu
+---------------------------------------
+
+To enable a core only event or atom only event, following syntax is supported:
+
+ cpu_core/<event name>/
+or
+ cpu_atom/<event name>/
+
+For example, count the 'cycles' event on core cpus.
+
+ perf stat -e cpu_core/cycles/
+
+Create two events for one hardware event automatically
+------------------------------------------------------
+
+When creating one event and the event is available on both atom and core,
+two events are created automatically. One is for atom, the other is for
+core. Most of hardware events and cache events are available on both
+cpu_core and cpu_atom.
+
+For hardware events, they have pre-defined configs (e.g. 0 for cycles).
+But on hybrid platform, kernel needs to know where the event comes from
+(from atom or from core). The original perf event type PERF_TYPE_HARDWARE
+can't carry pmu information. So now this type is extended to be PMU aware
+type. The PMU type ID is stored at attr.config[63:32].
+
+PMU type ID is retrieved from sysfs.
+/sys/devices/cpu_atom/type
+/sys/devices/cpu_core/type
+
+The new attr.config layout for PERF_TYPE_HARDWARE:
+
+PERF_TYPE_HARDWARE: 0xEEEEEEEE000000AA
+ AA: hardware event ID
+ EEEEEEEE: PMU type ID
+
+Cache event is similar. The type PERF_TYPE_HW_CACHE is extended to be
+PMU aware type. The PMU type ID is stored at attr.config[63:32].
+
+The new attr.config layout for PERF_TYPE_HW_CACHE:
+
+PERF_TYPE_HW_CACHE: 0xEEEEEEEE00DDCCBB
+ BB: hardware cache ID
+ CC: hardware cache op ID
+ DD: hardware cache op result ID
+ EEEEEEEE: PMU type ID
+
+When enabling a hardware event without specified pmu, such as,
+perf stat -e cycles -a (use system-wide in this example), two events
+are created automatically.
+
+ ------------------------------------------------------------
+ perf_event_attr:
+ size 120
+ config 0x400000000
+ sample_type IDENTIFIER
+ read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
+ disabled 1
+ inherit 1
+ exclude_guest 1
+ ------------------------------------------------------------
+
+and
+
+ ------------------------------------------------------------
+ perf_event_attr:
+ size 120
+ config 0x800000000
+ sample_type IDENTIFIER
+ read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
+ disabled 1
+ inherit 1
+ exclude_guest 1
+ ------------------------------------------------------------
+
+type 0 is PERF_TYPE_HARDWARE.
+0x4 in 0x400000000 indicates it's cpu_core pmu.
+0x8 in 0x800000000 indicates it's cpu_atom pmu (atom pmu type id is random).
+
+The kernel creates 'cycles' (0x400000000) on cpu0-cpu15 (core cpus),
+and create 'cycles' (0x800000000) on cpu16-cpu23 (atom cpus).
+
+For perf-stat result, it displays two events:
+
+ Performance counter stats for 'system wide':
+
+ 6,744,979 cpu_core/cycles/
+ 1,965,552 cpu_atom/cycles/
+
+The first 'cycles' is core event, the second 'cycles' is atom event.
+
+Thread mode example:
+--------------------
+
+perf-stat reports the scaled counts for hybrid event and with a percentage
+displayed. The percentage is the event's running time/enabling time.
+
+One example, 'triad_loop' runs on cpu16 (atom core), while we can see the
+scaled value for core cycles is 160,444,092 and the percentage is 0.47%.
+
+perf stat -e cycles -- taskset -c 16 ./triad_loop
+
+As previous, two events are created.
+
+------------------------------------------------------------
+perf_event_attr:
+ size 120
+ config 0x400000000
+ sample_type IDENTIFIER
+ read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
+ disabled 1
+ inherit 1
+ enable_on_exec 1
+ exclude_guest 1
+------------------------------------------------------------
+
+and
+
+------------------------------------------------------------
+perf_event_attr:
+ size 120
+ config 0x800000000
+ sample_type IDENTIFIER
+ read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
+ disabled 1
+ inherit 1
+ enable_on_exec 1
+ exclude_guest 1
+------------------------------------------------------------
+
+ Performance counter stats for 'taskset -c 16 ./triad_loop':
+
+ 233,066,666 cpu_core/cycles/ (0.43%)
+ 604,097,080 cpu_atom/cycles/ (99.57%)
+
+perf-record:
+------------
+
+If there is no '-e' specified in perf record, on hybrid platform,
+it creates two default 'cycles' and adds them to event list. One
+is for core, the other is for atom.
+
+perf-stat:
+----------
+
+If there is no '-e' specified in perf stat, on hybrid platform,
+besides of software events, following events are created and
+added to event list in order.
+
+cpu_core/cycles/,
+cpu_atom/cycles/,
+cpu_core/instructions/,
+cpu_atom/instructions/,
+cpu_core/branches/,
+cpu_atom/branches/,
+cpu_core/branch-misses/,
+cpu_atom/branch-misses/
+
+Of course, both perf-stat and perf-record support to enable
+hybrid event with a specific pmu.
+
+e.g.
+perf stat -e cpu_core/cycles/
+perf stat -e cpu_atom/cycles/
+perf stat -e cpu_core/r1a/
+perf stat -e cpu_atom/L1-icache-loads/
+perf stat -e cpu_core/cycles/,cpu_atom/instructions/
+perf stat -e '{cpu_core/cycles/,cpu_core/instructions/}'
+
+But '{cpu_core/cycles/,cpu_atom/instructions/}' will return
+warning and disable grouping, because the pmus in group are
+not matched (cpu_core vs. cpu_atom).
diff --git a/tools/perf/Documentation/perf-annotate.txt b/tools/perf/Documentation/perf-annotate.txt
index 1b5042f134a8..80c1be5d566c 100644
--- a/tools/perf/Documentation/perf-annotate.txt
+++ b/tools/perf/Documentation/perf-annotate.txt
@@ -124,6 +124,13 @@ OPTIONS
--group::
Show event group information together
+--demangle::
+ Demangle symbol names to human readable form. It's enabled by default,
+ disable with --no-demangle.
+
+--demangle-kernel::
+ Demangle kernel symbol names to human readable form (for C++ kernels).
+
--percent-type::
Set annotation percent type from following choices:
global-period, local-period, global-hits, local-hits
diff --git a/tools/perf/Documentation/perf-buildid-cache.txt b/tools/perf/Documentation/perf-buildid-cache.txt
index bb167e32a1d7..cd8ce6e8ec12 100644
--- a/tools/perf/Documentation/perf-buildid-cache.txt
+++ b/tools/perf/Documentation/perf-buildid-cache.txt
@@ -57,7 +57,7 @@ OPTIONS
-u::
--update=::
Update specified file of the cache. Note that this doesn't remove
- older entires since those may be still needed for annotating old
+ older entries since those may be still needed for annotating old
(or remote) perf.data. Only if there is already a cache which has
exactly same build-id, that is replaced by new one. It can be used
to update kallsyms and kernel dso to vmlinux in order to support
diff --git a/tools/perf/Documentation/perf-config.txt b/tools/perf/Documentation/perf-config.txt
index 153bde14bbe0..b0872c801866 100644
--- a/tools/perf/Documentation/perf-config.txt
+++ b/tools/perf/Documentation/perf-config.txt
@@ -123,6 +123,7 @@ Given a $HOME/.perfconfig like this:
queue-size = 0
children = true
group = true
+ skip-empty = true
[llvm]
dump-obj = true
@@ -393,6 +394,12 @@ annotate.*::
This option works with tui, stdio2 browsers.
+ annotate.demangle::
+ Demangle symbol names to human readable form. Default is 'true'.
+
+ annotate.demangle_kernel::
+ Demangle kernel symbol names to human readable form. Default is 'true'.
+
hist.*::
hist.percentage::
This option control the way to calculate overhead of filtered entries -
@@ -525,6 +532,10 @@ report.*::
0.07% 0.00% noploop ld-2.15.so [.] strcmp
0.03% 0.00% noploop [kernel.kallsyms] [k] timerqueue_del
+ report.skip-empty::
+ This option can change default stat behavior with empty results.
+ If it's set true, 'perf report --stat' will not show 0 stats.
+
top.*::
top.children::
Same as 'report.children'. So if it is enabled, the output of 'top'
diff --git a/tools/perf/Documentation/perf-data.txt b/tools/perf/Documentation/perf-data.txt
index 726b9bc9e1a7..417bf17e265c 100644
--- a/tools/perf/Documentation/perf-data.txt
+++ b/tools/perf/Documentation/perf-data.txt
@@ -17,7 +17,7 @@ Data file related processing.
COMMANDS
--------
convert::
- Converts perf data file into another format (only CTF [1] format is support by now).
+ Converts perf data file into another format.
It's possible to set data-convert debug variable to get debug messages from conversion,
like:
perf --debug data-convert data convert ...
@@ -27,6 +27,9 @@ OPTIONS for 'convert'
--to-ctf::
Triggers the CTF conversion, specify the path of CTF data directory.
+--to-json::
+ Triggers JSON conversion. Specify the JSON filename to output.
+
--tod::
Convert time to wall clock time.
diff --git a/tools/perf/Documentation/perf-iostat.txt b/tools/perf/Documentation/perf-iostat.txt
new file mode 100644
index 000000000000..165176944031
--- /dev/null
+++ b/tools/perf/Documentation/perf-iostat.txt
@@ -0,0 +1,88 @@
+perf-iostat(1)
+===============
+
+NAME
+----
+perf-iostat - Show I/O performance metrics
+
+SYNOPSIS
+--------
+[verse]
+'perf iostat' list
+'perf iostat' <ports> -- <command> [<options>]
+
+DESCRIPTION
+-----------
+Mode is intended to provide four I/O performance metrics per each PCIe root port:
+
+- Inbound Read - I/O devices below root port read from the host memory, in MB
+
+- Inbound Write - I/O devices below root port write to the host memory, in MB
+
+- Outbound Read - CPU reads from I/O devices below root port, in MB
+
+- Outbound Write - CPU writes to I/O devices below root port, in MB
+
+OPTIONS
+-------
+<command>...::
+ Any command you can specify in a shell.
+
+list::
+ List all PCIe root ports.
+
+<ports>::
+ Select the root ports for monitoring. Comma-separated list is supported.
+
+EXAMPLES
+--------
+
+1. List all PCIe root ports (example for 2-S platform):
+
+ $ perf iostat list
+ S0-uncore_iio_0<0000:00>
+ S1-uncore_iio_0<0000:80>
+ S0-uncore_iio_1<0000:17>
+ S1-uncore_iio_1<0000:85>
+ S0-uncore_iio_2<0000:3a>
+ S1-uncore_iio_2<0000:ae>
+ S0-uncore_iio_3<0000:5d>
+ S1-uncore_iio_3<0000:d7>
+
+2. Collect metrics for all PCIe root ports:
+
+ $ perf iostat -- dd if=/dev/zero of=/dev/nvme0n1 bs=1M oflag=direct
+ 357708+0 records in
+ 357707+0 records out
+ 375083606016 bytes (375 GB, 349 GiB) copied, 215.974 s, 1.7 GB/s
+
+ Performance counter stats for 'system wide':
+
+ port Inbound Read(MB) Inbound Write(MB) Outbound Read(MB) Outbound Write(MB)
+ 0000:00 1 0 2 3
+ 0000:80 0 0 0 0
+ 0000:17 352552 43 0 21
+ 0000:85 0 0 0 0
+ 0000:3a 3 0 0 0
+ 0000:ae 0 0 0 0
+ 0000:5d 0 0 0 0
+ 0000:d7 0 0 0 0
+
+3. Collect metrics for comma-separated list of PCIe root ports:
+
+ $ perf iostat 0000:17,0:3a -- dd if=/dev/zero of=/dev/nvme0n1 bs=1M oflag=direct
+ 357708+0 records in
+ 357707+0 records out
+ 375083606016 bytes (375 GB, 349 GiB) copied, 197.08 s, 1.9 GB/s
+
+ Performance counter stats for 'system wide':
+
+ port Inbound Read(MB) Inbound Write(MB) Outbound Read(MB) Outbound Write(MB)
+ 0000:17 358559 44 0 22
+ 0000:3a 3 2 0 0
+
+ 197.081983474 seconds time elapsed
+
+SEE ALSO
+--------
+linkperf:perf-stat[1] \ No newline at end of file
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index f3161c9673e9..d71bac847936 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -695,6 +695,7 @@ measurements:
wait -n ${perf_pid}
exit $?
+include::intel-hybrid.txt[]
SEE ALSO
--------
diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index f546b5e9db05..24efc0583c93 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -112,6 +112,8 @@ OPTIONS
- ins_lat: Instruction latency in core cycles. This is the global instruction
latency
- local_ins_lat: Local instruction latency version
+ - p_stage_cyc: On powerpc, this presents the number of cycles spent in a
+ pipeline stage. And currently supported only on powerpc.
By default, comm, dso and symbol keys are used.
(i.e. --sort comm,dso,symbol)
@@ -224,6 +226,9 @@ OPTIONS
--dump-raw-trace::
Dump raw trace in ASCII.
+--disable-order::
+ Disable raw trace ordering.
+
-g::
--call-graph=<print_type,threshold[,print_limit],order,sort_key[,branch],value>::
Display call chains using type, min percent threshold, print limit,
@@ -472,7 +477,7 @@ OPTIONS
but probably we'll make the default not to show the switch-on/off events
on the --group mode and if there is only one event besides the off/on ones,
go straight to the histogram browser, just like 'perf report' with no events
- explicitely specified does.
+ explicitly specified does.
--itrace::
Options for decoding instruction tracing data. The options are:
@@ -566,6 +571,9 @@ include::itrace.txt[]
sampled cycles
'Avg Cycles' - block average sampled cycles
+--skip-empty::
+ Do not print 0 results in the --stat output.
+
include::callchain-overhead-calculation.txt[]
SEE ALSO
diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 08a1714494f8..45c2467e4eb2 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -93,6 +93,19 @@ report::
1.102235068 seconds time elapsed
+--bpf-counters::
+ Use BPF programs to aggregate readings from perf_events. This
+ allows multiple perf-stat sessions that are counting the same metric (cycles,
+ instructions, etc.) to share hardware counters.
+ To use BPF programs on common events by default, use
+ "perf config stat.bpf-counter-events=<list_of_events>".
+
+--bpf-attr-map::
+ With option "--bpf-counters", different perf-stat sessions share
+ information about shared BPF programs and maps via a pinned hashmap.
+ Use "--bpf-attr-map" to specify the path of this pinned hashmap.
+ The default path is /sys/fs/bpf/perf_attr_map.
+
ifdef::HAVE_LIBPFM[]
--pfm-events events::
Select a PMU event using libpfm4 syntax (see http://perfmon2.sf.net)
@@ -142,7 +155,10 @@ Do not aggregate counts across all monitored CPUs.
-n::
--null::
- null run - don't start any counters
+null run - Don't start any counters.
+
+This can be useful to measure just elapsed wall-clock time - or to assess the
+raw overhead of perf stat itself, without running any counters.
-v::
--verbose::
@@ -468,6 +484,15 @@ convenient for post processing.
--summary::
Print summary for interval mode (-I).
+--no-csv-summary::
+Don't print 'summary' at the first column for CVS summary output.
+This option must be used with -x and --summary.
+
+This option can be enabled in perf config by setting the variable
+'stat.no-csv-summary'.
+
+$ perf config stat.no-csv-summary=true
+
EXAMPLES
--------
@@ -527,6 +552,8 @@ The fields are in this order:
Additional metrics may be printed with all earlier fields being empty.
+include::intel-hybrid.txt[]
+
SEE ALSO
--------
linkperf:perf-top[1], linkperf:perf-list[1]
diff --git a/tools/perf/Documentation/perf-top.txt b/tools/perf/Documentation/perf-top.txt
index ee2024691d46..bba5ffb05463 100644
--- a/tools/perf/Documentation/perf-top.txt
+++ b/tools/perf/Documentation/perf-top.txt
@@ -317,7 +317,7 @@ Default is to monitor all CPUS.
but probably we'll make the default not to show the switch-on/off events
on the --group mode and if there is only one event besides the off/on ones,
go straight to the histogram browser, just like 'perf top' with no events
- explicitely specified does.
+ explicitly specified does.
--stitch-lbr::
Show callgraph with stitched LBRs, which may have more complete
diff --git a/tools/perf/Documentation/perf.txt b/tools/perf/Documentation/perf.txt
index c130a3c46a90..9c330cdfa973 100644
--- a/tools/perf/Documentation/perf.txt
+++ b/tools/perf/Documentation/perf.txt
@@ -76,3 +76,15 @@ SEE ALSO
linkperf:perf-stat[1], linkperf:perf-top[1],
linkperf:perf-record[1], linkperf:perf-report[1],
linkperf:perf-list[1]
+
+linkperf:perf-annotate[1],linkperf:perf-archive[1],
+linkperf:perf-bench[1], linkperf:perf-buildid-cache[1],
+linkperf:perf-buildid-list[1], linkperf:perf-c2c[1],
+linkperf:perf-config[1], linkperf:perf-data[1], linkperf:perf-diff[1],
+linkperf:perf-evlist[1], linkperf:perf-ftrace[1],
+linkperf:perf-help[1], linkperf:perf-inject[1],
+linkperf:perf-intel-pt[1], linkperf:perf-kallsyms[1],
+linkperf:perf-kmem[1], linkperf:perf-kvm[1], linkperf:perf-lock[1],
+linkperf:perf-mem[1], linkperf:perf-probe[1], linkperf:perf-sched[1],
+linkperf:perf-script[1], linkperf:perf-test[1],
+linkperf:perf-trace[1], linkperf:perf-version[1]
diff --git a/tools/perf/Documentation/topdown.txt b/tools/perf/Documentation/topdown.txt
index 10f07f9455b8..c6302df4cf29 100644
--- a/tools/perf/Documentation/topdown.txt
+++ b/tools/perf/Documentation/topdown.txt
@@ -72,6 +72,7 @@ For example, the perf_event_attr structure can be initialized with
The Fixed counter 3 must be the leader of the group.
#include <linux/perf_event.h>
+#include <sys/mman.h>
#include <sys/syscall.h>
#include <unistd.h>
@@ -95,6 +96,11 @@ int slots_fd = perf_event_open(&slots, 0, -1, -1, 0);
if (slots_fd < 0)
... error ...
+/* Memory mapping the fd permits _rdpmc calls from userspace */
+void *slots_p = mmap(0, getpagesize(), PROT_READ, MAP_SHARED, slots_fd, 0);
+if (!slot_p)
+ .... error ...
+
/*
* Open metrics event file descriptor for current task.
* Set slots event as the leader of the group.
@@ -110,6 +116,14 @@ int metrics_fd = perf_event_open(&metrics, 0, -1, slots_fd, 0);
if (metrics_fd < 0)
... error ...
+/* Memory mapping the fd permits _rdpmc calls from userspace */
+void *metrics_p = mmap(0, getpagesize(), PROT_READ, MAP_SHARED, metrics_fd, 0);
+if (!metrics_p)
+ ... error ...
+
+Note: the file descriptors returned by the perf_event_open calls must be memory
+mapped to permit calls to the _rdpmd instruction. Permission may also be granted
+by writing the /sys/devices/cpu/rdpmc sysfs node.
The RDPMC instruction (or _rdpmc compiler intrinsic) can now be used
to read slots and the topdown metrics at different points of the program:
@@ -141,6 +155,10 @@ as the parallelism and overlap in the CPU program execution will
cause too much measurement inaccuracy. For example instrumenting
individual basic blocks is definitely too fine grained.
+_rdpmc calls should not be mixed with reading the metrics and slots counters
+through system calls, as the kernel will reset these counters after each system
+call.
+
Decoding metrics values
=======================