summaryrefslogtreecommitdiff
path: root/tools/perf/util/evsel.c
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2021-02-22 13:59:43 -0800
committerLinus Torvalds <torvalds@linux-foundation.org>2021-02-22 13:59:43 -0800
commit3a36281a17199737b468befb826d4a23eb774445 (patch)
tree18bbdfcb0c30215883809f248d0334b7ea84d5ba /tools/perf/util/evsel.c
parent7c70f3a7488d2fa62d32849d138bf2b8420fe788 (diff)
parent3027ce36ccbae74f2e7c1afbfc3f69fee0c2a996 (diff)
Merge tag 'perf-tools-for-v5.12-2020-02-19' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tool updates from Arnaldo Carvalho de Melo: "New features: - Support instruction latency in 'perf report', with both memory latency (weight) and instruction latency information, users can locate expensive load instructions and understand time spent in different stages. - Extend 'perf c2c' to display the number of loads which were blocked by data or address conflict. - Add 'perf stat' support for L2 topdown events in systems such as Intel's Sapphire rapids server. - Add support for PERF_SAMPLE_CODE_PAGE_SIZE in various tools, as a sort key, for instance: perf report --stdio --sort=comm,symbol,code_page_size - New 'perf daemon' command to run long running sessions while providing a way to control the enablement of events without restarting a traditional 'perf record' session. - Enable counting events for BPF programs in 'perf stat' just like for other targets (tid, cgroup, cpu, etc), e.g.: # perf stat -e ref-cycles,cycles -b 254 -I 1000 1.487903822 115,200 ref-cycles 1.487903822 86,012 cycles 2.489147029 80,560 ref-cycles 2.489147029 73,784 cycles ^C The example above counts 'cycles' and 'ref-cycles' of BPF program of id 254. It is similar to bpftool-prog-profile command, but more flexible. - Support the new layout for PERF_RECORD_MMAP2 to carry the DSO build-id using infrastructure generalised from the eBPF subsystem, removing the need for traversing the perf.data file to collect build-ids at the end of 'perf record' sessions and helping with long running sessions where binaries can get replaced in updates, leading to possible mis-resolution of symbols. - Support filtering by hex address in 'perf script'. - Support DSO filter in 'perf script', like in other perf tools. - Add namespaces support to 'perf inject' - Add support for SDT (Dtrace Style Markers) events on ARM64. perf record: - Fix handling of eventfd() when draining a buffer in 'perf record'. - Improvements to the generation of metadata events for pre-existing threads (mmaps, comm, etc), speeding up the work done at the start of system wide or per CPU 'perf record' sessions. Hardware tracing: - Initial support for tracing KVM with Intel PT. - Intel PT fixes for IPC - Support Intel PT PSB (synchronization packets) events. - Automatically group aux-output events to overcome --filter syntax. - Enable PERF_SAMPLE_DATA_SRC on ARMs SPE. - Update ARM's CoreSight hardware tracing OpenCSD library to v1.0.0. perf annotate TUI: - Fix handling of 'k' ("show line number") hotkey - Fix jump parsing for C++ code. perf probe: - Add protection to avoid endless loop. cgroups: - Avoid reading cgroup mountpoint multiple times, caching it. - Fix handling of cgroup v1/v2 in mixed hierarchy. Symbol resolving: - Add OCaml symbol demangling. - Further fixes for handling PE executables when using perf with Wine and .exe/.dll files. - Fix 'perf unwind' DSO handling. - Resolve symbols against debug file first, to deal with artifacts related to LTO. - Fix gap between kernel end and module start on powerpc. Reporting tools: - The DSO filter shouldn't show samples in unresolved maps. - Improve debuginfod support in various tools. build ids: - Fix 16-byte build ids in 'perf buildid-cache', add a 'perf test' entry for that case. perf test: - Support for PERF_SAMPLE_WEIGHT_STRUCT. - Add test case for PERF_SAMPLE_CODE_PAGE_SIZE. - Shell based tests for 'perf daemon's commands ('start', 'stop, 'reconfig', 'list', etc). - ARM cs-etm 'perf test' fixes. - Add parse-metric memory bandwidth testcase. Compiler related: - Fix 'perf probe' kretprobe issue caused by gcc 11 bug when used with -fpatchable-function-entry. - Fix ARM64 build with gcc 11's -Wformat-overflow. - Fix unaligned access in sample parsing test. - Fix printf conversion specifier for IP addresses on arm64, s390 and powerpc. Arch specific: - Support exposing Performance Monitor Counter SPRs as part of extended regs on powerpc. - Add JSON 'perf stat' metrics for ARM64's imx8mp, imx8mq and imx8mn DDR, fix imx8mm ones. - Fix common and uarch events for ARM64's A76 and Ampere eMag" * tag 'perf-tools-for-v5.12-2020-02-19' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (148 commits) perf buildid-cache: Don't skip 16-byte build-ids perf buildid-cache: Add test for 16-byte build-id perf symbol: Remove redundant libbfd checks perf test: Output the sub testing result in cs-etm perf test: Suppress logs in cs-etm testing perf tools: Fix arm64 build error with gcc-11 perf intel-pt: Add documentation for tracing virtual machines perf intel-pt: Split VM-Entry and VM-Exit branches perf intel-pt: Adjust sample flags for VM-Exit perf intel-pt: Allow for a guest kernel address filter perf intel-pt: Support decoding of guest kernel perf machine: Factor out machine__idle_thread() perf machine: Factor out machines__find_guest() perf intel-pt: Amend decoder to track the NR flag perf intel-pt: Retain the last PIP packet payload as is perf intel_pt: Add vmlaunch and vmresume as branches perf script: Add branch types for VM-Entry and VM-Exit perf auxtrace: Automatically group aux-output events perf test: Fix unaligned access in sample parsing test perf tools: Support arch specific PERF_SAMPLE_WEIGHT_STRUCT processing ...
Diffstat (limited to 'tools/perf/util/evsel.c')
-rw-r--r--tools/perf/util/evsel.c63
1 files changed, 55 insertions, 8 deletions
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index c26ea82220bd..1bf76864c4f2 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -25,6 +25,7 @@
#include <stdlib.h>
#include <perf/evsel.h>
#include "asm/bug.h"
+#include "bpf_counter.h"
#include "callchain.h"
#include "cgroup.h"
#include "counts.h"
@@ -247,6 +248,7 @@ void evsel__init(struct evsel *evsel,
evsel->bpf_obj = NULL;
evsel->bpf_fd = -1;
INIT_LIST_HEAD(&evsel->config_terms);
+ INIT_LIST_HEAD(&evsel->bpf_counter_list);
perf_evsel__object.init(evsel);
evsel->sample_size = __evsel__sample_size(attr->sample_type);
evsel__calc_id_pos(evsel);
@@ -1012,6 +1014,11 @@ struct evsel_config_term *__evsel__get_config_term(struct evsel *evsel, enum evs
return found_term;
}
+void __weak arch_evsel__set_sample_weight(struct evsel *evsel)
+{
+ evsel__set_sample_bit(evsel, WEIGHT);
+}
+
/*
* The enable_on_exec/disabled value strategy:
*
@@ -1166,12 +1173,14 @@ void evsel__config(struct evsel *evsel, struct record_opts *opts,
}
if (opts->sample_weight)
- evsel__set_sample_bit(evsel, WEIGHT);
+ arch_evsel__set_sample_weight(evsel);
+
+ attr->task = track;
+ attr->mmap = track;
+ attr->mmap2 = track && !perf_missing_features.mmap2;
+ attr->comm = track;
+ attr->build_id = track && opts->build_id;
- attr->task = track;
- attr->mmap = track;
- attr->mmap2 = track && !perf_missing_features.mmap2;
- attr->comm = track;
/*
* ksymbol is tracked separately with text poke because it needs to be
* system wide and enabled immediately.
@@ -1191,6 +1200,9 @@ void evsel__config(struct evsel *evsel, struct record_opts *opts,
if (opts->sample_data_page_size)
evsel__set_sample_bit(evsel, DATA_PAGE_SIZE);
+ if (opts->sample_code_page_size)
+ evsel__set_sample_bit(evsel, CODE_PAGE_SIZE);
+
if (opts->record_switch_events)
attr->context_switch = track;
@@ -1366,6 +1378,7 @@ void evsel__exit(struct evsel *evsel)
{
assert(list_empty(&evsel->core.node));
assert(evsel->evlist == NULL);
+ bpf_counter__destroy(evsel);
evsel__free_counts(evsel);
perf_evsel__free_fd(&evsel->core);
perf_evsel__free_id(&evsel->core);
@@ -1735,6 +1748,10 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
}
fallback_missing_features:
+ if (perf_missing_features.weight_struct) {
+ evsel__set_sample_bit(evsel, WEIGHT);
+ evsel__reset_sample_bit(evsel, WEIGHT_STRUCT);
+ }
if (perf_missing_features.clockid_wrong)
evsel->core.attr.clockid = CLOCK_MONOTONIC; /* should always work */
if (perf_missing_features.clockid) {
@@ -1781,6 +1798,8 @@ retry_open:
FD(evsel, cpu, thread) = fd;
+ bpf_counter__install_pe(evsel, cpu, fd);
+
if (unlikely(test_attr__enabled)) {
test_attr__open(&evsel->core.attr, pid, cpus->map[cpu],
fd, group_fd, flags);
@@ -1873,7 +1892,17 @@ try_fallback:
* Must probe features in the order they were added to the
* perf_event_attr interface.
*/
- if (!perf_missing_features.data_page_size &&
+ if (!perf_missing_features.weight_struct &&
+ (evsel->core.attr.sample_type & PERF_SAMPLE_WEIGHT_STRUCT)) {
+ perf_missing_features.weight_struct = true;
+ pr_debug2("switching off weight struct support\n");
+ goto fallback_missing_features;
+ } else if (!perf_missing_features.code_page_size &&
+ (evsel->core.attr.sample_type & PERF_SAMPLE_CODE_PAGE_SIZE)) {
+ perf_missing_features.code_page_size = true;
+ pr_debug2_peo("Kernel has no PERF_SAMPLE_CODE_PAGE_SIZE support, bailing out\n");
+ goto out_close;
+ } else if (!perf_missing_features.data_page_size &&
(evsel->core.attr.sample_type & PERF_SAMPLE_DATA_PAGE_SIZE)) {
perf_missing_features.data_page_size = true;
pr_debug2_peo("Kernel has no PERF_SAMPLE_DATA_PAGE_SIZE support, bailing out\n");
@@ -2076,6 +2105,13 @@ perf_event__check_size(union perf_event *event, unsigned int sample_size)
return 0;
}
+void __weak arch_perf_parse_sample_weight(struct perf_sample *data,
+ const __u64 *array,
+ u64 type __maybe_unused)
+{
+ data->weight = *array;
+}
+
int evsel__parse_sample(struct evsel *evsel, union perf_event *event,
struct perf_sample *data)
{
@@ -2316,9 +2352,9 @@ int evsel__parse_sample(struct evsel *evsel, union perf_event *event,
}
}
- if (type & PERF_SAMPLE_WEIGHT) {
+ if (type & PERF_SAMPLE_WEIGHT_TYPE) {
OVERFLOW_CHECK_u64(array);
- data->weight = *array;
+ arch_perf_parse_sample_weight(data, array, type);
array++;
}
@@ -2369,6 +2405,12 @@ int evsel__parse_sample(struct evsel *evsel, union perf_event *event,
array++;
}
+ data->code_page_size = 0;
+ if (type & PERF_SAMPLE_CODE_PAGE_SIZE) {
+ data->code_page_size = *array;
+ array++;
+ }
+
if (type & PERF_SAMPLE_AUX) {
OVERFLOW_CHECK_u64(array);
sz = *array++;
@@ -2678,6 +2720,8 @@ int evsel__open_strerror(struct evsel *evsel, struct target *target,
"We found oprofile daemon running, please stop it and try again.");
break;
case EINVAL:
+ if (evsel->core.attr.sample_type & PERF_SAMPLE_CODE_PAGE_SIZE && perf_missing_features.code_page_size)
+ return scnprintf(msg, size, "Asking for the code page size isn't supported by this kernel.");
if (evsel->core.attr.sample_type & PERF_SAMPLE_DATA_PAGE_SIZE && perf_missing_features.data_page_size)
return scnprintf(msg, size, "Asking for the data page size isn't supported by this kernel.");
if (evsel->core.attr.write_backward && perf_missing_features.write_backward)
@@ -2689,6 +2733,9 @@ int evsel__open_strerror(struct evsel *evsel, struct target *target,
if (perf_missing_features.aux_output)
return scnprintf(msg, size, "The 'aux_output' feature is not supported, update the kernel.");
break;
+ case ENODATA:
+ return scnprintf(msg, size, "Cannot collect data source with the load latency event alone. "
+ "Please add an auxiliary event in front of the load latency event.");
default:
break;
}