Age | Commit message (Collapse) | Author |
|
Since with multiple devices, we may have multiple different perf_pmu
each with their own type, we want to find the right one for the job.
The tests are run with a specific fd, from which we can extract the
appropriate bus-id and find the associated perf-type. The performance
monitoring tools are a little more general and not yet ready to probe
all device or bind to one in particular, so we just assume the default
igfx for the time being.
v2: Extract the bus address from out of sysfs
v3: A new name for a new decade!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Robert M. Fosha" <robert.m.fosha@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: "Robert M. Fosha" <robert.m.fosha@intel.com> #v2
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
intel-gpu-top is a dangerous tool which can hang machines due unsafe mmio
register access. This patch rewrites it to use only PMU.
Only overall command streamer busyness and GPU global data such as power
and frequencies are included in this new version.
For access to more GPU functional unit level data, an OA metric based tool
like gpu-top should be used instead.
v2:
* Sort engines by class and instance.
* Do not wait for one sampling period to display something on screen.
* Move code out of the asserts. (Rinat Ibragimov)
* Continuously adapt to terminal size. (Rinat Ibragimov)
v3:
* Change layout and precision of some field. (Chris Wilson)
Eero Tamminen:
* Use more user friendly engine names.
* Don't error out if a counter is missing.
* Add IMC read/write bandwidth.
* Report minimum required kernel version.
v4:
* Really support 4.16 by skipping of missing engines.
* Simpler and less hacky float printing.
* Preserve copyright header. (Antonio Argenziano)
* Simplify engines_ptr macro. (Rinat Ibragimov)
v5:
* Get RAPL unit from sysfs.
* Consolidate sysfs paths with a macro.
* Tidy error handling by carrying over and reporting errno.
* Check against console height on all prints.
* More readable minimum kernel version message. (Eero Tamminen)
* Column banner for per engine stats. (Eero Tamminen)
v6:
* Man page update. (Eero Tamminen)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Petri Latvala <petri.latvala@intel.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: Rinat Ibragimov <ibragimovrinat@mail.ru>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> # v1
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v0.5
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
Taken from drm-tip :
commit 1e6aa7e55c28ecd842b8b4599e4273c2429ee061
Author: Jani Nikula <jani.nikula@intel.com>
Date: Tue Mar 6 12:41:55 2018 +0200
drm/i915/icl: do not save DDI A/E sharing bit for ICL
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Since i915 PMU is removing separate RC6 counters and now aggregates all
under a single one, catch up the test and intel-gpu-overlay with those
changes.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
A bunch of tests for the new i915 PMU feature.
Parts of the code were initialy sketched by Dmitry Rogozhkin.
v2: (Most suggestions by Chris Wilson)
* Add new class/instance based engine list.
* Add gem_has_engine/gem_require_engine to work with class/instance.
* Use the above two throughout the test.
* Shorten tests to 100ms busy batches, seems enough.
* Add queued counter sanity checks.
* Use igt_nsec_elapsed.
* Skip on perf -ENODEV in some tests instead of embedding knowledge locally.
* Fix multi ordering for busy accounting.
* Use new guranteed_usleep when sleep time is asserted on.
* Check for no queued when idle/busy.
* Add queued counter init test.
* Add queued tests.
* Consolidate and increase multiple busy engines tests to most-busy and
all-busy tests.
* Guarantte interrupts by using fences.
* Test RC6 via forcewake.
v3:
* Tweak assert in interrupts subtest.
* Sprinkle of comments.
* Fix multi-client test which got broken in v2.
v4:
* Measured instead of guaranteed sleep.
* Missing sync in no_sema.
* Log busyness before asserts for debug.
* access(2) instead of open(2) to determine if cpu0 is hotpluggable.
* Test frequency reporting via min/max setting instead assuming.
^^ All above suggested by Chris Wilson. ^^
* Drop queued subtests to match i915.
* Use long batches with fences to ensure interrupts.
* Test render node as well.
v5:
* Add to meson build. (Petri Latvala)
* Use 1eN constants. (Chris Wilson)
* Add tests for semaphore and event waiting.
v6:
* Fix interrupts subtest by polling the fence from the "outside".
(Chris Wilson)
v7:
* Assert number of initialized engines matches the expectation.
(Chris Wilson)
* Warn instead of skipping if we couldn't restore the initial
frequency. (Chris Wilson)
* Move all asserts to after the test cleanup (just a tidy).
* More 1eN notation for timeouts.
* Bump the tolerance to 5% since I saw a few noisy runs with
sampling counters.
* Always start the PMU before submitting batches to lower
reliance on i915 doing the delayed engine busy stats disable.
v8:
* Update for upstream engine class enum.
v9:
* Add meson build support.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Wire up to the RAPL PMU for GPU energy readings.
The only complication is that we have to add code to parse:
# cat /sys/devices/power/events/energy-gpu.scale
2.3283064365386962890625e-10
v2: Link with -lm.
v3: strtod can handle scientific notation, even though my initial
reading of the man page did not spot that. (Chris Wilson)
v4: Meson fix.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
v2: Update for i915 changes.
v3: Use 1eN for large numbers. (Chris Wilson)
v4: Update for upstream engine class enum.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Configuration and format are uint64_t in the perf API.
Tidy some other details as well.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Various tool modules implement their owm PMU open wrapper which
can be replaced by calling the library one.
v2:
* Remove extra newline. (Chris Wilson)
* Commit msg.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Idea is to avoid duplication across multiple users in
upcoming patches.
v2: Commit message and use a separate library instead of piggy-
backing to libintel_tools. (Chris Wilson)
v3: Add Petri's meson build recipe.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Petri Latvala <petri.latvala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|