Age | Commit message (Collapse) | Author |
|
After closing the perf stream the parking the GPU engines may easily
take more than 1 second: releasing the FD itself results in a new
request submission via i915_perf_release()->i915_oa_stream_destroy()->
gen8_disable_metric_set(). That means a >1sec delay for the delayed
unpark to be called due to the delay from
queue_delayed_work(retire_work, round_jiffies_up_relative(HZ))
+ the delay from
mod_delayed_work(idle_work, msecs_to_jiffies(100))
Scheduling may push this delay even further, I measured >2sec delays on
my GLK.
Fix this by calling gem_quiescent_gpu() which syncs up with the idle
work, thus making sure we'll see RC6 residency afterwards.
v2:
- Use gem_quiescent_gpu() instead of increasing the timeout. (Chris)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103179
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
|
|
Similar to sysfs_path - more explicit more better.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Doing this lets us avoid drm_get_card, which we plan to remove
eventually.
v2: Don't break the test
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Those 2 tests fail regularly on HSW, probably because the OA period
aligns slightly differently there because of the differnce in the
timestamp frequency between HSW and other generation. Just bump the
max number by 1 to fix the issue.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102252
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
We want to allow bpp = 8 or 16, so make sure we set the bpp
in igt_buf. This way we can extend rendercopy to support
other values for bpp.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
[mlankhorst: Fix double ;; (Ville]
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
|
|
It makes the tests more reliable because the expected number of
reports is more acurate (given that we'll have almost no
context-switch reports).
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
The behavior of the OA unit is a tiny bit different on ICL. It
appears to be a bit sloppier on the timings of its OA reports (missing
the deadline by one period quite often). Let's add an acceptance delta.
v2: Use larger acceptance delta only on ICL (José)
Tweak indentation (José)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
|
|
Remove gem.has_ppgtt as the information is no longer used.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
Store a bit of aux surface state in igt_buf. This will be needed
for rendercopy AUX_CCS_E color compression.
We also have to sprinkle memset()s and whatnot all over to make
sure the current igt_buf users don't leave the aux stuff full
of stack garbage.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
If the GPU is not usable, we will not be able to submit workloads to be
measured and so observing them will fail.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
|
|
This make perf tests to run in Icelake.
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
|
|
We don't expect to access those registers on Braswell.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105593
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
|
|
Much like the enable-disable subtest, we're printing a bunch of values
that were meant to try to figure out the issue of the OA unit not
producing reports. After fixing the i915 driver with :
https://patchwork.freedesktop.org/series/39112/
We don't need those values anymore.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
We're printing a bunch of values that were meant to try to figure out
the issue of the OA unit not producing reports. After fixing the i915
driver with :
https://patchwork.freedesktop.org/series/39112/
We don't need those values anymore. It turns out the issue was simply
a race condition in the driver.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
Move variable declaration to top of scope to avoid C90 build warning.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
|
|
The previous patch said :
"verify that the time is always longer or equal to the period we've
asked for"
This is an obvious error, it only worked on my machine and the CI
because only one longer period was observed. But another CI run caught
the issue :
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_4280/shard-glkb6/igt@perf@oa-exponents.html
Fixes: c3d11ca104fa ("tests/perf: make oa-exponents subtest more reliable")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
We know the OA unit might skip some reports from time to time (reasons
include pressure on memory controller, power management, ...). So
rather than checking that the time between periodic reports is about
the period we asked for, let's verify that the time is always longer
or equal to the period we've asked for.
We still have to leave some room for errors. Here is dump of an error
in this updated test :
(perf:405) DEBUG: report0019 ts=e217de20 hw_id=0x00000014 delta=64
(perf:405) DEBUG: report0020 ts=e217de60 hw_id=0x00000014 delta=64
(perf:405) DEBUG: report0021 ts=e217dea0 hw_id=0x00000014 delta=64
(perf:405) DEBUG: report0022 ts=e217df66 hw_id=0x00000014 delta=198 ******
(perf:405) DEBUG: report0023 ts=e217dfa0 hw_id=0x00000014 delta=58 ******
(perf:405) DEBUG: report0024 ts=e217dfe0 hw_id=0x00000014 delta=64
(perf:405) DEBUG: report0025 ts=e217e020 hw_id=0x00000014 delta=64
(perf:405) DEBUG: report0026 ts=e217e060 hw_id=0x00000014 delta=64
As you can see there is a discrepency in the periodic reports. I have
no explanation for it. This isn't a programming error since the same
context has correct periods before and after, so it must be some kind
of hardware glitch/corner-case that hasn't be been documented.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Testcase: igt/perf/buffer-fill & igt/perf/enable-disable & igt/perf/gen8-unprivileged-single-ctx-counters
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104658
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
We mostly run tests on the most recent kernels but those are failing
on < 4.14.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
On Haswell, at least, MI_REPORT_PERF_COUNT is not flushed by the
PIPECONTROL surrounding the batch. (In theory, before the breadcrumb is
updated the CPU's view of memory is coherent with the GPU, i.e. all
writes have landed and are visible to userspace. This does not appear to
be the case for MI_REPORT_PERF_COUNT.) This makes it an unreliable
method for querying the timestamp, so use MI_STORE_REGISTER_MEM instead.
Testcase: igt/perf/oa-exponents
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
|
|
As igt_sysfs exists to provide convenience routine for parsing files
found in the device's sysfs dir, use it.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
|
|
This will enable running the tests on Cannonlake.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
Add the test config uuid for GT3.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
On Cannonlake+ the CS timestamp frequency might vary from one part to
another. We have a new param to query this from the kernel (which
reads the value from registers).
v2: Skip the tests when timestamp frequency cannot be read on CNL+ (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
We use this value in several places.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
Now that we have drm uapi headers in tree, we can drop this stuff.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Data types are defined differently on 32-bit systems, causing gcc to
complain about printf format specifiers not matching the size of the
variables passed in. Use PRIu64 and %zu where appropriate.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
The I915_OA_FORMAT_C4_B8 format has different offset on Haswell &
Gen8. Let's split the format lists so we don't mix them.
Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
Using the same timestamp frequency as Skylake/Kabylake.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
Some of our tests measure that the OA unit produces reports at
expected time intervals (as configured through the PERF_OPEN
ioctl). It turns out the power management plays a role in the decision
of the OA unit to write reports to memory. Under normal circumstances
we don't really mind if the unit misses one report here or there, but
for our tests it makes pretty difficult to verify whether we've made a
mistake in the configuration.
To work around this, let's prevent power management to kick in by
holding /dev/cpu_dma_latency opened for the following tests :
- enable-disable
- blocking
- polling
- buffer-fill
- oa-exponents
Many thanks to Chris Wilson for suggesting this!
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
Blocking & polling tests define an amount of time to spend in the test
and then estimate the number of syscalls that should successfully
return. The problem is that while running the test we might spend
slightly more time than initiallly planned. This change estimates the
number of syscalls based on time spent after the fact.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
Filling rate of the buffer must discard context switch reports as they
do not depend upon the periodicity, instead they're a factor on the
amount of different applications concurrently running on the system.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
Estimation of the amount of reports can only refer to periodic ones,
as context switch reports completely depend on what happens on the
system. Also generate some load to prevent clock frequency changes to
impact our measurement.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
New issues that were discovered while making the tests work on Gen8+ :
- we need to measure timings between periodic reports and discard all
other kind of reports
- it seems periodicity of the reports can be affected outside of RC6
(frequency change), we can detect this by looking at the amount of
clock cycles per timestamp deltas
v2: Drop some unused variables (Matthew)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
Experience shows that most of the issues we face with periodicity of
the reports produced by the OA unit are related to power management,
not frequency.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
Make it clear that we're using a 16Mb buffer.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
Signed-off-by: Robert Bragg <robert@sixbynine.org>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
When debugging unstable tests on new platforms we currently we don't
cleanup everything well in between different tests. Since only a
single OA stream fd can be opened at a time, having the stream_fd as a
global variable helps us cleanup the state between tests.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
We don't have flex eu counters on HSW, so don't try to program for
thoses.
Reported-by: CI \o/
Fixes: 609cb5e30b4 ("tests/perf: add tests to verify create/destroy userspace configs")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: adcde8ac ("tests/perf: fix build where system headers don't have Gen8 formats")
Tested-by: Matthew Auld <matthew.auld@intel.com
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
|
|
v2: Use previous enum to define the new Gen8 enums (Petri)
v3: Duh! (Lionel)
v4: Redefine MAX oa formats value (Daniel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Daniel Stone <daniels@collabora.com>
|
|
v2: Add tests regarding removing configs (Matthew)
Add tests regarding adding/removing configs without permissions
(Matthew)
v3: Add some flex registers (Matthew)
v4: memset oa_config to 0 (Lionel)
Change error code for removing unexisting config EINVAL->ENOENT (Lionel)
v5: Update i915 uapi (Chris)
Use wrappers to make assertions more readable (Chris)
v6: Add whitelisting test (Lionel)
v7: Add wrapper function for removing configs (Matthew)
Fix an unfinished comment (Matthew)
v8: Add EFAULT check (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
|
|
Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
|
|
Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
|
|
Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
|
|
Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
|