Age | Commit message (Collapse) | Author |
|
When using the pollable spinner, we often want to use it as a means of
ensuring the task is running on the GPU before switching to something
else. In which case we don't want to add extra delay inside the spinner,
but the current 1000 NOPs add on order of 5us, which is often larger
than the target latency.
v2: Don't change perf_pmu as that is sensitive to the extra CPU latency
from a tight GPU spinner.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Antonio Argenziano <antonio.argenziano@intel.com> #v1
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> #v1
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
In order to make adding more options easier, expose the full set of
options to the caller.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Petri Latvala <petri.latvala@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Katarzyna Dec <katarzyna.dec@intel.com>
|
|
We want to make sure RT tasks which use a lot of CPU times can submit
batch buffers with roughly the same latency (and certainly not worse)
compared to normal tasks.
v2: Add tests to run across all engines simultaneously to encourage
ksoftirqd to kick in even more often.
v3: More passes to improve measurement stability.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> #v1
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Some tests measure the render's ring size but are actually meant to
measure the smallest across all engines. This patch adds measuring the
smallest size in gem_measure_ring_size_inflight() given the appropriate
parameter.
v2:
- Only expose high level API. (Chris)
v3:
- Use ALL_ENGINES macro.
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Antonio Argenziano <antonio.argenziano@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
We current have a single for_each_engine() iterator which we use to
generate both a set of uABI engines and a set of physical engines.
Determining what uABI ring-id corresponds to an actual HW engine is
tricky, so pull that out to a library function and introduce
for_each_physical_engine() for cases where we want to issue requests
once on each HW ring (avoiding aliasing issues).
v2: Remember can_store_dword for gem_sync
v3: Find more open-coded for_each_physical
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
With intel_measure_ring_size and igt_cork added as common utilities we
can use them instead of the local copy of those utilities
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
This patch adds a context creation ioctl wrapper that returns the error
for the caller to consume. Multiple tests that implemented this already,
have been changed to use the new library function.
v2:
- Add gem_require_contexts() to check for contexts support (Chris)
v3:
- Add gem_has_contexts to check for contexts support and change
gem_require_contexts to skip if contests support is not available.
(Chris)
v4:
- Cosmetic changes and use lib function in gem_ctx_create where
possible. (Michal)
v5:
- Use gem_contexts_require() in tests and fixtures. (Chris)
Signed-off-by: Antonio Argenziano <antonio.argenziano@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
For measuring the cost of preemption, inject a low priority spinner
between the two measurements; the difference between the preemption and
the normal dispatch includes both the cost of the spinner dispatch and
of preempting it. Close enough for us to estimate the cost of
preemption, though we don't measure the cost of preemption on the local
ring!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
CC: Michał Winiarski <michal.winiarski@intel.com>
|
|
Since I accidentally broke the build for some, by putting the pretty
printer for submission inside ifdef HAVE_PROCPS, it's time to move the
whole thing into lib/i915 while fixing this mistake.
Let's also rename the pretty printer and add a doc to it as well as the
section.
Fixes: f6dfe556659f ("lib: Extract helpers for determining submission method")
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Couple of tests are using either determining submission method, or
pretty printing. Let's move those to helpers in lib.
v2: s/igt_show/gem_show
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Katarzyna Dec <katarzyna.dec@intel.com>
Cc: Petri Latvala <petri.latvala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Katarzyna Dec <katarzyna.dec@intel.com>
Acked-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
This reverts commit 25fbae15262cf570e207e62f50e7c5233e06bc67, restoring
commit 301ad44cdf1b868b1ab89096721da91fa8541fdc
Author: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Date: Thu Mar 2 10:37:11 2017 +0100
lib: Open debugfs files for the given DRM device
with fixes.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
This reverts commit 301ad44cdf1b868b1ab89096721da91fa8541fdc.
When a render-only device is opened and gem_quiescent_gpu is called, we
need to use the debugfs dir for the master device instead.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
|
|
When opening a DRM debugfs file, locate the right path based on the
given DRM device FD.
This is needed so, in setups with more than one DRM device, any
operations on debugfs files affect the expected DRM device.
v2: - rebased and fixed new API additions
v3: - updated chamelium test, which was missed previously
- use the minor of the device for the debugfs path, not the major
- have a proper exit handler for calling igt_hpd_storm_reset with the
right device fd.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Robert Foss <robert.foss@collabora.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Measure the ring size and limit the number of our batches to avoid
stalling (and livelocks) by waiting upon queued batches.
References: https://bugs.freedesktop.org/show_bug.cgi?id=99910
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Force the first write to be queued and so exercise a different exec
path.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Similar to benchmarks/gem_latency, but looking more at the dispatch cost
rather than wakeup cost, and looking for inter-engine costs. Still
probably better as a perf test.
ivb over the years:
IGT-Version: 1.16-gebee919 (x86_64) (Linux: 3.10-3-amd64 x86_64)
render: dispatch latency: 50.90, execution latency: 57.29 (target 4.64)
bsd: dispatch latency: 41.45, execution latency: 41.43 (target 4.75)
blt: dispatch latency: 41.02, execution latency: 41.00 (target 4.99)
IGT-Version: 1.16-gebee919 (x86_64) (Linux: 4.8.0-rc5+ x86_64)
render: dispatch latency: 12.61, execution latency: 15.44 (target 1.71)
bsd: dispatch latency: 12.08, execution latency: 12.07 (target 1.80)
blt: dispatch latency: 12.59, execution latency: 12.58 (target 1.85)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|