summaryrefslogtreecommitdiff
path: root/tests/i915/gem_ctx_persistence.c
AgeCommit message (Collapse)Author
2022-02-28lib/igt_dummyload: Drop ahnd from igt_spin_tAshutosh Dixit
In 4d9396e67930 we have started storing the opts with which the spin was created as part of igt_spin_t. The ahnd stored as part of igt_spin_t is therefore redundant. We can get ahnd from opts.ahnd. Cc: Zbigniew Kempczynski <zbigniew.kempczynski@intel.com> Cc: Jasmine Newsome <jasmine.newsome@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
2022-02-07i915/tests: Pass right ctx id into igt_allow_hang()Chuansheng Liu
Just like the commit 74fc362b425c(i915/gem_busy: Prevent context ban with right ctx id), some codes are using the constant ctx id 0 passed into igt_allow_hang(), it may cause test failures. This patch is to correct them with right ctx id for the below tests: tests/i915/prime_busy tests/i915/gem_ctx_persistence tests/i915/gem_exec_schedule tests/i915/gem_wait Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Chuansheng Liu <chuansheng.liu@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
2021-10-25tests/gem_ctx_persistence: Update saturated_hostile for dependent engine resetsJohn Harrison
The gem_ctx_persistence test in general has support for twiddling the scheduling parameters via sysfs to tune timeouts. For some reason, this was not being applied to the saturated_hostile test. The test was also broken for platforms with dependent engine resets. The test submits requests to all engines, kills one and expects the rest to survive. However, the other engine requests were all marked as not pre-emptible. On recent platforms, there is a reset dependency across RCS and CCS engines. That is, if one of those engines is reset then all engines must be reset. If a context executing on one of those engines does not pre-empt first then it will be killed. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Priyanka Dandamudi <priyanka.dandamudi@intel.com> Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Cc: Arjun Melkaveri <arjun.melkaveri@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com>
2021-08-13tests/gem_ctx_persistence: Adopt to use allocatorZbigniew Kempczyński
For newer gens we're not able to rely on relocations. Change mostly touches spinners creation where allocator handle is now mandatory variable on gens where relocations are disabled. Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com> Cc: Petri Latvala <petri.latvala@intel.com> Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
2021-07-16i915: Improve the precision of command parser checksJason Ekstrand
The previous gem_has_cmdparser helper took an engine and did nothing with it. We delete the engine parameter and use the general helper for the ALL_ENGINES cases. For cases where we really do care about something more precise, we add a version which takes an intel_ctx_cfg_t and an engine specifier and is able to say whether or not that particular engine has the command parser enabled. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Lakshminarayana Vudum <lakshminarayana.vudum@intel.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2021-07-16tests/i915/gem_ctx_persistence: Use intel_ctx_t for hang subtestsJason Ekstrand
We need this for proper cmdparser detection Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2021-07-08tests/i915/gem_ctx_persistence: Convert to intel_ctx_tJason Ekstrand
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2021-07-08tests/i915/gem_ctx_persistence: Drop the engine replace subtestsJason Ekstrand
We're going to start disallowing non-trivial uses of setparam for engines precisely to make races like this impossible. It'll also make these test cases invalid. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com> Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2021-07-08tests/i915/gem_ctx_persistence: Drop the clone subtestJason Ekstrand
The entire CONTEXT_CLONE_* API is being removed from upstream i915. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com> Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2021-01-29i915/gem_ctx_persistence: Wait for spinner before resetingChris Wilson
Just reset the spinner once before launching and killing many non-persistent contexts. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2021-01-23i915/gem_ctx_persistence: Check for accidental banningChris Wilson
Check that closing many contexts does not cause a ban. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2020-12-30i915: Rename legacy for_each_engine to for_each_ringChris Wilson
Improve the differentiation between the legacy ring selector ABI and the more recent engine selection API. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com>
2020-12-22i915/gem_ctx_persistence: Reset timeout in hostile / hang on each engineMatthew Brost
The timeout for a context to be killed / banned with GuC submission is a bit noisier than with execlist submission so reset the timeout to the default value on each engine in the hostile / hang sections rather than using the timeout from the previous engine. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2020-12-22i915/gem_ctx_persistence: Set context with available enginesRahul Kumar Singh
Fixed engine-hang subtest to set context with available engines Signed-off-by: Rahul Kumar Singh <rahul.kumar.singh@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2020-10-10i915/gem_ctx_persistence: Verify userptr vs context cleanupChris Wilson
Verify that the wait for userptr cleanup is after we have cancelled the non-persistent hanging context. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2020-09-29i915/gem_ctx_persistence: Fix legacy engine selectionChris Wilson
For the legacy execbuf engine selection, we have to be careful in handling vcs if there is more than one engine, and specify which one we actually want. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-09-28i915/gem_ctx_persistence: Exercise cleanup after disabling heartbeatsChris Wilson
We expose the heartbeat interval on each engine, allowing the sysadmin to disable them if they prefer avoiding any interruption for their GPU tasks. A caveat to allowing the contexts to run without checks is that we require such contexts to be non-persistent and so cleaned up on closure (including abnormal process termination). However, we also need to flush any persistent contexts that are still inflight at that time, lest they continue to run unchecked. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-05-20i915/gem_ctx_persistence: Use "%u" for -1u conversionChris Wilson
The debugfs modparams are more picky and refuse to do the implicit unsigned conversion. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
2020-05-19lib/i915: Reset all engine properties to defaults prior to the start of a testChris Wilson
We need each test in an isolated context, so that bad results from one test do not interfere with the next. In particular, we want to clean up the device and reset it to the defaults so that they are known for the next test, and the test can focus on behaviour it wants to control. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2020-05-11i915/gem_ctx_persistence: Add a small delayed for RCU'ed fdChris Wilson
Add a small delay before we wait on the rcu barrier to allow slower machines to flush the process tables first. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1528 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2020-05-08lib/params: start renaming functions igt_params_*Juha-Pekka Heikkila
Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Petri Latvala <petri.latvala@intel.com>
2020-05-08lib/params: add igt_params.c for module parameter accessJani Nikula
We have generic helpers for sysfs access in igt_sysfs.c, but we also have a number of module parameter access specific helpers scattered here and there. Start gathering the latter into a file of its own. For i915, the long-term goal is to migrate from module parameters to device specific debugfs parameters. With all igt module param access centralized in one place, we can make the transition much easier. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Petri Latvala <petri.latvala@intel.com>
2020-05-08i915/gem_ctx_persistence: Fix ring, don't blockChris Wilson
Beware of using gem_ring_measure_inflight() as it takes a ring identifier and not the engine, should you overwrite the defaults. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1848 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2020-05-07lib/i915: Split igt_require_gem() into i915/Chris Wilson
igt_require_gem() is a pecularity of i915/, move it out of the core. Similar opportunistic move of gem_reopen_driver() and gem_quiescent_gpu(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2020-03-24i915/gem_ctx_persistence: Force cleanup between testsChris Wilson
Since dynamic subtests we run multiple subtests in one binary, we encounter situations where a bug in one subtest percolates into the next subtest. Explicitly cleanup before each test to disarm our own shotgun. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2020-03-23i915/gem_ctx_persistence: Apply a 'load factor' for the smoker timeoutChris Wilson
Since we execute a few smokers in parallel, at worst we may have to wait for all smokers to be reset before we ourselves are. We need to increase our leniency for the smoketest and allow a longer timeout to accommodate the parallelism. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2020-03-13i915/gem_ctx_peristence: Use the canonical name for looking up the legacy engineChris Wilson
To set a property on an engine, we need to use its canonical name (%class%instance). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2020-03-13i915/gem_ctx_persistence: Tune reset-timeoutChris Wilson
When we can control the preempt_timeout_ms property on an engine, we can specify a much faster timeout and so expect our tests to run much faster. Then we can also avoid the embarrassment if the preempt reset is disabled and the tests start failing because we are not waiting 10+s for the hangcheck. Closes: https://gitlab.freedesktop.org/drm/intel/issues/1440 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2020-02-29i915/gem_ctx_persistence: Increase leniency for reset-timeoutChris Wilson
The default preempt_timeout_ms is a shocking 640ms. To be resilient against false positives, we should include an engineering safety factor of about 2x into our fail criteria, so that we only cry foul when we are truly unresponsive. Closes: https://gitlab.freedesktop.org/drm/intel/issues/679 Closes: https://gitlab.freedesktop.org/drm/intel/issues/570 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2020-02-25i915/gem_ctx_persistence: Check precision of hostile cancellationChris Wilson
Check that if we have to remove a hostile request from a non-persistent context, we do so without harming any other concurrent users. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2020-02-19i915/gem_ctx_persistence: Protect igt_spin_new() from close racesChris Wilson
Since the library call of igt_spin_new() asserts if it spots an error, we must protect it from the races we are imposing upon ourselves. However, to keep those races active, delegate the potentially failing calls to the children. References: https://gitlab.freedesktop.org/drm/intel/issues/1241 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2020-02-13i915/gem_ctx_persistence: Race context closure with replace-enginesChris Wilson
Tvrtko spotted a race condition between replacing a set of hanging engines and closing the context. So exercise it. 5s is not much time to hit the small window, but a little bit of testing several times a day is better than nothing. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2020-02-05i915/gem_ctx_persistence: Check that we cannot hide hangs on old enginesChris Wilson
As the kernel loses track of the context's old engines, if we request that the context is non-persistent then any request on the untracked engines must be cancelled. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2020-02-02i915/gem_ctx_persistence: Scrub i915.reset at startChris Wilson
Since gem_ctx_peristence requires and insists upon having working reset, the test will not run on a system without. If a previous has clobbered i915.reset, we need to restore the modparam for ourselves. Closes: https://gitlab.freedesktop.org/drm/intel/issues/1099 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2020-01-31lib: Don't feed IGT_SPIN_INVALID_CS to the command parserChris Wilson
If using a cmdparser, it may be intelligent enough to not execute the invalid batch leading to an unwritten breadcrumb and igt_spin_busywait_until_started() in an infinite loop. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2020-01-31i915/gem_ctx_persistence: Restore hangcheck on exitChris Wilson
As we abuse hangcheck within some of the tests, we then need to make sure we restore hangcheck on exit, in case we detect a failure and abort. Closes: https://gitlab.freedesktop.org/drm/intel/issues/1082 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2020-01-31i915/gem_ctx_persistence: Convert engine subtests to dynamicTvrtko Ursulin
Converts all per-engine tests into dynamic subtests and in the process: * Put back I915_EXEC_BSD legacy coverage. * Remove one added static engine list usage. * Compact code by driving two groups of the name/func table. v2: * Convert smoketest to proper all engines. v3: * Undo subgroup mistake. (Chris) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Bommu Krishnaiah <krishnaiah.bommu@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2020-01-29i915/gem_ctx_persistence: Set context with supported enginesBommu Krishnaiah
Update the context with supported engines on the platform with set_property I915_CONTEXT_PARAM_ENGINES to make sure the work load is submitted to the available engines only. Signed-off-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> [ickle: fix the flailing around[ Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2020-01-29i915/gem_ctx_persistence: Check we detect a genuine hangChris Wilson
Just in case the user submits an invalid batch, check we can clean up afterwards. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2020-01-28i915: Inject invalid CS into hanging spinnersChris Wilson
Some spinners are used with the intent of never ending and being declared hung by the kernel. In some cases, these are being used to simulate invalid payloads and so we can use an invalid command to trigger a GPU hang. (Other cases, they are simulating infinite workloads that truly never end, but we still need to be able to curtail to provide multi-tasking). This patch adds IGT_SPIN_INVALID_CS to request the injection of 0xdeadbeef into the command stream that should trigger a GPU hang. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2019-12-07Revert "tests/i915: Use engine query interface for ↵Chris Wilson
gem_ctx_isolation/persistence" This reverts commit 343aae776a58a67fa153825385e6fe90e3185c5b. __for_each_physical_engine() reprograms the context, invalidating the use of e->flags to select engines, necessitating e->index instead. Without also fixing up the engine selection, the result is that random engines were being used to read registers from the intended engine. This does not end well. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Stuart Summers <stuart.summers@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Andi Shyti <andi.shyti@intel.com>
2019-12-06tests/i915: Use engine query interface for gem_ctx_isolation/persistenceStuart Summers
Align with gem_exec_basic and other tests using the newer engine query interface into i915 to enumerate active engines. Signed-off-by: Stuart Summers <stuart.summers@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2019-11-27i915/gem_ctx_persistence: Bump the reset timeoutChris Wilson
As the default preempt-reset timeout has been increased from 100ms to 640ms, we need a corresponding increase in our own timeout so that we allow enough time for the preempt-reset to occur and close the hung contexts. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112401 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2019-11-23i915/gem_ctx_persistence: Use the right fd for flushing delayed fputChris Wilson
Fixes: 3fa72891269b ("i915/gem_ctx_persistence: Double the fput hammer!") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2019-11-22i915/gem_ctx_persistence: Double the fput hammer!Chris Wilson
Deferred rcu work is tricky to pin down and encourage to run, so try again... Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112277 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2019-11-13tests/i915/gem_ctx_persistence: fix gcc warningJuha-Pekka Heikkila
casting unsigned char pointer to int pointer causes gcc to be unhappy with comment: "warning: dereferencing type-punned pointer will break strict-aliasing rules" Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Petri Latvala <petri.latvala@intel.com>
2019-11-04i915/gem_ctx_persistence: Apply an rcu-barrier for fput cleanupChris Wilson
After any process termination, use an rcu-barrier to be sure that any deferred struct file cleanup has been performed. By being consistent in our paranoia here means that we can rule out more false positives and so focus on what remains. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Andi Shyti <andi.shyti@intel.com>
2019-10-31i915/gem_ctx_persistence: Double the rcu barrierChris Wilson
It seems the first rcu barrier may race with the addition of the file to the rcu task list; so wait again. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Andi Shyti <andi.shyti@intel.com>
2019-10-31i915/gem_ctx_persistence: Sanitycheck execbuf state harder for 'queued'Chris Wilson
And initialise fence to -1 to avoid closing stdin (fd:0)! The delayed fput is first queued with schedule (task_work) before being rcu freed. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Andi Shyti <andi.shyti@intel.com>
2019-10-29Add i915/gem_ctx_persistenceChris Wilson
Sanity test existing persistence and new exciting non-persistent context behaviour. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com>