Age | Commit message (Collapse) | Author |
|
An hanging batch is nothing more than a spinning batch that never gets
stopped, so re-use the routines implemented in dummyload.c.
v2: Let caller decide spin loop size
v3: Only use loose loops for hangs (Chris)
v4: No requires
v5: Free the spinner
v6: Chamelium exists.
Signed-off-by: Antonio Argenziano <antonio.argenziano@intel.com> #v3
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Antonio Argenziano <antonio.argenziano@intel.com>
|
|
Actually check the error state exists (!"No error state captured") and
that it contains the expected engine dump.
v2: Throw in some debug clues.
v3: Fail if the file doesn't exist, or empty.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Antonio Argenziano <antonio.argenziano@intel.com>
|
|
We current have a single for_each_engine() iterator which we use to
generate both a set of uABI engines and a set of physical engines.
Determining what uABI ring-id corresponds to an actual HW engine is
tricky, so pull that out to a library function and introduce
for_each_physical_engine() for cases where we want to issue requests
once on each HW ring (avoiding aliasing issues).
v2: Remember can_store_dword for gem_sync
v3: Find more open-coded for_each_physical
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
If the system has bsd2, we do not know which ring the kernel will alias
I915_EXEC_BSD onto and so we do not what the matching string should be.
Skip the unknown.
v2: Deny the aliased I915_EXEC_BSD exists at all; be specific!
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103324
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
CC: Michał Winiarski <michal.winiarski@intel.com>
|
|
This reverts commit 25fbae15262cf570e207e62f50e7c5233e06bc67, restoring
commit 301ad44cdf1b868b1ab89096721da91fa8541fdc
Author: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Date: Thu Mar 2 10:37:11 2017 +0100
lib: Open debugfs files for the given DRM device
with fixes.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
This reverts commit 301ad44cdf1b868b1ab89096721da91fa8541fdc.
When a render-only device is opened and gem_quiescent_gpu is called, we
need to use the debugfs dir for the master device instead.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
|
|
When opening a DRM debugfs file, locate the right path based on the
given DRM device FD.
This is needed so, in setups with more than one DRM device, any
operations on debugfs files affect the expected DRM device.
v2: - rebased and fixed new API additions
v3: - updated chamelium test, which was missed previously
- use the minor of the device for the debugfs path, not the major
- have a proper exit handler for calling igt_hpd_storm_reset with the
right device fd.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Robert Foss <robert.foss@collabora.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
So we don't need to use an artificial delay.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Comparing strcasecmp against a bool promotes the bool into an integer
and not force strcasecmp to be a bool. Since strcasecmp can return
negative or positive values for error (not a boolean!) we need to
convert into a boolean before comparing against our expectation.
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
The call to igt_sysfs_set("") was trying to write the empty string, i.e.
0 bytes and so never made it to the kernel. Use igt_sysfs_write("", 1)
instead to send the NUL byte to the error state in order for it to be
cleared.
Reported-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Given that we export the equivalent files between debugfs and sysfs, and
sysfs is our ABI, use sysfs for ABI testing.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
100 * 1 billion needs a 64bit intermediate
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Lots of test cases are re-declaring this.
v2: Remove definition in benchmarks/gem_syslatency.c
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
|
|
References: https://bugs.freedesktop.org/show_bug.cgi?id=98361
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
For the basic error state, we only desire that an error state be created
following a hang. For that purpose, we do not need a real hang (slow
6-12s) but can inject one instead (fast <1s).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
drv_hangman.c: In function ‘hangcheck_unterminated’:
drv_hangman.c:290:27: warning: integer overflow in expression [-Woverflow]
int64_t timeout_ns = 100 * NSEC_PER_SEC; /* 100 seconds */
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
The hangcheck logic will not flag an hang if acthd keeps increasing.
However, if a malformed batch jumps to an invalid offset in the ppgtt it
can potentially continue executing through the whole address space
without triggering the hangcheck mechanism.
This patch adds a test to simulate the issue. I've kept the test running
for more than 10 minutes before killing it on a BDW and no hang occurred.
I've sampled i915_hangcheck_info a few times during the run and got the
following:
Hangcheck active, fires in 468ms
render ring:
seqno = fffff55e [current fffff55e]
ACTHD = 0x47df685ecc [current 0x4926b81d90]
max ACTHD = 0x47df685ecc
score = 0
action = 2
instdone read = 0xffd7ffff 0xffffffff 0xffffffff 0xffffffff
instdone accu = 0x00000000 0x00000000 0x00000000 0x00000000
Hangcheck active, fires in 424ms
render ring:
seqno = fffff55e [current fffff55e]
ACTHD = 0x6c953d3a34 [current 0x6de5e76fa4]
max ACTHD = 0x6c953d3a34
score = 0
action = 2
instdone read = 0xffd7ffff 0xffffffff 0xffffffff 0xffffffff
instdone accu = 0x00000000 0x00000000 0x00000000 0x00000000
Hangcheck active, fires in 1692ms
render ring:
seqno = fffff55e [current fffff55e]
ACTHD = 0x1f49b0366dc [current 0x1f4dcbd88ec]
max ACTHD = 0x1f49b0366dc
score = 0
action = 2
instdone read = 0xffd7ffff 0xffffffff 0xffffffff 0xffffffff
instdone accu = 0x00000000 0x00000000 0x00000000 0x00000000
v2: use the new gem_wait() function (Chris)
v3: switch to unterminated batch and rename test, remove redundant
check, update test requirements (Chris), update top comment
v4: force gpu reset if the hang detection fails (Mika)
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Arun Siluvery <arun.siluvery@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
[Mika: removed batch_len=8]
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
|
|
Because
(drv_hangman:6035) CRITICAL: Failed assertion: !((__extension__
(__builtin_constant_p (l) && ((__builtin_constant_p (tmp) && strlen
(tmp) < ((size_t) (l))) || (__builtin_constant_p (s) && strlen (s) <
((size_t) (l)))) ? __extension__ ({ size_t __s1_len, __s2_len;
(__builtin_constant_p (tmp) && __builtin_constant_p (s) && (__s1_len =
strlen (tmp), __s2_len = strlen (s), (!((size_t)(const void *)((tmp) +
1) - (size_t)(const void *)(tmp) == 1) || __s1_len >= 4) &&
(!((size_t)(const void *)((s) + 1) - (size_t)(const void *)(s) == 1) ||
__s2_len >= 4)) ? __builtin_strcmp (tmp, s) : (__builtin_constant_p
(tmp) && ((size_t)(const void *)((tmp) + 1) - (size_t)(const void
*)(tmp) == 1) && (__s1_len = strlen (tmp), __s1_len < 4) ?
(__builtin_constant_p (s) && ((size_t)(const void *)((s) + 1) -
(size_t)(const void *)(s) == 1) ? __builtin_strcmp (tmp, s) :
(__extension__ ({ const unsigned char *__s2 = (const unsigned char *)
(const char *) (s); int __result = (((const unsigned char *) (const char
*) (tmp))[0] - __s2[0]); if (__s1_len > 0 && __result == 0) { __result =
(((const unsigned char *) (const char *) (tmp))[1] - __s2[1]); if
(__s1_len > 1 && __result == 0) { __result = (((const unsigned char *)
(const char *) (tmp))[2] - __s2[2]); if (__s1_len > 2 && __result == 0)
__result = (((const unsigned char *) (const char *) (tmp))[3] -
__s2[3]); } } __result; }))) : (__builtin_constant_p (s) &&
((size_t)(const void *)((s) + 1) - (size_t)(const void *)(s) == 1) &&
(__s2_len = strlen (s), __s2_len < 4) ? (__builtin_constant_p (tmp) &&
((size_t)(const void *)((tmp) + 1) - (size_t)(const void *)(tmp) == 1) ?
__builtin_strcmp (tmp, s) : (- (__extension__ ({ const unsigned char
*__s2 = (const unsigned char *) (const char *) (tmp); int __result =
(((const unsigned char *) (const char *) (s))[0] - __s2[0]); if
(__s2_len > 0 && __result == 0) { __result = (((const unsigned char *)
(const char *) (s))[1] - __s2[1]); if (__s2_len > 1 && __result == 0) {
__result = (((const unsigned char *) (const char *) (s))[2] - __s2[2]);
if (__s2_len > 2 && __result == 0) __result = (((const unsigned char *)
(const char *) (s))[3] - __s2[3]); } } __result; })))) :
__builtin_strcmp (tmp, s)))); }) : strncmp (tmp, s, l))) == 0)
is a little hard to understand at a glance.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Rather than encoding our own list of engines, use the common one for
greater coverage.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
All the external viewer expects of the GPU error capture is to extract
the exact batch that triggered the hang. Everything else is internal
detail to aide in post-mortem debugging of the kernel driver (i.e.
subject to change) and not of the userspace portion (under control of
the test).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Wean drv_hangman off the atrocious stop_rings and use a real GPU hang
instead.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
No functional changes.
While I'm here, let's also rename gem_uses_aliasing_ppgtt (since it's
being used to indicate if we are using ANY kind of ppgtt) and introduce
gem_uses_full_ppgtt to drop some unnecessary code from tests that were
previously calling getparam directly instead of using ioctl wrapper.
v2: drop gem_uses_full_48b_ppgtt since it's no longer used anywhere,
s/48b/64b (Chris)
v3: rebase
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
This way we correctly auto-skip instead of falling over the
lack of i915 debugfs files first and fail the testcase due to
that.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
Apply the new API to all call sites within the test suite using the following
semantic patch:
// Semantic patch for replacing drm_open_any* with arch-specific drm_open_driver* calls
@@
identifier i =~ "\bdrm_open_any\b";
@@
- i()
+ drm_open_driver(DRIVER_INTEL)
@@
identifier i =~ "\bdrm_open_any_master\b";
@@
- i()
+ drm_open_driver_master(DRIVER_INTEL)
@@
identifier i =~ "\bdrm_open_any_render\b";
@@
- i()
+ drm_open_driver_render(DRIVER_INTEL)
@@
identifier i =~ "\b__drm_open_any\b";
@@
- i()
+ __drm_open_driver(DRIVER_INTEL)
Signed-off-by: Micah Fedke <micah.fedke@collabora.co.uk>
Signed-off-by: Thomas Wood <thomas.wood@intel.com>
|
|
Add a header that includes all the headers for the library. This allows
reorganisation of the library without affecting programs using it and
also simplifies the headers that need to be included to use the library.
Signed-off-by: Thomas Wood <thomas.wood@intel.com>
|
|
commit e1f123257a1f7d3af36a31a0fb2d4c6f40039fed
Author: Michel Thierry <michel.thierry@intel.com>
Date: Wed Jul 29 17:23:56 2015 +0100
drm/i915: Expand error state's address width to 64b
changed the batch buffer address to be 64b. Fix the parsing
of gtt offset accordingly.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91638
Cc: Akash Goel <akash.goel@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
|
|
The integer comparison macros give us better error output by including
the actual values that failed the comparison.
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
|
|
Also move forcewake and stop_rings code from igt_debugfs to igt_gt
since it fits better. And move the hang injection fork helpers from
igt_aux to igt_gt, too.
Also push the intel_gen call into igt_hang_ring while at it.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
|
|
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
|
|
Found some open coded min()/max()/swap() macros.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
|
|
This test will not run on Android as the coreu service
remains running even after the android system is stopped.
Coreu is a client of drm and when the test finds this it
fails an assert.
Coreu is started by the init process and there is no
tidy, non invasive way to stop it (init just restarts it).
Coreu isn't doing anything and would not be expected to
interfere with this test. In addition, all the other
igt tests just rely on the user/test script to ensure
that there are no other drm clients, so this test can
do the same. On Android we must rely on coreu being
dormant when this test runs.
Signed-off-by: Tim Gore <tim.gore@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
This test has a few checks that batch buffer addresses in the error
state match the expected address for the userspace supplied batch.
But the batch buffer copy piece of the command parser means that
the logged addresses are actually _supposed_ to be different. So
skip just those checks.
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
Re-run with correct igt_fail rules. Again manually fixup missing
includes for igt_core.h.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
This reverts commit 6903ab04e5f9048e3932eb3225e94b6a228681ba.
The igt_assert conversion rule is broken and doesn't invert the check
as it should.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
Cocci is awesome
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
Guarantees that error capture works at a very basic level.
v2: Also check that the ring object contains a reloc with MI_BB_START
for the presumed batch object's address.
v3: Chris review comments:
- Move variables to local scope.
- Do not assume there is only one request.
- Some gen encode flags into the BB start address.
Also, use igt_set/get_stop_rings as suggested by Mika Kuoppala.
v4: Make as a subtest of drv_hangman.
v5: Rebase
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> <v4>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
|
|
Mixing script and standlone tests didn't mix well with the
strict i915_ring_stop flags handling. Also squash drv_missed_irq_hang
to the new test.
v2: - Remove missed irq test (Daniel Vetter)
- gitignore fixed (Oscar Mateo)
- fix check_other_clients to handle dangling fd's
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78322
Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com> <v1>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
|