Age | Commit message (Collapse) | Author |
|
Using a filter that doesn't match any test name resulted in the runner
silently failing. Print an error message so that the user understands
why the runner fails.
Signed-off-by: Simon Ser <simon.ser@intel.com>
Reviewed-by: Petri Latvala <petri.latvala@intel.com>
|
|
Tests that eat all of the RAM and then some to invoke the oom-killer
deliberately sometimes cause extra casualties. Make sure the runner
stays alive.
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
This switch allows users to select which dmesg log level is treated as
warning resulting in overriding the test results to
dmesg-fail/dmesg-warn.
Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Petri Latvala <petri.latvala@intel.com>
|
|
To aid testing function parsing metadata.txt is split into outer helper
that operates on dirfd and inner function that operates on FILE*.
This allows us to test the parsing using fmemopen(), limiting the amount
of necessary boilerplate.
Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Petri Latvala <petri.latvala@intel.com>
|
|
Which regexp gets compiled is settings specific, depending whether we
run piglit-style or not.
If it's optimized to be initialized only once and it is a global
variable, it will be "stuck" in the mode we have selected with the first
run, which may break tests.
Let's remove this optimization and initialize it each time, as it takes
less 0.002s on my hardware.
Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Simon Ser <simon.ser@intel.com>
|
|
Since not everyone is familiar with kernel taints, and it is easy to get
confused and attribute an abort to an innocent TAINT_USER caused by an
unsafe module option, which is usually the first thing people find
greping dmesg for "taint", we should provide more guidance.
This patch extends the abort log by printing the taint names, as found
in the kernel, along with a short explanation, so people know what to
look for in the dmesg.
v2: rebase, reword
Cc: Petri Latvala <petri.latvala@intel.com>
Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Petri Latvala <petri.latvala@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
* Refactor to use goto error handling
* Make execute_test_process noreturn to remove uninitialized variable
warning
* Check fork() return value
Signed-off-by: Simon Ser <simon.ser@intel.com>
Reviewed-by: Petri Latvala <petri.latvala@intel.com>
|
|
v2: Adjust tests accordingly
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
|
|
That leaves exitcode 1 for aborts and initialization failures. Should
maybe differentiate those as well. Not to mention document the exit
codes.
Also fix igt_resume to follow suit to igt_runner: Generate
results.json even when aborting or exceeding overall-timeout.
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
|
|
We use the timeout status for when the runner had to kill a testcase,
which indicates a more sever issue than an operation failing that we
expected to complete within seconds.
Since it's unused, drop it.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
|
|
Main use case here is CI, which already builds using meson.
Acked-by: Petri Latvala <petri.latvala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
|
|
Actually implement --dry-run to not execute tests. With dry-run
active, attempting to execute will figure out the list of things to
execute, serialize them along with settings, and stop. This will be
useful for CI that wants to post-mortem on failed test rounds to
generate a list of tests that should have been executed and produce
json result files (full of 'notrun') for proper statistics.
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Martin Peres <martin.peres@linux.intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
|
|
When possible, all tests we know we were going to attempt to execute
now appear in the results as "notrun". The only known case where it's
not possible to add an explicit "notrun" is when running in
multiple-mode, because "no subtests" and "run all subtests, we didn't
list them beforehand" are represented the same.
v2: Rebase and adjust to already landed json changes
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Cc: Martin Peres <martin.peres@linux.intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Acked-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
|
|
New piglit bumped its results_version to 10, making glxinfo and pals
optional in practice, not just by accident. Unfortunately reading
results with newer piglit attempts to convert the results to version
10, reading glxinfo and pals, and thus fails. In a hilarious summary:
A commit to piglit making glxinfo optional makes it mandatory for us.
v2: json unit tests confirmed to be working...
Reported-by: Andi Shyti <andi.shyti@intel.com>
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Tested-by: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
|
|
Characters in kernel logs, when read from /dev/kmsg, are escaped as
\xNN if they are not between 32 and 127 of if they are "\". Decode
what we can when creating results.json.
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
|
|
The igt_runner sends a SIGTERM to ask the test to cleanly exit upon an
external timeout. It is useful to know what the code was doing when the
timeout occurred, just in case it was unexpectedly stuck. However, since
we use SIGTERM internally to marshal helper processes, we want to keep
SIGTERM quiet, and so opt to use SIGQUIT for the timeout request
instead.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Petri Latvala <petri.latvala@intel.com>
Reviewed-by: Petri Latvala <petri.latvala@intel.com>
|
|
int kill(pid_t pid, int sig)!
Fixes: a6b514d242bd ("runner: Be patient for processes to die")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Petri Latvala <petri.latvala@intel.com>
Reviewed-by: Petri Latvala <petri.latvala@intel.com>
|
|
Accidentally pushed, believing this was the kill(child, 0) switcheroo.
This reverts commit 0be3f3e7c1613dcaf27267fce778025ea46a36c1.
Acked-by: Petri Latvala <petri.latvala@intel.com>
|
|
I have whinged on for ages about the dmesg-warnings being an expected
part of kernel testing (where else is the kernel meant to log its
errors?) and should be treated the same as our stderr for the test. That
is if a test fails, it fails and does not need to be conflated with
whether or not there was a dmesg warning (just as the test saying why it
failed on stderr does not need flagging), and that a passing test with a
dmesg warning is simply a warn.
The effect is that we simply remove the "dmesg-" flagging from results
names, as the err/dmesg output is simply collated for the error report
already.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Petri Latvala <petri.latvala@intel.com>
Cc: Martin Peres <martin.peres@linux.intel.com>
|
|
../runner/executor.c:555:54: warning: format ‘%x’ expects argument of type ‘unsigned int’, but argument 3 has type ‘long unsigned int’ [-Wformat=]
fprintf(stderr, "Child refuses to die, tainted %x. Aborting.\n",
~^
Fixes: a6b514d242bd ("runner: Be patient for processes to die")
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Some machines are very slow and some processes hog a lot of resources
and so take much longer than a mere 2s to be terminated. Be patient.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108073
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Petri Latvala <petri.latvala@intel.com>
|
|
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
|
|
Deviating a bit from the piglit command line flag, igt_runner takes an
optional comma-separated list as an argument to
--abort-on-monitored-error for the list of conditions to abort
on. Without a list all possible conditions will be checked.
Two conditions implemented:
- "taint" checks the kernel taint level for TAINT_PAGE, TAINT_DIE and
TAINT_OOPS
- "lockdep" checks the kernel lockdep status
Checking is done after every test binary execution, and if an abort
condition is met, the reason is printed to stderr (unless log level is
quiet) and the runner doesn't execute any further tests. Aborting
between subtests (when running in --multiple-mode) is not done.
v2:
- Remember to fclose
- Taints are unsigned long (Chris)
- Use getline instead of fgets (Chris)
v3:
- Fix brainfart with lockdep
v4:
- Rebase
- Refactor the abort condition checking to pass down strings
- Present the abort result in results.json as a pseudo test result
- Unit tests for the pseudo result
v5:
- Refactors (Chris)
- Don't claim lockdep was triggered if debug_locks is not on
anymore. Just say it's not active.
- Dump lockdep_stats when aborting due to lockdep (Chris)
- Use igt@runner@aborted instead for the pseudo result (Martin)
v6:
- If aborting after a test, generate results.json. Like was already
done for aborting at startup.
- Print the test that would be executed next as well when aborting,
as requested by Tomi.
v7:
- Remove the resolved TODO item from commit message
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
Cc: Martin Peres <martin.peres@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
|
|
Test the results.json generation with a top-down approach: With a
directory of test run intermediary logs, check that the resulting json
would match a reference json file.
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
|
|
This allows testing to skip the file writing.
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
|
|
const where possible, and cast away const when passing argv to
parse_options, which expects non-const strings, because it passes them
to getopt_long, which expects non-const strings...
getopt and getopt_long take the argv array as char * const *, or in
other words, as pointer-to-const-pointer-to-char. In C, pointer-to-T
implicitly converts to pointer-to-const-T and for a char **, the T is
char* and "const T" is char * const, ergo char ** converts to char *
const *, not const char **. The only const-correctness getopt and
getopt_long can really do is char * const * or they lose the ability
to directly pass in main()'s arguments, which are an array of
non-const pointers to non-const char for legacy reasons.
For testing the argument handling, it's very convenient to use an
array of string literals, which are of type const char[N], convertible
to const char *. To get such an array into getopt, the choices are:
1) Cast away the const in the pointer-to-pointer
2) Cast away the const in the string literal
3) Don't cast anything and eat the compiler warning
Option 1 looked cleanest out of all those.
tl;dr: Choices made in 1972 force our hand.
v2:
- Augment commit message
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
|
|
Fixes compiler warning: function declaration isn’t a prototype
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
|
|
Turns out the same information is looked up in different places in
different code paths. Piglit's summary module looks up total counts in
['totals']['root'], CI looks up ['totals']['']. The latter is
inherited from piglit, so this has probably changed at some point.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108486
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Tested-by: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
|
|
With --overall-timeout $foo, the runner will stop executing new tests
when $foo seconds have already been used.
A resumed run will start over with no time used, using the same
timeout. This allows for executing a long list of tests piecemeal, in
about $foo length executions.
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106127
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
Cc: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
|
|
After setting the result object text, the string retrieved from the
old object is invalid.
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Cc: Martin Peres <martin.peres@linux.intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
|
|
If we resume a test run with igt_resume, or if resume is done
automatically from a test timeout, the runner will execute the last
attempted test with the subtest selection set to original set minus
the subtests already journaled to have started. If this results in an
empty set, we get a harmless but misleading message from the test
saying
"igt_core-WARNING: Unknown subtest: subtest-name,!subtest-name"
If the journal already contains as many subtests as we have requested
(when we know the set), assume we have them all already and move to
the next job list entry instead.
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Cc: Martin Peres <martin.peres@linux.intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
|
|
Pretty much needed, as proven.
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
|
|
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
|
|
When starting a test run, drop a timestamp file. Do the same when
ending a run. Slap those timestamps directly into the time_elapsed
field in results.json.
Using timestamps instead of measuring actual elapsed time goes against
the naming of the field, but the name is chosen by piglit. Even though
piglit itself uses timestamps.
Corner cases:
On incomplete test runs, the end timestamp will be missing. The
time_elapsed field will only have the start timestamp. This matches
piglit behaviour exactly.
On incomplete but resumed test runs, the end timestamp will be the
time when the resume finishes. Piglit doesn't do this, and instead
leaves the end timestamp missing. Discussing which behaviour is better
is left as an exercise to the readers.
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
|
|
Make sure comparefd gets closed in dump_dmesg(). Otherwise we run out
of descriptors after a bit over 1000 tests executed...
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
|
|
CI pipeline (namely, cibuglog) doesn't cope well with strings that
have \0 in them. If null characters appear in output files, pretend
the output stops at the first such character. Well behaving tests
should not print them anyway.
The case in CI happened due to some hang/crash/explosion/solar flare
that corrupted the output of a test.
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
Cc: Martin Peres <martin.peres@linux.intel.com>
Acked-by: Martin Peres <martin.peres@linux.intel.com>
Acked-by: Tomi Sarvela <tomi.p.sarvela@intel.com>
|
|
Actually implement what was already commented to work.
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Acked-by: Tomi Sarvela <tomi.p.sarvela@intel.com> #irc
|
|
And thus make it possible to run -t basic-s3 for example.
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Acked-by: Tomi Sarvela <tomi.p.sarvela@intel.com> #irc
|
|
and move it to job_list.c
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Acked-by: Tomi Sarvela <tomi.p.sarvela@intel.com> #irc
|
|
If the output of igt_runner is piped or redirected, buffered prints
could be left lingering and read as test executable output if execv()
fails. This can happen easily if CI for example generates a testlist
with an incorrect binary name, or an optional test binary (say,
kms_chamelium) is not built for the deployment in question.
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
|
|
The law of chosen magic numbers: The number selected is wrong.
Chose another magic number for the size of the buffer used to read
test outputs and kernel log records. It's now 2048, up from 256. Also
added a warning print if that's still not enough for kernel logs.
The lesson to learn here is that the /dev/kmsg interface does not give
you a truncated log record as initially thought, but reports an
undocumented EINVAL instead. Subsequent reads give the next record, so
the failsafe added will make sure any future EINVALs will only drop
the record that is too long instead of everything from that point
onwards.
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
|
|
When draining the rest of kmsg records, read the compare record from
the end of kmsg or you get incomplete dmesg fields in the results.
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
|
|
If a test with subtests just exits immediately, or the test binary
doesn't exist at all (as is sometimes the case with kms_chamelium),
the existence of subtests doesn't end up in the execution journal. As
was done for timeouts in a797cbf6918a ("runner/resultgen: Be more
robust with incomplete tests"), check if we were attempting to run a
subtest before attributing a 'notrun' result to an incorrect field.
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
|
|
With the flag, dmesg handling is done exactly as piglit does it: Level
5 (info) and higher dmesg lines, if they match a regexp, cause test
result to change to dmesg-*.
The default is false (use new method).
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Martin Peres <martin.peres@linux.intel.com>
Acked-by: Martin Peres <martin.peres@linux.intel.com>
|
|
Previously, the total runtime of binary foo with subtests bar and quz
was accumulated to the tests field under 'igt@foo' with just a
TimeAttribute field. This confuses piglit-derived scripts deep down
the CI pipeline, so move the overall binary runtime to a new field
'runtimes', with TimeAttribute fields for 'igt@foo'.
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Martin Peres <martin.peres@linux.intel.com>
Acked-by: Martin Peres <martin.peres@linux.intel.com>
|
|
The totals field in the results json lists the total amount of
particular test results, both overall and by binary.
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
|
|
If a test is incomplete and didn't have time to print that it's
entering a subtest, the generated results will think the test binary
does not have subtests. If that case is known, make sure to attribute
blame correctly.
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
|
|
Instead of just matching the binary/subtest name.
Originally not implemented to get the runner landed faster. Turned out
to be simple enough.
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
|
|
Cc: Petri Latvala <petri.latvala@intel.com>
Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Petri Latvala <petri.latvala@intel.com>
|