summaryrefslogtreecommitdiff
path: root/runner/executor.c
AgeCommit message (Collapse)Author
2022-03-28runner: Fix handling of per-test-timeoutPetri Latvala
Instead of stopping execution on resume-init success, stop on resume-init failure like intended. Fixes: 4b88a9253443 ("runner: check if it has root permissions") Signed-off-by: Petri Latvala <petri.latvala@intel.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Reviewed-by: Mauro Carvalho Chehab <mchehab@kernel.org>
2022-03-21runner: cleanup code_cov directory, if anyMauro Carvalho Chehab
Ensure that "-o" parameter will also cleanup the contents of the code coverage results directory. Reviewed-by: Petri Latvala <petri.latvala@intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
2022-03-21runner: Add support for code coverageMauro Carvalho Chehab
The gcc compiler has a feature that enables checking the code coverage in runtime[1]. [1] See https://www.kernel.org/doc/html/latest/dev-tools/gcov.html The Linux Kernel comes with an option to enable such feature: ./scripts/config -e DEBUG_FS -e GCOV_KERNEL The driver's Makefile also needs change to enable it. For instance, in order to enable GCOV for all DRM drivers, one would need to run: for i in $(find drivers/gpu/drm/ -name Makefile); do sed '1 a GCOV_PROFILE := y' -i $i done This patch adds support for it by: a) Implementing a logic to cleanup the code coverage counters via sysfs; b) Calling a script responsible for collecging code coverage data. The implementation works with two modes: 1) It zeroes the counters, run all IGT tests and collects the code coverage results at the end. This implies that no tests would crash the driver, as otherwise the results won't be collected; This is faster, as collecting code coverage data can take several seconds. 2) For each test, it will clean the code coverage counters, run the and collect the results. This is more reliable, as a Kernel crash/OOPS won't affect the results of the previously ran tests. Reviewed-by: Petri Latvala <petri.latvala@intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
2022-03-21runner: check if it has root permissionsMauro Carvalho Chehab
Without root permissions, most IGT tests won't actually run, but they would be displayed at the runner's output as if everything went fine. In order to avoid that, check if one attempts to run IGT without root permission. Such check can be disbled with a new command line option: --allow-non-root As runner_tests runs as non-root, most unit tests need to pass --allow-non-root in order for them to not return an error. Reviewed-by: Petri Latvala <petri.latvala@intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
2021-02-09runner: Handle graceful exit regardless of log levelPetri Latvala
The SIGHUP handling was incorrectly done only when log level was at least 'normal'. Signed-off-by: Petri Latvala <petri.latvala@intel.com> Cc: Arkadiusz Hiler <arek@hiler.eu> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2021-01-22runner: Introduce a way to stop testing without marking tests incompletePetri Latvala
Killing igt_runner with SIGHUP will now still kill the currently running test, but it will mark that test as being "notrun" instead of "incomplete". This allows for external tools to interrupt the testing without messing the results. Incidentally, Intel CI's testing procedures occasionally falsely determine that the machine being tested is unreachable and as its next step, will ssh in and issue a reboot in preparation for the next round of testing, causing igt_runner to be killed with a SIGHUP... v2: - Fix typo SIGUP -> SIGHUP - Make runner print that a graceful exit will be done - Explain the code flow regarding handling of signals to the runner process - Use GRACEFUL_EXITCODE instead of -SIGHUP directly Signed-off-by: Petri Latvala <petri.latvala@intel.com> Cc: Tomi Sarvela <tomi.p.sarvela@intel.com> Cc: Arkadiusz Hiler <arek@hiler.eu> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2021-01-12runner: Fix constness warningPetri Latvala
Introduced in commit 532d6e84ab7f ("lib: Process kernel taints"): ../runner/executor.c: In function ‘handle_taint’: ../runner/executor.c:324:18: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers] while ((explain = igt_explain_taints(&bad))) { ^ Signed-off-by: Petri Latvala <petri.latvala@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2021-01-08lib: Process kernel taintsChris Wilson
A small library routine to read '/proc/sys/kernel/taints' and check for a fatal condition. This is currently used by the runner, but is also useful for some tests. v2,3: function docs Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Petri Latvala <petri.latvala@intel.com>
2020-12-07runner: Don't kill a test on taint if watching timeoutsJanusz Krzysztofik
We may still be interested in results of a test even if it has tainted the kernel. On the other hand, we need to kill the test on taint if no other means of killing it on a jam is active. If abort on both kernel taint or a timeout is requested, decrease all potential timeouts significantly while the taint is detected instead of aborting immediately. However, report the taint as the reason of the abort if a timeout decreased by the taint expires. v2: Fix missing show_kernel_task_state() lost on rebase conflict resolution (Chris - thanks!) Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Reviewed-by: Petri Latvala <petri.latvala@intel.com>
2020-08-06runner: Only claim the test was killed if it was killedPetri Latvala
If we don't have --abort=taint active and there is a kernel taint, test exiting normally caused the runner to inject a "this test was killed" message to the test's output. Make sure we only inject that if we really did kill the test, and journal the test exit correctly as well. Same goes for the message for exceeding disk usage limits. Signed-off-by: Petri Latvala <petri.latvala@intel.com> Cc: Arkadiusz Hiler <arek@hiler.eu> Cc: Lukasz Fiedorowicz <lukasz.fiedorowicz@intel.com> Reviewed-by: Lukasz Fiedorowicz <lukasz.fiedorowicz@intel.com>
2020-07-30runner: Also generate igt@runner@aborted when aborting internallyPetri Latvala
If we can't kill the (main) test process, or when the test process exits with IGT_EXIT_ABORT, we abort the execution. Pass that information along to the other machinery that tracks whether we aborted, thus also getting that information to the end user in the form of the pseudo-result igt@runner@aborted. Signed-off-by: Petri Latvala <petri.latvala@intel.com> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Cc: Lukasz Fiedorowicz <lukasz.fiedorowicz@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2020-07-22runner: Print a message when aborting due to IGT_EXIT_ABORTPetri Latvala
Previously, when a test exited with IGT_EXIT_ABORT, we did abort but did it silently. Print a message so runner logs tell a clear message why we didn't execute the rest of the tests. Signed-off-by: Petri Latvala <petri.latvala@intel.com> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
2020-07-20runner: Introduce --disk-usage-limitPetri Latvala
Disk usage limit is a limit of disk space taken, per (dynamic) subtest. If the test's output, kernel log included, exceeds this limit, the test is killed, similarly to killing the test when the kernel gets tainted. Signed-off-by: Petri Latvala <petri.latvala@intel.com> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
2020-07-20runner: Inject a message when killing test to taintsPetri Latvala
Normally runner injecting a message to the test's stdout/stderr logs has a race condition; The test outputs have special lines (subtest starting/ending) and accidentally injecting stuff in between would cause funky results. When we're killing a test because the kernel got tainted, we know already that we're not getting a subtest ending line and we can inject, if we make sure we have newlines printed before and after the injection. Having a message in the stdout of the test will aid automatic bug filtering. Signed-off-by: Petri Latvala <petri.latvala@intel.com> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
2020-04-17runner: More task debug!Chris Wilson
In a few cases, we hit a timeout where no process appears to be deadlocked (i.e. tasks stuck in 'D' with intertwined stacks) but everything appears to be running happily. Often, they appear to be fighting over the shrinker, so one naturally presumes we are running low on memory. But for tests that were designed to run with ample memory to spare, that is a little disconcerting and I would like to know where the memory actually went. sysrq('m'): Will dump current memory info to your console Sounds like that should do the trick. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Petri Latvala <petri.latvala@intel.com> Acked-by: Petri Latvala <petri.latvala@intel.com>
2020-04-07runner: Show why we dump the task stateChris Wilson
Include the reason why we are dumping the task state (test timeout) in the kmsg log prior to the task state. Hopefully this helps when reading the dump. v2: Use asprintf to combine the strings into one to avoid error prone manual string handling and enjoy one single write() into the kmsg. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Petri Latvala <petri.latvala@intel.com>
2020-03-25runner: Remember to sync journal.txt for all writesPetri Latvala
One missing fdatasync() for starting a subtest. Fixes: https://gitlab.freedesktop.org/drm/igt-gpu-tools/issues/81 Signed-off-by: Petri Latvala <petri.latvala@intel.com> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2020-03-25runner: Only show the kmsg overflow message onceChris Wilson
Instead of repeating every single time we overflow the read from kmsg, just once per test is enough warning. v2: Just suppress the multiple s/underflow/overflow/ messages. Having a buffer smaller than a single kmsg packet is unlikely. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Petri Latvala <petri.latvala@intel.com>
2020-03-23runner: Abort the run when test exits with IGT_EXIT_ABORTArkadiusz Hiler
Now that the IGT tests have a mechanism for signaling broken testing conditions we can stop the run on the first test that has noticed it, and possibly has triggered that state. Traditionally run would have continued with that test failing and the side effects would trickle down into the other tests causing a lot of skip/fails. v2: extra explanations, small cleanup (Petri) Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Petri Latvala <petri.latvala@intel.com>
2020-03-13runner: Read all kernel logs when there are logsPetri Latvala
Instead of reading one record at a time between select() calls and tainted-checks etc, use the same at-the-end dmesg dumper whenever there's activity in /dev/kmsg. It's possible that the occasional chunk of missing dmesg we're sometimes hitting is due to reading too slowly, especially if there's a huge gem traceback. Also print a clear message if we hit a log buffer underrun so we know it. Reference: https://gitlab.freedesktop.org/drm/igt-gpu-tools/issues/79 Signed-off-by: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2020-03-13runner: Dump the rest of dmesg also when child refuses to diePetri Latvala
Signed-off-by: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2020-03-13runner: Handle outputs before checking for timeoutPetri Latvala
Signed-off-by: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2020-02-19runner: Introduce per-test timeoutsPetri Latvala
A new config option, --per-test-timeout, sets a time a single test cannot exceed without getting itself killed. The time resets when starting a subtest or a dynamic subtest, so an execution with --per-test-timeout=20 can indeed go over 20 seconds a long as it launches a dynamic subtest within that time. As a bonus, verbose log level from runner now also prints dynamic subtest begin/result. Signed-off-by: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2020-02-19runner: Refactor timeoutingPetri Latvala
Instead of aiming for inactivity_timeout and splitting that into suitable intervals for watchdog pinging, replace the whole logic with one-second select() timeouts and checking if we're reaching a timeout condition based on current time and the time passed since a particular event, be it the last activity or the time of signaling the child processes. With the refactoring, we gain a couple of new features for free: - use-watchdog now makes sense even without inactivity-timeout. Previously use-watchdog was silently ignored if inactivity-timeout was not set. Now, watchdogs will be used always if configured so, effectively ensuring the device gets rebooted if userspace dies without other timeout tracking. - Killing tests early on kernel taint now happens even earlier. Previously on an inactive system we possibly waited for some tens of seconds before checking kernel taints. Signed-off-by: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2020-02-11runner: Support dynamic subtests in testlistsPetri Latvala
In a very rudimentary and undocumented manner, testlist files can now have dynamic subtests specified. This feature is intended for very special cases, and the main supported mode of operation with testlist files is still the CI-style "run it all no matter what". The syntax for testlist files is: igt@binary@subtestname@dynamicsubtestname As dynamic subtests are not easily listable, any helpers for generating such testlists are not implemented. If running in multiple-mode, subtests with dynamic subtests specified will run in single-mode instead. Closes: https://gitlab.freedesktop.org/drm/igt-gpu-tools/issues/45 Signed-off-by: Petri Latvala <petri.latvala@intel.com> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
2020-02-03runner: Make the result an incomplete if a test is killed due to taintPetri Latvala
If we're checking for taints, we kill the test as soon as we notice a taint. Out of the box, such killing will get marked as such and yields a 'timeout' result, which is misleading. The test didn't spend too much time, it just did nasties. Make sure taint-killing results in an 'incomplete' result instead. It's still not completely truthful for the state of the testing but closer than a 'timeout'. And stands out more in CI result analysis. Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Petri Latvala <petri.latvala@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2020-01-29runner: Make sure output is still collected when killing test due to taintPetri Latvala
If the kernel is tainted, it stays tainted, so make sure the execution monitoring still reaches the output collectors and other fd change handlers. Reported-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2020-01-23runner: Don't check for taints when not configured for itPetri Latvala
If someone wants to execute tests without aborting when tainted, they get all their tests just straight up killed on the first taint without actually aborting execution. Obey their wishes and keep running. Signed-off-by: Petri Latvala <petri.latvala@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2020-01-21runner: Clean up quickly if the kernel OOPSed during a testChris Wilson
If the kernel OOPSed during the test, it is unlikely to ever complete. Furthermore, we have the reason why it won't complete and so do not need to burden ourselves with the full stacktrace of every process -- or at least we have a more pressing bug to fix before worrying about the system deadlock. v2: Log the post-taint killing. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Petri Latvala <petri.latvala@intel.com>
2019-12-05runner: Don't wait forever for processes to diePetri Latvala
While the originally written timeout for process killing (2 seconds) was way too short, waiting indefinitely is suboptimal as well. We're seeing cases where the test is stuck for possibly hours in uninterruptible sleep (IO). Wait a fairly longer selected time period of 2 minutes, because even making progress for that long means the machine is in bad enough state to require a good kicking and booting. v2: - Abort quicker if kernel is tainted (Chris) - Correctly convert process-exists check with kill() to process-does-not-exist Signed-off-by: Petri Latvala <petri.latvala@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2019-12-05runner: Actually ping watchdogs every intervalPetri Latvala
The split to timeout intervals was made to accomodate for watchdogs that cannot use a timeout as high as we wanted. Actually using that feature requires us to ping the watchdog every interval even though we handle actual timeouting after all intervals are used up. Signed-off-by: Petri Latvala <petri.latvala@intel.com> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2019-10-21runner: Don't add timestamps when cannot exec a testPetri Latvala
Don't add timestamps when printing that we cannot execute a binary from a child (post fork-failed-execv). Timestamps were meant for runner's direct output only, and this was accidentally converted. v2: Rephrase commit message (Arek) Signed-off-by: Petri Latvala <petri.latvala@intel.com> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
2019-10-14runner: Show kernel state on detecting test timeoutChris Wilson
When our watchdog expires and we declare the test has timed out, we send it a signal to terminate. The test will produce a backtrace upon receipt of that signal, but often times (especially as we do test and debug the kernel), the test is hung inside the kernel. So we need the kernel state to see where the live/deadlock is occuring. Enter sysrq-t to show the backtraces of all processes (as the one we are searching for may be sleeping). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Petri Latvala <petri.latvala@intel.com>
2019-09-24runner: Chomp away trailing spaces from cmdlineChris Wilson
A minor refinement to remove the trailing spaces after converting the NUL-terminators to spaces. v2: Beware the crafty filename entirely composed of spaces. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Petri Latvala <petri.latvala@intel.com>
2019-09-23runner: Show more elements of the signaler's argv[]Chris Wilson
/proc/$pid/cmdline is the entire argv[] including NUL-terminators. Replace the NULs with spaces so we get a better idea of who the signaler was, as often it is a subprocess (such as a child of sudo, or worse java). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Petri Latvala <petri.latvala@intel.com>
2019-09-20runner: Add signal sender name when dyingChris Wilson
We want to know who sent us the fatal signal, for there are plenty of fingers to go around. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Petri Latvala <petri.latvala@intel.com>
2019-09-17runner: Add support for aborting on network failurePetri Latvala
If the network goes down while testing, CI tends to interpret that as the device being down, cutting its power after a while. This causes an incomplete to an innocent test, increasing noise in the results. A new flag to --abort-on-monitored-error, "ping", uses liboping to ping a host configured in .igtrc with one ping after each test execution and aborts the run if there is no reply in a hardcoded amount of time. v2: - Use a higher timeout - Allow hostname configuration from environment v3: - Use runner_c_args for holding c args for runner - Handle runner's meson options in runner/meson.build - Instead of one ping with 20 second timeout, ping with 1 second timeout for a duration of 20 seconds v4: - Rebase - Use now-exported igt_load_igtrc instead of copypaste code - Use define for timeout, clearer var name for single attempt timeout Signed-off-by: Petri Latvala <petri.latvala@intel.com> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Cc: Martin Peres <martin.peres@linux.intel.com> Cc: Tomi Sarvela <tomi.p.sarvela@intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
2019-09-13runner: Add a timestamp to each log messageChris Wilson
Very handy for correlating events between different logs. This generate igt_runner0.txt output like: [28.112360] Initializing watchdogs [28.112424] /dev/watchdog0 [28.114069] [001/269] (960s left) core_auth (basic-auth) Starting subtest: basic-auth Subtest basic-auth: SUCCESS (0.000s) [28.224898] [002/269] (960s left) debugfs_test (read_all_entries) Starting subtest: read_all_entries Subtest read_all_entries: SUCCESS (0.035s) The subtest logs are separate (not part of the runner's logging per-se), but the flow of events is clear enough from the runner's timestamp for now. v2: Concatenate split messages into a single call (so that the timestamp is only added once!) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Petri Latvala <petri.latvala@intel.com>
2019-07-22runner: Make sure that we are closing watchdogs on signalsArkadiusz Hiler
There are few short windows of opportunity when watchdogs are primed but there is no signal handling in place, so the process may exit without proper shutdown sequence. This patch rearranges the existing code so that we set up the signalfd and BLOCK the signals before setting up watchdogs and UNBLOCK only after the watchdogs are closed properly. If igt_runner exits due to signal, non-zero status code is returned. v2: more error handling and minor touch ups (Simon) Cc: Petri Latvala <petri.latvala@intel.com> Cc: Simon Ser <simon.ser@intel.com> Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Simon Ser <simon.ser@intel.com> Reviewed-by: Petri Latvala <petri.latvala@intel.com>
2019-07-22runner: Warn when watchdogs are being closed from the exit handlerArkadiusz Hiler
instead of being closed normally on a graceful code path Cc: Petri Latvala <petri.latvala@intel.com> Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Simon Ser <simon.ser@intel.com> Reviewed-by: Petri Latvala <petri.latvala@intel.com>
2019-07-22runner: Make sure we don't close watchdogs twiceArkadiusz Hiler
Setting the watchdog fd lists to NULL for extra fireworks if accessed unintentionally. Cc: Petri Latvala <petri.latvala@intel.com> Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Simon Ser <simon.ser@intel.com> Reviewed-by: Petri Latvala <petri.latvala@intel.com>
2019-06-24runner/executor: Make sure that intervals_left is always initializedArkadiusz Hiler
intervals_left got initialized only when when we had a timeout exceeding watchdog capabilities, meaning we had to use multiple shorter intervals by moving intervals_left = timeout_intervals down we are always initializing it Cc: Petri Latvala <petri.latvala@intel.com> Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Petri Latvala <petri.latvala@intel.com>
2019-06-24runner: Log which signal was used to terminate the runnerArkadiusz Hiler
Feed the curious ones, aid the troubleshooters. Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Simon Ser <simon.ser@intel.com> Reviewed-by: Petri Latvala <petri.latvala@intel.com>
2019-06-24runner: Handle SIGHUP tooArkadiusz Hiler
Default handler for SIGHUP is also terminating the process, so let's mask it and handle it manually, like the rest of the bunch. Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Simon Ser <simon.ser@intel.com> Reviewed-by: Petri Latvala <petri.latvala@intel.com>
2019-06-24runner: Log when watchdog handling failsArkadiusz Hiler
If write or ioctl on a watchdog ever fails it will be logged. Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Simon Ser <simon.ser@intel.com> Reviewed-by: Petri Latvala <petri.latvala@intel.com>
2019-04-12runner: Make sure oom-killer doesn't kill the runnerPetri Latvala
Tests that eat all of the RAM and then some to invoke the oom-killer deliberately sometimes cause extra casualties. Make sure the runner stays alive. Signed-off-by: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2019-04-01runner: Refactor metadata parsingArkadiusz Hiler
To aid testing function parsing metadata.txt is split into outer helper that operates on dirfd and inner function that operates on FILE*. This allows us to test the parsing using fmemopen(), limiting the amount of necessary boilerplate. Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Petri Latvala <petri.latvala@intel.com>
2019-03-27runner: Make taint abort messages more verboseArkadiusz Hiler
Since not everyone is familiar with kernel taints, and it is easy to get confused and attribute an abort to an innocent TAINT_USER caused by an unsafe module option, which is usually the first thing people find greping dmesg for "taint", we should provide more guidance. This patch extends the abort log by printing the taint names, as found in the kernel, along with a short explanation, so people know what to look for in the dmesg. v2: rebase, reword Cc: Petri Latvala <petri.latvala@intel.com> Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Petri Latvala <petri.latvala@intel.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2019-03-25runner/executor: refactor error handlingSimon Ser
* Refactor to use goto error handling * Make execute_test_process noreturn to remove uninitialized variable warning * Check fork() return value Signed-off-by: Simon Ser <simon.ser@intel.com> Reviewed-by: Petri Latvala <petri.latvala@intel.com>
2019-02-21runner: Exit with 0 on dry-runPetri Latvala
v2: Adjust tests accordingly Signed-off-by: Petri Latvala <petri.latvala@intel.com> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Cc: Tomi Sarvela <tomi.p.sarvela@intel.com> Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>