igt-gpu-tools.git - DRM IGT GPU Tools

Age	Commit message (Collapse)	Author
2018-04-11	tests/perf_pmu: Avoid RT thread for accuracy test	Tvrtko Ursulin
	Realtime scheduling interferes with execlists submission (tasklet) so try to simplify the PWM loop in a few ways: * Drop RT. * Longer batches for smaller systematic error. * More truthful test duration calculation. * Less clock queries. * No self-adjust - instead just report the achieved cycle and let the parent check against it. * Report absolute cycle error. v2: * Bring back self-adjust. (Chris Wilson) (But slightly fixed version with no overflow.) v3: * Log average and mean calibration for each pass. v4: * Eliminate development leftovers. * Fix variance logging. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-03-29	tests/perf_pmu: Fix usage of for_each_engine_class_instance	Tvrtko Ursulin
	Wrong file descriptor was passed to the iterator. This had currently no effect, since it wasn't used in the macro, but needs to be fixed. At the same time make the macro consistent by checking for engine presence like the other iterators do. Added __for_each_engine_class_instance which does not check for engine presence and so is useful for enumerating all possible engines - like for instance for subtest enumeration. And another 'wrong fd used' fixlet in the render node subtests. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reported-by: Michel Thierry <michel.thierry@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Michel Thierry <michel.thierry@intel.com>
2018-03-27	igt/perf_pmu: Most-busy requires at least one busy engine	Chris Wilson
	The test is whether with all but one engine busy we record the correct load on each engine. If we only have one engine, this test degenerates into all-idle/all-busy, so we can skip to avoid crashing on the assumption that we have a busy spinner. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2018-03-26	tests/perf_pmu: Improve accuracy by waiting on spinner to start	Tvrtko Ursulin
	More than one test assumes that the spinner is running pretty much immediately after we have create or submitted it. In actuality there is a variable delay, especially on execlists platforms, between submission and spin batch starting to run on the hardware. To enable tests which care about this level of timing to account for this, we add a new spin batch constructor which provides an output field which can be polled to determine when the batch actually started running. This is implemented via MI_STOREDW_IMM from the spin batch, writing into memory mapped page shared with userspace. Using this facility from perf_pmu, where applicable, should improve very occasional test fails across the set and platforms. v2: Chris Wilson: * Use caching mapping if available. * Handle old gens better. * Use gem_can_store_dword. * Cache exec obj array in spin_batch_t for easier resubmit. v3: * Forgot I915_EXEC_NO_RELOC. (Chris Wilson) v4: * Mask out all non-engine flags in gem_can_store_dword. * Added some debug logging. v5: * Fix relocs and batch munmap. (Chris) * Added assert idle spinner batch looks as expected. v6: * Skip accuracy tests when !gem_can_store_dword. v7: * Fix batch recursion reloc address. v8: Chris Wilson: * Pull up gem_can_store_dword check before we start submitting. * Build spinner batch in a way we can skip store dword when not needed so we can run on SandyBridge. v9: * Fix wait on spinner. * More tweaks to accuracy test. v10: * Dropped accuracy subtest changes due problems with RT thread and tasklet submission. v11: * Use READ_ONCE. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # IRC
2018-03-12	tests/perf_pmu: Use absolute tolerance in accuracy tests	Tvrtko Ursulin
	We need to use absolute tolerance when asserting on percentages. Relative tolerance in this case is unfair and inaccurate since it's strictness varies with relative target busyness. v2: * Do not include spin batch edit and submit into measured time. * Open PMU before child is in test PWM phase. * No need to emit test PWM for twice as long with the new explicit synchroniazation via pipe. * Log test duration in ms for better readability. * Drop inverse assert. (Chris Wilson) v3: Explain tasklet delay. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-03-05	tests/perf_pmu: Handle CPU hotplug failures better	Chris Wilson
	CPU hotplug, especially CPU0, can be flaky on commodity hardware. To improve test reliability and reponse times when testing larger runs we need to handle those cases better. Handle failures to off-line a CPU by immediately skipping the test, and failures to on-line a CPU by immediately rebooting the machine. This patch includes igt_sysrq_reboot implementation from Chris Wilson. v2: Halt by default, reboot if env variable IGT_REBOOT_ON_FATAL_ERROR is set. (Petri Latvala) v3: Add missign docs and update stale comment. (Petri Latvala) v4: Use pause instead of sleep. (Chris Wilson) v5: Newlines! (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Petri Latvala <petri.latvala@intel.com> Cc: Tomi Sarvela <tomi.p.sarvela@intel.com> Reviewed-by: Petri Latvala <petri.latvala@intel.com>
2018-03-05	tests/perf_pmu: Test busyness reporting in face of GPU hangs	Tvrtko Ursulin
	Verify that the reported busyness is in line with what would we expect from a batch which causes a hang and gets kicked out from the engine. v2: Change to explicit igt_force_gpu_reset instead of guessing when a spin batch will hang. (Chris Wilson) v3: Assert and comment test expectations. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-02-22	igt/perf_pmu: Fix 64b printf-isms	Chris Wilson
	My bad, perf_pmu.c: In function ‘accuracy’: perf_pmu.c:1533:4: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 5 has type ‘uint64_t’ [-Wformat] perf_pmu.c:1533:4: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 6 has type ‘uint64_t’ [-Wformat] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2018-02-22	tests/perf_pmu: Skip hotplug test on Broxton	Tvrtko Ursulin
	Apollolake machine in the shards cannot bring the CPU0 back online so skip the test on all Broxtons for now. v2: Fix inverted check. v3: igt_skip_on. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-02-21	igt/perf_pmu: Use a self-correcting busy pwm	Chris Wilson
	Convert the busy pwm from using a single calibration pass with a fixed target into a self-correcting pwm that tries to adjust how long to sleep on each pwm in order to converge at the target busy %%. Being self-correcting, it should fare better against the more variable systems CI presents. v2: Be fair and equally strict for low/high busy %% v3: target_idle_us and calculate expected from timing of each individual pass Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105157 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2018-02-19	perf_pmu: Fix some compile warnings with old compilers / 32-bit builds	Tvrtko Ursulin
	Correct printf format for uint64_t and one "may be uninitialized". v2: Fix one more "may be uninitialized". (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reported-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-02-19	igt/perf_pmu: Retain original GTT offset when resubmitting the spinner	Chris Wilson
	Since the spin batch contains a relocation to itself, when we resubmit the spinner, we must ensure that it is executed at the same location. While the spinner is busy, resubmitting will reuse the same location, but if it is idle, the kernel may move it between execution. In this case, we need to record the previous location (in obj.offset) and then demand the kernel reuse the location using EXEC_OBJECT_PINNED. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2018-02-16	tests/perf_pmu: Verify engine busyness accuracy	Tvrtko Ursulin
	A subtest to verify that the engine busyness is reported with expected accuracy on platforms where the feature is available. We test three patterns: 2%, 50% and 98% load per engine. v2: * Use spin batch instead of nop calibration. * Various tweaks. v3: * Change loops to be time based. * Use __igt_spin_batch_new inside timing sensitive loops. * Fixed PWM sleep handling. v4: * Use restarting spin batch. * Calibrate more carefully by looking at the real PWM loop. v5: * Made standalone. * Better info messages. * Tweak sleep compensation. v6: * Some final tweaks. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-02-15	tests/perf_pmu: Log perf timestamp in semaphore wait tests	Tvrtko Ursulin
	We need more data to debug sporadic test failures. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-02-13	tests/perf_pmu: Give sampling more time	Tvrtko Ursulin
	We get occasional errors like: (perf_pmu:21315) CRITICAL: Test assertion failure function sema_wait, file perf_pmu.c:631: (perf_pmu:21315) CRITICAL: Failed assertion: (double)(val[1] - val[0]) <= (1.0 + (tolerance)) * (double)(slept) && (double)(val[1] - val[0]) >= (1.0 - (tolerance)) * (double)(slept) (perf_pmu:21315) CRITICAL: 'val[1] - val[0]' != 'slept' (450000000.000000 not within 5.000000% tolerance of 500129618.000000) Suggesting a time disagreement between userspace and the PMU. At the moment I got no better ideas than fiddling with delays to see if it improves things. v2: Wait for sampling to start instead of hardcoded sleep. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-02-13	tests/perf_pmu: Handle thermally throttled devices	Tvrtko Ursulin
	Some systems cannot reach the advertised maximum frequency due throttling. Handle them by considering a 100MHz lower limit. v2: Use more relaxed tolerance only in the downward direction. (Chris Wilson) v3: Improved assert message. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-02-13	tests/perf_pmu: Use perf timestamps in a few more places	Tvrtko Ursulin
	Use perf timestamps in more places where possible. v2: Log measure_usleep vs perf timestamps. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-02-12	igt/perf_pmu: Semaphores do not exist before gen6	Chris Wilson
	We don't expect to be able to open the I915_SAMPLE_SEMA on gen5 and earlier as the HW doesn't support semaphores. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2018-02-07	tests/perf_pmu: Test RC6 during runtime suspend	Tvrtko Ursulin
	Test to check that the RC6 counter works as expected during and after runtime suspend. v2: * Use correct sysfs root by using IGT helpers. * Turn off display to allow runtime suspend. (Imre) * Two subtest flavours. v3: * drmModeFreeResources. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-02-07	tests/perf_pmu: Use perf timestamps when calculating average frequency	Tvrtko Ursulin
	We can use perf reported timestamps to potentially get a more accurate frequency average. Lets see if this improves the situation for sporadic failures like on APL: Frequency: min=100, max=750, boost=750 MHz Min frequency: requested 90.0, actual 90.0 Max frequency: requested 749.8, actual 647.9 Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-02-07	lib: Move __gem_context_create to common ioctl wrapper library.	Antonio Argenziano
	This patch adds a context creation ioctl wrapper that returns the error for the caller to consume. Multiple tests that implemented this already, have been changed to use the new library function. v2: - Add gem_require_contexts() to check for contexts support (Chris) v3: - Add gem_has_contexts to check for contexts support and change gem_require_contexts to skip if contests support is not available. (Chris) v4: - Cosmetic changes and use lib function in gem_ctx_create where possible. (Michal) v5: - Use gem_contexts_require() in tests and fixtures. (Chris) Signed-off-by: Antonio Argenziano <antonio.argenziano@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-02-05	tests/perf_pmu: Use short batches from hotplug test	Tvrtko Ursulin
	This test emits a spin batch which runs roughly for N CPU cores seconds As such these can be declared as GPU hangs, so work around that by looping with shorter batches. v2: * Use overlapping spinners. (Chris Wilson) * Go back to igt_fork. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-02-05	tests/perf_pmu: Explicitly test for engine availability in init tests	Tvrtko Ursulin
	Test will succeed if present engine can be opened, or if the missing engine reports the correct error code. v2: * Use the right errno. * Close fd only on success. (Chris Wilson) v3: * Only sample errno on failure. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-02-05	tests/perf_pmu: Always skip missing engines	Tvrtko Ursulin
	Always skip missing engines to make tests skips very early and avoid losing time in tests which need to do setups or waits before they would otherwise detect this. To ensure PMU is rejecting opening missing engines we will add an explicit test later. v2: Use subtest groups for engine checking. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-02-05	tests/perf_pmu: PMU enable race test	Tvrtko Ursulin
	Test that the PMU can be safely enabled in face of interrupt-heavy load on an engine. The test is probabilistic so run it for ten seconds in a loop to increase the odds of hitting the race. v2: Repeat the test a few times, until a timeout. (Chris Wilson) v3: Added note in code and commit about probabilistic nature of the test. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-02-05	tests/perf_pmu: Add trailing edge idle test variants	Tvrtko Ursulin
	Additional set of tests which stops the batch and sleeps for a bit before sampling the counter in order to test that the busyness stop being recorded correctly. v2: Reorganize end_spin and guards a bit. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v1 Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-02-05	tests/perf_pmu: Convert to flags	Tvrtko Ursulin
	Will need more modes soon. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-02-05	tests/perf_pmu: Use measured sleep in all time based tests	Tvrtko Ursulin
	Stop relying on timers to end spin batches but use measured sleep, which was established to work better, in all time based tests. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-02-05	tests/perf_pmu: More busy measurement tightening	Tvrtko Ursulin
	Where we use measured sleeps, take PMU samples immediately before and after and look at their delta in order to minimize the effect of any test setup delays. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-02-05	tests/perf_pmu: Tighten busy measurement	Tvrtko Ursulin
	In cases where we manually terminate the busy batch, we always want to sample busyness while the batch is running, just before we will terminate it, and not the other way around. This way we make the window for unwated idleness getting sampled smaller. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-01-15	tests/perf_pmu: Exercise busy stats and lite-restore	Tvrtko Ursulin
	While developing a fix for an accounting hole in busy stats we realized lite-restore is a potential edge case which would be interesting to check is properly handled. It is unfortnately quite timing sensitive to hit lite-restore in the fashion test needs, so downside of this test is that it sufferes from a high rate of false negatives. v2: * Make the sleep unconditional and use scientific notiation for large constants. (Chris Wilson) * Use gem_quiscent_gpu instead of gem_sync+usleep to ensure context complete was received under execlists. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-01-15	tests/perf_pmu: Verify busyness when PMU is enabled after engine got busy	Tvrtko Ursulin
	Make sure busyness is correctly reported when PMU is enabled after the engine is already busy with a single long batch. v2: * Make the sleep unconditional and use scientific notiation for large constants. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-01-04	igt/perf_pmu: Skip GEM checks for repeated spin_batch allocations	Chris Wilson
	Each call to igt_spin_batch_new_fence will do a stalling check to verify that GEM is functional before submitting the spinning batch. In a loop, this means that we may end up waiting for our earlier spinning batches... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
2017-12-22	tests/perf_pmu: Simplify interrupt testing	Tvrtko Ursulin
	Rather than calibrate and emit nop batches, use a manually signalled chain of spinners to generate the desired interrupts. v2: Two flavours of interrupt generation. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-12-22	igt/perf_pmu: Speed up frequency measurement	Chris Wilson
	Use the normal batch_duration_ns and display the sampled frequency: Frequency: min=100, max=750, boost=750 MHz Min frequency: requested 100.0, actual 100.0 Max frequency: requested 755.6, actual 755.6 v2: Remove the early spin_batch_end and assert the measured frequencies are within tolerance of our target. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-12-22	igt/perf_pmu: Measure the reference batch for all-busy-check-all	Chris Wilson
	Don't rely on the timer being precise when we can sleep for a known duration. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-12-22	igt/perf_pmu: Measure the reference batch for busy-check-all	Chris Wilson
	Don't rely on the timer being precise when we can sleep for a known duration. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-12-22	igt/perf_pmu: Tighten measurements for most-busy	Chris Wilson
	Create all the spinners before starting the sampler and then measure how long we sleep. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104160 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-12-22	igt/perf_pmu: Tighten busy measurement	Chris Wilson
	Sleep for a known duration. In particular, CI once saw a measurement for busyness greater than the intended batch_duration! v2: Go back to starting pmu sampling outside of spinner; the GPU should be idle. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104241 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-12-07	igt/perf_pmu: Tweak wait_for_rc6, yet again	Chris Wilson
	Still CI remains obstinate that RC6 is not smoothly incrementing during the sample period. Tweak the wait_for_rc6() to first wait for the initial Evaluation Interval before polling. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-12-05	igt/perf_pmu: Replace hard-coded sleep before rc6 with a probe	Chris Wilson
	Instead of trying to sleep for 2 evaluations intervals and then assuming that rc6 is working, poll the rc6 residency instead. v2: dce References: https://bugs.freedesktop.org/show_bug.cgi?id=103929 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-12-04	igt/perf_pmu: Tighten semaphore-wait measurement	Chris Wilson
	Record the before/after semaphore-wait values around the sleep to try to reduce the inaccuracy from scheduler delays. Previously, the samples were taken before submitting the batch and then after synchronising its completion. The measurement will then be the total that the semaphore was being sampled, but with the extra syscalls intervening may have drifted from the sleep duration. To further reduce the disparity, wait for the batch to start executing before taking our samples. References: https://bugs.freedesktop.org/show_bug.cgi?id=104013 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-11-30	igt/perf_pmu: Increase delay for rc6 to start	Chris Wilson
	I was thinking of the RC6 threshold parameter, but needed to consider the RC6 evaluation interval instead. RC6 doesn't enable until activity is below the threshold inside an evaluation interval, therefore we need to wait at least 2 EI after idling before we can expect RC6 to be enabled. Fixes: 55a17bc2d040 ("igt/perf_pmu: Reduce arbitrary delays before rc6") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-11-28	tests/perf_pmu: Sync invalid-init with i915 changes	Tvrtko Ursulin
	i915 started returning -EINVAL for incorrect CPU. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-11-27	igt/perf_pmu: Keep batch_duration_ns as the minimum measurement duration	Chris Wilson
	We have chosen batch_duration_ns to be the minimum duration we need to meet our accuracy requirements for legacy ringbuffer PMU sampling. As such, we need to be careful to use multiples of it during tests, and not split it into different phases within a test, like multi_client does. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-11-24	intel/pmu: Catch-up with i915 RC6 aggregation changes	Tvrtko Ursulin
	Since i915 PMU is removing separate RC6 counters and now aggregates all under a single one, catch up the test and intel-gpu-overlay with those changes. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-11-24	igt/perf_pmu: Recalibrate interrupt loop.	Chris Wilson
	We have to be careful in our calibration loop, too slow and we timeout, too fast and we don't emit an interrupt! On fast legacy devices, we would overflow the calibration calcuation... v2: Give the time constants a name. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-11-24	igt/perf_pmu: Stop peeking at intel_mmio registers	Chris Wilson
	Program the MI_WAIT_FOR_EVENT without reference to DERRMR by knowing its state is ~0u when not in use, and is only in use when userspace requires it. By not touching intel_regsiter_access we completely eliminate the risk that we leak the forcewake ref, which can cause later rc6 to fail. At the same time, note that vlv/chv use a different mechanism (read none) for coupling between the render engine and display. v2: Note that we assume DERRMR should be ~0u when not in use. For futureproofing one might like to do SRM/LRM (but I believe that if the HW changes that much, we are likely to need a bigger boat). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-11-23	igt/perf_pmu: Reduce arbitrary delays before rc6	Chris Wilson
	gem_quiescent_gpu() is supposed to ensure that the HW is idle, and in the process kick the GPU into rc6, so we should not need a long delay afterwards to ensure that we are indeed in rc6. We do however need a small delay in order to be sure that rc6 cycle counter has started and stopped. v2: Apply to rc6p as well. v3: The longest rc6 timeout (before the HW kicks in and enables rc6 on an idle GPU) is 50ms, so make sure that at least that time has passed since we were busy. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-11-23	tests/perf_pmu: Bump measuring duration for semaphores as well	Tvrtko Ursulin
	As Chris has discover 100ms is not long enough to cover the sampling error in general, fix the semaphore subtest as well to measure for 500ms. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>