summaryrefslogtreecommitdiff
path: root/benchmarks/gem_latency.c
AgeCommit message (Collapse)Author
2017-09-08build: remove _GNU_SOURCE from source filesDaniel Vetter
We are, the build system takes care of that. Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Acked-by: Petri Latvala <petri.latvala@intel.com> Acked-by: Daniel Stone <daniels@collabora.com> Acked-by: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-05-31lib: Moving gem_execbuf_wr to ioctl_wrappersLukasz Fiedorowicz
gem_execbuf_wr was duplicated in multiple places. Moving everything to lib/ Signed-off-by: Lukasz Fiedorowicz <lukasz.fiedorowicz@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-04-19benchmarks/gem_latency: Provide LOCAL defines for old libdrmChris Wilson
In order to bend over backwards to keep supporting Android. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-03-21Restore "lib: Open debugfs files for the given DRM device"Chris Wilson
This reverts commit 25fbae15262cf570e207e62f50e7c5233e06bc67, restoring commit 301ad44cdf1b868b1ab89096721da91fa8541fdc Author: Tomeu Vizoso <tomeu.vizoso@collabora.com> Date: Thu Mar 2 10:37:11 2017 +0100 lib: Open debugfs files for the given DRM device with fixes. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-03-21Revert "lib: Open debugfs files for the given DRM device"Tomeu Vizoso
This reverts commit 301ad44cdf1b868b1ab89096721da91fa8541fdc. When a render-only device is opened and gem_quiescent_gpu is called, we need to use the debugfs dir for the master device instead. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2017-03-21lib: Open debugfs files for the given DRM deviceTomeu Vizoso
When opening a DRM debugfs file, locate the right path based on the given DRM device FD. This is needed so, in setups with more than one DRM device, any operations on debugfs files affect the expected DRM device. v2: - rebased and fixed new API additions v3: - updated chamelium test, which was missed previously - use the minor of the device for the debugfs path, not the major - have a proper exit handler for calling igt_hpd_storm_reset with the right device fd. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Robert Foss <robert.foss@collabora.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-02-24benchmarks/gem_latency: Fix compiler warningMika Kahola
Fix compiler warning about I915_EXEC_FENCE_OUT definition/redefinition as it is defined in libdrm/i915_drm.h:890:0 gem_latency.c:48:0: warning: "I915_EXEC_FENCE_OUT" redefined #define I915_EXEC_FENCE_OUT (1 << 17) ^ In file included from ../lib/intel_batchbuffer.h:6:0, from ../lib/drmtest.h:39, from ../lib/igt.h:27, from gem_latency.c:31: Signed-off-by: Mika Kahola <mika.kahola@intel.com>
2016-09-02benchmarks/gem_latency: Measure fence wakeup latenciesChris Wilson
Useful for comparing the cost of explict fences versus implicit. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-05-09benchmarks/gem_latency: Revert to unsafe mmio access on gen7Chris Wilson
In theory, we need to only worry about concurrent mmio writes to the same cacheline. So far, disabling the spinlock hasn't hung the machine. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-05-02benchmarks/gem_latency: Report throughputChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-27benchmark/gem_latency: sync startup correctlyChris Wilson
When waiting for the producers to start, use the cond/mutex of the Nth producer and not always the first. Spotted-by: "Goel, Akash" <akash.goel@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-03benchmarks/gem_latency: Add a -C switch to measure impact of cmdparserChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-03-08benchmarks/gem_latency: Replace igt_stats with igt_meanChris Wilson
Use a simpler statically allocated struct for computing the mean as otherwise we many run out of memeory! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-01-06benchmarks/gem_latency: Allow setting an infinite timeChris Wilson
Well, 24000 years. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-12-21benchmarks/gem_latency: Hide spinlocks for androidChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-12-21benchmarks/gem_latency: Serialise mmio readsChris Wilson
The joy of our hardware; don't let two threads attempt to read the same register at the same time. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-12-21benchmarks/gem_latency: Guard against inferior pthreads.hChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-12-20benchmarks/gem_latency: Measure CPU usageChris Wilson
Try and gauge the amount of CPU time used for each dispatch/wait cycle. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-12-20benchmarks/gem_latency: Measure effect of using RealTime priorityChris Wilson
Allow the producers to be set with maximum RT priority to verify that the waiters are not exhibiting priorty-inversion. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-12-20benchmarks/gem_latency: Use RCS on SandybridgeChris Wilson
Reading BCS_TIMESTAMP just returns 0... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-12-20benchmarks/gem_latency: Rearrange thread cancellationChris Wilson
Try a different pattern to cascade the cancellation from producers to their consumers in order to avoid one potential deadlock. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-12-20benchmarks/gem_latency: Tweak workloadChris Wilson
Do the workload before the nop, so that if combining both, there is a better chance for the spurious interrupts. Emit just one workload batch (use the nops to generate spurious interrupts) and apply the factor to the number of copies to make inside the workload - the intention is that this gives sufficient time for all producers to run concurrently. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-12-19benchmarks/gem_latency: Add output field specifierChris Wilson
Just to make it easier to integrate into ezbench. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-12-19benchmarks/gem_latency: Split the nop/work/latency measurementChris Wilson
Split the distinct phases (generate interrupts, busywork, measure latency) into separate batches for finer control. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-12-19benchmarks/gem_latency: Add time controlChris Wilson
Allow the user to choose a time to run for, default 10s Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-12-19benchmarks/gem_latency: Add nop dispatch latency measurementChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-12-19benchmarks/gem_latency: Expose the workload factorChris Wilson
Allow the user to select how many batches each producer submits before waiting. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-12-19benchmarks/gem_latency: Measure whole execution throughputChris Wilson
Knowing how long it takes to execute the workload (and how that scales) is interesting to put the latency figures into perspective. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-12-19benchmarks/gem_latency: Fix for !LLCChris Wilson
Late last night I forgot I had only added the llc CPU mmaping and not the !llc GTT mapping for byt/bsw. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-12-19benchmark: Measure of latency of producers -> consumers, gem_latencyChris Wilson
The goal is measure how long it takes for clients waiting on results to wakeup after a buffer completes, and in doing so ensure scalibilty of the kernel to large number of clients. We spawn a number of producers. Each producer submits a busyload to the system and records in the GPU the BCS timestamp of when the batch completes. Then each producer spawns a number of waiters, who wait upon the batch completion and measure the current BCS timestamp register and compare against the recorded value. By varying the number of producers and consumers, we can study different aspects of the design, in particular how many wakeups the kernel does for each interrupt (end of batch). The more wakeups on each batch, the longer it takes for any one client to finish. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>