igt-gpu-tools.git - DRM IGT GPU Tools

Age	Commit message (Collapse)	Author
2015-12-19	benchmarks: Remove gem_wait	Chris Wilson
	Superseded by gem_latency. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-12-19	benchmark: Measure of latency of producers -> consumers, gem_latency	Chris Wilson
	The goal is measure how long it takes for clients waiting on results to wakeup after a buffer completes, and in doing so ensure scalibilty of the kernel to large number of clients. We spawn a number of producers. Each producer submits a busyload to the system and records in the GPU the BCS timestamp of when the batch completes. Then each producer spawns a number of waiters, who wait upon the batch completion and measure the current BCS timestamp register and compare against the recorded value. By varying the number of producers and consumers, we can study different aspects of the design, in particular how many wakeups the kernel does for each interrupt (end of batch). The more wakeups on each batch, the longer it takes for any one client to finish. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-12-04	benchmarks/gem_exec_nop: Flush retirement lists before executing	Chris Wilson
	wait-ioctl skips a couple of side-effects of retiring, so provoke them using set-domain before we sleep. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-11-27	benchmarks/gem_exec_ctx: Measure switching between fds	Chris Wilson
	Switching between fds also involves a context switch, include it amongst the measurements. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-11-22	benchmarks: Add a set-domain benchmark	Chris Wilson
	Benchmark the overhead of changing from GTT to CPU domains and vice versa. Effectively this measures the cost of a clflush, and how well the driver can avoid them. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-11-12	benchmarks/gem_blt: Fixup a couple of non-llc foibles	Chris Wilson
	When extending the batch for multiple copies, we need to remember to flag it as being in the CPU write domain so that the new values get flushed out to main memory before execution. We also have to be careful not to specify NO_RELOC for the extended batch as the execobjects will have been updated but we write the wrong presumed offsets. Subsequent iterations will be correct and we can tell the kernel then to skip the relocations entirely. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-11-11	Fix comparison of unsigned integers	Thomas Wood
	Signed-off-by: Thomas Wood <thomas.wood@intel.com>
2015-11-10	benchmarks: Add README	Chris Wilson
	Add a README to introduce the ezbench.sh benchmark runner. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-11-10	benchmarks/gem_blt: Report peak throughput	Chris Wilson
	Report the highest throughput measured from a large set of runs to improve sensitivity. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-10-30	benchmarks/gem_wait: Remove pthread_cancel()	Chris Wilson
	Apparently the pthread shim on Android doesn't have pthread cancellation, so use the plain old volatile to terminate the CPU hogs. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-10-30	benchmark/gem_wait: poc for benchmarking i915_wait_request overhead	Chris Wilson
	One scenario under recent discussion is that of having a thundering herd in i915_wait_request - where the overhead of waking up every waiter for every batchbuffer was significantly impacting customer throughput. This benchmark tries to replicate something to that effect by having a large number of consumers generating a busy load (a large copy followed by lots of small copies to generate lots of interrupts) and tries to wait upon all the consumers concurrenctly (to reproduce the thundering herd effect). To measure the overhead, we have a bunch of cpu hogs - less kernel overhead in waiting should allow more CPU throughput. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-10-15	benchmarks/gem_blt: Include igt.h in gem_blt.c	Derek Morton
	To fix a build error on android Signed-off-by: Derek Morton <derek.j.morton@intel.com> Signed-off-by: Thomas Wood <thomas.wood@intel.com>
2015-10-12	Replace __gem_mmap__{cpu,gtt,wc}() + igt_assert() with gem_mmap__{cpu,gtt,wc}()	Ville Syrjälä
	gem_mmap__{cpu,gtt,wc}() already has the assert built in, so replace __gem_mmap__{cpu,gtt,wc}() + igt_assert() with it. Mostly done with coccinelle, with some manual help: @@ identifier I; expression E1, E2, E3, E4, E5, E6; @@ ( - I = __gem_mmap__gtt(E1, E2, E3, E4); + I = gem_mmap__gtt(E1, E2, E3, E4); ... - igt_assert(I); \| - I = __gem_mmap__cpu(E1, E2, E3, E4, E5); + I = gem_mmap__cpu(E1, E2, E3, E4, E5); ... - igt_assert(I); \| - I = __gem_mmap__wc(E1, E2, E3, E4, E5); + I = gem_mmap__wc(E1, E2, E3, E4, E5); ... - igt_assert(I); ) Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Stochastically-reviwewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-10-09	Make gem_mmap__{cpu,gtt,wc}() assert on failure	Ville Syrjälä
	Rename the current gem_mmap__{cpu,gtt,wc}() functions into __gem_mmap__{cpu,gtt,wc}(), and add back wrappers with the original name that assert that the pointer is valid. Most callers will expect a valid pointer and shouldn't have to bother with failures. To avoid changing anything (yet), sed 's/gem_mmap__/__gem_mmap__/g' over the entire codebase. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Stochastically-reviwewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-10-09	Sprinkle igt_assert(ptr) after gem_mmap__{cpu,gtt,wc}	Ville Syrjälä
	Do the following ptr = gem_mmap__{cpu,gtt,wc}() +igt_assert(ptr); whenever the code doesn't handle the NULL ptr in any kind of specific way. Makes it easier to move the assert into gem_mmap__{cpu,gtt,wc}() itself. Mostly done with coccinelle, with some manual cleanups: @@ identifier I; @@ <... when != igt_assert(I) when != igt_require(I) when != igt_require_f(I, ...) when != I != NULL when != I == NULL ( I = gem_mmap__gtt(...); + igt_assert(I); \| I = gem_mmap__cpu(...); + igt_assert(I); \| I = gem_mmap__wc(...); + igt_assert(I); ) ...> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Stochastically-reviwewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-10-06	benchmarks/gem_blt: Fix compilation after rebase and add batch-size	Chris Wilson
	Add an option to do more than one copy per batch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-10-06	benchmarks: Measure BLT performance	Chris Wilson
	Execute N blits and time how long they complete to measure both GPU limited bandwidth and submission overhead. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-10-02	benchmarks: Fix build errors on Android M-Dessert	Derek Morton
	Android M-Dessert treats implicit declaration of function warnings as errors resulting in igt failing to build. This patch fixes the errors by including missing header files as required. Mostly this involved including igt.h in the benchmarks. Signed-off-by: Derek Morton <derek.j.morton@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-09-11	convert drm_open_any() calls to drm_open_driver(DRIVER_INTEL) calls with cocci	Micah Fedke
	Apply the new API to all call sites within the test suite using the following semantic patch: // Semantic patch for replacing drm_open_any* with arch-specific drm_open_driver* calls @@ identifier i =~ "\bdrm_open_any\b"; @@ - i() + drm_open_driver(DRIVER_INTEL) @@ identifier i =~ "\bdrm_open_any_master\b"; @@ - i() + drm_open_driver_master(DRIVER_INTEL) @@ identifier i =~ "\bdrm_open_any_render\b"; @@ - i() + drm_open_driver_render(DRIVER_INTEL) @@ identifier i =~ "\b__drm_open_any\b"; @@ - i() + __drm_open_driver(DRIVER_INTEL) Signed-off-by: Micah Fedke <micah.fedke@collabora.co.uk> Signed-off-by: Thomas Wood <thomas.wood@intel.com>
2015-09-08	build: fix unused-result warnings	Thomas Wood
	Signed-off-by: Thomas Wood <thomas.wood@intel.com>
2015-08-21	benchmarks/gem_exec_reloc: Allow profiling 0 relocs	Chris Wilson
	Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-08-14	benchmark/gem_exec_trace: Inline everything	Chris Wilson
	Avoid the globals and make the dispatch one huge function and hope GCC works some magic. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-08-14	benchmark/gem_exec_tracer: Tweak to handle SNA	Chris Wilson
	SNA starts by feeding in deliberately bad ioctls in order to detect the kernel interface versions. A quick solution is to always feed it to the ioctl and only record the trace if it is valid. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-08-13	benckmarks/Android.mk: Fix building benchmarks for Android	Derek Morton
	The commit "benchmarks: Do not install to system-wide bin/" changed the benchmark file list from bin_PROGRAMS to benchmarks_PROGRAMS. However Android.mk was not updated, resulting in IGT failing to build for Android. This commit adds that change. It also adds LOCAL_MODULE_PATH to specify where the built benchmarks should be put. v2: I discovered that the existing definitions of LOCAL_MODULE_PATH were creating what should have been an invalid path. Not sure how it was ever working previously, but fixed now. Signed-off-by: Derek Morton <derek.j.morton@intel.com> Signed-off-by: Thomas Wood <thomas.wood@intel.com>
2015-08-11	benchmarks: Add a microbenchmark for relocation overhead	Chris Wilson
	Allow specification of the many different busyness modes and relocation interfaces, along with the number of buffers to use and relocations. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-08-10	benchmarks/gem_exec_trace: Unmap each trace after replay	Chris Wilson
	Just on the off chance someone is replaying a bunch of traces, remember to cleanup up. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-08-10	benchmarks/gem_exec_trace: Mark the mmap as sequentially read	Chris Wilson
	Use madvise(MADV_SEQUENTIAL) to let the kernel optimise for our straightforward sequential read pattern. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-08-10	benchmarks: Rename the gem_exec_trace tracer module	Chris Wilson
	Now that we actually install the benchmarks into a sane location, slightly abuse it to put the tracer for gem_exec_trace alongside. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-08-10	benchmarks/gem_exec_trace: Clear all new bo handles	Chris Wilson
	When reallocing the bo array, remember to set the new entries to 0. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-08-10	benchmarks: Do not install to system-wide bin/	Chris Wilson
	These benchmarks are first-and-foremost development tools, not aimed at general users. As such they should not be installed into the system-wide bin/ directory, but installed into libexec/. v2: Now actually install beneath ${libexec} Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-08-09	benchmarks: Record and replay calls to EXECBUFFER2	Chris Wilson
	This slightly idealises the behaviour of clients with the aim of measuring the kernel overhead of different workloads. This test focuses on the cost of relocating batchbuffers. A trace file is generated with an LD_PRELOAD intercept around execbuffer, which we can then replay at our leisure. The replay replaces the real buffers with a set of empty ones so the only thing that the kernel has to do is parse the relocations. but without a real workload we lose the impact of having to rewrite active buffers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-08-06	benchmarks/Android.mk, tools/Android.mk: Fix android build error	Derek Morton
	Recently added tools / benckmarks have the same module name as existing tests. Android does not allow duplicate modules. This patch appends _benchmark and _tool to the module names used when building benckmarks and tools to prevent clashes with tests of the same name. Signed-off-by: Derek Morton <derek.j.morton@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-07-24	benchmark: Measure allocation time for objects	Chris Wilson
	A basic measurement, how fast can we create and populate an object with backing storage? Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-07-24	benchmarks: Measure mmap fault latency	Chris Wilson
	Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-07-24	benchmarks: Benchmarkify gem_exec_ctx	Chris Wilson
	Measure the overhead of execution when doing nothing, switching between a pair of contexts, or creating a new context every time. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-07-24	benchmarks: Add kms_vblank to .gitignore	Chris Wilson
	Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-07-23	benchmarks: Measure round-trip time for an immediate vblanks	Chris Wilson
	By measuring both the query and the event round trip time, we can make a reasonable estimate of how long it takes for the query to send the vblank following an interrupt. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-07-23	benchmarks: gem_prw add the read/write switch to getopt	Chris Wilson
	In my haste to merge the two gem_pread/gem_pwrite, I forgot to write up the command line switch to getopt. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-07-23	benchmarks: Add simple mmap benchmarks	Chris Wilson
	Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-07-23	benchmarks: Add simple pread/pwrite benchmarks	Chris Wilson
	Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-07-22	benchmarks: Benchmarkify gem_exec_nop	Chris Wilson
	Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2015-06-02	gem_userptr_benchmark: Test overlapping bo mmu notifier performance impact	Tvrtko Ursulin
	Current userptr kernel implementation downgrades tracking VMA ranges (real userspace ones) to an inefficient linear walk for any process which has instantiated overlapping userptr objects. This adds a test which shows the performance cliff on, most visibly, generic userspace mmap(2) and munmap(2) operations between unsync, non-overlapping and overlapping userptr objects. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Thomas Daniel <thomas.daniel@intel.com>
2015-03-26	lib: print a stack trace when a test assertion fails	Thomas Wood
	Add an optional dependency on libunwind to print stack traces when a test assertion fails. Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Thomas Wood <thomas.wood@intel.com>
2014-12-12	Android.mk: replace std=c99 with std=gnu99	Tim Gore
	The android makefiles were passing the -std=c99 flag to the compiler which disables the typeof keyword. This causes a build fail for a recent addition to igt_aux.h. Change this to -std=gnu99, which is the flag used in the linux build Signed-off-by: Tim Gore <tim.gore@intel.com> Signed-off-by: Thomas Wood <thomas.wood@intel.com>
2014-08-30	batch: Specify number of relocations to accommodate	Chris Wilson
	Since relocations are variable size, depending upon generation, it is easier to handle the resizing of the batch request inside the BEGIN_BATCH macro. This still leaves us with having to resize commands in a few places - which still need adaption for gen8+. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2014-08-29	Prepare for 64bit relocation addresses	Chris Wilson
	This reveal that quite a few locations were writing relocation offsets but only allowing for 32 bit addresses. To reveal such places in active tests, we also now double check that we do not use more batch space than declared. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2014-07-23	igt/gem_userptr_benchmark: Fix for upstream ioctl number	Tvrtko Ursulin
	Hardcoding has upsides and downsides. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-04-25	test/gem_userptr_*: Fix compile fail	Daniel Vetter
	Also shut up warnings. Those revealed incorrect usage of local variables in conjunction with igt_fixture/igt_subtest. Since those use longjmps we need to move the out of the stackframe those magic blocks are declared in. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-04-25	tests/gem_userptr_benchmark: Benchmarking userptr surfaces and impact	Tvrtko Ursulin
	This adds a small benchmark for the new userptr functionality. Apart from basic surface creation and destruction, also tested is the impact of having userptr surfaces in the process address space. Reason for that is the impact of MMU notifiers on common address space operations like munmap() which is per process. v2: * Moved to benchmarks. * Added pointer read/write tests. * Changed output to say iterations per second instead of operations per second. * Multiply result by batch size for multi-create* tests for a more comparable number with create-destroy test. v3: * Use ALIGN macro. * Catchup with big lib/ reorganization. * Removed unused code and one global variable. * Fixed up some warnings. v4: * Fixed feature test, does not matter here but makes it consistent with gem_userptr_blits and clearer. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-04-24	benchmarks: Build them on Android.	Tvrtko Ursulin
	They build fine so give them some exposure. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com> Signed-off-by: Thomas Wood <thomas.wood@intel.com>