summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/i915/i915_gem_execbuffer.c
AgeCommit message (Collapse)Author
2011-03-07drm/i915: Disable GPU semaphores by defaultChris Wilson
Andi Kleen narrowed his GPU hangs on his Sugar Bay (SNB desktop) rev 09 down to the use of GPU semaphores, and we already know that they appear broken up to Huron River (mobile) rev 08. (I'm optimistic that disabling GPU semaphores is simply hiding another bug by the latency and side-effects of the additional device interaction it introduces...) However, use of semaphores is a massive performance improvement... Only as long as the system remains stable. Enable at your peril. Reported-by: Andi Kleen <andi-fd@firstfloor.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=33921 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-23drm/i915: Fix use of invalid array size for ring->sync_seqnoChris Wilson
There are I915_NUM_RINGS-1 inter-ring synchronisation counters, but we were clearing I915_NUM_RINGS of them. Oops. Reported-by: Jiri Slaby <jirislaby@gmail.com> Tested-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-14drm/i915: Disable GPU semaphores on SandyBridge mobileChris Wilson
Hopefully, this is a temporary measure whilst the root cause is understood. At the moment, we experience a hard hang whilst looping urbanterror that has been identified as a result of the use of semaphores, but so far only on SNB mobile. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32752 Tested-by: mengmeng.meng@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-13drm/i915/execbuffer: Clear domains before beginning reloc processingChris Wilson
After reordering the sequence of relocating objects, commit 6fe4f1404, we can no longer rely on seeing all reloc targets prior to performing the relocation. As a result we were ignoring the need to flush objects from the render cache and invalidate the sampler caches, resulting in rendering glitches. So we need to clear the relocation domains earlier. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-13drm/i915/execbuffer: Reorder relocations to match new object orderChris Wilson
On the fault path, commit 6fe4f140 introduction a regression whereby it changed the sequence of the objects but continued to use the original ordering of relocation entries. The result was that incorrect GTT offsets were being fed into the execbuffer causing lots of misrendering and potential hangs. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-11drm/i915/execbuffer: Reorder binding of objects to favour restrictionsChris Wilson
As the mappable portion of the aperture is always a small subset at the start of the GTT, it is allocated preferentially by drm_mm. This is useful in case we ever need to map an object later. However, if you have a large object that can consume the entire mappable region of the GTT this prevents the batchbuffer from fitting and so causing an error. Instead allocate all those that require a mapping up front in order to improve the likelihood of finding sufficient space to bind them. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-11drm/i915/execbuffer: Correctly clear the current object list upon EFAULTChris Wilson
Before releasing the lock in order to copy the relocation list from user pages, we need to drop all the object references as another thread may usurp and execute another batchbuffer before we reacquire the lock. However, the code was buggy and failed to clear the list... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org
2011-01-11drm/i915: Propagate error from flushing the ringChris Wilson
... in order to avoid a BUG() and potential unbounded waits. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-11drm/i915: Handle ringbuffer stalls when flushingChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-11drm/i915: Enforce write ordering through the GTTChris Wilson
We need to ensure that writes through the GTT land before any modification to the MMIO registers and so must impose a mandatory write barrier when flushing the GTT domain. This was revealed by relaxing the write ordering by experimentally mapping the registers and the GATT as write-combining. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-20drm/i915: Allow the application to choose the constant addressing modeChris Wilson
The relative-to-general state default is useless as it means having to rewrite the streaming kernels for each batch. Relative-to-surface is more useful, as that stream usually needs to be rewritten for each batch. And absolute addressing mode, vital if you start streaming state, is also only available by adjusting the register... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-09drm/i915: Mark the user reloc error paths as unlikelyChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-09drm/i915: Eliminate drm_gem_object_lookup during relocationChris Wilson
As we provide a list of all objects that will be accessed from the batchbuffer, we can build a lut of the handles associated with those objects for this invocation and use that to avoid the overhead of looking up those objects again for every relocation. The cost of building and searching a small hash table is much less than that of acquiring a spinlock, searching a radix tree and manipulating an atomic refcnt per relocation. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-05drm/i915: Ignore fenced commands for gpu access on gen4Chris Wilson
Userspace should not have been declaring that it needed fenced GPU access with gen4+ as those GPUs have no fenced commands, but to be on the safe side it is easier to ignore userspace in case they did. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-05drm/i915: Implement GPU semaphores for inter-ring synchronisation on SNBChris Wilson
The bulk of the change is to convert the growing list of rings into an array so that the relationship between the rings and the semaphore sync registers can be easily computed. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-02drm/i915: Pipelined fencing [infrastructure]Chris Wilson
With this change, every batchbuffer can use all available fences (save pinned and scanout, of course) without ever stalling the gpu! In theory. Currently the actual pipelined update of the register is disabled due to some stability issues. However, just the deferred update is a significant win. Based on a series of patches by Daniel Vetter. The premise is that before every access to a buffer through the GTT we have to declare whether we need a register or not. If the access is by the GPU, a pipelined update to the register is made via the ringbuffer, and we track the last seqno of the batches that access it. If by the CPU we wait for the last GPU access and update the register (either to clear or to set it for the current buffer). One advantage of being able to pipeline changes is that we can defer the actual updating of the fence register until we first need to access the object through the GTT, i.e. we can eliminate the stall on set_tiling. This is important as the userspace bo cache does not track the tiling status of active buffers which generate frequent stalls on gen3 when enabling tiling for an already bound buffer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2010-12-02drm/i915: Prevent stalling for a GTT read back from a read-only GPU targetChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-30drm/i915/ringbuffer: Handle cliprects in the callerChris Wilson
This makes the various rings more consistent by removing the anomalous handing of the rendering ring execbuffer dispatch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-28drm/i915/execbuffer: On error, starting unwinding from the previous objectChris Wilson
As the error occurred on the current object, it means that its state was not changed and so it should be excluded from the unwind. Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-25drm/i915: Avoid allocation for execbuffer object listChris Wilson
Besides the minimal improvement in reducing the execbuffer overhead, the real benefit is clarifying a few routines. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-25drm/i915: Split i915_gem_execbuffer into its own file.Chris Wilson
A number of dragons have been seen lurking within the execbuffer code. The first step is then to isolate them from the rest and begin to scrutinise them in depth. Suggested by Daniel Vetter. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>