Age | Commit message (Collapse) | Author |
|
Some frames from the middle of a demo with corresponding buffers.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
See README for more details.
v2:
* No need to mess with flags. (Chris)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Add support for defining buffer object working sets and targetting them as
data dependencies. For more information please see the README file.
v2:
* More robustness in parsing here and there. (Chris)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Evaluation of userspace load balancing options was how this tool started
but since we have settled on doing it in the kernel.
Tomorrow we will want to update the tool for new engine interfaces and all
this legacy code will just be a distraction.
Rip out everything not related to explicit load balancing implemented via
context engine maps and adjust the workloads to use it.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
A new workload command ('S') is added which allows per context slice
(re-)configuration.
v2:
* Only query device SSEU on first use. (Chris)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
For simulating frame split workloads it is useful to express a batch which
ends at the same time as the parallel submission on the respective bonded
engine. For this we add support for infinite batch durations and the batch
terminate command ('T'). Syntax looks like this:
1.RCS.*.0.0
T.-1
First step starts an infinite batch, and second command terminates the
infinite batch with the usual relative workload step addressing.
v2: (Chris)
* Relax the recursive batch with 4096 nops between BB_START.
* Check for at least gen8.
* Simplify relocation entry building.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v1
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
A few additional workloads useful for experimenting with scheduling.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Engine bonds are an i915 uAPI applicable to load balanced contexts with
engine map. They allow expression rules of engine selection between two
contexts when submissions are also tied with submit fences.
Please refer to the README for a more detailed description.
v2:
* Use list of symbolic engine names instead of the mask. (Chris)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
A new workload command for enabling a load balanced context map (aka
Virtual Engine). Example usage:
B.1
This turns on load balancing for context one, assuming it has already been
configured with an engine map. Only DEFAULT engine specifier can be used
with load balanced engine maps.
v2:
* Lift restriction to only use load balancer when enabled in context map.
(Chris)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Support new i915 uAPI for configuring contexts with engine maps.
Please refer to the README file for more detailed explanation.
v2:
* Allow defining engine maps by class.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Add support for submit fences in a way similar to how normal input fences
are handled. Eg:
1.RCS.500-1000.0.0
1.VCS1.3000.s-1.0
1.VCS2.3000.s-2.0
Submit fences are signalled when the originating request enters the
submission backend.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
We're not using automake to build tarballs anymore.
Acked-by: Petri Latvala <petri.latvala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
|
|
Allow workloads to specify frequency of preemption points per context.
New workload command ('X') is added to allow this.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
A new workload command ('P') is added which enables per context dynamic
priority control.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Petri Latvala <petri.latvala@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
These ones demonstrate fence usage and also mixing them with
data dependencies.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
It was the only one with no randomness.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Simulates a single decoder feeding multiple processing and
encoding pipelines.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Two new workload commands are added, 'f' and 'q.<idx>' which
enable creation and signalling of non i915 fences.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Add sync fence dependency support to workload steps.
Only one sync fence dependency per step is supported at the
moment.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
hd06mp2, hd12: Want many parallel clients (20+) and lets itself
be balanced.
fhd26u7, 4k12u7: Simulates either encoder or decoder with VCS1
(HEVC) dependency and some balancing VCS usage. Needs fewer
clients (3-6).
hd01, hd17i4: Mostly RCS limited targetting maximum execution
speed for a single client. Must not be hampered by incorrect
balancing decisions.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Multiple dependencies separated by forward slashes are now supported.
Some media workloads also updated to use this for better efficiency.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Workloads generated from a high level description of how
things usually work in the transcoding world.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Tool which emits batch buffers to engines with configurable
sequences, durations, contexts, dependencies and userspace waits.
Unfinished but shows promise so sending out for early feedback.
v2:
* Load workload descriptors from files. (also -w)
* Help text.
* Calibration control if needed. (-t)
* NORELOC | LUT to eb flags.
* Added sample workload to wsim/workload1.
v3:
* Multiple parallel different workloads (-w -w ...).
* Multi-context workloads.
* Variable (random) batch length.
* Load balancing (round robin and queue depth estimation).
* Workloads delays and explicit sync steps.
* Workload frequency (period) control.
v4:
* Fixed queue-depth estimation by creating separate batches
per engine when qd load balancing is on.
* Dropped separate -s cmd line option. It can turn itself on
automatically when needed.
* Keep a single status page and lie about the write hazard
as suggested by Chris.
* Use batch_start_offset for controlling the batch duration.
(Chris)
* Set status page object cache level. (Chris)
* Moved workload description to a README.
* Tidied example workloads.
* Some other cleanups and refactorings.
v5:
* Master and background workloads (-W / -w).
* Single batch per step is enough even when balancing. (Chris)
* Use hars_petruska_f54_1_random IGT functions and see to zero
at start. (Chris)
* Use WC cache domain when WC mapping. (Chris)
* Keep seqnos 64-bytes apart in the status page. (Chris)
* Add workload throttling and queue-depth throttling commands.
(Chris)
v6:
* Added two more workloads.
* Merged RT balancer from Chris.
v7:
* Merged NO_RELOC patch from Chris.
* Added missing RT balancer to help text.
TODO list:
* Fence support.
* Batch buffer caching (re-use pool).
* Better error handling.
* Less 1980's workload parsing.
* More workloads.
* Threads?
* ... ?
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
|