Age | Commit message (Collapse) | Author |
|
i915 connector properties have been converted to atomic, so
all properties can now be set.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
|
|
This will force less watermarks to be allcoated to the sprite plane,
making it easy to hit a underrun!
Unfortunately the FIFO underrun wasprevented by skl_plane_downscale_amount,
but fixed with Ville's patch series.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
|
|
This makes kms_rotation_crc also test a case where a small strip is
tested on the top or left side of the screen. It allows us to ensure
more accurately that the rotation is shown correctly on the screen.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
|
|
No functional changes yet, just making sure that it works
as expected.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
|
|
Changed the "%2X" to "%02X", to prevent padding with spaces, which
breaks qemu command line arguments when first RANDOM is <0x10.
Signed-off-by: Sarvela Tomi P <tomi.p.sarvela@intel.com>
Signed-off-by: Terrence Xu <terrence.xu@intel.com>
Reviewed-by: Petri Latvala <petri.latvala@intel.com>
|
|
Create crtc/connector combinations based on actual adapter
information obtained from drmModeRes.
Also set MAX_CRTCs to 6 for AMD GPUs.
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
|
|
IVB+ have the cursor "FBC" feature, meaning they support a
somewhat limited form of non-square cursors. Let's test that.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
|
|
gem_execbuf_wr was duplicated in multiple places.
Moving everything to lib/
Signed-off-by: Lukasz Fiedorowicz <lukasz.fiedorowicz@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
AM_PROG_FLEX macro will set the LEX variable using the missing
script when the flex is not present. This will confuse the
configure.ac check, which expects the AC_PROG_FLEX behaviour,
and will so fail to detect the missing flex:
AS_IF([test x"$LEX" != "x:" -a x"$YACC" != xyacc],
[enable_assembler=yes],
[enable_assembler=no])
This is because AM_PROG_LEX sets the LEX variable to
"${SHELL} /home/sc/intel-gpu-tools/build-aux/missing flex",
while AC_PROG_LEX would set it to ":".
If for some reason we really need to keep AM_PROG_LEX,
alternative fix could be something like this placed before
the above AS_IF check:
AC_MSG_CHECKING([checking for working flex])
if ! eval "$LEX --version >/dev/null 2>&1"; then
AC_MSG_RESULT([failed])
LEX=:
else
AC_MSG_RESULT([pass])
fi
Note the evil eval needed to recursively expand variables.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Petri Latvala <petri.latvala@intel.com>
|
|
Just recently I once again made the mistake of thinking we could do a
plain mutex_lock() inside i915_gem_shrink(). However, such a lock is
liable to cyclic deadlocks between multiple relcaimers. This can be
reported by lockdep, but we need contention in the shrinker for it to
spot this particular mistake. The easiest way to explicit cause
contention is via concurrent calls to debugfs/i915_drop_caches whilst
the GPU is busy.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Plus a help text correction and calibration speed-up in
-R and -T modes.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Record it within this script since trace.pl added support.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
New option (-w) allows direct pass-through to gem_wsim for cases
when heterogenous workloads, or even additional parameters to
gem_wsim need to be tested.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Merge and flatten all the engine timelines to produce an
aggregate stat.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
In addition:
* optimize saturation point finding
* fix wps target modes
* always use -R, it is pointless not to
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
In this mode ('-G' on the command line) all balancing operations
are routed via the first client so the complete balancing state
is shared. In other words the overall balancing behaviours is
like there is only one client submitting the aggregate workload.
This can help with the observed metrics and lead to better
balancing decisions in a lot of cases.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Refactoring for upcoming changes.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Will make the userspace balancing daemon simulation easier.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Not all subtests remembered their requires, so do it from the caller and
catch all at once.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Check that writes to adjacent cachelines are functional.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Document priority support in the help text.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
The request we await for is just to store a dword, if the device doesn't
support store-dword we will do no awaits and so not actually test
anything.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Rather than have the code in multiple locations, put a copy in lib/
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101079
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Slaves just keep on running, far beyond the repeat target of their
master.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Need to close more pipe ends to support master with more than
one background workloads.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
When evaluating best balancers it is useful to be able to glance
over the range of results for a particular workload since that
determines the weighted scoring.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
I am failing to come up with a smart formula which would
have a little bit of an exponential component and take into
consideration both total thtoughput and single client
performance.
Simply adding the two scores together might work better
than any complications.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Makes sense to keep it around if a different type of analysis
needs to be done later.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Split out the flagging logic so to make it easier to read and so
that the complete failure to balance is declared a failure
instead of a warning.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Code just didn't expect '<none>' as the selected balancer.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
It wasn't normalized as the results are.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Just a micro-optimisation to avoid copying back the struct to userspace
if we aren't looking for an output.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Print out a little bit of device information on startup to help diagnose
errors.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Two new workload commands are added, 'f' and 'q.<idx>' which
enable creation and signalling of non i915 fences.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Introduce an anonymous union so each step type can use its own
name for the metadata.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Add sync fence dependency support to workload steps.
Only one sync fence dependency per step is supported at the
moment.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Just compact it a bit by avoiding the min != max check
duplication and change get_duration to change w_step.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
The high level goal of this script is to programatically analyze
the simulated media workloads (gem_wsim) by finding an optimal
load balancing strategy, and also detecting any possible
shortcomings of the same.
When run without command line arguments script will run through
both of its phases.
In the first phase it will be running all the known balancers
against the all the known workloads, and for each combination
look for a point where aggregated system throughput cannot be
increased by running more parallel workload instances.
At that point each balancer gets a score proportional to the
throughput achieved, which is added to the running total for the
complete phase.
Several different score boards are kept - total throughput, per
client throughput and combined (total + per client). Weighted
scoreboards are also kept where scores are weighted based on the
total variance detected for a single workload. This means scores
for workloads which respond well to being balanced will be worth
more than of the ones which do not balance well in neither of
the configurations.
Based on the first phase a "best" balancing strategy will be
selected based on the combined weighted scoreboard.
Second phase will then proceed to profile all the selected
workloads with this balancer and look at potential problems with
GPU engines not being completely saturated.
If none of the active engine is saturated the workload will be
flagged, as it will if the only saturated engine is one of the
ones which can be balanced, but the other one in the same class
is under-utilized.
Flagged workloads then need to be analyzed which can be achieved
by looking at the html of the engine timelines which are
generated during this phase. (These files are all put in the
current working directory.)
It is quite possible that something flagged by the script as
suspect is completely fine and just a consequence of the
workload in question being fundementally unbalanced.
It is possible to skip directly to the second phase of the
evaluation by using the -b command line option. This option must
contain a string exactly as understood by gem_wsim's -b option.
For example '-b "-b rtavg -R"'.
Apart from being run with no arguments, script also supports a
selection of command line switches to enable fine tuning.
For example, also including the complete output from the script
in order to be more illustrative:
-8<---
+ scripts/media-bench.pl -n 642317 -r 2 -B rand,rtavg -W media_load_balance_hd12.wsim,media_load_balance_fhd26u7.wsim
Workloads:
media_load_balance_hd12.wsim
media_load_balance_fhd26u7.wsim
Balancers: rand,rtavg,
Target workload duration is 2s.
Calibration tolerance is 0.01.
Nop calibration is 642317.
Evaluating 'media_load_balance_hd12.wsim'... 2s is 990 workloads. (error=0.00750000000000006)
Finding saturation points for 'media_load_balance_hd12.wsim'...
rand balancer ('-b rand'): 6 clients (1412.576 wps, 235.429333333333 wps/client).
rand balancer ('-b rand -R'): 6 clients (1419.639 wps, 236.6065 wps/client).
rtavg balancer ('-b rtavg'): 5 clients (1430.143 wps, 286.0286 wps/client).
rtavg balancer ('-b rtavg -H'): 5 clients (1339.775 wps, 267.955 wps/client).
rtavg balancer ('-b rtavg -R'): 5 clients (1386.384 wps, 277.2768 wps/client).
rtavg balancer ('-b rtavg -R -H'): 6 clients (1365.943 wps, 227.657166666667 wps/client).
Best balancer is '-b rtavg'.
Evaluating 'media_load_balance_fhd26u7.wsim'... 2s is 52 workloads. (error=0.002)
Finding saturation points for 'media_load_balance_fhd26u7.wsim'...
rand balancer ('-b rand'): 3 clients (46.532 wps, 15.5106666666667 wps/client).
rand balancer ('-b rand -R'): 3 clients (46.242 wps, 15.414 wps/client).
rtavg balancer ('-b rtavg'): 6 clients (61.232 wps, 10.2053333333333 wps/client).
rtavg balancer ('-b rtavg -H'): 4 clients (57.608 wps, 14.402 wps/client).
rtavg balancer ('-b rtavg -R'): 6 clients (61.793 wps, 10.2988333333333 wps/client).
rtavg balancer ('-b rtavg -R -H'): 7 clients (60.697 wps, 8.671 wps/client).
Best balancer is '-b rtavg -R'.
Total wps rank:
===============
1: '-b rtavg' (1)
2: '-b rtavg -R' (0.989191465637926)
3: '-b rtavg -R -H' (0.973103630772601)
4: '-b rtavg -H' (0.938804458876241)
5: '-b rand -R' (0.874465740398305)
6: '-b rand' (0.874342391093453)
Total weighted wps rank:
========================
1: '-b rtavg -R' (1)
2: '-b rtavg' (0.998877134022041)
3: '-b rtavg -R -H' (0.982849160383224)
4: '-b rtavg -H' (0.938950446314292)
5: '-b rand' (0.80507369080098)
6: '-b rand -R' (0.80229656623594)
Per client wps rank:
====================
1: '-b rtavg -H' (1)
2: '-b rand' (0.977356849770376)
3: '-b rand -R' (0.976222085591368)
4: '-b rtavg' (0.888825068013012)
5: '-b rtavg -R' (0.875653417817828)
6: '-b rtavg -R -H' (0.726389466714194)
Per client weighted wps rank:
=============================
1: '-b rand' (1)
2: '-b rand -R' (0.996866139192282)
3: '-b rtavg -H' (0.986348733324348)
4: '-b rtavg' (0.811593544774355)
5: '-b rtavg -R' (0.805704548552663)
6: '-b rtavg -R -H' (0.671567075453688)
Combined wps rank:
==================
1: '-b rtavg' (1)
2: '-b rtavg -R' (0.989191465637926)
3: '-b rtavg -H' (0.972251783752137)
4: '-b rtavg -R -H' (0.949708930404222)
5: '-b rand' (0.914594701126905)
6: '-b rand -R' (0.914312395840401)
Combined weighted wps rank:
===========================
1: '-b rtavg' (1)
2: '-b rtavg -R' (0.995945739226824)
3: '-b rtavg -H' (0.984347862855008)
4: '-b rtavg -R -H' (0.956920992185625)
5: '-b rand' (0.899001713089319)
6: '-b rand -R' (0.896984246540919)
Balancer is '-b rtavg'.
Idleness tolerance is 2%.
Profiling 'media_load_balance_hd12.wsim'...
2s is 992 workloads. (error=0.00150000000000006)
Saturation at 6 clients (1434.207 workloads/s).
Pass [ 0: 0.57%, 2: 22.59%, 3: 23.30%, ]
Profiling 'media_load_balance_fhd26u7.wsim'...
2s is 52 workloads. (error=0.001)
Saturation at 6 clients (61.823 workloads/s).
WARN [ 0: 7.77%, 2: 0.66%, 3: 28.70%, ]
Problematic workloads were:
media_load_balance_fhd26u7.wsim -c 6 -r 52 [ 0: 7.77%, 2: 0.66%, 3: 28.70%, ]
-8<---
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Used with the '-a' command line switch which follows the same
usage as '-w' and '-W', it enables to add append workload steps
to the end of all normal workloads.
This for example allows running any workload in the real-time
mode:
gem_wsim -w <some-workload> -a p.16667
Makes a workload to be run with the 60 Hz period.
At the same time fix the periodic mode execution with dropped
frames, or almost dropped frames.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Tidy last in the array presumed offset setting even though this
code path is not used at the moment.
Also use READ_ONCE on all fields we are trying to read from the
status page.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|