diff options
author | Sebastian Andrzej Siewior <bigeasy@linutronix.de> | 2020-08-19 21:55:05 +0200 |
---|---|---|
committer | Peter Zijlstra <peterz@infradead.org> | 2020-08-26 12:41:58 +0200 |
commit | 01ccf592362a984534371b3596d4c953da6a7bb2 (patch) | |
tree | 27cda671f00ea1ffd9a2712b00685104ef815b5a /tools/perf/scripts/python/stackcollapse.py | |
parent | 1724b95b92979a8ea8e55a4817d05b3bb7750958 (diff) |
sched: Bring the PF_IO_WORKER and PF_WQ_WORKER bits closer together
The bits PF_IO_WORKER and PF_WQ_WORKER are tested together in
sched_submit_work() which is considered to be a hot path.
If the two bits cross the 8 or 16 bit boundary then most architecture
require multiple load instructions in order to create the constant
value. Also, such a value can not be encoded within the compare opcode.
By moving the bit definition within the same block, the compiler can
create/use one immediate value.
For some reason gcc-10 on ARM64 requires both bits to be next to each
other in order to issue "tst reg, val; bne label". Otherwise the result
is "mov reg1, val; tst reg, reg1; bne label".
Move PF_VCPU out of the way so that PF_IO_WORKER can be next to
PF_WQ_WORKER.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200819195505.y3fxk72sotnrkczi@linutronix.de
Diffstat (limited to 'tools/perf/scripts/python/stackcollapse.py')
0 files changed, 0 insertions, 0 deletions