summaryrefslogtreecommitdiff
path: root/arch
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2012-03-19 16:37:28 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2012-03-19 16:37:28 -0700
commitb0e37d7ac6ba937c3776ff5111ff6a7fa832fb4f (patch)
treefdb86783c464825a77223e49cc24f632e319d2df /arch
parent6d7d1a0dc735ea8412769edae7154885021107a9 (diff)
parentbfcfaa77bdf0f775263e906015982a608df01c76 (diff)
Merge branch 'dcache-word-accesses'
* branch 'dcache-word-accesses': vfs: use 'unsigned long' accesses for dcache name comparison and hashing This does the name hashing and lookup using word-sized accesses when that is efficient, namely on x86 (although any little-endian machine with good unaligned accesses would do). It does very much depend on little-endian logic, but it's a very hot couple of functions under some real loads, and this patch improves the performance of __d_lookup_rcu() and link_path_walk() by up to about 30%. Giving a 10% improvement on some very pathname-heavy benchmarks. Because we do make unaligned accesses past the filename, the optimization is disabled when CONFIG_DEBUG_PAGEALLOC is active, and we effectively depend on the fact that on x86 we don't really ever have the last page of usable RAM followed immediately by any IO memory (due to ACPI tables, BIOS buffer areas etc). Some of the bit operations we do are a bit "subtle". It's commented, but you do need to really think about the code. Or just consider it black magic. Thanks to people on G+ for some of the optimized bit tricks.
Diffstat (limited to 'arch')
-rw-r--r--arch/x86/Kconfig1
1 files changed, 1 insertions, 0 deletions
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 5bed94e189fa..09675d3e0ac3 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -82,6 +82,7 @@ config X86
select CLKEVT_I8253
select ARCH_HAVE_NMI_SAFE_CMPXCHG
select GENERIC_IOMAP
+ select DCACHE_WORD_ACCESS if !DEBUG_PAGEALLOC
config INSTRUCTION_DECODER
def_bool (KPROBES || PERF_EVENTS)