summaryrefslogtreecommitdiff
path: root/arch/arm64/kernel/machine_kexec.c
AgeCommit message (Collapse)Author
2022-05-07arm64: kdump: Reimplement crashkernel=XChen Zhou
There are following issues in arm64 kdump: 1. We use crashkernel=X to reserve crashkernel in DMA zone, which will fail when there is not enough low memory. 2. If reserving crashkernel above DMA zone, in this case, crash dump kernel will fail to boot because there is no low memory available for allocation. To solve these issues, introduce crashkernel=X,[high,low]. The "crashkernel=X,high" is used to select a region above DMA zone, and the "crashkernel=Y,low" is used to allocate specified size low memory. Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Co-developed-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/r/20220506114402.365-4-thunder.leizhen@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-01-05Merge branches 'for-next/misc', 'for-next/cache-ops-dzp', ↵Catalin Marinas
'for-next/stacktrace', 'for-next/xor-neon', 'for-next/kasan', 'for-next/armv8_7-fp', 'for-next/atomics', 'for-next/bti', 'for-next/sve', 'for-next/kselftest' and 'for-next/kcsan', remote-tracking branch 'arm64/for-next/perf' into for-next/core * arm64/for-next/perf: (32 commits) arm64: perf: Don't register user access sysctl handler multiple times drivers: perf: marvell_cn10k: fix an IS_ERR() vs NULL check perf/smmuv3: Fix unused variable warning when CONFIG_OF=n arm64: perf: Support new DT compatibles arm64: perf: Simplify registration boilerplate arm64: perf: Support Denver and Carmel PMUs drivers/perf: hisi: Add driver for HiSilicon PCIe PMU docs: perf: Add description for HiSilicon PCIe PMU driver dt-bindings: perf: Add YAML schemas for Marvell CN10K LLC-TAD pmu bindings drivers: perf: Add LLC-TAD perf counter support perf/smmuv3: Synthesize IIDR from CoreSight ID registers perf/smmuv3: Add devicetree support dt-bindings: Add Arm SMMUv3 PMCG binding perf/arm-cmn: Add debugfs topology info perf/arm-cmn: Add CI-700 Support dt-bindings: perf: arm-cmn: Add CI-700 perf/arm-cmn: Support new IP features perf/arm-cmn: Demarcate CMN-600 specifics perf/arm-cmn: Move group validation data off-stack perf/arm-cmn: Optimise DTC counter accesses ... * for-next/misc: : Miscellaneous patches arm64: Use correct method to calculate nomap region boundaries arm64: Drop outdated links in comments arm64: errata: Fix exec handling in erratum 1418040 workaround arm64: Unhash early pointer print plus improve comment asm-generic: introduce io_stop_wc() and add implementation for ARM64 arm64: remove __dma_*_area() aliases docs/arm64: delete a space from tagged-address-abi arm64/fp: Add comments documenting the usage of state restore functions arm64: mm: Use asid feature macro for cheanup arm64: mm: Rename asid2idx() to ctxid2asid() arm64: kexec: reduce calls to page_address() arm64: extable: remove unused ex_handler_t definition arm64: entry: Use SDEI event constants arm64: Simplify checking for populated DT arm64/kvm: Fix bitrotted comment for SVE handling in handle_exit.c * for-next/cache-ops-dzp: : Avoid DC instructions when DCZID_EL0.DZP == 1 arm64: mte: DC {GVA,GZVA} shouldn't be used when DCZID_EL0.DZP == 1 arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1 * for-next/stacktrace: : Unify the arm64 unwind code arm64: Make some stacktrace functions private arm64: Make dump_backtrace() use arch_stack_walk() arm64: Make profile_pc() use arch_stack_walk() arm64: Make return_address() use arch_stack_walk() arm64: Make __get_wchan() use arch_stack_walk() arm64: Make perf_callchain_kernel() use arch_stack_walk() arm64: Mark __switch_to() as __sched arm64: Add comment for stack_info::kr_cur arch: Make ARCH_STACKWALK independent of STACKTRACE * for-next/xor-neon: : Use SHA3 instructions to speed up XOR arm64/xor: use EOR3 instructions when available * for-next/kasan: : Log potential KASAN shadow aliases arm64: mm: log potential KASAN shadow alias arm64: mm: use die_kernel_fault() in do_mem_abort() * for-next/armv8_7-fp: : Add HWCAPS for ARMv8.7 FEAT_AFP amd FEAT_RPRES arm64: cpufeature: add HWCAP for FEAT_RPRES arm64: add ID_AA64ISAR2_EL1 sys register arm64: cpufeature: add HWCAP for FEAT_AFP * for-next/atomics: : arm64 atomics clean-ups and codegen improvements arm64: atomics: lse: define RETURN ops in terms of FETCH ops arm64: atomics: lse: improve constraints for simple ops arm64: atomics: lse: define ANDs in terms of ANDNOTs arm64: atomics lse: define SUBs in terms of ADDs arm64: atomics: format whitespace consistently * for-next/bti: : BTI clean-ups arm64: Ensure that the 'bti' macro is defined where linkage.h is included arm64: Use BTI C directly and unconditionally arm64: Unconditionally override SYM_FUNC macros arm64: Add macro version of the BTI instruction arm64: ftrace: add missing BTIs arm64: kexec: use __pa_symbol(empty_zero_page) arm64: update PAC description for kernel * for-next/sve: : SVE code clean-ups and refactoring in prepararation of Scalable Matrix Extensions arm64/sve: Minor clarification of ABI documentation arm64/sve: Generalise vector length configuration prctl() for SME arm64/sve: Make sysctl interface for SVE reusable by SME * for-next/kselftest: : arm64 kselftest additions kselftest/arm64: Add pidbench for floating point syscall cases kselftest/arm64: Add a test program to exercise the syscall ABI kselftest/arm64: Allow signal tests to trigger from a function kselftest/arm64: Parameterise ptrace vector length information * for-next/kcsan: : Enable KCSAN for arm64 arm64: Enable KCSAN
2021-12-10arm64: kexec: reduce calls to page_address()Rongwei Wang
In kexec_page_alloc(), page_address() is called twice. This patch add a new variable to help to reduce calls to page_address(). Signed-off-by: Rongwei Wang <rongwei.wang@linux.alibaba.com> Link: https://lore.kernel.org/r/20211125170600.1608-3-rongwei.wang@linux.alibaba.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-12-02arm64: kexec: use __pa_symbol(empty_zero_page)Mark Rutland
In machine_kexec_post_load() we use __pa() on `empty_zero_page`, so that we can use the physical address during arm64_relocate_new_kernel() to switch TTBR1 to a new set of tables. While `empty_zero_page` is part of the old kernel, we won't clobber it until after this switch, so using it is benign. However, `empty_zero_page` is part of the kernel image rather than a linear map address, so it is not correct to use __pa(x), and we should instead use __pa_symbol(x) or __pa(lm_alias(x)). Otherwise, when the kernel is built with DEBUG_VIRTUAL, we'll encounter splats as below, as I've seen when fuzzing v5.16-rc3 with Syzkaller: | ------------[ cut here ]------------ | virt_to_phys used for non-linear address: 000000008492561a (empty_zero_page+0x0/0x1000) | WARNING: CPU: 3 PID: 11492 at arch/arm64/mm/physaddr.c:15 __virt_to_phys+0x120/0x1c0 arch/arm64/mm/physaddr.c:12 | CPU: 3 PID: 11492 Comm: syz-executor.0 Not tainted 5.16.0-rc3-00001-g48bd452a045c #1 | Hardware name: linux,dummy-virt (DT) | pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : __virt_to_phys+0x120/0x1c0 arch/arm64/mm/physaddr.c:12 | lr : __virt_to_phys+0x120/0x1c0 arch/arm64/mm/physaddr.c:12 | sp : ffff80001af17bb0 | x29: ffff80001af17bb0 x28: ffff1cc65207b400 x27: ffffb7828730b120 | x26: 0000000000000e11 x25: 0000000000000000 x24: 0000000000000001 | x23: ffffb7828963e000 x22: ffffb78289644000 x21: 0000600000000000 | x20: 000000000000002d x19: 0000b78289644000 x18: 0000000000000000 | x17: 74706d6528206131 x16: 3635323934383030 x15: 303030303030203a | x14: 1ffff000035e2eb8 x13: ffff6398d53f4f0f x12: 1fffe398d53f4f0e | x11: 1fffe398d53f4f0e x10: ffff6398d53f4f0e x9 : ffffb7827c6f76dc | x8 : ffff1cc6a9fa7877 x7 : 0000000000000001 x6 : ffff6398d53f4f0f | x5 : 0000000000000000 x4 : 0000000000000000 x3 : ffff1cc66f2a99c0 | x2 : 0000000000040000 x1 : d7ce7775b09b5d00 x0 : 0000000000000000 | Call trace: | __virt_to_phys+0x120/0x1c0 arch/arm64/mm/physaddr.c:12 | machine_kexec_post_load+0x284/0x670 arch/arm64/kernel/machine_kexec.c:150 | do_kexec_load+0x570/0x670 kernel/kexec.c:155 | __do_sys_kexec_load kernel/kexec.c:250 [inline] | __se_sys_kexec_load kernel/kexec.c:231 [inline] | __arm64_sys_kexec_load+0x1d8/0x268 kernel/kexec.c:231 | __invoke_syscall arch/arm64/kernel/syscall.c:38 [inline] | invoke_syscall+0x90/0x2e0 arch/arm64/kernel/syscall.c:52 | el0_svc_common.constprop.2+0x1e4/0x2f8 arch/arm64/kernel/syscall.c:142 | do_el0_svc+0xf8/0x150 arch/arm64/kernel/syscall.c:181 | el0_svc+0x60/0x248 arch/arm64/kernel/entry-common.c:603 | el0t_64_sync_handler+0x90/0xb8 arch/arm64/kernel/entry-common.c:621 | el0t_64_sync+0x180/0x184 arch/arm64/kernel/entry.S:572 | irq event stamp: 2428 | hardirqs last enabled at (2427): [<ffffb7827c6f2308>] __up_console_sem+0xf0/0x118 kernel/printk/printk.c:255 | hardirqs last disabled at (2428): [<ffffb7828223df98>] el1_dbg+0x28/0x80 arch/arm64/kernel/entry-common.c:375 | softirqs last enabled at (2424): [<ffffb7827c411c00>] softirq_handle_end kernel/softirq.c:401 [inline] | softirqs last enabled at (2424): [<ffffb7827c411c00>] __do_softirq+0xa28/0x11e4 kernel/softirq.c:587 | softirqs last disabled at (2417): [<ffffb7827c59015c>] do_softirq_own_stack include/asm-generic/softirq_stack.h:10 [inline] | softirqs last disabled at (2417): [<ffffb7827c59015c>] invoke_softirq kernel/softirq.c:439 [inline] | softirqs last disabled at (2417): [<ffffb7827c59015c>] __irq_exit_rcu kernel/softirq.c:636 [inline] | softirqs last disabled at (2417): [<ffffb7827c59015c>] irq_exit_rcu+0x53c/0x688 kernel/softirq.c:648 | ---[ end trace 0ca578534e7ca938 ]--- With or without DEBUG_VIRTUAL __pa() will fall back to __kimg_to_phys() for non-linear addresses, and will happen to do the right thing in this case, even with the warning. But we should not depend upon this, and to keep the warning useful we should fix this case. Fix this issue by using __pa_symbol(), which handles kernel image addresses (and checks its input is a kernel image address). This matches what we do elsewhere, e.g. in arch/arm64/include/asm/pgtable.h: | #define ZERO_PAGE(vaddr) phys_to_page(__pa_symbol(empty_zero_page)) Fixes: 3744b5280e67 ("arm64: kexec: install a copy of the linear-map") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Link: https://lore.kernel.org/r/20211130121849.3319010-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-01arm64: kexec: remove cpu-reset.hPasha Tatashin
This header contains only cpu_soft_restart() which is never used directly anymore. So, remove this header, and rename the helper to be cpu_soft_restart(). Suggested-by: James Morse <james.morse@arm.com> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210930143113.1502553-15-pasha.tatashin@soleen.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-01arm64: kexec: remove the pre-kexec PoC maintenancePasha Tatashin
Now that kexec does its relocations with the MMU enabled, we no longer need to clean the relocation data to the PoC. Suggested-by: James Morse <james.morse@arm.com> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210930143113.1502553-14-pasha.tatashin@soleen.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-01arm64: kexec: keep MMU enabled during kexec relocationPasha Tatashin
Now, that we have linear map page tables configured, keep MMU enabled to allow faster relocation of segments to final destination. Cavium ThunderX2: Kernel Image size: 38M Iniramfs size: 46M Total relocation size: 84M MMU-disabled: relocation 7.489539915s MMU-enabled: relocation 0.03946095s Broadcom Stingray: The performance data: for a moderate size kernel + initramfs: 25M the relocation was taking 0.382s, with enabled MMU it now takes 0.019s only or x20 improvement. The time is proportional to the size of relocation, therefore if initramfs is larger, 100M it could take over a second. Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Tested-by: Pingfan Liu <piliu@redhat.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210930143113.1502553-13-pasha.tatashin@soleen.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-01arm64: kexec: install a copy of the linear-mapPasha Tatashin
To perform the kexec relocation with the MMU enabled, we need a copy of the linear map. Create one, and install it from the relocation code. This has to be done from the assembly code as it will be idmapped with TTBR0. The kernel runs in TTRB1, so can't use the break-before-make sequence on the mapping it is executing from. The makes no difference yet as the relocation code runs with the MMU disabled. Suggested-by: James Morse <james.morse@arm.com> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210930143113.1502553-12-pasha.tatashin@soleen.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-01arm64: kexec: use ld script for relocation functionPasha Tatashin
Currently, relocation code declares start and end variables which are used to compute its size. The better way to do this is to use ld script, and put relocation function in its own section. Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210930143113.1502553-11-pasha.tatashin@soleen.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-01arm64: kexec: relocate in EL1 modePasha Tatashin
Since we are going to keep MMU enabled during relocation, we need to keep EL1 mode throughout the relocation. Keep EL1 enabled, and switch EL2 only before entering the new world. Suggested-by: James Morse <james.morse@arm.com> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210930143113.1502553-10-pasha.tatashin@soleen.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-01arm64: kexec: configure EL2 vectors for kexecPasha Tatashin
If we have a EL2 mode without VHE, the EL2 vectors are needed in order to switch to EL2 and jump to new world with hypervisor privileges. In preparation to MMU enabled relocation, configure our EL2 table now. Kexec uses #HVC_SOFT_RESTART to branch to the new world, so extend el1_sync vector that is provided by trans_pgd_copy_el2_vectors() to support this case. Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210930143113.1502553-9-pasha.tatashin@soleen.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-01arm64: kexec: pass kimage as the only argument to relocation functionPasha Tatashin
Currently, kexec relocation function (arm64_relocate_new_kernel) accepts the following arguments: head: start of array that contains relocation information. entry: entry point for new kernel or purgatory. dtb_mem: first and only argument to entry. The number of arguments cannot be easily expended, because this function is also called from HVC_SOFT_RESTART, which preserves only three arguments. And, also arm64_relocate_new_kernel is written in assembly but called without stack, thus no place to move extra arguments to free registers. Soon, we will need to pass more arguments: once we enable MMU we will need to pass information about page tables. Pass kimage to arm64_relocate_new_kernel, and teach it to get the required fields from kimage. Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210930143113.1502553-8-pasha.tatashin@soleen.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-01arm64: kexec: skip relocation code for inplace kexecPasha Tatashin
In case of kdump or when segments are already in place the relocation is not needed, therefore the setup of relocation function and call to it can be skipped. Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Suggested-by: James Morse <james.morse@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210930143113.1502553-6-pasha.tatashin@soleen.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-01arm64: kexec: flush image and lists during kexec load timePasha Tatashin
Currently, during kexec load we are copying relocation function and flushing it. However, we can also flush kexec relocation buffers and if new kernel image is already in place (i.e. crash kernel), we can also flush the new kernel image itself. Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210930143113.1502553-5-pasha.tatashin@soleen.com Signed-off-by: Will Deacon <will@kernel.org>
2021-07-08set_memory: allow querying whether set_direct_map_*() is actually enabledMike Rapoport
On arm64, set_direct_map_*() functions may return 0 without actually changing the linear map. This behaviour can be controlled using kernel parameters, so we need a way to determine at runtime whether calls to set_direct_map_invalid_noflush() and set_direct_map_default_noflush() have any effect. Extend set_memory API with can_set_direct_map() function that allows checking if calling set_direct_map_*() will actually change the page table, replace several occurrences of open coded checks in arm64 with the new function and provide a generic stub for architectures that always modify page tables upon calls to set_direct_map APIs. [arnd@arndb.de: arm64: kfence: fix header inclusion ] Link: https://lkml.kernel.org/r/20210518072034.31572-4-rppt@kernel.org Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Christopher Lameter <cl@linux.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Elena Reshetova <elena.reshetova@intel.com> Cc: Hagen Paul Pfeifer <hagen@jauu.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Bottomley <jejb@linux.ibm.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Palmer Dabbelt <palmerdabbelt@google.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Roman Gushchin <guro@fb.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tycho Andersen <tycho@tycho.ws> Cc: Will Deacon <will@kernel.org> Cc: kernel test robot <lkp@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-25arm64: Rename arm64-internal cache maintenance functionsFuad Tabba
Although naming across the codebase isn't that consistent, it tends to follow certain patterns. Moreover, the term "flush" isn't defined in the Arm Architecture reference manual, and might be interpreted to mean clean, invalidate, or both for a cache. Rename arm64-internal functions to make the naming internally consistent, as well as making it consistent with the Arm ARM, by specifying whether it applies to the instruction, data, or both caches, whether the operation is a clean, invalidate, or both. Also specify which point the operation applies to, i.e., to the point of unification (PoU), coherency (PoC), or persistence (PoP). This commit applies the following sed transformation to all files under arch/arm64: "s/\b__flush_cache_range\b/caches_clean_inval_pou_macro/g;"\ "s/\b__flush_icache_range\b/caches_clean_inval_pou/g;"\ "s/\binvalidate_icache_range\b/icache_inval_pou/g;"\ "s/\b__flush_dcache_area\b/dcache_clean_inval_poc/g;"\ "s/\b__inval_dcache_area\b/dcache_inval_poc/g;"\ "s/__clean_dcache_area_poc\b/dcache_clean_poc/g;"\ "s/\b__clean_dcache_area_pop\b/dcache_clean_pop/g;"\ "s/\b__clean_dcache_area_pou\b/dcache_clean_pou/g;"\ "s/\b__flush_cache_user_range\b/caches_clean_inval_user_pou/g;"\ "s/\b__flush_icache_all\b/icache_inval_all_pou/g;" Note that __clean_dcache_area_poc is deliberately missing a word boundary check at the beginning in order to match the efistub symbols in image-vars.h. Also note that, despite its name, __flush_icache_range operates on both instruction and data caches. The name change here reflects that. No functional change intended. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210524083001.2586635-19-tabba@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25arm64: __flush_dcache_area to take end parameter instead of sizeFuad Tabba
To be consistent with other functions with similar names and functionality in cacheflush.h, cache.S, and cachetlb.rst, change to specify the range in terms of start and end, as opposed to start and size. No functional change intended. Reported-by: Will Deacon <will@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210524083001.2586635-13-tabba@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25arm64: Downgrade flush_icache_range to invalidateFuad Tabba
Since __flush_dcache_area is called right before, invalidate_icache_range is sufficient in this case. Rewrite the comment to better explain the rationale behind the cache maintenance operations used here. No functional change intended. Possible performance impact due to invalidating only the icache rather than invalidating and cleaning both caches. Reported-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/linux-arch/20200511110014.lb9PEahJ4hVOYrbwIb_qUHXyNy9KQzNFdb_I3YlzY6A@z/ Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210524083001.2586635-7-tabba@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-01-27arm64: kexec: call kexec_image_info only oncePavel Tatashin
Currently, kexec_image_info() is called during load time, and right before kernel is being kexec'ed. There is no need to do both. So, call it only once when segments are loaded and the physical location of page with copy of arm64_relocate_new_kernel is known. Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com> Acked-by: James Morse <james.morse@arm.com> Link: https://lore.kernel.org/r/20210125191923.1060122-11-pasha.tatashin@soleen.com Signed-off-by: Will Deacon <will@kernel.org>
2021-01-27arm64: kexec: move relocation function setupPavel Tatashin
Currently, kernel relocation function is configured in machine_kexec() at the time of kexec reboot by using control_code_page. This operation, however, is more logical to be done during kexec_load, and thus remove from reboot time. Move, setup of this function to newly added machine_kexec_post_load(). Because once MMU is enabled, kexec control page will contain more than relocation kernel, but also vector table, add pointer to the actual function within this page arch.kern_reloc. Currently, it equals to the beginning of page, we will add offsets later, when vector table is added. Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com> Reviewed-by: James Morse <james.morse@arm.com> Link: https://lore.kernel.org/r/20210125191923.1060122-10-pasha.tatashin@soleen.com Signed-off-by: Will Deacon <will@kernel.org>
2021-01-27arm64: kexec: make dtb_mem always enabledPavel Tatashin
Currently, dtb_mem is enabled only when CONFIG_KEXEC_FILE is enabled. This adds ugly ifdefs to c files. Always enabled dtb_mem, when it is not used, it is NULL. Change the dtb_mem to phys_addr_t, as it is a physical address. Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com> Reviewed-by: James Morse <james.morse@arm.com> Link: https://lore.kernel.org/r/20210125191923.1060122-2-pasha.tatashin@soleen.com Signed-off-by: Will Deacon <will@kernel.org>
2020-05-11arm64: fix the flush_icache_range arguments in machine_kexecChristoph Hellwig
The second argument is the end "pointer", not the length. Fixes: d28f6df1305a ("arm64/kexec: Add core kexec support") Cc: <stable@vger.kernel.org> # 4.8.x- Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-01-10Revert "arm64: kexec: make dtb_mem always enabled"Will Deacon
Adding crash dump support to 'kexec_file' is going to extend 'struct kimage_arch' with more 'kexec_file'-specific members. The cleanup here then starts to get in the way, so revert it. This reverts commit 621516789ee6e285cb2088fe4706eedd030d38bf. Reported-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-01-08arm64: kexec: make dtb_mem always enabledPavel Tatashin
Currently, dtb_mem is enabled only when CONFIG_KEXEC_FILE is enabled. This adds ugly ifdefs to c files. Always enabled dtb_mem, when it is not used, it is NULL. Change the dtb_mem to phys_addr_t, as it is a physical address. Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-01-08arm64: kexec: remove unnecessary debug printsPavel Tatashin
The kexec_image_info() outputs all the necessary information about the upcoming kexec. The extra debug printfs in machine_kexec() are not needed. Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-06-19treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500Thomas Gleixner
Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation # extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4122 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Enrico Weigelt <info@metux.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-05arm64: kdump: no need to mark crashkernel pages manually PG_reservedDavid Hildenbrand
The crashkernel is reserved via memblock_reserve(). memblock_free_all() will call free_low_memory_core_early(), which will go over all reserved memblocks, marking the pages as PG_reserved. So manually marking pages as PG_reserved is not necessary, they are already in the desired state (otherwise they would have been handed over to the buddy as free pages and bad things would happen). Link: http://lkml.kernel.org/r/20190114125903.24845-8-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Matthias Brugger <mbrugger@suse.com> Reviewed-by: Bhupesh Sharma <bhsharma@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: James Morse <james.morse@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Dave Kleikamp <dave.kleikamp@oracle.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Stefan Agner <stefan@agner.ch> Cc: Laura Abbott <labbott@redhat.com> Cc: Greg Hackmann <ghackmann@android.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kristina Martsenko <kristina.martsenko@arm.com> Cc: CHANDAN VN <chandan.vn@samsung.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05arm64: kexec: no need to ClearPageReserved()David Hildenbrand
This will be done by free_reserved_page(). Link: http://lkml.kernel.org/r/20190114125903.24845-7-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: James Morse <james.morse@arm.com> Reviewed-by: Bhupesh Sharma <bhsharma@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Dave Kleikamp <dave.kleikamp@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-06arm64: kexec_file: invoke the kernel without purgatoryAKASHI Takahiro
On arm64, purgatory would do almost nothing. So just invoke secondary kernel directly by jumping into its entry code. While, in this case, cpu_soft_restart() must be called with dtb address in the fifth argument, the behavior still stays compatible with kexec_load case as long as the argument is null. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Reviewed-by: James Morse <james.morse@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-09-11arm64: kernel: arch_crash_save_vmcoreinfo() should depend on CONFIG_CRASH_COREJames Morse
Since commit 23c85094fe18 ("proc/kcore: add vmcoreinfo note to /proc/kcore") the kernel has exported the vmcoreinfo PT_NOTE on /proc/kcore as well as /proc/vmcore. arm64 only exposes it's additional arch information via arch_crash_save_vmcoreinfo() if built with CONFIG_KEXEC, as kdump was previously the only user of vmcoreinfo. Move this weak function to a separate file that is built at the same time as its caller in kernel/crash_core.c. This ensures values like 'kimage_voffset' are always present in the vmcoreinfo PT_NOTE. CC: AKASHI Takahiro <takahiro.akashi@linaro.org> Reviewed-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-31arm64: kexec: Add comment to explain use of __flush_icache_range()Will Deacon
Now that we understand the deadlock arising from flush_icache_range() on the kexec crash kernel path, add a comment to justify the use of __flush_icache_range() here. Reported-by: Dave Kleikamp <dave.kleikamp@oracle.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-31arm64, kaslr: export offset in VMCOREINFO ELF notesBhupesh Sharma
Include KASLR offset in arm64 VMCOREINFO ELF notes to assist in debugging. vmcore parsing in user-space already expects this value in the notes and we are providing it for portability of those existing tools with x86. Ideally we would like core code to do this (so that way this information won't be missed when an architecture adds KASLR support), but mips has CONFIG_RANDOMIZE_BASE, and doesn't provide kaslr_offset(), so I am not sure if this is needed for mips (and other such similar arch cases in future). So, lets keep this architecture specific for now. As an example of a user-space use-case, consider the makedumpfile user-space utility which will need fixup to use this KASLR offset to work with cases where we need to find a way to translate symbol address from vmlinux to kernel run time address in case of KASLR boot on arm64. I have already submitted the makedumpfile user-space patch upstream and the maintainer has suggested to wait for the kernel changes to be included (see [0]). I tested this on my qualcomm amberwing board both for KASLR and non-KASLR boot cases: Without this patch: # cat > scrub.conf << EOF [vmlinux] erase jiffies erase init_task.utime for tsk in init_task.tasks.next within task_struct:tasks erase tsk.utime endfor EOF # makedumpfile --split -d 31 -x vmlinux --config scrub.conf vmcore dumpfile_{1,2,3} readpage_elf: Attempt to read non-existent page at 0xffffa8a5bf180000. readmem: type_addr: 1, addr:ffffa8a5bf180000, size:8 vaddr_to_paddr_arm64: Can't read pgd readmem: Can't convert a virtual address(ffff0000092a542c) to physical address. readmem: type_addr: 0, addr:ffff0000092a542c, size:390 check_release: Can't get the address of system_utsname After this patch check_release() is ok, and also we are able to erase symbol from vmcore (I checked this with kernel 4.18.0-rc4+): # makedumpfile --split -d 31 -x vmlinux --config scrub.conf vmcore dumpfile_{1,2,3} The kernel version is not supported. The makedumpfile operation may be incomplete. Checking for memory holes : [100.0 %] \ Checking for memory holes : [100.0 %] | Checking foExcluding unnecessary pages : [100.0 %] \ Excluding unnecessary pages : [100.0 %] \ The dumpfiles are saved to dumpfile_1, dumpfile_2, and dumpfile_3. makedumpfile Completed. [0] https://www.spinics.net/lists/kexec/msg21195.html Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Acked-by: James Morse <james.morse@arm.com> Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-30arm64: kexec: machine_kexec should call __flush_icache_rangeDave Kleikamp
machine_kexec flushes the reboot_code_buffer from the icache after stopping the other cpus. Commit 3b8c9f1cdfc5 ("arm64: IPI each CPU after invalidating the I-cache for kernel mappings") added an IPI call to flush_icache_range, which causes a hang here, so replace the call with __flush_icache_range Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-04arm64: kexec: always reset to EL2 if presentMark Rutland
Currently machine_kexec() doesn't reset to EL2 in the case of a crashdump kernel. This leaves potentially dodgy state active at EL2, and means that if the crashdump kernel attempts to online secondary CPUs, these will be booted as mismatched ELs. Let's reset to EL2, as we do in all other cases, and simplify things. If EL2 state is corrupt, things are already sufficiently bad that kdump is unlikely to work, and it's best-effort regardless. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-11-02arm64: explicitly mask all exceptionsJames Morse
There are a few places where we want to mask all exceptions. Today we do this in a piecemeal fashion, typically we expect the caller to have masked irqs and the arch code masks debug exceptions, ignoring serror which is probably masked. Make it clear that 'mask all exceptions' is the intention by adding helpers to do exactly that. This will let us unmask SError without having to add 'oh and SError' to these paths. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Julien Thierry <julien.thierry@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-08-21arm64: kexec: have own crash_smp_send_stop() for crash dump for nonpanic coresHoeun Ryu
Commit 0ee5941 : (x86/panic: replace smp_send_stop() with kdump friendly version in panic path) introduced crash_smp_send_stop() which is a weak function and can be overridden by architecture codes to fix the side effect caused by commit f06e515 : (kernel/panic.c: add "crash_kexec_post_ notifiers" option). ARM64 architecture uses the weak version function and the problem is that the weak function simply calls smp_send_stop() which makes other CPUs offline and takes away the chance to save crash information for nonpanic CPUs in machine_crash_shutdown() when crash_kexec_post_notifiers kernel option is enabled. Calling smp_send_crash_stop() in machine_crash_shutdown() is useless because all nonpanic CPUs are already offline by smp_send_stop() in this case and smp_send_crash_stop() only works against online CPUs. The result is that secondary CPUs registers are not saved by crash_save_cpu() and the vmcore file misreports these CPUs as being offline. crash_smp_send_stop() is implemented to fix this problem by replacing the existing smp_send_crash_stop() and adding a check for multiple calling to the function. The function (strong symbol version) saves crash information for nonpanic CPUs and machine_crash_shutdown() tries to save crash information for nonpanic CPUs only when crash_kexec_post_notifiers kernel option is disabled. * crash_kexec_post_notifiers : false panic() __crash_kexec() machine_crash_shutdown() crash_smp_send_stop() <= save crash dump for nonpanic cores * crash_kexec_post_notifiers : true panic() crash_smp_send_stop() <= save crash dump for nonpanic cores __crash_kexec() machine_crash_shutdown() crash_smp_send_stop() <= just return. Signed-off-by: Hoeun Ryu <hoeun.ryu@gmail.com> Reviewed-by: James Morse <james.morse@arm.com> Tested-by: James Morse <james.morse@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-04-05arm64: kdump: add VMCOREINFO's for user-space toolsAKASHI Takahiro
In addition to common VMCOREINFO's defined in crash_save_vmcoreinfo_init(), we need to know, for crash utility, - kimage_voffset - PHYS_OFFSET to examine the contents of a dump file (/proc/vmcore) correctly due to the introduction of KASLR (CONFIG_RANDOMIZE_BASE) in v4.6. - VA_BITS is also required for makedumpfile command. arch_crash_save_vmcoreinfo() appends them to the dump file. More VMCOREINFO's may be added later. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Reviewed-by: James Morse <james.morse@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-04-05arm64: kdump: implement machine_crash_shutdown()AKASHI Takahiro
Primary kernel calls machine_crash_shutdown() to shut down non-boot cpus and save registers' status in per-cpu ELF notes before starting crash dump kernel. See kernel_kexec(). Even if not all secondary cpus have shut down, we do kdump anyway. As we don't have to make non-boot(crashed) cpus offline (to preserve correct status of cpus at crash dump) before shutting down, this patch also adds a variant of smp_send_stop(). Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Reviewed-by: James Morse <james.morse@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-04-05arm64: hibernate: preserve kdump image around hibernationAKASHI Takahiro
Since arch_kexec_protect_crashkres() removes a mapping for crash dump kernel image, the loaded data won't be preserved around hibernation. In this patch, helper functions, crash_prepare_suspend()/ crash_post_resume(), are additionally called before/after hibernation so that the relevant memory segments will be mapped again and preserved just as the others are. In addition, to minimize the size of hibernation image, crash_is_nosave() is added to pfn_is_nosave() in order to recognize only the pages that hold loaded crash dump kernel image as saveable. Hibernation excludes any pages that are marked as Reserved and yet "nosave." Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-04-05arm64: kdump: protect crash dump kernel memoryTakahiro Akashi
arch_kexec_protect_crashkres() and arch_kexec_unprotect_crashkres() are meant to be called by kexec_load() in order to protect the memory allocated for crash dump kernel once the image is loaded. The protection is implemented by unmapping the relevant segments in crash dump kernel memory, rather than making it read-only as other archs do, to prevent coherency issues due to potential cache aliasing (with mismatched attributes). Page-level mappings are consistently used here so that we can change the attributes of segments in page granularity as well as shrink the region also in page granularity through /sys/kernel/kexec_crash_size, putting the freed memory back to buddy system. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-06-27arm64/kexec: Add pr_debug outputGeoff Levand
To aid in debugging kexec problems or when adding new functionality to kexec add a new routine kexec_image_info() and several inline pr_debug statements. Signed-off-by: Geoff Levand <geoff@infradead.org> Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-06-27arm64/kexec: Add core kexec supportGeoff Levand
Add three new files, kexec.h, machine_kexec.c and relocate_kernel.S to the arm64 architecture that add support for the kexec re-boot mechanism (CONFIG_KEXEC) on arm64 platforms. Signed-off-by: Geoff Levand <geoff@infradead.org> Reviewed-by: James Morse <james.morse@arm.com> [catalin.marinas@arm.com: removed dead code following James Morse's comments] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>