summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-02-05ubsan: implement __ubsan_handle_alignment_assumptionNathan Chancellor
When building ARCH=mips 32r2el_defconfig with CONFIG_UBSAN_ALIGNMENT: ld.lld: error: undefined symbol: __ubsan_handle_alignment_assumption referenced by slab.h:557 (include/linux/slab.h:557) main.o:(do_initcalls) in archive init/built-in.a referenced by slab.h:448 (include/linux/slab.h:448) do_mounts_rd.o:(rd_load_image) in archive init/built-in.a referenced by slab.h:448 (include/linux/slab.h:448) do_mounts_rd.o:(identify_ramdisk_image) in archive init/built-in.a referenced 1579 more times Implement this for the kernel based on LLVM's handleAlignmentAssumptionImpl because the kernel is not linked against the compiler runtime. Link: https://github.com/ClangBuiltLinux/linux/issues/1245 Link: https://github.com/llvm/llvm-project/blob/llvmorg-11.0.1/compiler-rt/lib/ubsan/ubsan_handlers.cpp#L151-L190 Link: https://lkml.kernel.org/r/20210127224451.2587372-1-nathan@kernel.org Signed-off-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Kees Cook <keescook@chromium.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05kasan: make addr_has_metadata() return true for valid addressesVincenzo Frascino
Currently, addr_has_metadata() returns true for every address. An invalid address (e.g. NULL) passed to the function when, KASAN_HW_TAGS is enabled, leads to a kernel panic. Make addr_has_metadata() return true for valid addresses only. Note: KASAN_HW_TAGS support for vmalloc will be added with a future patch. Link: https://lkml.kernel.org/r/20210126134409.47894-3-vincenzo.frascino@arm.com Fixes: 2e903b91479782b7 ("kasan, arm64: implement HW_TAGS runtime") Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Andrey Konovalov <andreyknvl@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Leon Romanovsky <leonro@mellanox.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Naresh Kamboju <naresh.kamboju@linaro.org> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05kasan: add explicit preconditions to kasan_report()Vincenzo Frascino
Patch series "kasan: Fix metadata detection for KASAN_HW_TAGS", v5. With the introduction of KASAN_HW_TAGS, kasan_report() currently assumes that every location in memory has valid metadata associated. This is due to the fact that addr_has_metadata() returns always true. As a consequence of this, an invalid address (e.g. NULL pointer address) passed to kasan_report() when KASAN_HW_TAGS is enabled, leads to a kernel panic. Example below, based on arm64: BUG: KASAN: invalid-access in 0x0 Read at addr 0000000000000000 by task swapper/0/1 Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 Mem abort info: ESR = 0x96000004 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x00000004 CM = 0, WnR = 0 ... Call trace: mte_get_mem_tag+0x24/0x40 kasan_report+0x1a4/0x410 alsa_sound_last_init+0x8c/0xa4 do_one_initcall+0x50/0x1b0 kernel_init_freeable+0x1d4/0x23c kernel_init+0x14/0x118 ret_from_fork+0x10/0x34 Code: d65f03c0 9000f021 f9428021 b6cfff61 (d9600000) ---[ end trace 377c8bb45bdd3a1a ]--- hrtimer: interrupt took 48694256 ns note: swapper/0[1] exited with preempt_count 1 Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b SMP: stopping secondary CPUs Kernel Offset: 0x35abaf140000 from 0xffff800010000000 PHYS_OFFSET: 0x40000000 CPU features: 0x0a7e0152,61c0a030 Memory Limit: none ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]--- This series fixes the behavior of addr_has_metadata() that now returns true only when the address is valid. This patch (of 2): With the introduction of KASAN_HW_TAGS, kasan_report() accesses the metadata only when addr_has_metadata() succeeds. Add a comment to make sure that the preconditions to the function are explicitly clarified. Link: https://lkml.kernel.org/r/20210126134409.47894-1-vincenzo.frascino@arm.com Link: https://lkml.kernel.org/r/20210126134409.47894-2-vincenzo.frascino@arm.com Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Andrey Konovalov <andreyknvl@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Leon Romanovsky <leonro@mellanox.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05mm/filemap: add missing mem_cgroup_uncharge() to __add_to_page_cache_locked()Waiman Long
Commit 3fea5a499d57 ("mm: memcontrol: convert page cache to a new mem_cgroup_charge() API") introduced a bug in __add_to_page_cache_locked() causing the following splat: page dumped because: VM_BUG_ON_PAGE(page_memcg(page)) pages's memcg:ffff8889a4116000 ------------[ cut here ]------------ kernel BUG at mm/memcontrol.c:2924! invalid opcode: 0000 [#1] SMP KASAN PTI CPU: 35 PID: 12345 Comm: cat Tainted: G S W I 5.11.0-rc4-debug+ #1 Hardware name: HP HP Z8 G4 Workstation/81C7, BIOS P60 v01.25 12/06/2017 RIP: commit_charge+0xf4/0x130 Call Trace: mem_cgroup_charge+0x175/0x770 __add_to_page_cache_locked+0x712/0xad0 add_to_page_cache_lru+0xc5/0x1f0 cachefiles_read_or_alloc_pages+0x895/0x2e10 [cachefiles] __fscache_read_or_alloc_pages+0x6c0/0xa00 [fscache] __nfs_readpages_from_fscache+0x16d/0x630 [nfs] nfs_readpages+0x24e/0x540 [nfs] read_pages+0x5b1/0xc40 page_cache_ra_unbounded+0x460/0x750 generic_file_buffered_read_get_pages+0x290/0x1710 generic_file_buffered_read+0x2a9/0xc30 nfs_file_read+0x13f/0x230 [nfs] new_sync_read+0x3af/0x610 vfs_read+0x339/0x4b0 ksys_read+0xf1/0x1c0 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Before that commit, there was a try_charge() and commit_charge() in __add_to_page_cache_locked(). These two separated charge functions were replaced by a single mem_cgroup_charge(). However, it forgot to add a matching mem_cgroup_uncharge() when the xarray insertion failed with the page released back to the pool. Fix this by adding a mem_cgroup_uncharge() call when insertion error happens. Link: https://lkml.kernel.org/r/20210125042441.20030-1-longman@redhat.com Fixes: 3fea5a499d57 ("mm: memcontrol: convert page cache to a new mem_cgroup_charge() API") Signed-off-by: Waiman Long <longman@redhat.com> Reviewed-by: Alex Shi <alex.shi@linux.alibaba.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Muchun Song <smuchun@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05mailmap: add entries for Manivannan SadhasivamManivannan Sadhasivam
Map my personal and work addresses to korg mail address. Link: https://lkml.kernel.org/r/20210201104640.108556-1-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05mailmap: fix name/email for Viresh KumarViresh Kumar
For some of the patches the email id was misspelled to linaro.com instead of linaro.org and for others Viresh Kumar was written as "viresh kumar" (all small). Fix both with help of mailmap entries. Link: https://lkml.kernel.org/r/d6b80b210d7fe0ddc1d4d0b22eff9708c72ef8b3.1612178938.git.viresh.kumar@linaro.org Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05memblock: do not start bottom-up allocations with kernel_endRoman Gushchin
With kaslr the kernel image is placed at a random place, so starting the bottom-up allocation with the kernel_end can result in an allocation failure and a warning like this one: hugetlb_cma: reserve 2048 MiB, up to 2048 MiB per node ------------[ cut here ]------------ memblock: bottom-up allocation failed, memory hotremove may be affected WARNING: CPU: 0 PID: 0 at mm/memblock.c:332 memblock_find_in_range_node+0x178/0x25a Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 5.10.0+ #1169 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014 RIP: 0010:memblock_find_in_range_node+0x178/0x25a Code: e9 6d ff ff ff 48 85 c0 0f 85 da 00 00 00 80 3d 9b 35 df 00 00 75 15 48 c7 c7 c0 75 59 88 c6 05 8b 35 df 00 01 e8 25 8a fa ff <0f> 0b 48 c7 44 24 20 ff ff ff ff 44 89 e6 44 89 ea 48 c7 c1 70 5c RSP: 0000:ffffffff88803d18 EFLAGS: 00010086 ORIG_RAX: 0000000000000000 RAX: 0000000000000000 RBX: 0000000240000000 RCX: 00000000ffffdfff RDX: 00000000ffffdfff RSI: 00000000ffffffea RDI: 0000000000000046 RBP: 0000000100000000 R08: ffffffff88922788 R09: 0000000000009ffb R10: 00000000ffffe000 R11: 3fffffffffffffff R12: 0000000000000000 R13: 0000000000000000 R14: 0000000080000000 R15: 00000001fb42c000 FS: 0000000000000000(0000) GS:ffffffff88f71000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffa080fb401000 CR3: 00000001fa80a000 CR4: 00000000000406b0 Call Trace: memblock_alloc_range_nid+0x8d/0x11e cma_declare_contiguous_nid+0x2c4/0x38c hugetlb_cma_reserve+0xdc/0x128 flush_tlb_one_kernel+0xc/0x20 native_set_fixmap+0x82/0xd0 flat_get_apic_id+0x5/0x10 register_lapic_address+0x8e/0x97 setup_arch+0x8a5/0xc3f start_kernel+0x66/0x547 load_ucode_bsp+0x4c/0xcd secondary_startup_64_no_verify+0xb0/0xbb random: get_random_bytes called from __warn+0xab/0x110 with crng_init=0 ---[ end trace f151227d0b39be70 ]--- At the same time, the kernel image is protected with memblock_reserve(), so we can just start searching at PAGE_SIZE. In this case the bottom-up allocation has the same chances to success as a top-down allocation, so there is no reason to fallback in the case of a failure. All together it simplifies the logic. Link: https://lkml.kernel.org/r/20201217201214.3414100-2-guro@fb.com Fixes: 8fabc623238e ("powerpc: Ensure that swiotlb buffer is allocated from low memory") Signed-off-by: Roman Gushchin <guro@fb.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Rik van Riel <riel@surriel.com> Cc: Wonhyuk Yang <vvghjk1234@gmail.com> Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05mm: thp: fix MADV_REMOVE deadlock on shmem THPHugh Dickins
Sergey reported deadlock between kswapd correctly doing its usual lock_page(page) followed by down_read(page->mapping->i_mmap_rwsem), and madvise(MADV_REMOVE) on an madvise(MADV_HUGEPAGE) area doing down_write(page->mapping->i_mmap_rwsem) followed by lock_page(page). This happened when shmem_fallocate(punch hole)'s unmap_mapping_range() reaches zap_pmd_range()'s call to __split_huge_pmd(). The same deadlock could occur when partially truncating a mapped huge tmpfs file, or using fallocate(FALLOC_FL_PUNCH_HOLE) on it. __split_huge_pmd()'s page lock was added in 5.8, to make sure that any concurrent use of reuse_swap_page() (holding page lock) could not catch the anon THP's mapcounts and swapcounts while they were being split. Fortunately, reuse_swap_page() is never applied to a shmem or file THP (not even by khugepaged, which checks PageSwapCache before calling), and anonymous THPs are never created in shmem or file areas: so that __split_huge_pmd()'s page lock can only be necessary for anonymous THPs, on which there is no risk of deadlock with i_mmap_rwsem. Link: https://lkml.kernel.org/r/alpine.LSU.2.11.2101161409470.2022@eggly.anvils Fixes: c444eb564fb1 ("mm: thp: make the THP mapcount atomic against __split_huge_pmd_locked()") Signed-off-by: Hugh Dickins <hughd@google.com> Reported-by: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05init/gcov: allow CONFIG_CONSTRUCTORS on UML to fix module gcovJohannes Berg
On ARCH=um, loading a module doesn't result in its constructors getting called, which breaks module gcov since the debugfs files are never registered. On the other hand, in-kernel constructors have already been called by the dynamic linker, so we can't call them again. Get out of this conundrum by allowing CONFIG_CONSTRUCTORS to be selected, but avoiding the in-kernel constructor calls. Also remove the "if !UML" from GCOV selecting CONSTRUCTORS now, since we really do want CONSTRUCTORS, just not kernel binary ones. Link: https://lkml.kernel.org/r/20210120172041.c246a2cac2fb.I1358f584b76f1898373adfed77f4462c8705b736@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Jessica Yu <jeyu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05mm/vmalloc: separate put pages and flush VM flagsRick Edgecombe
When VM_MAP_PUT_PAGES was added, it was defined with the same value as VM_FLUSH_RESET_PERMS. This doesn't seem like it will cause any big functional problems other than some excess flushing for VM_MAP_PUT_PAGES allocations. Redefine VM_MAP_PUT_PAGES to have its own value. Also, rearrange things so flags are less likely to be missed in the future. Link: https://lkml.kernel.org/r/20210122233706.9304-1-rick.p.edgecombe@intel.com Fixes: b944afc9d64d ("mm: add a VM_MAP_PUT_PAGES flag for vmap") Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Suggested-by: Matthew Wilcox <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Daniel Axtens <dja@axtens.net> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05mm, compaction: move high_pfn to the for loop scopeRokudo Yan
In fast_isolate_freepages, high_pfn will be used if a prefered one (ie PFN >= low_fn) not found. But the high_pfn is not reset before searching an free area, so when it was used as freepage, it may from another free area searched before. As a result move_freelist_head(freelist, freepage) will have unexpected behavior (eg corrupt the MOVABLE freelist) Unable to handle kernel paging request at virtual address dead000000000200 Mem abort info: ESR = 0x96000044 Exception class = DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x00000044 CM = 0, WnR = 1 [dead000000000200] address between user and kernel address ranges -000|list_cut_before(inline) -000|move_freelist_head(inline) -000|fast_isolate_freepages(inline) -000|isolate_freepages(inline) -000|compaction_alloc(?, ?) -001|unmap_and_move(inline) -001|migrate_pages([NSD:0xFFFFFF80088CBBD0] from = 0xFFFFFF80088CBD88, [NSD:0xFFFFFF80088CBBC8] get_new_p -002|__read_once_size(inline) -002|static_key_count(inline) -002|static_key_false(inline) -002|trace_mm_compaction_migratepages(inline) -002|compact_zone(?, [NSD:0xFFFFFF80088CBCB0] capc = 0x0) -003|kcompactd_do_work(inline) -003|kcompactd([X19] p = 0xFFFFFF93227FBC40) -004|kthread([X20] _create = 0xFFFFFFE1AFB26380) -005|ret_from_fork(asm) The issue was reported on an smart phone product with 6GB ram and 3GB zram as swap device. This patch fixes the issue by reset high_pfn before searching each free area, which ensure freepage and freelist match when call move_freelist_head in fast_isolate_freepages(). Link: http://lkml.kernel.org/r/20190118175136.31341-12-mgorman@techsingularity.net Link: https://lkml.kernel.org/r/20210112094720.1238444-1-wu-yan@tcl.com Fixes: 5a811889de10f1eb ("mm, compaction: use free lists to quickly locate a migration target") Signed-off-by: Rokudo Yan <wu-yan@tcl.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05mm: migrate: do not migrate HugeTLB page whose refcount is oneMuchun Song
All pages isolated for the migration have an elevated reference count and therefore seeing a reference count equal to 1 means that the last user of the page has dropped the reference and the page has became unused and there doesn't make much sense to migrate it anymore. This has been done for regular pages and this patch does the same for hugetlb pages. Although the likelihood of the race is rather small for hugetlb pages it makes sense the two code paths in sync. Link: https://lkml.kernel.org/r/20210115124942.46403-2-songmuchun@bytedance.com Signed-off-by: Muchun Song <songmuchun@bytedance.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Acked-by: Yang Shi <shy828301@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_activeMuchun Song
The page_huge_active() can be called from scan_movable_pages() which do not hold a reference count to the HugeTLB page. So when we call page_huge_active() from scan_movable_pages(), the HugeTLB page can be freed parallel. Then we will trigger a BUG_ON which is in the page_huge_active() when CONFIG_DEBUG_VM is enabled. Just remove the VM_BUG_ON_PAGE. Link: https://lkml.kernel.org/r/20210115124942.46403-6-songmuchun@bytedance.com Fixes: 7e1f049efb86 ("mm: hugetlb: cleanup using paeg_huge_active()") Signed-off-by: Muchun Song <songmuchun@bytedance.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: David Hildenbrand <david@redhat.com> Cc: Yang Shi <shy828301@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05mm: hugetlb: fix a race between isolating and freeing pageMuchun Song
There is a race between isolate_huge_page() and __free_huge_page(). CPU0: CPU1: if (PageHuge(page)) put_page(page) __free_huge_page(page) spin_lock(&hugetlb_lock) update_and_free_page(page) set_compound_page_dtor(page, NULL_COMPOUND_DTOR) spin_unlock(&hugetlb_lock) isolate_huge_page(page) // trigger BUG_ON VM_BUG_ON_PAGE(!PageHead(page), page) spin_lock(&hugetlb_lock) page_huge_active(page) // trigger BUG_ON VM_BUG_ON_PAGE(!PageHuge(page), page) spin_unlock(&hugetlb_lock) When we isolate a HugeTLB page on CPU0. Meanwhile, we free it to the buddy allocator on CPU1. Then, we can trigger a BUG_ON on CPU0, because it is already freed to the buddy allocator. Link: https://lkml.kernel.org/r/20210115124942.46403-5-songmuchun@bytedance.com Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle hugepage") Signed-off-by: Muchun Song <songmuchun@bytedance.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: David Hildenbrand <david@redhat.com> Cc: Yang Shi <shy828301@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05mm: hugetlb: fix a race between freeing and dissolving the pageMuchun Song
There is a race condition between __free_huge_page() and dissolve_free_huge_page(). CPU0: CPU1: // page_count(page) == 1 put_page(page) __free_huge_page(page) dissolve_free_huge_page(page) spin_lock(&hugetlb_lock) // PageHuge(page) && !page_count(page) update_and_free_page(page) // page is freed to the buddy spin_unlock(&hugetlb_lock) spin_lock(&hugetlb_lock) clear_page_huge_active(page) enqueue_huge_page(page) // It is wrong, the page is already freed spin_unlock(&hugetlb_lock) The race window is between put_page() and dissolve_free_huge_page(). We should make sure that the page is already on the free list when it is dissolved. As a result __free_huge_page would corrupt page(s) already in the buddy allocator. Link: https://lkml.kernel.org/r/20210115124942.46403-4-songmuchun@bytedance.com Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle hugepage") Signed-off-by: Muchun Song <songmuchun@bytedance.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Acked-by: Michal Hocko <mhocko@suse.com> Cc: David Hildenbrand <david@redhat.com> Cc: Yang Shi <shy828301@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB pageMuchun Song
If a new hugetlb page is allocated during fallocate it will not be marked as active (set_page_huge_active) which will result in a later isolate_huge_page failure when the page migration code would like to move that page. Such a failure would be unexpected and wrong. Only export set_page_huge_active, just leave clear_page_huge_active as static. Because there are no external users. Link: https://lkml.kernel.org/r/20210115124942.46403-3-songmuchun@bytedance.com Fixes: 70c3547e36f5 (hugetlbfs: add hugetlbfs_fallocate()) Signed-off-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: David Hildenbrand <david@redhat.com> Cc: Yang Shi <shy828301@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05Merge tag 'nfsd-5.11-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fix from Chuck Lever: "Fix non-page-aligned NFS READs" * tag 'nfsd-5.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: SUNRPC: Fix NFS READs that start at non-page-aligned offsets
2021-02-05Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Paolo Bonzini: "x86 has lots of small bugfixes, mostly one liners. It's quite late in 5.11-rc but none of them are related to this merge window; it's just bugs coming in at the wrong time. Of note among the others is "KVM: x86: Allow guests to see MSR_IA32_TSX_CTRL even if tsx=off" that fixes a live migration failure seen on distros that hadn't switched to tsx=off right away. ARM: - Avoid clobbering extra registers on initialisation" [ Sean Christopherson notes that commit 943dea8af21b ("KVM: x86: Update emulator context mode if SYSENTER xfers to 64-bit mode") should have had authorship credited to Jonny Barker, not to him. - Linus ] * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Set so called 'reserved CR3 bits in LM mask' at vCPU reset KVM: x86/mmu: Fix TDP MMU zap collapsible SPTEs KVM: x86: cleanup CR3 reserved bits checks KVM: SVM: Treat SVM as unsupported when running as an SEV guest KVM: x86: Update emulator context mode if SYSENTER xfers to 64-bit mode KVM: x86: Supplement __cr4_reserved_bits() with X86_FEATURE_PCID check KVM/x86: assign hva with the right value to vm_munmap the pages KVM: x86: Allow guests to see MSR_IA32_TSX_CTRL even if tsx=off Fix unsynchronized access to sev members through svm_register_enc_region KVM: Documentation: Fix documentation for nested. KVM: x86: fix CPUID entries returned by KVM_GET_CPUID2 ioctl KVM: arm64: Don't clobber x4 in __do_hyp_init
2021-02-05Merge tag 'iommu-fixes-v5.11-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU fix from Joerg Roedel: "Fix a possible NULL-ptr dereference in dev_iommu_priv_get() which is too easy to accidentially trigger from IOMMU drivers. In the current case the AMD IOMMU driver triggered it on some machines in the IO-page-fault path, so fix it once and for all" * tag 'iommu-fixes-v5.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu: Check dev->iommu in dev_iommu_priv_get() before dereferencing it
2021-02-05Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull vdpa fix from Michael Tsirkin: "A bugfix in the mlx driver I got at the last minute" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vdpa/mlx5: Restore the hardware used index after change map
2021-02-05Merge tag 'mmc-v5.11-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fixes from Ulf Hansson: "MMC core: - Limit retries when analyse of SDIO tuples fails MMC host: - sdhci: Fix linking err for sdhci-brcmstb" * tag 'mmc-v5.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: sdhci-pltfm: Fix linking err for sdhci-brcmstb mmc: core: Limit retries when analyse of SDIO tuples fails
2021-02-05Merge tag 'drm-fixes-2021-02-05-1' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "Fixes for rc7, bit bigger than I'd like at this stage, but most of the i915 stuff and some amdgpu is destined for staging and I'd rather not hold it up, the i915 changes also pulled in a few precusor code movement patches to make things cleaner, but nothing seems that horrible, and I've checked over all of it. Otherwise there is a nouveau dma-api warning regression, and a ttm page allocation warning fix, and some fixes for a bridge chip, ttm: - fix huge page warning regression i915: - Skip vswing programming for TBT - Power up combo PHY lanes for HDMI - Fix double YUV range correction on HDR planes - Fix the MST PBN divider calculation - Fix LTTPR vswing/pre-emp setting in non-transparent mode - Move the breadcrumb to the signaler if completed upon cancel - Close race between enable_breadcrumbs and cancel_breadcrumbs - Drop lru bumping on display unpinning amdgpu: - Fix retry in gem create - Vangogh fixes - Fix for display from shared buffers - Various display fixes amdkfd: - Fix regression in buffer free nouveau: - fix DMA API warning regression drm/bridge/lontium-lt9611uxc: - EDID fixes - Don't handle hotplug events in IRQ handler" * tag 'drm-fixes-2021-02-05-1' of git://anongit.freedesktop.org/drm/drm: (29 commits) drm/nouveau: fix dma syncing warning with debugging on. drm/amd/display: Decrement refcount of dc_sink before reassignment drm/amd/display: Free atomic state after drm_atomic_commit drm/amd/display: Fix dc_sink kref count in emulated_link_detect drm/amd/display: Release DSC before acquiring drm/amd/display: Revert "Fix EDID parsing after resume from suspend" drm/amd/display: Add more Clock Sources to DCN2.1 drm/amd/display: reuse current context instead of recreating one drm/amd/display: Fix DPCD translation for LTTPR AUX_RD_INTERVAL drm/amdgpu: enable freesync for A+A configs drm/amd/pm: fill in the data member of v2 gpu metrics table for vangogh drm/amdgpu/gfx10: update CGTS_TCC_DISABLE and CGTS_USER_TCC_DISABLE register offsets for VGH drm/amdkfd: fix null pointer panic while free buffer in kfd drm/amdgpu: fix the issue that retry constantly once the buffer is oversize drm/i915/dp: Fix LTTPR vswing/pre-emp setting in non-transparent mode drm/i915/dp: Move intel_dp_set_signal_levels() to intel_dp_link_training.c drm/i915: Fix the MST PBN divider calculation drm/dp/mst: Export drm_dp_get_vc_payload_bw() drm/i915/gem: Drop lru bumping on display unpinning drm/i915/gt: Close race between enable_breadcrumbs and cancel_breadcrumbs ...
2021-02-05ntp: Use freezable workqueue for RTC synchronizationGeert Uytterhoeven
The bug fixed by commit e3fab2f3de081e98 ("ntp: Fix RTC synchronization on 32-bit platforms") revealed an underlying issue: RTC synchronization may happen anytime, even while the system is partially suspended. On systems where the RTC is connected to an I2C bus, the I2C bus controller may already or still be suspended, triggering a WARNING during suspend or resume from s2ram: WARNING: CPU: 0 PID: 124 at drivers/i2c/i2c-core.h:54 __i2c_transfer+0x634/0x680 i2c i2c-6: Transfer while suspended [...] Workqueue: events_power_efficient sync_hw_clock [...] (__i2c_transfer) (i2c_transfer) (regmap_i2c_read) ... (da9063_rtc_set_time) (rtc_set_time) (sync_hw_clock) (process_one_work) Fix this race condition by using the freezable instead of the normal power-efficient workqueue. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/20210125143039.1051912-1-geert+renesas@glider.be
2021-02-05vdpa/mlx5: Restore the hardware used index after change mapEli Cohen
When a change of memory map occurs, the hardware resources are destroyed and then re-created again with the new memory map. In such case, we need to restore the hardware available and used indices. The driver failed to restore the used index which is added here. Also, since the driver also fails to reset the available and used indices upon device reset, fix this here to avoid regression caused by the fact that used index may not be zero upon device reset. Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20210204073618.36336-1-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2021-02-05smb3: fix crediting for compounding when only one request in flightPavel Shilovsky
Currently we try to guess if a compound request is going to succeed waiting for credits or not based on the number of requests in flight. This approach doesn't work correctly all the time because there may be only one request in flight which is going to bring multiple credits satisfying the compound request. Change the behavior to fail a request only if there are no requests in flight at all and proceed waiting for credits otherwise. Cc: <stable@vger.kernel.org> # 5.1+ Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Tom Talpey <tom@talpey.com> Reviewed-by: Shyam Prasad N <nspmangalore@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-02-05dma-mapping: benchmark: use u8 for reserved field in uAPI structureBarry Song
The original code put five u32 before a u64 expansion[10] array. Five is odd, this will cause trouble in the extension of the structure by adding new features. This patch moves to use u8 for reserved field to avoid future alignment risk. Meanwhile, it also clears the memory of struct map_benchmark in tools, otherwise, if users use old version to run on newer kernel, the random expansion value will cause side effect on newer kernel. Signed-off-by: Barry Song <song.bao.hua@hisilicon.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-02-05ARM: kexec: fix oops after TLB are invalidatedRussell King
Giancarlo Ferrari reports the following oops while trying to use kexec: Unable to handle kernel paging request at virtual address 80112f38 pgd = fd7ef03e [80112f38] *pgd=0001141e(bad) Internal error: Oops: 80d [#1] PREEMPT SMP ARM ... This is caused by machine_kexec() trying to set the kernel text to be read/write, so it can poke values into the relocation code before copying it - and an interrupt occuring which changes the page tables. The subsequent writes then hit read-only sections that trigger a data abort resulting in the above oops. Fix this by copying the relocation code, and then writing the variables into the destination, thereby avoiding the need to make the kernel text read/write. Reported-by: Giancarlo Ferrari <giancarlo.ferrari89@gmail.com> Tested-by: Giancarlo Ferrari <giancarlo.ferrari89@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2021-02-05ARM: ensure the signal page contains defined contentsRussell King
Ensure that the signal page contains our poison instruction to increase the protection against ROP attacks and also contains well defined contents. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2021-02-05usb: dwc2: Fix endpoint direction check in ep_from_windexHeiko Stuebner
dwc2_hsotg_process_req_status uses ep_from_windex() to retrieve the endpoint for the index provided in the wIndex request param. In a test-case with a rndis gadget running and sending a malformed packet to it like: dev.ctrl_transfer( 0x82, # bmRequestType 0x00, # bRequest 0x0000, # wValue 0x0001, # wIndex 0x00 # wLength ) it is possible to cause a crash: [ 217.533022] dwc2 ff300000.usb: dwc2_hsotg_process_req_status: USB_REQ_GET_STATUS [ 217.559003] Unable to handle kernel read from unreadable memory at virtual address 0000000000000088 ... [ 218.313189] Call trace: [ 218.330217] ep_from_windex+0x3c/0x54 [ 218.348565] usb_gadget_giveback_request+0x10/0x20 [ 218.368056] dwc2_hsotg_complete_request+0x144/0x184 This happens because ep_from_windex wants to compare the endpoint direction even if index_to_ep() didn't return an endpoint due to the direction not matching. The fix is easy insofar that the actual direction check is already happening when calling index_to_ep() which will return NULL if there is no endpoint for the targeted direction, so the offending check can go away completely. Fixes: c6f5c050e2a7 ("usb: dwc2: gadget: add bi-directional endpoint support") Cc: stable@vger.kernel.org Reported-by: Gerhard Klostermeier <gerhard.klostermeier@syss.de> Signed-off-by: Heiko Stuebner <heiko.stuebner@theobroma-systems.com> Link: https://lore.kernel.org/r/20210127103919.58215-1-heiko@sntech.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-05usb: dwc3: fix clock issue during resume in OTG modeGary Bisson
Commit fe8abf332b8f ("usb: dwc3: support clocks and resets for DWC3 core") introduced clock support and a new function named dwc3_core_init_for_resume() which enables the clock before calling dwc3_core_init() during resume as clocks get disabled during suspend. Unfortunately in this commit the DWC3_GCTL_PRTCAP_OTG case was forgotten and therefore during resume, a platform could call dwc3_core_init() without re-enabling the clocks first, preventing to resume properly. So update the resume path to call dwc3_core_init_for_resume() as it should. Fixes: fe8abf332b8f ("usb: dwc3: support clocks and resets for DWC3 core") Cc: stable@vger.kernel.org Signed-off-by: Gary Bisson <gary.bisson@boundarydevices.com> Link: https://lore.kernel.org/r/20210125161934.527820-1-gary.bisson@boundarydevices.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-05kallsyms: fix nonconverging kallsyms table with lldArnd Bergmann
ARM randconfig builds with lld sometimes show a build failure from kallsyms: Inconsistent kallsyms data Try make KALLSYMS_EXTRA_PASS=1 as a workaround The problem is the veneers/thunks getting added by the linker extend the symbol table, which in turn leads to more veneers being needed, so it may take a few extra iterations to converge. This bug has been fixed multiple times before, but comes back every time a new symbol name is used. lld uses a different set of identifiers from ld.bfd, so the additional ones need to be added as well. I looked through the sources and found that arm64 and mips define similar prefixes, so I'm adding those as well, aside from the ones I observed. I'm not sure about powerpc64, which seems to already be handled through a section match, but if it comes back, the "__long_branch_" and "__plt_" prefixes would have to get added as well. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2021-02-05kbuild: fix duplicated flags in DEBUG_CFLAGSMasahiro Yamada
Sedat Dilek noticed duplicated flags in DEBUG_CFLAGS when building deb-pkg with CONFIG_DEBUG_INFO. For example, 'make CC=clang bindeb-pkg' reproduces the issue. Kbuild recurses to the top Makefile for some targets such as package builds. With commit 121c5d08d53c ("kbuild: Only add -fno-var-tracking-assignments for old GCC versions") applied, DEBUG_CFLAGS is now reset only when CONFIG_CC_IS_GCC=y. Fix it to reset DEBUG_CFLAGS all the time. Fixes: 121c5d08d53c ("kbuild: Only add -fno-var-tracking-assignments for old GCC versions") Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Mark Wielaard <mark@klomp.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org>
2021-02-05Merge tag 'drm-intel-fixes-2021-02-04' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes drm/i915 fixes for v5.11-rc7: - Skip vswing programming for TBT - Power up combo PHY lanes for HDMI - Fix double YUV range correction on HDR planes - Fix the MST PBN divider calculation - Fix LTTPR vswing/pre-emp setting in non-transparent mode - Move the breadcrumb to the signaler if completed upon cancel - Close race between enable_breadcrumbs and cancel_breadcrumbs - Drop lru bumping on display unpinning Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/87bld0f36b.fsf@intel.com
2021-02-04Merge tag 'pci-v5.11-fixes-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fix from Bjorn Helgaas: "Revert ASPM suspend/resume fix that regressed NVMe devices (Bjorn Helgaas)" * tag 'pci-v5.11-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: Revert "PCI/ASPM: Save/restore L1SS Capability for suspend/resume"
2021-02-05Merge tag 'amd-drm-fixes-5.11-2021-02-03' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-5.11-2021-02-03: amdgpu: - Fix retry in gem create - Vangogh fixes - Fix for display from shared buffers - Various display fixes amdkfd: - Fix regression in buffer free Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210204041300.4425-1-alexander.deucher@amd.com
2021-02-04io_uring: drop mm/files between task_work_submitPavel Begunkov
Since SQPOLL task can be shared and so task_work entries can be a mix of them, we need to drop mm and files before trying to issue next request. Cc: stable@vger.kernel.org # 5.10+ Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-04x86/apic: Add extra serialization for non-serializing MSRsDave Hansen
Jan Kiszka reported that the x2apic_wrmsr_fence() function uses a plain MFENCE while the Intel SDM (10.12.3 MSR Access in x2APIC Mode) calls for MFENCE; LFENCE. Short summary: we have special MSRs that have weaker ordering than all the rest. Add fencing consistent with current SDM recommendations. This is not known to cause any issues in practice, only in theory. Longer story below: The reason the kernel uses a different semantic is that the SDM changed (roughly in late 2017). The SDM changed because folks at Intel were auditing all of the recommended fences in the SDM and realized that the x2apic fences were insufficient. Why was the pain MFENCE judged insufficient? WRMSR itself is normally a serializing instruction. No fences are needed because the instruction itself serializes everything. But, there are explicit exceptions for this serializing behavior written into the WRMSR instruction documentation for two classes of MSRs: IA32_TSC_DEADLINE and the X2APIC MSRs. Back to x2apic: WRMSR is *not* serializing in this specific case. But why is MFENCE insufficient? MFENCE makes writes visible, but only affects load/store instructions. WRMSR is unfortunately not a load/store instruction and is unaffected by MFENCE. This means that a non-serializing WRMSR could be reordered by the CPU to execute before the writes made visible by the MFENCE have even occurred in the first place. This means that an x2apic IPI could theoretically be triggered before there is any (visible) data to process. Does this affect anything in practice? I honestly don't know. It seems quite possible that by the time an interrupt gets to consume the (not yet) MFENCE'd data, it has become visible, mostly by accident. To be safe, add the SDM-recommended fences for all x2apic WRMSRs. This also leaves open the question of the _other_ weakly-ordered WRMSR: MSR_IA32_TSC_DEADLINE. While it has the same ordering architecture as the x2APIC MSRs, it seems substantially less likely to be a problem in practice. While writes to the in-memory Local Vector Table (LVT) might theoretically be reordered with respect to a weakly-ordered WRMSR like TSC_DEADLINE, the SDM has this to say: In x2APIC mode, the WRMSR instruction is used to write to the LVT entry. The processor ensures the ordering of this write and any subsequent WRMSR to the deadline; no fencing is required. But, that might still leave xAPIC exposed. The safest thing to do for now is to add the extra, recommended LFENCE. [ bp: Massage commit message, fix typos, drop accidentally added newline to tools/arch/x86/include/asm/barrier.h. ] Reported-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20200305174708.F77040DD@viggo.jf.intel.com
2021-02-04Revert "x86/setup: don't remove E820_TYPE_RAM for pfn 0"Mike Rapoport
This reverts commit bde9cfa3afe4324ec251e4af80ebf9b7afaf7afe. Changing the first memory page type from E820_TYPE_RESERVED to E820_TYPE_RAM makes it a part of "System RAM" resource rather than a reserved resource and this in turn causes devmem_is_allowed() to treat is as area that can be accessed but it is filled with zeroes instead of the actual data as previously. The change in /dev/mem output causes lilo to fail as was reported at slakware users forum, and probably other legacy applications will experience similar problems. Link: https://www.linuxquestions.org/questions/slackware-14/slackware-current-lilo-vesa-warnings-after-recent-updates-4175689617/#post6214439 Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-04Merge tag 'acpi-5.11-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "Address recent regression causing battery devices to be never bound to a driver on some systems (Hans de Goede)" * tag 'acpi-5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: scan: Fix battery devices sometimes never binding
2021-02-04Merge tag 'ovl-fixes-5.11-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs fixes from Miklos Szeredi: - Fix capability conversion and minor overlayfs bugs that are related to the unprivileged overlay mounts introduced in this cycle. - Fix two recent (v5.10) and one old (v4.10) bug. - Clean up security xattr copy-up (related to a SELinux regression). * tag 'ovl-fixes-5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: implement volatile-specific fsync error behaviour ovl: skip getxattr of security labels ovl: fix dentry leak in ovl_get_redirect ovl: avoid deadlock on directory ioctl cap: fix conversions on getxattr ovl: perform vfs_getxattr() with mounter creds ovl: add warning on user_ns mismatch
2021-02-04KVM: x86: Set so called 'reserved CR3 bits in LM mask' at vCPU resetSean Christopherson
Set cr3_lm_rsvd_bits, which is effectively an invalid GPA mask, at vCPU reset. The reserved bits check needs to be done even if userspace never configures the guest's CPUID model. Cc: stable@vger.kernel.org Fixes: 0107973a80ad ("KVM: x86: Introduce cr3_lm_rsvd_bits in kvm_vcpu_arch") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210204000117.3303214-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-04Merge branch 'nvme-5.11' of git://git.infradead.org/nvme into block-5.11Jens Axboe
Pull NVMe fixes from Christoph. * 'nvme-5.11' of git://git.infradead.org/nvme: nvmet-tcp: fix out-of-bounds access when receiving multiple h2cdata PDUs update the email address for Keith Bush nvme-pci: ignore the subsysem NQN on Phison E16 nvme-pci: avoid the deepest sleep state on Kingston A2000 SSDs
2021-02-04io_uring: don't modify identity's files uncess identity is cowedXiaoguang Wang
Abaci Robot reported following panic: BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 800000010ef3f067 P4D 800000010ef3f067 PUD 10d9df067 PMD 0 Oops: 0002 [#1] SMP PTI CPU: 0 PID: 1869 Comm: io_wqe_worker-0 Not tainted 5.11.0-rc3+ #1 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:put_files_struct+0x1b/0x120 Code: 24 18 c7 00 f4 ff ff ff e9 4d fd ff ff 66 90 0f 1f 44 00 00 41 57 41 56 49 89 fe 41 55 41 54 55 53 48 83 ec 08 e8 b5 6b db ff 41 ff 0e 74 13 48 83 c4 08 5b 5d 41 5c 41 5d 41 5e 41 5f e9 9c RSP: 0000:ffffc90002147d48 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffff88810d9a5300 RCX: 0000000000000000 RDX: ffff88810d87c280 RSI: ffffffff8144ba6b RDI: 0000000000000000 RBP: 0000000000000080 R08: 0000000000000001 R09: ffffffff81431500 R10: ffff8881001be000 R11: 0000000000000000 R12: ffff88810ac2f800 R13: ffff88810af38a00 R14: 0000000000000000 R15: ffff8881057130c0 FS: 0000000000000000(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000010dbaa002 CR4: 00000000003706f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __io_clean_op+0x10c/0x2a0 io_dismantle_req+0x3c7/0x600 __io_free_req+0x34/0x280 io_put_req+0x63/0xb0 io_worker_handle_work+0x60e/0x830 ? io_wqe_worker+0x135/0x520 io_wqe_worker+0x158/0x520 ? __kthread_parkme+0x96/0xc0 ? io_worker_handle_work+0x830/0x830 kthread+0x134/0x180 ? kthread_create_worker_on_cpu+0x90/0x90 ret_from_fork+0x1f/0x30 Modules linked in: CR2: 0000000000000000 ---[ end trace c358ca86af95b1e7 ]--- I guess case below can trigger above panic: there're two threads which operates different io_uring ctxs and share same sqthread identity, and later one thread exits, io_uring_cancel_task_requests() will clear task->io_uring->identity->files to be NULL in sqpoll mode, then another ctx that uses same identity will panic. Indeed we don't need to clear task->io_uring->identity->files here, io_grab_identity() should handle identity->files changes well, if task->io_uring->identity->files is not equal to current->files, io_cow_identity() should handle this changes well. Cc: stable@vger.kernel.org # 5.5+ Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-04KVM: x86/mmu: Fix TDP MMU zap collapsible SPTEsBen Gardon
There is a bug in the TDP MMU function to zap SPTEs which could be replaced with a larger mapping which prevents the function from doing anything. Fix this by correctly zapping the last level SPTEs. Cc: stable@vger.kernel.org Fixes: 14881998566d ("kvm: x86/mmu: Support disabling dirty logging for the tdp MMU") Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20210202185734.1680553-11-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-04Merge tag 'drm-misc-fixes-2021-02-02' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes * drm/bridge/lontium-lt9611uxc: EDID fixes; Don't handle hotplug events in IRQ handler * drm/ttm: Use _GFP_NOWARN for huge pages Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/YBlHU4sc/5GHpXpg@linux-uq9g
2021-02-04drm/nouveau: fix dma syncing warning with debugging on.Dave Airlie
Since I wrote the below patch if you run a debug kernel you can a dma debug warning like: nouveau 0000:1f:00.0: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x000000016e012000] [size=4096 bytes] The old nouveau code wasn't consolidate the pages like the ttm code, but the dma-debug expects the sync code to give it the same base/range pairs as the allocator. Fix the nouveau sync code to consolidate pages before calling the sync code. Fixes: bd549d35b4be0 ("nouveau: use ttm populate mapping functions. (v2)") Reported-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/417588/
2021-02-03Merge tag 'for-linus-5.11-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml Pull UML fixes from Richard Weinberger: - Make sure to set a default console, otherwise ttynull is selected - Revert initial ARCH_HAS_SET_MEMORY support, this needs more work - Fix a regression due to ubd refactoring - Various small fixes * tag 'for-linus-5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: um: time: fix initialization in time-travel mode um: fix os_idle_sleep() to not hang Revert "um: support some of ARCH_HAS_SET_MEMORY" Revert "um: allocate a guard page to helper threads" um: virtio: free vu_dev only with the contained struct device um: kmsg_dumper: always dump when not tty console um: stdio_console: Make preferred console um: return error from ioremap() um: ubd: fix command line handling of ubd
2021-02-03Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: "Fix the arm64 linear map range detection for tagged addresses and replace the bitwise operations with subtract (virt_addr_valid(), __is_lm_address(), __lm_to_phys())" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: Use simpler arithmetics for the linear map macros arm64: Do not pass tagged addresses to __is_lm_address()
2021-02-03Merge tag 'trace-v5.11-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: - Initialize tracing-graph-pause at task creation, not start of function tracing, to avoid corrupting the pause counter. - Set "pause-on-trace" for latency tracers as that option breaks their output (regression). - Fix the wrong error return for setting kretprobes on future modules (before they are loaded). - Fix re-registering the same kretprobe. - Add missing value check for added RCU variable reload. * tag 'trace-v5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracepoint: Fix race between tracing and removing tracepoint kretprobe: Avoid re-registration of the same kretprobe earlier tracing/kprobe: Fix to support kretprobe events on unloaded modules tracing: Use pause-on-trace with the latency tracers fgraph: Initialize tracing_graph_pause at task creation
2021-02-03Merge tag 'arm-soc-fixes-v5.11-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "The code fixes in this round are all for the Texas Instruments OMAP platform, addressing several regressions related to the ti-sysc interconnect changes that was merged in linux-5.11 and one recently introduced RCU usage warning. Tero Kristo updates his maintainer file entries as he is changing to a new employer. The other changes are for devicetree files across eight different platforms: TI OMAP: - multiple gpio related one-line fixes Allwinner/sunxi: - ARM: dts: sun7i: a20: bananapro: Fix ethernet phy-mode - soc: sunxi: mbus: Remove DE2 display engine compatibles NXP lpc32xx: - ARM: dts: lpc32xx: Revert set default clock rate of HCLK PLL STMicroelectronics stm32 - multiple minor fixes for DHCOM/DHCOR boards NXP Layerscape: - Fix DCFG address range on LS1046A SoC Amlogic meson: - fix reboot issue on odroid C4 - revert an ethernet change that caused a regression - meson-g12: Set FL-adj property value Rockchip: - multiple minor fixes on 64-bit rockchip machines Qualcomm: - Regression fixes for Lenovo Yoga touchpad and for interconnect configuration - Boot fixes for 'LPASS' clock configuration on two machines" * tag 'arm-soc-fixes-v5.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (31 commits) ARM: dts: lpc32xx: Revert set default clock rate of HCLK PLL ARM: dts: sun7i: a20: bananapro: Fix ethernet phy-mode arm64: dts: ls1046a: fix dcfg address range soc: sunxi: mbus: Remove DE2 display engine compatibles arm64: dts: meson: switch TFLASH_VDD_EN pin to open drain on Odroid-C4 Revert "arm64: dts: amlogic: add missing ethernet reset ID" arm64: dts: rockchip: Disable display for NanoPi R2S ARM: dts: omap4-droid4: Fix lost keypad slide interrupts for droid4 arm64: dts: rockchip: remove interrupt-names property from rk3399 vdec node drivers: bus: simple-pm-bus: Fix compatibility with simple-bus for auxdata ARM: OMAP2+: Fix booting for am335x after moving to simple-pm-bus ARM: OMAP2+: Fix suspcious RCU usage splats for omap_enter_idle_coupled ARM: dts: stm32: Fix GPIO hog flags on DHCOM DRC02 ARM: dts: stm32: Fix GPIO hog flags on DHCOM PicoITX ARM: dts: stm32: Fix GPIO hog names on DHCOM ARM: dts: stm32: Disable optional TSC2004 on DRC02 board ARM: dts: stm32: Disable WP on DHCOM uSD slot ARM: dts: stm32: Connect card-detect signal on DHCOM ARM: dts: stm32: Fix polarity of the DH DRC02 uSD card detect arm64: dts: qcom: sdm845: Reserve LPASS clocks in gcc ...