summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2021-02-21Merge tag 'core-mm-2021-02-17' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull tlb gather updates from Ingo Molnar: "Theses fix MM (soft-)dirty bit management in the procfs code & clean up the TLB gather API" * tag 'core-mm-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ldt: Use tlb_gather_mmu_fullmm() when freeing LDT page-tables tlb: arch: Remove empty __tlb_remove_tlb_entry() stubs tlb: mmu_gather: Remove start/end arguments from tlb_gather_mmu() tlb: mmu_gather: Introduce tlb_gather_mmu_fullmm() tlb: mmu_gather: Remove unused start/end arguments from tlb_finish_mmu() mm: proc: Invalidate TLB after clearing soft-dirty page state
2021-02-21Merge tag 'for-5.12/io_uring-2021-02-17' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull io_uring updates from Jens Axboe: "Highlights from this cycles are things like request recycling and task_work optimizations, which net us anywhere from 10-20% of speedups on workloads that mostly are inline. This work was originally done to put io_uring under memcg, which adds considerable overhead. But it's a really nice win as well. Also worth highlighting is the LOOKUP_CACHED work in the VFS, and using it in io_uring. Greatly speeds up the fast path for file opens. Summary: - Put io_uring under memcg protection. We accounted just the rings themselves under rlimit memlock before, now we account everything. - Request cache recycling, persistent across invocations (Pavel, me) - First part of a cleanup/improvement to buffer registration (Bijan) - SQPOLL fixes (Hao) - File registration NULL pointer fixup (Dan) - LOOKUP_CACHED support for io_uring - Disable /proc/thread-self/ for io_uring, like we do for /proc/self - Add Pavel to the io_uring MAINTAINERS entry - Tons of code cleanups and optimizations (Pavel) - Support for skip entries in file registration (Noah)" * tag 'for-5.12/io_uring-2021-02-17' of git://git.kernel.dk/linux-block: (103 commits) io_uring: tctx->task_lock should be IRQ safe proc: don't allow async path resolution of /proc/thread-self components io_uring: kill cached requests from exiting task closing the ring io_uring: add helper to free all request caches io_uring: allow task match to be passed to io_req_cache_free() io-wq: clear out worker ->fs and ->files io_uring: optimise io_init_req() flags setting io_uring: clean io_req_find_next() fast check io_uring: don't check PF_EXITING from syscall io_uring: don't split out consume out of SQE get io_uring: save ctx put/get for task_work submit io_uring: don't duplicate io_req_task_queue() io_uring: optimise SQPOLL mm/files grabbing io_uring: optimise out unlikely link queue io_uring: take compl state from submit state io_uring: inline io_complete_rw_common() io_uring: move res check out of io_rw_reissue() io_uring: simplify iopoll reissuing io_uring: clean up io_req_free_batch_finish() io_uring: move submit side state closer in the ring ...
2021-02-21Merge tag 'for-5.12/block-2021-02-17' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull core block updates from Jens Axboe: "Another nice round of removing more code than what is added, mostly due to Christoph's relentless pursuit of tech debt removal/cleanups. This pull request contains: - Two series of BFQ improvements (Paolo, Jan, Jia) - Block iov_iter improvements (Pavel) - bsg error path fix (Pan) - blk-mq scheduler improvements (Jan) - -EBUSY discard fix (Jan) - bvec allocation improvements (Ming, Christoph) - bio allocation and init improvements (Christoph) - Store bdev pointer in bio instead of gendisk + partno (Christoph) - Block trace point cleanups (Christoph) - hard read-only vs read-only split (Christoph) - Block based swap cleanups (Christoph) - Zoned write granularity support (Damien) - Various fixes/tweaks (Chunguang, Guoqing, Lei, Lukas, Huhai)" * tag 'for-5.12/block-2021-02-17' of git://git.kernel.dk/linux-block: (104 commits) mm: simplify swapdev_block sd_zbc: clear zone resources for non-zoned case block: introduce blk_queue_clear_zone_settings() zonefs: use zone write granularity as block size block: introduce zone_write_granularity limit block: use blk_queue_set_zoned in add_partition() nullb: use blk_queue_set_zoned() to setup zoned devices nvme: cleanup zone information initialization block: document zone_append_max_bytes attribute block: use bi_max_vecs to find the bvec pool md/raid10: remove dead code in reshape_request block: mark the bio as cloned in bio_iov_bvec_set block: set BIO_NO_PAGE_REF in bio_iov_bvec_set block: remove a layer of indentation in bio_iov_iter_get_pages block: turn the nr_iovecs argument to bio_alloc* into an unsigned short block: remove the 1 and 4 vec bvec_slabs entries block: streamline bvec_alloc block: factor out a bvec_alloc_gfp helper block: move struct biovec_slab to bio.c block: reuse BIO_INLINE_VECS for integrity bvecs ...
2021-02-21Merge tag 'oprofile-removal-5.12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/linux Pull oprofile and dcookies removal from Viresh Kumar: "Remove oprofile and dcookies support The 'oprofile' user-space tools don't use the kernel OPROFILE support any more, and haven't in a long time. User-space has been converted to the perf interfaces. The dcookies stuff is only used by the oprofile code. Now that oprofile's support is getting removed from the kernel, there is no need for dcookies as well. Remove kernel's old oprofile and dcookies support" * tag 'oprofile-removal-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/linux: fs: Remove dcookies support drivers: Remove CONFIG_OPROFILE support arch: xtensa: Remove CONFIG_OPROFILE support arch: x86: Remove CONFIG_OPROFILE support arch: sparc: Remove CONFIG_OPROFILE support arch: sh: Remove CONFIG_OPROFILE support arch: s390: Remove CONFIG_OPROFILE support arch: powerpc: Remove oprofile arch: powerpc: Stop building and using oprofile arch: parisc: Remove CONFIG_OPROFILE support arch: mips: Remove CONFIG_OPROFILE support arch: microblaze: Remove CONFIG_OPROFILE support arch: ia64: Remove rest of perfmon support arch: ia64: Remove CONFIG_OPROFILE support arch: hexagon: Don't select HAVE_OPROFILE arch: arc: Remove CONFIG_OPROFILE support arch: arm: Remove CONFIG_OPROFILE support arch: alpha: Remove CONFIG_OPROFILE support
2021-02-21Merge tag 'xfs-5.12-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs updates from Darrick Wong: "There's a lot going on this time, which seems about right for this drama-filled year. Community developers added some code to speed up freezing when read-only workloads are still running, refactored the logging code, added checks to prevent file extent counter overflow, reduced iolock cycling to speed up fsync and gc scans, and started the slow march towards supporting filesystem shrinking. There's a huge refactoring of the internal speculative preallocation garbage collection code which fixes a bunch of bugs, makes the gc scheduling per-AG and hence multithreaded, and standardizes the retry logic when we try to reserve space or quota, can't, and want to trigger a gc scan. We also enable multithreaded quotacheck to reduce mount times further. This is also preparation for background file gc, which may or may not land for 5.13. We also fixed some deadlocks in the rename code, fixed a quota accounting leak when FSSETXATTR fails, restored the behavior that write faults to an mmap'd region actually cause a SIGBUS, fixed a bug where sgid directory inheritance wasn't quite working properly, and fixed a bug where symlinks weren't working properly in ecryptfs. We also now advertise the inode btree counters feature that was introduced two cycles ago. Summary: - Fix an ABBA deadlock when renaming files on overlayfs. - Make sure that we can't overflow the inode extent counters when adding to or removing extents from a file. - Make directory sgid inheritance work the same way as all the other filesystems. - Don't drain the buffer cache on freeze and ro remount, which should reduce the amount of time if read-only workloads are continuing during the freeze. - Fix a bug where symlink size isn't reported to the vfs in ecryptfs. - Disentangle log cleaning from log covering. This refactoring sets us up for future changes to the log, though for now it simply means that we can use covering for freezes, and cleaning becomes something we only do at unmount. - Speed up file fsyncs by reducing iolock cycling. - Fix delalloc blocks leaking when changing the project id fails because of input validation errors in FSSETXATTR. - Fix oversized quota reservation when converting unwritten extents during a DAX write. - Create a transaction allocation helper function to standardize the idiom of allocating a transaction, reserving blocks, locking inodes, and reserving quota. Replace all the open-coded logic for file creation, file ownership changes, and file modifications to use them. - Actually shut down the fs if the incore quota reservations get corrupted. - Fix background block garbage collection scans to not block and to actually clean out CoW staging extents properly. - Run block gc scans when we run low on project quota. - Use the standardized transaction allocation helpers to make it so that ENOSPC and EDQUOT errors during reservation will back out, invoke the block gc scanner, and try again. This is preparation for introducing background inode garbage collection in the next cycle. - Combine speculative post-EOF block garbage collection with speculative copy on write block garbage collection. - Enable multithreaded quotacheck. - Allow sysadmins to tweak the CPU affinities and maximum concurrency levels of quotacheck and background blockgc worker pools. - Expose the inode btree counter feature in the fs geometry ioctl. - Cleanups of the growfs code in preparation for starting work on filesystem shrinking. - Fix all the bloody gcc warnings that the maintainer knows about. :P - Fix a RST syntax error. - Don't trigger bmbt corruption assertions after the fs shuts down. - Restore behavior of forcing SIGBUS on a shut down filesystem when someone triggers a mmap write fault (or really, any buffered write)" * tag 'xfs-5.12-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (85 commits) xfs: consider shutdown in bmapbt cursor delete assert xfs: fix boolreturn.cocci warnings xfs: restore shutdown check in mapped write fault path xfs: fix rst syntax error in admin guide xfs: fix incorrect root dquot corruption error when switching group/project quota types xfs: get rid of xfs_growfs_{data,log}_t xfs: rename `new' to `delta' in xfs_growfs_data_private() libxfs: expose inobtcount in xfs geometry xfs: don't bounce the iolock between free_{eof,cow}blocks xfs: expose the blockgc workqueue knobs publicly xfs: parallelize block preallocation garbage collection xfs: rename block gc start and stop functions xfs: only walk the incore inode tree once per blockgc scan xfs: consolidate the eofblocks and cowblocks workers xfs: consolidate incore inode radix tree posteof/cowblocks tags xfs: remove trivial eof/cowblocks functions xfs: hide xfs_icache_free_cowblocks xfs: hide xfs_icache_free_eofblocks xfs: relocate the eofb/cowb workqueue functions xfs: set WQ_SYSFS on all workqueues in debug mode ...
2021-02-21Merge tag 'iomap-5.12-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull iomap updates from Darrick Wong: "The big change in this cycle is some new code to make it possible for XFS to try unaligned directio overwrites without taking locks. If the block is fully written and within EOF (i.e. doesn't require any further fs intervention) then we can let the unlocked write proceed. If not, we fall back to synchronizing direct writes. Summary: - Adjust the final parameter of iomap_dio_rw. - Add a new flag to request that iomap directio writes return EAGAIN if the write is not a pure overwrite within EOF; this will be used to reduce lock contention with unaligned direct writes on XFS. - Amend XFS' directio code to eliminate exclusive locking for unaligned direct writes if the circumstances permit" * tag 'iomap-5.12-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: reduce exclusive locking on unaligned dio xfs: split the unaligned DIO write code out xfs: improve the reflink_bounce_dio_write tracepoint xfs: simplify the read/write tracepoints xfs: remove the buffered I/O fallback assert xfs: cleanup the read/write helper naming xfs: make xfs_file_aio_write_checks IOCB_NOWAIT-aware xfs: factor out a xfs_ilock_iocb helper iomap: add a IOMAP_DIO_OVERWRITE_ONLY flag iomap: pass a flags argument to iomap_dio_rw iomap: rename the flags variable in __iomap_dio_rw
2021-02-21Merge tag 'pstore-v5.12-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull pstore fix from Kees Cook: "Fix a CONFIG typo (Jiri Bohac)" * tag 'pstore-v5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: pstore: Fix typo in compression option name
2021-02-21Merge tag 'fsverity-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt Pull fsverity updates from Eric Biggers: "Add an ioctl which allows reading fs-verity metadata from a file. This is useful when a file with fs-verity enabled needs to be served somewhere, and the other end wants to do its own fs-verity compatible verification of the file. See the commit messages for details. This new ioctl has been tested using new xfstests I've written for it" * tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: fs-verity: support reading signature with ioctl fs-verity: support reading descriptor with ioctl fs-verity: support reading Merkle tree with ioctl fs-verity: add FS_IOC_READ_VERITY_METADATA ioctl fs-verity: don't pass whole descriptor to fsverity_verify_signature() fs-verity: factor out fsverity_get_descriptor()
2021-02-21Merge tag 'nfsd-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linuxLinus Torvalds
Pull nfsd updates from Chuck Lever: - Update NFSv2 and NFSv3 XDR decoding functions - Further improve support for re-exporting NFS mounts - Convert NFSD stats to per-CPU counters - Add batch Receive posting to the server's RPC/RDMA transport * tag 'nfsd-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (65 commits) nfsd: skip some unnecessary stats in the v4 case nfs: use change attribute for NFS re-exports NFSv4_2: SSC helper should use its own config. nfsd: cstate->session->se_client -> cstate->clp nfsd: simplify nfsd4_check_open_reclaim nfsd: remove unused set_client argument nfsd: find_cpntf_state cleanup nfsd: refactor set_client nfsd: rename lookup_clientid->set_client nfsd: simplify nfsd_renew nfsd: simplify process_lock nfsd4: simplify process_lookup1 SUNRPC: Correct a comment svcrdma: DMA-sync the receive buffer in svc_rdma_recvfrom() svcrdma: Reduce Receive doorbell rate svcrdma: Deprecate stat variables that are no longer used svcrdma: Restore read and write stats svcrdma: Convert rdma_stat_sq_starve to a per-CPU counter svcrdma: Convert rdma_stat_recv to a per-CPU counter svcrdma: Refactor svc_rdma_init() and svc_rdma_clean_up() ...
2021-02-21Merge tag 'erofs-for-5.12-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs updates from Gao Xiang: "This contains a somewhat important but rarely reproduced fix reported month ago for platforms which have weak memory model (e.g. arm64). The root cause is that test_bit/set_bit atomic operations are actually implemented in relaxed forms, and uninitialized fields governed by an atomic bit could be observed in advance due to memory reordering thus memory barrier pairs should be used. There is also a trivial fix of crafted blkszbits generated by syzkaller. Summary: - fix shift-out-of-bounds of crafted blkszbits generated by syzkaller - ensure initialized fields can only be observed after bit is set" * tag 'erofs-for-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: initialized fields can only be observed after bit is set erofs: fix shift-out-of-bounds of blkszbits
2021-02-21Merge tag 'f2fs-for-5.12-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "We've added two major features: 1) compression level and 2) checkpoint_merge, in this round. Compression level expands 'compress_algorithm' mount option to accept parameter as format of <algorithm>:<level>, by this way, it gives a way to allow user to do more specified config on lz4 and zstd compression level, then f2fs compression can provide higher compress ratio. checkpoint_merge creates a kernel daemon and makes it to merge concurrent checkpoint requests as much as possible to eliminate redundant checkpoint issues. Plus, we can eliminate the sluggish issue caused by slow checkpoint operation when the checkpoint is done in a process context in a cgroup having low i/o budget and cpu shares. Enhancements: - add compress level for lz4 and zstd in mount option - checkpoint_merge mount option - deprecate f2fs_trace_io Bug fixes: - flush data when enabling checkpoint back - handle corner cases of mount options - missing ACL update and lock for I_LINKABLE flag - attach FIEMAP_EXTENT_MERGED in f2fs_fiemap - fix potential deadlock in compression flow - fix wrong submit_io condition As usual, we've cleaned up many code flows and fixed minor bugs" * tag 'f2fs-for-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (32 commits) Documentation: f2fs: fix typo s/automaic/automatic f2fs: give a warning only for readonly partition f2fs: don't grab superblock freeze for flush/ckpt thread f2fs: add ckpt_thread_ioprio sysfs node f2fs: introduce checkpoint_merge mount option f2fs: relocate inline conversion from mmap() to mkwrite() f2fs: fix a wrong condition in __submit_bio f2fs: remove unnecessary initialization in xattr.c f2fs: fix to avoid inconsistent quota data f2fs: flush data when enabling checkpoint back f2fs: deprecate f2fs_trace_io f2fs: Remove readahead collision detection f2fs: remove unused stat_{inc, dec}_atomic_write f2fs: introduce sb_status sysfs node f2fs: fix to use per-inode maxbytes f2fs: compress: fix potential deadlock libfs: unexport generic_ci_d_compare() and generic_ci_d_hash() f2fs: fix to set/clear I_LINKABLE under i_lock f2fs: fix null page reference in redirty_blocks f2fs: clean up post-read processing ...
2021-02-21Merge tag 'for-5.12-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs updates from David Sterba: "This brings updates of space handling, performance improvements or bug fixes. The subpage block size and zoned mode features have reached state where they're usable but with limitations. Performance or related: - do not block on deleted block group mutex in the cleaner, avoids some long stalls - improved flushing: make it work better with ticket space reservations and avoid excessive transaction commits in some scenarios, slightly improves throughput for random write load - preemptive background flushing: separate the logic from ticket reservations, improve the accounting and decisions when to flush in low space conditions - less lock contention related to running delayed refs, let just one thread do the flushing when there are many inside transaction commit - dbench workload improvements: avoid unnecessary work when logging inodes, fewer fallbacks to transaction commit and thus less waiting for it (+7% throughput, -20% latency) Core: - subpage block size - currently read-only support - refactor and generalize code where sectorsize is assumed to be page size, add the subpage handling everywhere - the read-write support is on the way, page sizes are still limited to 4K or 64K - zoned mode, first working version but with limitations - SMR/ZBC/ZNS friendly allocation mode, utilizing the "no fixed location for structures" and chunked allocation - superblock as the only fixed data structure needs special handling, uses 2 consecutive zones as a ring buffer - tree-log support with a dedicated block group to avoid unordered writes - emulated zones on non-zoned devices - not yet working - all non-single block group profiles, requires more zone write pointer synchronization between the multiple block groups - fitrim due to dependency on space cache, can be implemented Fixes: - ref-verify: proper tree owner and node level tracking - fix pinned byte accounting, causing some early ENOSPC now more likely due to other changes in delayed refs Other: - error handling fixes and improvements - more error injection points - more function documentation - more and updated tracepoints - subset of W=1 checked by default - update comments to allow more automatic kdoc parameter checks" * tag 'for-5.12-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (144 commits) btrfs: zoned: enable to mount ZONED incompat flag btrfs: zoned: deal with holes writing out tree-log pages btrfs: zoned: reorder log node allocation on zoned filesystem btrfs: zoned: serialize log transaction on zoned filesystems btrfs: zoned: extend zoned allocator to use dedicated tree-log block group btrfs: split alloc_log_tree() btrfs: zoned: relocate block group to repair IO failure in zoned filesystems btrfs: zoned: enable relocation on a zoned filesystem btrfs: zoned: support dev-replace in zoned filesystems btrfs: zoned: implement copying for zoned device-replace btrfs: zoned: implement cloning for zoned device-replace btrfs: zoned: mark block groups to copy for device-replace btrfs: zoned: do not use async metadata checksum on zoned filesystems btrfs: zoned: wait for existing extents before truncating btrfs: zoned: serialize metadata IO btrfs: zoned: introduce dedicated data write path for zoned filesystems btrfs: zoned: enable zone append writing for direct IO btrfs: zoned: use ZONE_APPEND write for zoned mode btrfs: save irq flags when looking up an ordered extent btrfs: zoned: cache if block group is on a sequential zone ...
2021-02-21Merge tag 'affs-for-5.12-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull AFFS fix from David Sterba: "One minor fix for error handling in rename exchange" * tag 'affs-for-5.12-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: fs/affs: release old buffer head on error path
2021-02-21Merge tag 'jfs-5.12' of git://github.com/kleikamp/linux-shaggyLinus Torvalds
Pull jfs updates from David Kleikamp: "A few jfs fixes" * tag 'jfs-5.12' of git://github.com/kleikamp/linux-shaggy: fs/jfs: fix potential integer overflow on shift of a int jfs: turn diLog(), dataLog() and txLog() into void functions JFS: more checks for invalid superblock
2021-02-21Merge tag 'locks-v5.12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux Pull fcntl fix from Jeff Layton. * tag 'locks-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux: fcntl: make F_GETOWN(EX) return 0 on dead owner task
2021-02-21Merge branch 'work.namei' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull namei updates from Al Viro: "Most of that pile is LOOKUP_CACHED series; the rest is a couple of misc cleanups in the general area... There's a minor bisect hazard in the end of series, and normally I would've just folded the fix into the previous commit, but this branch is shared with Jens' tree, with stuff on top of it in there, so that would've required rebases outside of vfs.git" * 'work.namei' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fix handling of nd->depth on LOOKUP_CACHED failures in try_to_unlazy* fs: expose LOOKUP_CACHED through openat2() RESOLVE_CACHED fs: add support for LOOKUP_CACHED saner calling conventions for unlazy_child() fs: make unlazy_walk() error handling consistent fs/namei.c: Remove unlikely of status being -ECHILD in lookup_fast() do_tmpfile(): don't mess with finish_open()
2021-02-21Merge branch 'work.elf-compat' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull ELF compat updates from Al Viro: "Sanitizing ELF compat support, especially for triarch architectures: - X32 handling cleaned up - MIPS64 uses compat_binfmt_elf.c both for O32 and N32 now - Kconfig side of things regularized Eventually I hope to have compat_binfmt_elf.c killed, with both native and compat built from fs/binfmt_elf.c, with -DELF_BITS={64,32} passed by kbuild, but that's a separate story - not included here" * 'work.elf-compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: get rid of COMPAT_ELF_EXEC_PAGESIZE compat_binfmt_elf: don't bother with undef of ELF_ARCH Kconfig: regularize selection of CONFIG_BINFMT_ELF mips compat: switch to compat_binfmt_elf.c mips: don't bother with ELF_CORE_EFLAGS mips compat: don't bother with ELF_ET_DYN_BASE mips: KVM_GUEST makes no sense for 64bit builds... mips: kill unused definitions in binfmt_elf[on]32.c mips binfmt_elf*32.c: use elfcore-compat.h x32: make X32, !IA32_EMULATION setups able to execute x32 binaries [amd64] clean PRSTATUS_SIZE/SET_PR_FPVALID up properly elf_prstatus: collect the common part (everything before pr_reg) into a struct binfmt_elf: partially sanitize PRSTATUS_SIZE and SET_PR_FPVALID
2021-02-21Merge branch 'work.sendfile' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull sendfile updates from Al Viro: "Make sendfile() to pipe destination do the right thing, should make 'fs/pipe: allow sendfile() to pipe again' redundant" * 'work.sendfile' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: teach sendfile(2) to handle send-to-pipe directly take the guts of file-to-pipe splice into a helper function do_splice_to(): move the logics for limiting the read length in
2021-02-20Merge tag 'arm-platform-removal-v5.12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC platform removals from Arnd Bergmann: "There are a lot of platforms that have not seen any interesting code changes in the past five years or more. I made a list and asked around which ones are no longer in use, and received confirmation about six ARM platforms and the TI C6x architecture that have all reached the end of their life upstream, with no known users remaining: - efm32 - added in 2011, first Cortex-M, no notable changes after 2013 - picoxcell - added in 2011, abandoned after 2012 acquisition - prima2 - added in 20111, no notable changes since 2015 - tango - added in 2015, sporadic changes until 2017, but abandoned - u300 - added in 2009, no notable changes since 2013 - zx - added in 2015 for both 32, 2017 for 64 bit, no notable changes - arch/c6x - added in 2011, but work stalled soon after that A number of other platforms on the original list turned out to still have users. In some cases there are out-of-tree patches and users that plan to contribute them in the future, in other cases the code is complete and works reliably" Link: https://lore.kernel.org/lkml/CAK8P3a2DZ8xQp7R=H=wewHnT2=a_=M53QsZOueMVEf7tOZLKNg@mail.gmail.com/ * tag 'arm-platform-removal-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: ARM: remove u300 platform ARM: remove tango platform ARM: remove zte zx platform ARM: remove sirf prima2/atlas platforms c6x: remove architecture MAINTAINERS: Remove deleted platform efm32 ARM: drop efm32 platform ARM: Remove PicoXcell platform support ARM: dts: Remove PicoXcell platforms
2021-02-20fix handling of nd->depth on LOOKUP_CACHED failures in try_to_unlazy*Al Viro
After switching to non-RCU mode, we want nd->depth to match the number of entries in nd->stack[] that need eventual path_put(). legitimize_links() takes care of that on failures; unfortunately, failure exits added for LOOKUP_CACHED do not. We could add the logics for that into those failure exits, both in try_to_unlazy() and in try_to_unlazy_next(), but since both checks are immediately followed by legitimize_links() and there's no calls of legitimize_links() other than those two... It's easier to move the check (and required handling of nd->depth on failure) into legitimize_links() itself. [caught by Jens: ... and since we are zeroing ->depth here, we need to do drop_links() first] Fixes: 6c6ec2b0a3e0 "fs: add support for LOOKUP_CACHED" Tested-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2021-02-18pstore: Fix typo in compression option nameJiri Bohac
Both pstore_compress() and decompress_record() use a mistyped config option name ("PSTORE_COMPRESSION" instead of "PSTORE_COMPRESS"). As a result compression and decompression of pstore records was always disabled. Use the correct config option name. Signed-off-by: Jiri Bohac <jbohac@suse.cz> Fixes: fd49e03280e5 ("pstore: Fix linking when crypto API disabled") Acked-by: Matteo Croce <mcroce@microsoft.com> Signed-off-by: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210218111547.johvp5klpv3xrpnn@dwarf.suse.cz
2021-02-16io_uring: tctx->task_lock should be IRQ safeJens Axboe
We add task_work from any context, hence we need to ensure that we can tolerate it being from IRQ context as well. Fixes: 7cbf1722d5fc ("io_uring: provide FIFO ordering for task_work") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-15proc: don't allow async path resolution of /proc/thread-self componentsJens Axboe
If this is attempted by an io-wq kthread, then return -EOPNOTSUPP as we don't currently support that. Once we can get task_pid_ptr() doing the right thing, then this can go away again. Use PF_IO_WORKER for this to speciically target the io_uring workers. Modify the /proc/self/ check to use PF_IO_WORKER as well. Cc: stable@vger.kernel.org Fixes: 8d4c3e76e3be ("proc: don't allow async path resolution of /proc/self components") Reported-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-13Merge tag 'for-5.11-rc7-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fix from David Sterba: "A regression fix caused by a refactoring in 5.11. A corrupted superblock wouldn't be detected by checksum verification due to wrongly placed initialization of the checksum length, thus making memcmp always work" * tag 'for-5.11-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: initialize fs_info::csum_size earlier in open_ctree
2021-02-13io_uring: kill cached requests from exiting task closing the ringJens Axboe
Be nice and prune these upfront, in case the ring is being shared and one of the tasks is going away. This is a bit more important now that we account the allocations. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-13io_uring: add helper to free all request cachesJens Axboe
We have three different ones, put it in a helper for easy calling. This is in preparation for doing it outside of ring freeing as well. With that in mind, also ensure that we do the proper locking for safe calling from a context where the ring it still live. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-13io_uring: allow task match to be passed to io_req_cache_free()Jens Axboe
No changes in this patch, just allows a caller to pass in a targeted task that we must match for freeing requests in the cache. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-12Merge tag '5.11-rc7-smb3-github' of git://github.com/smfrench/smb3-kernelLinus Torvalds
Pull cifs fixes from Steve French: "Four small smb3 fixes to the new mount API (including a particularly important one for DFS links). These were found in testing this week of additional DFS scenarios, and a user testing of an apache container problem" * tag '5.11-rc7-smb3-github' of git://github.com/smfrench/smb3-kernel: cifs: Set CIFS_MOUNT_USE_PREFIX_PATH flag on setting cifs_sb->prepath. cifs: In the new mount api we get the full devname as source= cifs: do not disable noperm if multiuser mount option is not provided cifs: fix dfs-links
2021-02-12f2fs: give a warning only for readonly partitionJaegeuk Kim
Let's allow mounting readonly partition. We're able to recovery later once we have it as read-write back. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-02-12io-wq: clear out worker ->fs and ->filesJens Axboe
By default, kernel threads have init_fs and init_files assigned. In the past, this has triggered security problems, as commands that don't ask for (and hence don't get assigned) fs/files from the originating task can then attempt path resolution etc with access to parts of the system they should not be able to. Rather than add checks in the fs code for misuse, just set these to NULL. If we do attempt to use them, then the resulting code will oops rather than provide access to something that it should not permit. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-12Merge tag 'io_uring-5.11-2021-02-12' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull io_uring fix from Jens Axboe: "Revert of a patch from this release that caused a regression" * tag 'io_uring-5.11-2021-02-12' of git://git.kernel.dk/linux-block: Revert "io_uring: don't take fs for recvmsg/sendmsg"
2021-02-12io_uring: optimise io_init_req() flags settingPavel Begunkov
Invalid req->flags are tolerated by free/put well, avoid this dancing needlessly presetting it to zero, and then not even resetting but modifying it, i.e. "|=". Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-12io_uring: clean io_req_find_next() fast checkPavel Begunkov
Indirectly io_req_find_next() is called for every request, optimise the check by testing flags as it was long before -- __io_req_find_next() tolerates false-positives well (i.e. link==NULL), and those should be really rare. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-12io_uring: don't check PF_EXITING from syscallPavel Begunkov
io_sq_thread_acquire_mm_files() can find a PF_EXITING task only when it's called from task_work context. Don't check it in all other cases, that are when we're in io_uring_enter(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-12btrfs: initialize fs_info::csum_size earlier in open_ctreeSu Yue
User reported that btrfs-progs misc-tests/028-superblock-recover fails: [TEST/misc] 028-superblock-recover unexpected success: mounted fs with corrupted superblock test failed for case 028-superblock-recover The test case expects that a broken image with bad superblock will be rejected to be mounted. However, the test image just passed csum check of superblock and was successfully mounted. Commit 55fc29bed8dd ("btrfs: use cached value of fs_info::csum_size everywhere") replaces all calls to btrfs_super_csum_size by fs_info::csum_size. The calls include the place where fs_info->csum_size is not initialized. So btrfs_check_super_csum() passes because memcmp() with len 0 always returns 0. Fix it by caching csum size in btrfs_fs_info::csum_size once we know the csum type in superblock is valid in open_ctree(). Link: https://github.com/kdave/btrfs-progs/issues/250 Fixes: 55fc29bed8dd ("btrfs: use cached value of fs_info::csum_size everywhere") Signed-off-by: Su Yue <l@damenly.su> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2021-02-12io_uring: don't split out consume out of SQE getPavel Begunkov
Remove io_consume_sqe() and inline it back into io_get_sqe(). It requires req dealloc on error, but in exchange we get cleaner io_submit_sqes() and better locality for cached_sq_head. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-12io_uring: save ctx put/get for task_work submitPavel Begunkov
Do a little trick in io_ring_ctx_free() briefly taking uring_lock, that will wait for everyone currently holding it, so we can skip pinning ctx with ctx->refs for __io_req_task_submit(), which is executed and loses its refs/reqs while holding the lock. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-12io_uring: don't duplicate io_req_task_queue()Pavel Begunkov
Don't hand code io_req_task_queue() inside of io_async_buf_func(), just call it. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-12io_uring: optimise SQPOLL mm/files grabbingPavel Begunkov
There are two reasons for this. First is to optimise io_sq_thread_acquire_mm_files() for non-SQPOLL case, which currently do too many checks and function calls in the hot path, e.g. in io_init_req(). The second is to not grab mm/files when there are not needed. As __io_queue_sqe() issues only one request now, we can reuse io_sq_thread_acquire_mm_files() instead of unconditional acquire mm/files. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-12io_uring: optimise out unlikely link queuePavel Begunkov
__io_queue_sqe() tries to issue as much requests of a link as it can, and uses io_put_req_find_next() to extract a next one, targeting inline completed requests. As now __io_queue_sqe() is always used together with struct io_comp_state, it leaves next propagation only a small window and only for async reqs, that doesn't justify its existence. Remove it, make __io_queue_sqe() to issue only a head request. It simplifies the code and will allow other optimisations. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-12io_uring: take compl state from submit statePavel Begunkov
Completion and submission states are now coupled together, it's weird to get one from argument and another from ctx, do it consistently for io_req_free_batch(). It's also faster as we already have @state cached in registers. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-11io_uring: inline io_complete_rw_common()Pavel Begunkov
__io_complete_rw() casts request to kiocb for it to be immediately container_of()'ed by io_complete_rw_common(). And the last function's name doesn't do a great job of illuminating its purposes, so just inline it in its only user. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-11io_uring: move res check out of io_rw_reissue()Pavel Begunkov
We pass return code into io_rw_reissue() only to be able to check if it's -EAGAIN. That's not the cleanest approach and may prevent inlining of the non-EAGAIN fast path, so do it at call sites. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-11io_uring: simplify iopoll reissuingPavel Begunkov
Don't stash -EAGAIN'ed iopoll requests into a list to reissue it later, do it eagerly. It removes overhead on keeping and checking that list, and allows in case of failure for these requests to be completed through normal iopoll completion path. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-11io_uring: clean up io_req_free_batch_finish()Pavel Begunkov
io_req_free_batch_finish() is final and does not permit struct req_batch to be reused without re-init. To be more consistent don't clear ->task there. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-11io_uring: move submit side state closer in the ringJens Axboe
We recently added the submit side req cache, but it was placed at the end of the struct. Move it near the other submission state for better memory placement, and reshuffle a few other members at the same time. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-11fs/jfs: fix potential integer overflow on shift of a intColin Ian King
The left shift of int 32 bit integer constant 1 is evaluated using 32 bit arithmetic and then assigned to a signed 64 bit integer. In the case where l2nb is 32 or more this can lead to an overflow. Avoid this by shifting the value 1LL instead. Addresses-Coverity: ("Uninitentional integer overflow") Fixes: b40c2e665cd5 ("fs/jfs: TRIM support for JFS Filesystem") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
2021-02-11cifs: Set CIFS_MOUNT_USE_PREFIX_PATH flag on setting cifs_sb->prepath.Shyam Prasad N
While debugging another issue today, Steve and I noticed that if a subdir for a file share is already mounted on the client, any new mount of any other subdir (or the file share root) of the same share results in sharing the cifs superblock, which e.g. can result in incorrect device name. While setting prefix path for the root of a cifs_sb, CIFS_MOUNT_USE_PREFIX_PATH flag should also be set. Without it, prepath is not even considered in some places, and output of "mount" and various /proc/<>/*mount* related options can be missing part of the device name. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-02-11cifs: In the new mount api we get the full devname as source=Ronnie Sahlberg
so we no longer need to handle or parse the UNC= and prefixpath= options that mount.cifs are generating. This also fixes a bug in the mount command option where the devname would be truncated into just //server/share because we were looking at the truncated UNC value and not the full path. I.e. in the mount command output the devive //server/share/path would show up as just //server/share Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Shyam Prasad N <nspmangalore@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-02-11xfs: consider shutdown in bmapbt cursor delete assertBrian Foster
The assert in xfs_btree_del_cursor() checks that the bmapbt block allocation field has been handled correctly before the cursor is freed. This field is used for accurate calculation of indirect block reservation requirements (for delayed allocations), for example. generic/019 reproduces a scenario where this assert fails because the filesystem has shutdown while in the middle of a bmbt record insertion. This occurs after a bmbt block has been allocated via the cursor but before the higher level bmap function (i.e. xfs_bmap_add_extent_hole_real()) completes and resets the field. Update the assert to accommodate the transient state if the filesystem has shutdown. While here, clean up the indentation and comments in the function. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>