summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2020-06-02mm: remove the prot argument from vm_map_ramChristoph Hellwig
This is always PAGE_KERNEL - for long term mappings with other properties vmap should be used. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: David Airlie <airlied@linux.ie> Cc: Gao Xiang <xiang@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Laura Abbott <labbott@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Kelley <mikelley@microsoft.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Sakari Ailus <sakari.ailus@linux.intel.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Wei Liu <wei.liu@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Link: http://lkml.kernel.org/r/20200414131348.444715-19-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02/proc/PID/smaps: Add PMD migration entry parsingHuang Ying
Now, when reading /proc/PID/smaps, the PMD migration entry in page table is simply ignored. To improve the accuracy of /proc/PID/smaps, its parsing and processing is added. To test the patch, we run pmbench to eat 400 MB memory in background, then run /usr/bin/migratepages and `cat /proc/PID/smaps` every second. The issue as follows can be reproduced within 60 seconds. Before the patch, for the fully populated 400 MB anonymous VMA, some THP pages under migration may be lost as below. 7f3f6a7e5000-7f3f837e5000 rw-p 00000000 00:00 0 Size: 409600 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Rss: 407552 kB Pss: 407552 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 407552 kB Referenced: 301056 kB Anonymous: 407552 kB LazyFree: 0 kB AnonHugePages: 405504 kB ShmemPmdMapped: 0 kB FilePmdMapped: 0 kB Shared_Hugetlb: 0 kB Private_Hugetlb: 0 kB Swap: 0 kB SwapPss: 0 kB Locked: 0 kB THPeligible: 1 VmFlags: rd wr mr mw me ac After the patch, it will be always, 7f3f6a7e5000-7f3f837e5000 rw-p 00000000 00:00 0 Size: 409600 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Rss: 409600 kB Pss: 409600 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 409600 kB Referenced: 294912 kB Anonymous: 409600 kB LazyFree: 0 kB AnonHugePages: 407552 kB ShmemPmdMapped: 0 kB FilePmdMapped: 0 kB Shared_Hugetlb: 0 kB Private_Hugetlb: 0 kB Swap: 0 kB SwapPss: 0 kB Locked: 0 kB THPeligible: 1 VmFlags: rd wr mr mw me ac Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Zi Yan <ziy@nvidia.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Yang Shi <yang.shi@linux.alibaba.com> Link: http://lkml.kernel.org/r/20200403123059.1846960-1-ying.huang@intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02mm/writeback: discard NR_UNSTABLE_NFS, use NR_WRITEBACK insteadNeilBrown
After an NFS page has been written it is considered "unstable" until a COMMIT request succeeds. If the COMMIT fails, the page will be re-written. These "unstable" pages are currently accounted as "reclaimable", either in WB_RECLAIMABLE, or in NR_UNSTABLE_NFS which is included in a 'reclaimable' count. This might have made sense when sending the COMMIT required a separate action by the VFS/MM (e.g. releasepage() used to send a COMMIT). However now that all writes generated by ->writepages() will automatically be followed by a COMMIT (since commit 919e3bd9a875 ("NFS: Ensure we commit after writeback is complete")) it makes more sense to treat them as writeback pages. So this patch removes NR_UNSTABLE_NFS and accounts unstable pages in NR_WRITEBACK and WB_WRITEBACK. A particular effect of this change is that when wb_check_background_flush() calls wb_over_bg_threshold(), the latter will report 'true' a lot less often as the 'unstable' pages are no longer considered 'dirty' (as there is nothing that writeback can do about them anyway). Currently wb_check_background_flush() will trigger writeback to NFS even when there are relatively few dirty pages (if there are lots of unstable pages), this can result in small writes going to the server (10s of Kilobytes rather than a Megabyte) which hurts throughput. With this patch, there are fewer writes which are each larger on average. Where the NR_UNSTABLE_NFS count was included in statistics virtual-files, the entry is retained, but the value is hard-coded as zero. static trace points and warning printks which mentioned this counter no longer report it. [akpm@linux-foundation.org: re-layout comment] [akpm@linux-foundation.org: fix printk warning] Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Trond Myklebust <trond.myklebust@hammerspace.com> Acked-by: Michal Hocko <mhocko@suse.com> [mm] Cc: Christoph Hellwig <hch@lst.de> Cc: Chuck Lever <chuck.lever@oracle.com> Link: http://lkml.kernel.org/r/87d06j7gqa.fsf@notabene.neil.brown.name Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02mm/writeback: replace PF_LESS_THROTTLE with PF_LOCAL_THROTTLENeilBrown
PF_LESS_THROTTLE exists for loop-back nfsd (and a similar need in the loop block driver and callers of prctl(PR_SET_IO_FLUSHER)), where a daemon needs to write to one bdi (the final bdi) in order to free up writes queued to another bdi (the client bdi). The daemon sets PF_LESS_THROTTLE and gets a larger allowance of dirty pages, so that it can still dirty pages after other processses have been throttled. The purpose of this is to avoid deadlock that happen when the PF_LESS_THROTTLE process must write for any dirty pages to be freed, but it is being thottled and cannot write. This approach was designed when all threads were blocked equally, independently on which device they were writing to, or how fast it was. Since that time the writeback algorithm has changed substantially with different threads getting different allowances based on non-trivial heuristics. This means the simple "add 25%" heuristic is no longer reliable. The important issue is not that the daemon needs a *larger* dirty page allowance, but that it needs a *private* dirty page allowance, so that dirty pages for the "client" bdi that it is helping to clear (the bdi for an NFS filesystem or loop block device etc) do not affect the throttling of the daemon writing to the "final" bdi. This patch changes the heuristic so that the task is not throttled when the bdi it is writing to has a dirty page count below below (or equal to) the free-run threshold for that bdi. This ensures it will always be able to have some pages in flight, and so will not deadlock. In a steady-state, it is expected that PF_LOCAL_THROTTLE tasks might still be throttled by global threshold, but that is acceptable as it is only the deadlock state that is interesting for this flag. This approach of "only throttle when target bdi is busy" is consistent with the other use of PF_LESS_THROTTLE in current_may_throttle(), were it causes attention to be focussed only on the target bdi. So this patch - renames PF_LESS_THROTTLE to PF_LOCAL_THROTTLE, - removes the 25% bonus that that flag gives, and - If PF_LOCAL_THROTTLE is set, don't delay at all unless the global and the local free-run thresholds are exceeded. Note that previously realtime threads were treated the same as PF_LESS_THROTTLE threads. This patch does *not* change the behvaiour for real-time threads, so it is now different from the behaviour of nfsd and loop tasks. I don't know what is wanted for realtime. [akpm@linux-foundation.org: coding style fixes] Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Chuck Lever <chuck.lever@oracle.com> [nfsd] Cc: Christoph Hellwig <hch@lst.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Link: http://lkml.kernel.org/r/87ftbf7gs3.fsf@notabene.neil.brown.name Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02orangefs: use attach/detach_page_privateGuoqing Jiang
Since the new pair function is introduced, we can call them to clean the code in orangefs. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Mike Marshall <hubcap@omnibond.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Martin Brandenburg <martin@omnibond.com> Link: http://lkml.kernel.org/r/20200517214718.468-9-guoqing.jiang@cloud.ionos.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02ntfs: replace attach_page_buffers with attach_page_privateGuoqing Jiang
Call the new function since attach_page_buffers will be removed. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Anton Altaparmakov <anton@tuxera.com> Link: http://lkml.kernel.org/r/20200517214718.468-8-guoqing.jiang@cloud.ionos.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02iomap: use attach/detach_page_privateGuoqing Jiang
Since the new pair function is introduced, we can call them to clean the code in iomap. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Dave Chinner <david@fromorbit.com> Link: http://lkml.kernel.org/r/20200517214718.468-7-guoqing.jiang@cloud.ionos.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02f2fs: use attach/detach_page_privateGuoqing Jiang
Since the new pair function is introduced, we can call them to clean the code in f2fs.h. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Chao Yu <yuchao0@huawei.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Link: http://lkml.kernel.org/r/20200517214718.468-6-guoqing.jiang@cloud.ionos.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02fs/buffer.c: use attach/detach_page_privateGuoqing Jiang
Since the new pair function is introduced, we can call them to clean the code in buffer.c. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Link: http://lkml.kernel.org/r/20200517214718.468-5-guoqing.jiang@cloud.ionos.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02btrfs: use attach/detach_page_privateGuoqing Jiang
Since the new pair function is introduced, we can call them to clean the code in btrfs. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: David Sterba <dsterba@suse.com> Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <josef@toxicpanda.com> Link: http://lkml.kernel.org/r/20200517214718.468-4-guoqing.jiang@cloud.ionos.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02iomap: convert from readpages to readaheadMatthew Wilcox (Oracle)
Use the new readahead operation in iomap. Convert XFS and ZoneFS to use it. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Cc: Chao Yu <yuchao0@huawei.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Gao Xiang <gaoxiang25@huawei.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Link: http://lkml.kernel.org/r/20200414150233.24495-26-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02fuse: convert from readpages to readaheadMatthew Wilcox (Oracle)
Implement the new readahead operation in fuse by using __readahead_batch() to fill the array of pages in fuse_args_pages directly. This lets us inline fuse_readpages_fill() into fuse_readahead(). [willy@infradead.org: build fix] Link: http://lkml.kernel.org/r/20200415025938.GB5820@bombadil.infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Acked-by: Miklos Szeredi <mszeredi@redhat.com> Cc: Chao Yu <yuchao0@huawei.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Gao Xiang <gaoxiang25@huawei.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: http://lkml.kernel.org/r/20200414150233.24495-25-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02f2fs: pass the inode to f2fs_mpage_readpagesMatthew Wilcox (Oracle)
This function now only uses the mapping argument to look up the inode, and both callers already have the inode, so just pass the inode instead of the mapping. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Acked-by: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Gao Xiang <gaoxiang25@huawei.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Link: http://lkml.kernel.org/r/20200414150233.24495-24-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02f2fs: convert from readpages to readaheadMatthew Wilcox (Oracle)
Use the new readahead operation in f2fs Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Acked-by: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Gao Xiang <gaoxiang25@huawei.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Link: http://lkml.kernel.org/r/20200414150233.24495-23-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02ext4: pass the inode to ext4_mpage_readpagesMatthew Wilcox (Oracle)
This function now only uses the mapping argument to look up the inode, and both callers already have the inode, so just pass the inode instead of the mapping. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Cc: Chao Yu <yuchao0@huawei.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Gao Xiang <gaoxiang25@huawei.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Link: http://lkml.kernel.org/r/20200414150233.24495-22-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02ext4: convert from readpages to readaheadMatthew Wilcox (Oracle)
Use the new readahead operation in ext4 Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Cc: Chao Yu <yuchao0@huawei.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Gao Xiang <gaoxiang25@huawei.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Link: http://lkml.kernel.org/r/20200414150233.24495-21-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02erofs: convert compressed files from readpages to readaheadMatthew Wilcox (Oracle)
Use the new readahead operation in erofs. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Acked-by: Gao Xiang <gaoxiang25@huawei.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Link: http://lkml.kernel.org/r/20200414150233.24495-20-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02erofs: convert uncompressed files from readpages to readaheadMatthew Wilcox (Oracle)
Use the new readahead operation in erofs Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Acked-by: Gao Xiang <gaoxiang25@huawei.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Link: http://lkml.kernel.org/r/20200414150233.24495-19-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02btrfs: convert from readpages to readaheadMatthew Wilcox (Oracle)
Implement the new readahead method in btrfs using the new readahead_page_batch() function. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Cc: Chao Yu <yuchao0@huawei.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Gao Xiang <gaoxiang25@huawei.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Link: http://lkml.kernel.org/r/20200414150233.24495-18-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02fs: convert mpage_readpages to mpage_readaheadMatthew Wilcox (Oracle)
Implement the new readahead aop and convert all callers (block_dev, exfat, ext2, fat, gfs2, hpfs, isofs, jfs, nilfs2, ocfs2, omfs, qnx6, reiserfs & udf). The callers are all trivial except for GFS2 & OCFS2. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com> # ocfs2 Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> # ocfs2 Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Cc: Chao Yu <yuchao0@huawei.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Gao Xiang <gaoxiang25@huawei.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Link: http://lkml.kernel.org/r/20200414150233.24495-17-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02mm: add page_cache_readahead_unboundedMatthew Wilcox (Oracle)
ext4 and f2fs have duplicated the guts of the readahead code so they can read past i_size. Instead, separate out the guts of the readahead code so they can call it directly. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Cc: Chao Yu <yuchao0@huawei.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Gao Xiang <gaoxiang25@huawei.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Link: http://lkml.kernel.org/r/20200414150233.24495-14-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02fs/buffer.c: record blockdev write errors in super_block that it backsJeff Layton
When syncing out a block device (a'la __sync_blockdev), any error encountered will only be recorded in the bd_inode's mapping. When the blockdev contains a filesystem however, we'd like to also record the error in the super_block that's stored there. Make mark_buffer_write_io_error also record the error in the corresponding super_block when a writeback error occurs and the block device contains a mounted superblock. Since superblocks are RCU freed, hold the rcu_read_lock to ensure that the superblock doesn't go away while we're marking it. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andres Freund <andres@anarazel.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: David Howells <dhowells@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Dave Chinner <david@fromorbit.com> Link: http://lkml.kernel.org/r/20200428135155.19223-3-jlayton@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02vfs: track per-sb writeback errors and report them to syncfsJeff Layton
Patch series "vfs: have syncfs() return error when there are writeback errors", v6. Currently, syncfs does not return errors when one of the inodes fails to be written back. It will return errors based on the legacy AS_EIO and AS_ENOSPC flags when syncing out the block device fails, but that's not particularly helpful for filesystems that aren't backed by a blockdev. It's also possible for a stray sync to lose those errors. The basic idea in this set is to track writeback errors at the superblock level, so that we can quickly and easily check whether something bad happened without having to fsync each file individually. syncfs is then changed to reliably report writeback errors after they occur, much in the same fashion as fsync does now. This patch (of 2): Usually we suggest that applications call fsync when they want to ensure that all data written to the file has made it to the backing store, but that can be inefficient when there are a lot of open files. Calling syncfs on the filesystem can be more efficient in some situations, but the error reporting doesn't currently work the way most people expect. If a single inode on a filesystem reports a writeback error, syncfs won't necessarily return an error. syncfs only returns an error if __sync_blockdev fails, and on some filesystems that's a no-op. It would be better if syncfs reported an error if there were any writeback failures. Then applications could call syncfs to see if there are any errors on any open files, and could then call fsync on all of the other descriptors to figure out which one failed. This patch adds a new errseq_t to struct super_block, and has mapping_set_error also record writeback errors there. To report those errors, we also need to keep an errseq_t in struct file to act as a cursor. This patch adds a dedicated field for that purpose, which slots nicely into 4 bytes of padding at the end of struct file on x86_64. An earlier version of this patch used an O_PATH file descriptor to cue the kernel that the open file should track the superblock error and not the inode's writeback error. I think that API is just too weird though. This is simpler and should make syncfs error reporting "just work" even if someone is multiplexing fsync and syncfs on the same fds. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Andres Freund <andres@anarazel.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Dave Chinner <david@fromorbit.com> Cc: David Howells <dhowells@redhat.com> Link: http://lkml.kernel.org/r/20200428135155.19223-1-jlayton@kernel.org Link: http://lkml.kernel.org/r/20200428135155.19223-2-jlayton@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02ocfs2: mount shared volume without ha stackGang He
Usually we create and use a ocfs2 shared volume on the top of ha stack. For pcmk based ha stack, which includes DLM, corosync and pacemaker services. The customers complained they could not mount existent ocfs2 volume in the single node without ha stack, e.g. single node backup/restore scenario. Like this case, the customers just want to access the data from the existent ocfs2 volume quickly, but do not want to restart or setup ha stack. Then, I'd like to add a mount option "nocluster", if the users use this option to mount a ocfs2 shared volume, the whole mount will not depend on the ha related services. the command will mount the existent ocfs2 volume directly (like local mount), for avoiding setup the ha stack. Signed-off-by: Gang He <ghe@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Link: http://lkml.kernel.org/r/20200423053300.22661-1-ghe@suse.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02ocfs2: add missing annotation for dlm_empty_lockres()Jules Irenge
Sparse reports a warning at dlm_empty_lockres() warning: context imbalance in dlm_purge_lockres() - unexpected unlock The root cause is the missing annotation at dlm_purge_lockres() Add the missing __must_hold(&dlm->spinlock) Signed-off-by: Jules Irenge <jbi.octave@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Link: http://lkml.kernel.org/r/20200403160505.2832-4-jbi.octave@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02squashfs: migrate from ll_rw_block usage to BIOPhilippe Liard
ll_rw_block() function has been deprecated in favor of BIO which appears to come with large performance improvements. This patch decreases boot time by close to 40% when using squashfs for the root file-system. This is observed at least in the context of starting an Android VM on Chrome OS using crosvm. The patch was tested on 4.19 as well as master. This patch is largely based on Adrien Schildknecht's patch that was originally sent as https://lkml.org/lkml/2017/9/22/814 though with some significant changes and simplifications while also taking Phillip Lougher's feedback into account, around preserving support for FILE_CACHE in particular. [akpm@linux-foundation.org: fix build error reported by Randy] Link: http://lkml.kernel.org/r/319997c2-5fc8-f889-2ea3-d913308a7c1f@infradead.org Signed-off-by: Philippe Liard <pliard@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Adrien Schildknecht <adrien+dev@schischi.me> Cc: Phillip Lougher <phillip@squashfs.org.uk> Cc: Guenter Roeck <groeck@chromium.org> Cc: Daniel Rosenberg <drosen@google.com> Link: https://chromium.googlesource.com/chromiumos/platform/crosvm Link: http://lkml.kernel.org/r/20191106074238.186023-1-pliard@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-01Merge tag 'fsverity-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt Pull fsverity updates from Eric Biggers: "Fix kerneldoc warnings and some coding style inconsistencies. This mirrors the similar cleanups being done in fs/crypto/" * tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: fs-verity: remove unnecessary extern keywords fs-verity: fix all kerneldoc warnings
2020-06-01Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscryptLinus Torvalds
Pull fscrypt updates from Eric Biggers: - Add the IV_INO_LBLK_32 encryption policy flag which modifies the encryption to be optimized for eMMC inline encryption hardware. - Make the test_dummy_encryption mount option for ext4 and f2fs support v2 encryption policies. - Fix kerneldoc warnings and some coding style inconsistencies. * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: fscrypt: add support for IV_INO_LBLK_32 policies fscrypt: make test_dummy_encryption use v2 by default fscrypt: support test_dummy_encryption=v2 fscrypt: add fscrypt_add_test_dummy_key() linux/parser.h: add include guards fscrypt: remove unnecessary extern keywords fscrypt: name all function parameters fscrypt: fix all kerneldoc warnings
2020-06-01Merge tag 'pstore-v5.8-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull pstore updates from Kees Cook: "Fixes and new features for pstore. This is a pretty big set of changes (relative to past pstore pulls), but it has been in -next for a while. The biggest change here is the ability to support a block device as a pstore backend, which has been desired for a while. A lot of additional fixes and refactorings are also included, mostly in support of the new features. - refactor pstore locking for safer module unloading (Kees Cook) - remove orphaned records from pstorefs when backend unloaded (Kees Cook) - refactor dump_oops parameter into max_reason (Pavel Tatashin) - introduce pstore/zone for common code for contiguous storage (WeiXiong Liao) - introduce pstore/blk for block device backend (WeiXiong Liao) - introduce mtd backend (WeiXiong Liao)" * tag 'pstore-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (35 commits) mtd: Support kmsg dumper based on pstore/blk pstore/blk: Introduce "best_effort" mode pstore/blk: Support non-block storage devices pstore/blk: Provide way to query pstore configuration pstore/zone: Provide way to skip "broken" zone for MTD devices Documentation: Add details for pstore/blk pstore/zone,blk: Add ftrace frontend support pstore/zone,blk: Add console frontend support pstore/zone,blk: Add support for pmsg frontend pstore/blk: Introduce backend for block devices pstore/zone: Introduce common layer to manage storage zones ramoops: Add "max-reason" optional field to ramoops DT node pstore/ram: Introduce max_reason and convert dump_oops pstore/platform: Pass max_reason to kmesg dump printk: Introduce kmsg_dump_reason_str() printk: honor the max_reason field in kmsg_dumper printk: Collapse shutdown types into a single dump reason pstore/ftrace: Provide ftrace log merging routine pstore/ram: Refactor ftrace buffer merging pstore/ram: Refactor DT size parsing ...
2020-06-01Merge branch 'linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Introduce crypto_shash_tfm_digest() and use it wherever possible. - Fix use-after-free and race in crypto_spawn_alg. - Add support for parallel and batch requests to crypto_engine. Algorithms: - Update jitter RNG for SP800-90B compliance. - Always use jitter RNG as seed in drbg. Drivers: - Add Arm CryptoCell driver cctrng. - Add support for SEV-ES to the PSP driver in ccp" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (114 commits) crypto: hisilicon - fix driver compatibility issue with different versions of devices crypto: engine - do not requeue in case of fatal error crypto: cavium/nitrox - Fix a typo in a comment crypto: hisilicon/qm - change debugfs file name from qm_regs to regs crypto: hisilicon/qm - add DebugFS for xQC and xQE dump crypto: hisilicon/zip - add debugfs for Hisilicon ZIP crypto: hisilicon/hpre - add debugfs for Hisilicon HPRE crypto: hisilicon/sec2 - add debugfs for Hisilicon SEC crypto: hisilicon/qm - add debugfs to the QM state machine crypto: hisilicon/qm - add debugfs for QM crypto: stm32/crc32 - protect from concurrent accesses crypto: stm32/crc32 - don't sleep in runtime pm crypto: stm32/crc32 - fix multi-instance crypto: stm32/crc32 - fix run-time self test issue. crypto: stm32/crc32 - fix ext4 chksum BUG_ON() crypto: hisilicon/zip - Use temporary sqe when doing work crypto: hisilicon - add device error report through abnormal irq crypto: hisilicon - remove codes of directly report device errors through MSI crypto: hisilicon - QM memory management optimization crypto: hisilicon - unify initial value assignment into QM ...
2020-05-31pstore/blk: Introduce "best_effort" modeKees Cook
In order to use arbitrary block devices as a pstore backend, provide a new module param named "best_effort", which will allow using any block device, even if it has not provided a panic_write callback. Link: https://lore.kernel.org/lkml/20200511233229.27745-12-keescook@chromium.org/ Signed-off-by: Kees Cook <keescook@chromium.org>
2020-05-31pstore/blk: Support non-block storage devicesWeiXiong Liao
Add support for non-block devices (e.g. MTD). A non-block driver calls pstore_blk_register_device() to register iself. In addition, pstore/zone is updated to handle non-block devices, where an erase must be done before a write. Without this, there is no way to remove records stored to an MTD. Signed-off-by: WeiXiong Liao <liaoweixiong@allwinnertech.com> Link: https://lore.kernel.org/lkml/20200511233229.27745-10-keescook@chromium.org/ Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org>
2020-05-31pstore/blk: Provide way to query pstore configurationWeiXiong Liao
In order to configure itself, the MTD backend needs to be able to query the current pstore configuration. Introduce pstore_blk_get_config() for this purpose. Signed-off-by: WeiXiong Liao <liaoweixiong@allwinnertech.com> Link: https://lore.kernel.org/lkml/20200511233229.27745-9-keescook@chromium.org/ Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org>
2020-05-31pstore/zone: Provide way to skip "broken" zone for MTD devicesWeiXiong Liao
One requirement to support MTD devices in pstore/zone is having a way to declare certain regions as broken. Add this support to pstore/zone. The MTD driver should return -ENOMSG when encountering a bad region, which tells pstore/zone to skip and try the next one. Signed-off-by: WeiXiong Liao <liaoweixiong@allwinnertech.com> Link: https://lore.kernel.org/lkml/20200511233229.27745-8-keescook@chromium.org/ Co-developed-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: //lore.kernel.org/lkml/20200512173801.222666-1-colin.king@canonical.com Signed-off-by: Kees Cook <keescook@chromium.org>
2020-05-30Documentation: Add details for pstore/blkWeiXiong Liao
Add details on using pstore/blk, the new backend of pstore to record dumps to block devices, in Documentation/admin-guide/pstore-blk.rst Signed-off-by: WeiXiong Liao <liaoweixiong@allwinnertech.com> Link: https://lore.kernel.org/lkml/20200511233229.27745-7-keescook@chromium.org/ Signed-off-by: Kees Cook <keescook@chromium.org>
2020-05-30pstore/zone,blk: Add ftrace frontend supportWeiXiong Liao
Support backend for ftrace. To enable ftrace backend, just make ftrace_size be greater than 0 and a multiple of 4096. Signed-off-by: WeiXiong Liao <liaoweixiong@allwinnertech.com> Link: https://lore.kernel.org/lkml/20200511233229.27745-6-keescook@chromium.org/ Co-developed-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/lkml/20200512170719.221514-1-colin.king@canonical.com Signed-off-by: Kees Cook <keescook@chromium.org>
2020-05-30pstore/zone,blk: Add console frontend supportWeiXiong Liao
Support backend for console. To enable console backend, just make console_size be greater than 0 and a multiple of 4096. Signed-off-by: WeiXiong Liao <liaoweixiong@allwinnertech.com> Link: https://lore.kernel.org/lkml/20200511233229.27745-5-keescook@chromium.org/ Signed-off-by: Kees Cook <keescook@chromium.org>
2020-05-30pstore/zone,blk: Add support for pmsg frontendWeiXiong Liao
Add pmsg support to pstore/blk (through pstore/zone). To enable, pmsg_size must be greater than 0 and a multiple of 4096. Signed-off-by: WeiXiong Liao <liaoweixiong@allwinnertech.com> Link: https://lore.kernel.org/lkml/20200511233229.27745-4-keescook@chromium.org/ Co-developed-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/lkml/20200512171932.222102-1-colin.king@canonical.com Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org>
2020-05-30pstore/blk: Introduce backend for block devicesWeiXiong Liao
pstore/blk is similar to pstore/ram, but uses a block device as the storage rather than persistent ram. The pstore/blk backend solves two common use-cases that used to preclude using pstore/ram: - not all devices have a battery that could be used to persist regular RAM across power failures. - most embedded intelligent equipment have no persistent ram, which increases costs, instead preferring cheaper solutions, like block devices. pstore/blk provides separate configurations for the end user and for the block drivers. User configuration determines how pstore/blk operates, such as record sizes, max kmsg dump reasons, etc. These can be set by Kconfig and/or module parameters, but module parameter have priority over Kconfig. Driver configuration covers all the details about the target block device, such as total size of the device and how to perform read/write operations. These are provided by block drivers, calling pstore_register_blkdev(), including an optional panic_write callback used to bypass regular IO APIs in an effort to avoid potentially destabilized kernel code during a panic. Signed-off-by: WeiXiong Liao <liaoweixiong@allwinnertech.com> Link: https://lore.kernel.org/lkml/20200511233229.27745-3-keescook@chromium.org/ Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org>
2020-05-30pstore/zone: Introduce common layer to manage storage zonesWeiXiong Liao
Implement a common set of APIs needed to support pstore storage zones, based on how ramoops is designed. This will be used by pstore/blk with the intention of migrating pstore/ram in the future. Signed-off-by: WeiXiong Liao <liaoweixiong@allwinnertech.com> Link: https://lore.kernel.org/lkml/20200511233229.27745-2-keescook@chromium.org/ Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org>
2020-05-30pstore/ram: Introduce max_reason and convert dump_oopsKees Cook
Now that pstore_register() can correctly pass max_reason to the kmesg dump facility, introduce a new "max_reason" module parameter and "max-reason" Device Tree field. The "dump_oops" module parameter and "dump-oops" Device Tree field are now considered deprecated, but are now automatically converted to their corresponding max_reason values when present, though the new max_reason setting has precedence. For struct ramoops_platform_data, the "dump_oops" member is entirely replaced by a new "max_reason" member, with the only existing user updated in place. Additionally remove the "reason" filter logic from ramoops_pstore_write(), as that is not specifically needed anymore, though technically this is a change in behavior for any ramoops users also setting the printk.always_kmsg_dump boot param, which will cause ramoops to behave as if max_reason was set to KMSG_DUMP_MAX. Co-developed-by: Pavel Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com> Link: https://lore.kernel.org/lkml/20200515184434.8470-6-keescook@chromium.org/ Signed-off-by: Kees Cook <keescook@chromium.org>
2020-05-30pstore/platform: Pass max_reason to kmesg dumpPavel Tatashin
Add a new member to struct pstore_info for passing information about kmesg dump maximum reason. This allows a finer control of what kmesg dumps are sent to pstore storage backends. Those backends that do not explicitly set this field (keeping it equal to 0), get the default behavior: store only Oopses and Panics, or everything if the printk.always_kmsg_dump boot param is set. Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com> Link: https://lore.kernel.org/lkml/20200515184434.8470-5-keescook@chromium.org/ Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org>
2020-05-30printk: Introduce kmsg_dump_reason_str()Kees Cook
The pstore subsystem already had a private version of this function. With the coming addition of the pstore/zone driver, this needs to be shared. As it really should live with printk, move it there instead. Link: https://lore.kernel.org/lkml/20200515184434.8470-4-keescook@chromium.org/ Acked-by: Petr Mladek <pmladek@suse.com> Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Kees Cook <keescook@chromium.org>
2020-05-30printk: Collapse shutdown types into a single dump reasonKees Cook
To turn the KMSG_DUMP_* reasons into a more ordered list, collapse the redundant KMSG_DUMP_(RESTART|HALT|POWEROFF) reasons into KMSG_DUMP_SHUTDOWN. The current users already don't meaningfully distinguish between them, so there's no need to, as discussed here: https://lore.kernel.org/lkml/CA+CK2bAPv5u1ih5y9t5FUnTyximtFCtDYXJCpuyjOyHNOkRdqw@mail.gmail.com/ Link: https://lore.kernel.org/lkml/20200515184434.8470-2-keescook@chromium.org/ Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by: Kees Cook <keescook@chromium.org>
2020-05-30pstore/ftrace: Provide ftrace log merging routineKees Cook
Move the ftrace log merging logic out of pstore/ram into pstore/ftrace so other backends can use it, like pstore/zone. Link: https://lore.kernel.org/lkml/20200510202436.63222-7-keescook@chromium.org/ Signed-off-by: Kees Cook <keescook@chromium.org>
2020-05-30pstore/ram: Refactor ftrace buffer mergingKees Cook
This changes the ftrace record merging code to be agnostic of pstore/ram, as the first step to making it available as a generic routine for other backends to use, such as pstore/zone. Link: https://lore.kernel.org/lkml/20200510202436.63222-6-keescook@chromium.org/ Signed-off-by: Kees Cook <keescook@chromium.org>
2020-05-30pstore/ram: Refactor DT size parsingKees Cook
Refactor device tree size parsing routines to be able to pass a non-zero default value for providing a configurable default for the coming "max_reason" field. Also rename the helpers, since we're not always parsing a size -- we're parsing a u32 and making sure it's not greater than INT_MAX. Link: https://lore.kernel.org/lkml/20200506211523.15077-4-keescook@chromium.org/ Link: https://lore.kernel.org/lkml/20200521205223.175957-1-tyhicks@linux.microsoft.com Signed-off-by: Kees Cook <keescook@chromium.org>
2020-05-30pstore/ram: Adjust module param permissions to reflect realityKees Cook
A couple module parameters had 0600 permissions, but changing them would have no impact on ramoops, so switch these to 0400 to reflect reality. Link: https://lore.kernel.org/lkml/20200506211523.15077-7-keescook@chromium.org/ Signed-off-by: Kees Cook <keescook@chromium.org>
2020-05-30pstore/platform: Move module params after declarationsKees Cook
It is easier to see how module params are used if they're near the variables they use. Link: https://lore.kernel.org/lkml/20200510202436.63222-4-keescook@chromium.org/ Signed-off-by: Kees Cook <keescook@chromium.org>
2020-05-30pstore/platform: Use backend name for console registrationKees Cook
If the pstore backend changes, there's no indication in the logs what the console is (it always says "pstore"). Instead, pass through the active backend's name. (Also adjust the selftest to match.) Link: https://lore.kernel.org/lkml/20200510202436.63222-5-keescook@chromium.org/ Link: https://lore.kernel.org/lkml/20200526135429.GQ12456@shao2-debian Signed-off-by: Kees Cook <keescook@chromium.org>