summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2017-02-09scripts/spelling.txt: add "an user" pattern and fix typo instancesMasahiro Yamada
Fix typos and add the following to the scripts/spelling.txt: an user||a user an userspace||a userspace I also added "userspace" to the list since it is a common word in Linux. I found some instances for "an userfaultfd", but I did not add it to the list. I felt it is endless to find words that start with "user" such as "userland" etc., so must draw a line somewhere. Link: http://lkml.kernel.org/r/1481573103-11329-4-git-send-email-yamada.masahiro@socionext.com Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2017-02-09Merge branch 'akpm-current/current'Stephen Rothwell
2017-02-09Merge remote-tracking branch 'extable/extable'Stephen Rothwell
2017-02-09Merge remote-tracking branch 'userns/for-next'Stephen Rothwell
2017-02-09Merge remote-tracking branch 'cgroup/for-next'Stephen Rothwell
2017-02-09Merge remote-tracking branch 'driver-core/driver-core-next'Stephen Rothwell
2017-02-09Merge remote-tracking branch 'rcu/rcu/next'Stephen Rothwell
2017-02-09Merge remote-tracking branch 'ftrace/for-next'Stephen Rothwell
2017-02-09Merge remote-tracking branch 'tip/auto-latest'Stephen Rothwell
2017-02-09Merge remote-tracking branch 'audit/next'Stephen Rothwell
2017-02-09Merge remote-tracking branch 'iommu/next'Stephen Rothwell
2017-02-09Merge remote-tracking branch 'selinux/next'Stephen Rothwell
2017-02-09Merge remote-tracking branch 'security/next'Stephen Rothwell
2017-02-09Merge remote-tracking branch 'kgdb/kgdb-next'Stephen Rothwell
2017-02-09Merge remote-tracking branch 'block/for-next'Stephen Rothwell
2017-02-09Merge remote-tracking branch 'modules/modules-next'Stephen Rothwell
2017-02-09Merge remote-tracking branch 'kspp/for-next/kspp'Stephen Rothwell
2017-02-09Merge remote-tracking branch 'net-next/master'Stephen Rothwell
2017-02-09Merge remote-tracking branch 'printk/for-next'Stephen Rothwell
2017-02-09Merge remote-tracking branch 'vfs/for-next'Stephen Rothwell
2017-02-09Merge remote-tracking branch 'arm/for-next'Stephen Rothwell
2017-02-09config: android-base: enable hardened usercopy and kernel ASLRAmit Pundir
Enable CONFIG_HARDENED_USERCOPY and CONFIG_RANDOMIZE_BASE in Android base config fragment. Reviewed at https://android-review.googlesource.com/#/c/283659/ Reviewed at https://android-review.googlesource.com/#/c/278133/ Link: http://lkml.kernel.org/r/1481113148-29204-2-git-send-email-amit.pundir@linaro.org Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Cc: Rob Herring <rob.herring@linaro.org> Cc: John Stultz <john.stultz@linaro.org> Cc: Daniel Micay <danielmicay@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2017-02-09config: android-recommended: disable aio supportDaniel Micay
The aio interface adds substantial attack surface for a feature that's not being exposed by Android at all. It's unlikely that anyone is using the kernel feature directly either. This feature is rarely used even on servers. The glibc POSIX aio calls really use thread pools. The lack of widespread usage also means this is relatively poorly audited/tested. The kernel's aio rarely provides performance benefits over using a thread pool and is quite incomplete in terms of system call coverage along with having edge cases where blocking can occur. Part of the performance issue is the fact that it only supports direct io, not buffered io. The existing API is considered fundamentally flawed and it's unlikely it will be expanded, but rather replaced: https://marc.info/?l=linux-aio&m=145255815216051&w=2 Since ext4 encryption means no direct io support, kernel aio isn't even going to work properly on Android devices using file-based encryption. Reviewed-at: https://android-review.googlesource.com/#/c/292158/ Link: http://lkml.kernel.org/r/1481113148-29204-1-git-send-email-amit.pundir@linaro.org Signed-off-by: Daniel Micay <danielmicay@gmail.com> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Cc: Rob Herring <rob.herring@linaro.org> Cc: John Stultz <john.stultz@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2017-02-09sigaltstack: support SS_AUTODISARM for CONFIG_COMPATStas Sergeev
Currently SS_AUTODISARM is not supported in compatibility mode, but does not return -EINVAL either. This makes dosemu built with -m32 on x86_64 to crash. Also the kernel's sigaltstack selftest fails if compiled with -m32. This patch adds the needed support. Link: http://lkml.kernel.org/r/20170205101213.8163-2-stsp@list.ru Signed-off-by: Stas Sergeev <stsp@users.sourceforge.net> Cc: Milosz Tanski <milosz@adfin.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Nicolas Pitre <nicolas.pitre@linaro.org> Cc: Waiman Long <Waiman.Long@hpe.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dmitry Safonov <dsafonov@virtuozzo.com> Cc: Wang Xiaoqiang <wangxq10@lzu.edu.cn> Cc: Oleg Nesterov <oleg@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2017-02-08bpf, lpm: fix overflows in trie_alloc checksDaniel Borkmann
Cap the maximum (total) value size and bail out if larger than KMALLOC_MAX_SIZE as otherwise it doesn't make any sense to proceed further, since we're guaranteed to fail to allocate elements anyway in lpm_trie_node_alloc(); likleyhood of failure is still high for large values, though, similarly as with htab case in non-prealloc. Next, make sure that cost vars are really u64 instead of size_t, so that we don't overflow on 32 bit and charge only tiny map.pages against memlock while allowing huge max_entries; cap also the max cost like we do with other map types. Fixes: b95a5c4db09b ("bpf: add a longest prefix match trie map implementation") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-08printk: drop call_console_drivers() unused paramSergey Senozhatsky
We do suppress_message_printing() check before we call call_console_drivers() now, so `level' param is not needed anymore. Link: http://lkml.kernel.org/r/20161224140902.1962-2-sergey.senozhatsky@gmail.com Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Petr Mladek <pmladek@suse.com>
2017-02-08printk: convert the rest to printk-safeSergey Senozhatsky
This patch converts the rest of logbuf users (which are out of printk recursion case, but can deadlock in printk). To make printk-safe usage easier the patch introduces 4 helper macros: - logbuf_lock_irq()/logbuf_unlock_irq() lock/unlock the logbuf lock and disable/enable local IRQ - logbuf_lock_irqsave(flags)/logbuf_unlock_irqrestore(flags) lock/unlock the logbuf lock and saves/restores local IRQ state Link: http://lkml.kernel.org/r/20161227141611.940-9-sergey.senozhatsky@gmail.com Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Jan Kara <jack@suse.cz> Cc: Tejun Heo <tj@kernel.org> Cc: Calvin Owens <calvinowens@fb.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Petr Mladek <pmladek@suse.com>
2017-02-08printk: remove zap_locks() functionSergey Senozhatsky
We use printk-safe now which makes printk-recursion detection code in vprintk_emit() unreachable. The tricky thing here is that, apart from detecting and reporting printk recursions, that code also used to zap_locks() in case of panic() from the same CPU. However, zap_locks() does not look to be needed anymore: 1) Since commit 08d78658f393 ("panic: release stale console lock to always get the logbuf printed out") panic flushing of `logbuf' to console ignores the state of `console_sem' by doing panic() console_trylock(); console_unlock(); 2) Since commit cf9b1106c81c ("printk/nmi: flush NMI messages on the system panic") panic attempts to zap the `logbuf_lock' spin_lock to successfully flush nmi messages to `logbuf'. Basically, it seems that we either already do what zap_locks() used to do but in other places or we ignore the state of the lock. The only reaming difference is that we don't re-init the console semaphore in printk_safe_flush_on_panic(), but this is not necessary because we don't call console drivers from printk_safe_flush_on_panic() due to the fact that we are using a deferred printk() version (as was suggested by Petr Mladek). Link: http://lkml.kernel.org/r/20161227141611.940-8-sergey.senozhatsky@gmail.com Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Jan Kara <jack@suse.cz> Cc: Tejun Heo <tj@kernel.org> Cc: Calvin Owens <calvinowens@fb.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Petr Mladek <pmladek@suse.com>
2017-02-08printk: use printk_safe buffers in printkSergey Senozhatsky
Use printk_safe per-CPU buffers in printk recursion-prone blocks: -- around logbuf_lock protected sections in vprintk_emit() and console_unlock() -- around down_trylock_console_sem() and up_console_sem() Note that this solution addresses deadlocks caused by printk() recursive calls only. That is vprintk_emit() and console_unlock(). The rest will be converted in a followup patch. Another thing to note is that we now keep lockdep enabled in printk, because we are protected against the printk recursion caused by lockdep in vprintk_emit() by the printk-safe mechanism - we first switch to per-CPU buffers and only then access the deadlock-prone locks. Examples: 1) printk() from logbuf_lock spin_lock section Assume the following code: printk() raw_spin_lock(&logbuf_lock); WARN_ON(1); raw_spin_unlock(&logbuf_lock); which now produces: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 366 at kernel/printk/printk.c:1811 vprintk_emit CPU: 0 PID: 366 Comm: bash Call Trace: warn_slowpath_null+0x1d/0x1f vprintk_emit+0x1cd/0x438 vprintk_default+0x1d/0x1f printk+0x48/0x50 [..] 2) printk() from semaphore sem->lock spin_lock section Assume the following code printk() console_trylock() down_trylock() raw_spin_lock_irqsave(&sem->lock, flags); WARN_ON(1); raw_spin_unlock_irqrestore(&sem->lock, flags); which now produces: ------------[ cut here ]------------ WARNING: CPU: 1 PID: 363 at kernel/locking/semaphore.c:141 down_trylock CPU: 1 PID: 363 Comm: bash Call Trace: warn_slowpath_null+0x1d/0x1f down_trylock+0x3d/0x62 ? vprintk_emit+0x3f9/0x414 console_trylock+0x31/0xeb vprintk_emit+0x3f9/0x414 vprintk_default+0x1d/0x1f printk+0x48/0x50 [..] 3) printk() from console_unlock() Assume the following code: printk() console_unlock() raw_spin_lock(&logbuf_lock); WARN_ON(1); raw_spin_unlock(&logbuf_lock); which now produces: ------------[ cut here ]------------ WARNING: CPU: 1 PID: 329 at kernel/printk/printk.c:2384 console_unlock CPU: 1 PID: 329 Comm: bash Call Trace: warn_slowpath_null+0x18/0x1a console_unlock+0x12d/0x559 ? trace_hardirqs_on_caller+0x16d/0x189 ? trace_hardirqs_on+0xd/0xf vprintk_emit+0x363/0x374 vprintk_default+0x18/0x1a printk+0x43/0x4b [..] 4) printk() from try_to_wake_up() Assume the following code: printk() console_unlock() up() try_to_wake_up() raw_spin_lock_irqsave(&p->pi_lock, flags); WARN_ON(1); raw_spin_unlock_irqrestore(&p->pi_lock, flags); which now produces: ------------[ cut here ]------------ WARNING: CPU: 3 PID: 363 at kernel/sched/core.c:2028 try_to_wake_up CPU: 3 PID: 363 Comm: bash Call Trace: warn_slowpath_null+0x1d/0x1f try_to_wake_up+0x7f/0x4f7 wake_up_process+0x15/0x17 __up.isra.0+0x56/0x63 up+0x32/0x42 __up_console_sem+0x37/0x55 console_unlock+0x21e/0x4c2 vprintk_emit+0x41c/0x462 vprintk_default+0x1d/0x1f printk+0x48/0x50 [..] 5) printk() from call_console_drivers() Assume the following code: printk() console_unlock() call_console_drivers() ... WARN_ON(1); which now produces: ------------[ cut here ]------------ WARNING: CPU: 2 PID: 305 at kernel/printk/printk.c:1604 call_console_drivers CPU: 2 PID: 305 Comm: bash Call Trace: warn_slowpath_null+0x18/0x1a call_console_drivers.isra.6.constprop.16+0x3a/0xb0 console_unlock+0x471/0x48e vprintk_emit+0x1f4/0x206 vprintk_default+0x18/0x1a vprintk_func+0x6e/0x70 printk+0x3e/0x46 [..] 6) unsupported placeholder in printk() format now prints an actual warning from vscnprintf(), instead of 'BUG: recent printk recursion!'. ------------[ cut here ]------------ WARNING: CPU: 5 PID: 337 at lib/vsprintf.c:1900 format_decode Please remove unsupported % in format string CPU: 5 PID: 337 Comm: bash Call Trace: dump_stack+0x4f/0x65 __warn+0xc2/0xdd warn_slowpath_fmt+0x4b/0x53 format_decode+0x22c/0x308 vsnprintf+0x89/0x3b7 vscnprintf+0xd/0x26 vprintk_emit+0xb4/0x238 vprintk_default+0x1d/0x1f vprintk_func+0x6c/0x73 printk+0x43/0x4b [..] Link: http://lkml.kernel.org/r/20161227141611.940-7-sergey.senozhatsky@gmail.com Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jan Kara <jack@suse.cz> Cc: Tejun Heo <tj@kernel.org> Cc: Calvin Owens <calvinowens@fb.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-02-08printk: report lost messages in printk safe/nmi contextsSergey Senozhatsky
Account lost messages in pritk-safe and printk-safe-nmi contexts and report those numbers during printk_safe_flush(). The patch also moves lost message counter to struct `printk_safe_seq_buf' instead of having dedicated static counters - this simplifies the code. Link: http://lkml.kernel.org/r/20161227141611.940-6-sergey.senozhatsky@gmail.com Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Jan Kara <jack@suse.cz> Cc: Tejun Heo <tj@kernel.org> Cc: Calvin Owens <calvinowens@fb.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Petr Mladek <pmladek@suse.com>
2017-02-08printk: always use deferred printk when flush printk_safe linesSergey Senozhatsky
Always use printk_deferred() in printk_safe_flush_line(). Flushing can be done from NMI or printk_safe contexts (when we are in panic), so we can't call console drivers, yet still want to store the messages in the logbuf buffer. Therefore we use a deferred printk version. Link: http://lkml.kernel.org/r/20170206164253.GA463@tigerII.localdomain Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jan Kara <jack@suse.cz> Cc: Tejun Heo <tj@kernel.org> Cc: Calvin Owens <calvinowens@fb.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Suggested-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-02-08printk: introduce per-cpu safe_print seq bufferSergey Senozhatsky
This patch extends the idea of NMI per-cpu buffers to regions that may cause recursive printk() calls and possible deadlocks. Namely, printk() can't handle printk calls from schedule code or printk() calls from lock debugging code (spin_dump() for instance); because those may be called with `sem->lock' already taken or any other `critical' locks (p->pi_lock, etc.). An example of deadlock can be vprintk_emit() console_unlock() up() << raw_spin_lock_irqsave(&sem->lock, flags); wake_up_process() try_to_wake_up() ttwu_queue() ttwu_activate() activate_task() enqueue_task() enqueue_task_fair() cfs_rq_of() task_of() WARN_ON_ONCE(!entity_is_task(se)) vprintk_emit() console_trylock() down_trylock() raw_spin_lock_irqsave(&sem->lock, flags) ^^^^ deadlock and some other cases. Just like in NMI implementation, the solution uses a per-cpu `printk_func' pointer to 'redirect' printk() calls to a 'safe' callback, that store messages in a per-cpu buffer and flushes them back to logbuf buffer later. Usage example: printk() printk_safe_enter_irqsave(flags) // // any printk() call from here will endup in vprintk_safe(), // that stores messages in a special per-CPU buffer. // printk_safe_exit_irqrestore(flags) The 'redirection' mechanism, though, has been reworked, as suggested by Petr Mladek. Instead of using a per-cpu @print_func callback we now keep a per-cpu printk-context variable and call either default or nmi vprintk function depending on its value. printk_nmi_entrer/exit and printk_safe_enter/exit, thus, just set/celar corresponding bits in printk-context functions. The patch only adds printk_safe support, we don't use it yet. Link: http://lkml.kernel.org/r/20161227141611.940-4-sergey.senozhatsky@gmail.com Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jan Kara <jack@suse.cz> Cc: Tejun Heo <tj@kernel.org> Cc: Calvin Owens <calvinowens@fb.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-02-08printk: rename nmi.c and exported apiSergey Senozhatsky
A preparation patch for printk_safe work. No functional change. - rename nmi.c to print_safe.c - add `printk_safe' prefix to some (which used both by printk-safe and printk-nmi) of the exported functions. Link: http://lkml.kernel.org/r/20161227141611.940-3-sergey.senozhatsky@gmail.com Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jan Kara <jack@suse.cz> Cc: Tejun Heo <tj@kernel.org> Cc: Calvin Owens <calvinowens@fb.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Petr Mladek <pmladek@suse.com>
2017-02-08printk: use vprintk_func in vprintk()Sergey Senozhatsky
vprintk(), just like printk(), better be using per-cpu printk_func instead of direct vprintk_emit() call. Just in case if vprintk() will ever be called from NMI, or from any other context that can deadlock in printk(). Link: http://lkml.kernel.org/r/20161227141611.940-2-sergey.senozhatsky@gmail.com Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jan Kara <jack@suse.cz> Cc: Tejun Heo <tj@kernel.org> Cc: Calvin Owens <calvinowens@fb.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Petr Mladek <pmladek@suse.com>
2017-02-08kernel/notifier.c: simplify expressionViresh Kumar
NOTIFY_STOP_MASK (0x8000) has only one bit set and there is no need to compare output of "ret & NOTIFY_STOP_MASK" to NOTIFY_STOP_MASK. We just need to make sure the output is non-zero, that's it. Link: http://lkml.kernel.org/r/88ee58264a2bfab1c97ffc8ac753e25f55f57c10.1483593065.git.viresh.kumar@linaro.org Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2017-02-08userfaultfd: non-cooperative: add event for exit() notificationMike Rapoport
Allow userfaultfd monitor track termination of the processes that have memory backed by the uffd. Link: http://lkml.kernel.org/r/1485542673-24387-4-git-send-email-rppt@linux.vnet.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2017-02-08mm, uprobes: convert __replace_page() to use page_vma_mapped_walk()Kirill A. Shutemov
For consistency, it worth converting all page_check_address() to page_vma_mapped_walk(), so we could drop the former. Link: http://lkml.kernel.org/r/20170129173858.45174-10-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2017-02-08uprobes: split THPs before trying to replace themKirill A. Shutemov
Patch series "Fix few rmap-related THP bugs", v3. The patchset fixes handing PTE-mapped THPs in page_referenced() and page_idle_clear_pte_refs(). To achieve that I've intrdocued new helper -- page_vma_mapped_walk() -- which replaces all page_check_address{,_transhuge}() and covers all THP cases. Patchset overview: - First patch fixes one uprobe bug (unrelated to the rest of the patchset, just spotted it at the same time); - Patches 2-5 fix handling PTE-mapped THPs in page_referenced(), page_idle_clear_pte_refs() and rmap core; - Patches 6-12 convert all page_check_address{,_transhuge}() users (plus remove_migration_pte()) to page_vma_mapped_walk() and drop unused helpers. I think the fixes are not critical enough for stable@ as they don't lead to crashes or hangs, only suboptimal behaviour. This patch (of 12): For THPs page_check_address() always fails. It leads to endless loop in uprobe_write_opcode(). Testcase with huge-tmpfs (uprobes cannot probe anonymous memory). mount -t debugfs none /sys/kernel/debug mount -t tmpfs -o huge=always none /mnt gcc -Wall -O2 -o /mnt/test -x c - <<EOF int main(void) { return 0; } /* Padding to map the code segment with huge pmd */ asm (".zero 2097152"); EOF echo 'p /mnt/test:0' > /sys/kernel/debug/tracing/uprobe_events echo 1 > /sys/kernel/debug/tracing/events/uprobes/enable /mnt/test Let's split THPs before trying to replace. Link: http://lkml.kernel.org/r/20170129173858.45174-2-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2017-02-08mm, fs: reduce fault, page_mkwrite, and pfn_mkwrite to take only vmfDave Jiang
->fault(), ->page_mkwrite(), and ->pfn_mkwrite() calls do not need to take a vma and vmf parameter when the vma already resides in vmf. Remove the vma parameter to simplify things. Link: http://lkml.kernel.org/r/148521301778.19116.10840599906674778980.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2017-02-08mm: enable section-unaligned devm_memremap_pages()Dan Williams
Teach devm_memremap_pages() about the new sub-section capabilities of arch_{add,remove}_memory(). Link: http://lkml.kernel.org/r/148486366055.19694.17199008017867229383.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: Stephen Bates <stephen.bates@microsemi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2017-02-08mm: introduce common definitions for the size and mask of a sectionDan Williams
Up-level the local section size and mask from kernel/memremap.c to global definitions. These will be used by the new sub-section hotplug support. Link: http://lkml.kernel.org/r/148486362333.19694.14241079169560958896.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Stephen Bates <stephen.bates@microsemi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2017-02-08mm, devm_memremap_pages: use multi-order radix for ZONE_DEVICE lookupsDan Williams
devm_memremap_pages() records mapped ranges in pgmap_radix with an entry per section's worth of memory (128MB). The key for each of those entries is a section number. This leads to false positives when devm_memremap_pages() is passed a section-unaligned range as lookups in the misalignment fail to return NULL. We can close this hole by using the pfn as the key for entries in the tree. The number of entries required to describe a remapped range is reduced by leveraging multi-order entries. In practice this approach usually yields just one entry in the tree if the size and starting address are of the same power-of-2 alignment. Previously we always needed nr_entries = mapping_size / 128MB. Link: https://lists.01.org/pipermail/linux-nvdimm/2016-August/006666.html Link: http://lkml.kernel.org/r/148486361197.19694.17322218559382955416.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reported-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2017-02-08userfaultfd: non-cooperative: Add fork() eventPavel Emelyanov
When the mm with uffd-ed vmas fork()-s the respective vmas notify their uffds with the event which contains a descriptor with new uffd. This new descriptor can then be used to get events from the child and populate its mm with data. Note, that there can be different uffd-s controlling different vmas within one mm, so first we should collect all those uffds (and ctx-s) in a list and then notify them all one by one but only once per fork(). The context is created at fork() time but the descriptor, file struct and anon inode object is created at event read time. So some trickery is added to the userfaultfd_ctx_read() to handle the ctx queues' locking vs file creation. Another thing worth noticing is that the task that fork()-s waits for the uffd event to get processed WITHOUT the mmap sem. Link: http://lkml.kernel.org/r/20161216144821.5183-9-aarcange@redhat.com Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Michael Rapoport <RAPOPORT@il.ibm.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2017-02-08kernel/watchdog.c: do not hardcode CPU 0 as the initial threadPrarit Bhargava
When CONFIG_BOOTPARAM_HOTPLUG_CPU0 is enabled, the socket containing the boot cpu can be replaced. During the hot add event, the message NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter. is output implying that the NMI watchdog was disabled at some point. This is not the case and the message has caused confusion for users of systems that support the removal of the boot cpu socket. The watchdog code is coded to assume that cpu 0 is always the first cpu to initialize the watchdog, and the last to stop its watchdog thread. That is not the case for initializing if cpu 0 has been removed and added. The removal case has never been correct because the smpboot code will remove the watchdog threads starting with the lowest cpu number. This patch adds watchdog_cpus to track the number of cpus with active NMI watchdog threads so that the first and last thread can be used to set and clear the value of firstcpu_err. firstcpu_err is set when the first watchdog thread is enabled, and cleared when the last watchdog thread is disabled. Link: http://lkml.kernel.org/r/1480425321-32296-1-git-send-email-prarit@redhat.com Signed-off-by: Prarit Bhargava <prarit@redhat.com> Acked-by: Don Zickus <dzickus@redhat.com> Cc: Borislav Petkov <bp@suse.de> Cc: Tejun Heo <tj@kernel.org> Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <ak@linux.intel.com> Cc: Joshua Hunt <johunt@akamai.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Babu Moger <babu.moger@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2017-02-08tracing: add __print_flags_u64()Ross Zwisler
Patch series "DAX tracepoints, mm argument simplification", v4. This contains both my DAX tracepoint code and Dave Jiang's MM argument simplifications. Dave's code was written with my tracepoint code as a baseline, so it seemed simplest to keep them together in a single series. This patch (of 7): Add __print_flags_u64() and the helper trace_print_flags_seq_u64() in the same spirit as __print_symbolic_u64() and trace_print_symbols_seq_u64(). These functions allow us to print symbols associated with flags that are 64 bits wide even on 32 bit machines. These will be used by the DAX code so that we can print the flags set in a pfn_t such as PFN_SG_CHAIN, PFN_SG_LAST, PFN_DEV and PFN_MAP. Without this new function I was getting errors like the following when compiling for i386: ./include/linux/pfn_t.h:13:22: warning: large integer implicitly truncated to unsigned type [-Woverflow] #define PFN_SG_CHAIN (1ULL << (BITS_PER_LONG_LONG - 1)) ^ Link: http://lkml.kernel.org/r/1484085142-2297-2-git-send-email-ross.zwisler@linux.intel.com Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2017-02-08kernel/ucount.c: mark user_header with kmemleak_ignore()Luis R. Rodriguez
The user_header gets caught by kmemleak with the following splat as missing a free: unreferenced object 0xffff99667a733d80 (size 96): comm "swapper/0", pid 1, jiffies 4294892317 (age 62191.468s) hex dump (first 32 bytes): a0 b6 92 b4 ff ff ff ff 00 00 00 00 01 00 00 00 ................ 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffffb3a5f7ea>] kmemleak_alloc+0x4a/0xa0 [<ffffffffb36050d4>] __kmalloc+0x144/0x260 [<ffffffffb36a7144>] __register_sysctl_table+0x54/0x5e0 [<ffffffffb36a76eb>] register_sysctl+0x1b/0x20 [<ffffffffb416fe17>] user_namespace_sysctl_init+0x17/0x34 [<ffffffffb3402192>] do_one_initcall+0x52/0x1a0 [<ffffffffb414d1bd>] kernel_init_freeable+0x173/0x200 [<ffffffffb3a5c23e>] kernel_init+0xe/0x100 [<ffffffffb3a6b57c>] ret_from_fork+0x2c/0x40 [<ffffffffffffffff>] 0xffffffffffffffff The BUG_ON()s are intended to crash so no need to clean up after ourselves on error there. This is also a kernel/ subsys_init() we don't need a respective exit call here as this is never modular, so just white list it. Link: http://lkml.kernel.org/r/20170203211404.31458-1-mcgrof@kernel.org Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Kees Cook <keescook@chromium.org> Cc: Nikolay Borisov <n.borisov.lkml@gmail.com> Cc: Serge Hallyn <serge@hallyn.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2017-02-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
The conflict was an interaction between a bug fix in the netvsc driver in 'net' and an optimization of the RX path in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-07arch: Rename CONFIG_DEBUG_RODATA and CONFIG_DEBUG_MODULE_RONXLaura Abbott
Both of these options are poorly named. The features they provide are necessary for system security and should not be considered debug only. Change the names to CONFIG_STRICT_KERNEL_RWX and CONFIG_STRICT_MODULE_RWX to better describe what these options do. Signed-off-by: Laura Abbott <labbott@redhat.com> Acked-by: Jessica Yu <jeyu@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org>
2017-02-06bpf: enable verifier to add 0 to packet ptrWilliam Tu
The patch fixes the case when adding a zero value to the packet pointer. The zero value could come from src_reg equals type BPF_K or CONST_IMM. The patch fixes both, otherwise the verifer reports the following error: [...] R0=imm0,min_value=0,max_value=0 R1=pkt(id=0,off=0,r=4) R2=pkt_end R3=fp-12 R4=imm4,min_value=4,max_value=4 R5=pkt(id=0,off=4,r=4) 269: (bf) r2 = r0 // r2 becomes imm0 270: (77) r2 >>= 3 271: (bf) r4 = r1 // r4 becomes pkt ptr 272: (0f) r4 += r2 // r4 += 0 addition of negative constant to packet pointer is not allowed Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Mihai Budiu <mbudiu@vmware.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-06Merge branch 'master' into for-nextJens Axboe