summaryrefslogtreecommitdiffstats
path: root/lib
AgeCommit message (Collapse)Author
2020-04-17xarray: Fix early termination of xas_for_each_markedMatthew Wilcox (Oracle)
commit 7e934cf5ace1dceeb804f7493fa28bb697ed3c52 upstream. xas_for_each_marked() is using entry == NULL as a termination condition of the iteration. When xas_for_each_marked() is used protected only by RCU, this can however race with xas_store(xas, NULL) in the following way: TASK1 TASK2 page_cache_delete() find_get_pages_range_tag() xas_for_each_marked() xas_find_marked() off = xas_find_chunk() xas_store(&xas, NULL) xas_init_marks(&xas); ... rcu_assign_pointer(*slot, NULL); entry = xa_entry(off); And thus xas_for_each_marked() terminates prematurely possibly leading to missed entries in the iteration (translating to missing writeback of some pages or a similar problem). If we find a NULL entry that has been marked, skip it (unless we're trying to allocate an entry). Reported-by: Jan Kara <jack@suse.cz> CC: stable@vger.kernel.org Fixes: ef8e5717db01 ("page cache: Convert delete_batch to XArray") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-17XArray: Fix xas_pause for large multi-index entriesMatthew Wilcox (Oracle)
commit c36d451ad386b34f452fc3c8621ff14b9eaa31a6 upstream. Inspired by the recent Coverity report, I looked for other places where the offset wasn't being converted to an unsigned long before being shifted, and I found one in xas_pause() when the entry being paused is of order >32. Fixes: b803b42823d0 ("xarray: Add XArray iterators") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-13uapi: rename ext2_swab() to swab() and share globally in swab.hYury Norov
commit d5767057c9a76a29f073dad66b7fa12a90e8c748 upstream. ext2_swab() is defined locally in lib/find_bit.c However it is not specific to ext2, neither to bitmaps. There are many potential users of it, so rename it to just swab() and move to include/uapi/linux/swab.h ABI guarantees that size of unsigned long corresponds to BITS_PER_LONG, therefore drop unneeded cast. Link: http://lkml.kernel.org/r/20200103202846.21616-1-yury.norov@gmail.com Signed-off-by: Yury Norov <yury.norov@gmail.com> Cc: Allison Randal <allison@lohutok.net> Cc: Joe Perches <joe@perches.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: William Breathitt Gray <vilhelm.gray@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-08XArray: Fix xa_find_next for large multi-index entriesMatthew Wilcox (Oracle)
[ Upstream commit bd40b17ca49d7d110adf456e647701ce74de2241 ] Coverity pointed out that xas_sibling() was shifting xa_offset without promoting it to an unsigned long first, so the shift could cause an overflow and we'd get the wrong answer. The fix is obvious, and the new test-case provokes UBSAN to report an error: runtime error: shift exponent 60 is too large for 32-bit type 'int' Fixes: 19c30f4dd092 ("XArray: Fix xa_find_after with multi-index entries") Reported-by: Bjorn Helgaas <bhelgaas@google.com> Reported-by: Kees Cook <keescook@chromium.org> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-03-05kbuild: move headers_check rule to usr/include/MakefileMasahiro Yamada
commit 7ecaf069da52e472d393f03e79d721aabd724166 upstream. Currently, some sanity checks for uapi headers are done by scripts/headers_check.pl, which is wired up to the 'headers_check' target in the top Makefile. It is true compiling headers has better test coverage, but there are still several headers excluded from the compile test. I like to keep headers_check.pl for a while, but we can delete a lot of code by moving the build rule to usr/include/Makefile. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28lib/stackdepot.c: fix global out-of-bounds in stack_slabsAlexander Potapenko
commit 305e519ce48e935702c32241f07d393c3c8fed3e upstream. Walter Wu has reported a potential case in which init_stack_slab() is called after stack_slabs[STACK_ALLOC_MAX_SLABS - 1] has already been initialized. In that case init_stack_slab() will overwrite stack_slabs[STACK_ALLOC_MAX_SLABS], which may result in a memory corruption. Link: http://lkml.kernel.org/r/20200218102950.260263-1-glider@google.com Fixes: cd11016e5f521 ("mm, kasan: stackdepot implementation. Enable stackdepot for SLAB") Signed-off-by: Alexander Potapenko <glider@google.com> Reported-by: Walter Wu <walter-zh.wu@mediatek.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-24lib/scatterlist.c: adjust indentation in __sg_alloc_tableNathan Chancellor
[ Upstream commit 4e456fee215677584cafa7f67298a76917e89c64 ] Clang warns: ../lib/scatterlist.c:314:5: warning: misleading indentation; statement is not part of the previous 'if' [-Wmisleading-indentation] return -ENOMEM; ^ ../lib/scatterlist.c:311:4: note: previous statement is here if (prv) ^ 1 warning generated. This warning occurs because there is a space before the tab on this line. Remove it so that the indentation is consistent with the Linux kernel coding style and clang no longer warns. Link: http://lkml.kernel.org/r/20191218033606.11942-1-natechancellor@gmail.com Link: https://github.com/ClangBuiltLinux/linux/issues/830 Fixes: edce6820a9fd ("scatterlist: prevent invalid free when alloc fails") Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24debugobjects: Fix various data racesMarco Elver
[ Upstream commit 35fd7a637c42bb54ba4608f4d40ae6e55fc88781 ] The counters obj_pool_free, and obj_nr_tofree, and the flag obj_freeing are read locklessly outside the pool_lock critical sections. If read with plain accesses, this would result in data races. This is addressed as follows: * reads outside critical sections become READ_ONCE()s (pairing with WRITE_ONCE()s added); * writes become WRITE_ONCE()s (pairing with READ_ONCE()s added); since writes happen inside critical sections, only the write and not the read of RMWs needs to be atomic, thus WRITE_ONCE(var, var +/- X) is sufficient. The data races were reported by KCSAN: BUG: KCSAN: data-race in __free_object / fill_pool write to 0xffffffff8beb04f8 of 4 bytes by interrupt on cpu 1: __free_object+0x1ee/0x8e0 lib/debugobjects.c:404 __debug_check_no_obj_freed+0x199/0x330 lib/debugobjects.c:969 debug_check_no_obj_freed+0x3c/0x44 lib/debugobjects.c:994 slab_free_hook mm/slub.c:1422 [inline] read to 0xffffffff8beb04f8 of 4 bytes by task 1 on cpu 2: fill_pool+0x3d/0x520 lib/debugobjects.c:135 __debug_object_init+0x3c/0x810 lib/debugobjects.c:536 debug_object_init lib/debugobjects.c:591 [inline] debug_object_activate+0x228/0x320 lib/debugobjects.c:677 debug_rcu_head_queue kernel/rcu/rcu.h:176 [inline] BUG: KCSAN: data-race in __debug_object_init / fill_pool read to 0xffffffff8beb04f8 of 4 bytes by task 10 on cpu 6: fill_pool+0x3d/0x520 lib/debugobjects.c:135 __debug_object_init+0x3c/0x810 lib/debugobjects.c:536 debug_object_init_on_stack+0x39/0x50 lib/debugobjects.c:606 init_timer_on_stack_key kernel/time/timer.c:742 [inline] write to 0xffffffff8beb04f8 of 4 bytes by task 1 on cpu 3: alloc_object lib/debugobjects.c:258 [inline] __debug_object_init+0x717/0x810 lib/debugobjects.c:544 debug_object_init lib/debugobjects.c:591 [inline] debug_object_activate+0x228/0x320 lib/debugobjects.c:677 debug_rcu_head_queue kernel/rcu/rcu.h:176 [inline] BUG: KCSAN: data-race in free_obj_work / free_object read to 0xffffffff9140c190 of 4 bytes by task 10 on cpu 6: free_object+0x4b/0xd0 lib/debugobjects.c:426 debug_object_free+0x190/0x210 lib/debugobjects.c:824 destroy_timer_on_stack kernel/time/timer.c:749 [inline] write to 0xffffffff9140c190 of 4 bytes by task 93 on cpu 1: free_obj_work+0x24f/0x480 lib/debugobjects.c:313 process_one_work+0x454/0x8d0 kernel/workqueue.c:2264 worker_thread+0x9a/0x780 kernel/workqueue.c:2410 Reported-by: Qian Cai <cai@lca.pw> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20200116185529.11026-1-elver@google.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24raid6/test: fix a compilation warningZhengyuan Liu
[ Upstream commit 5e5ac01c2b8802921fee680518a986011cb59820 ] The compilation warning is redefination showed as following: In file included from tables.c:2: ../../../include/linux/export.h:180: warning: "EXPORT_SYMBOL" redefined #define EXPORT_SYMBOL(sym) __EXPORT_SYMBOL(sym, "") In file included from tables.c:1: ../../../include/linux/raid/pq.h:61: note: this is the location of the previous definition #define EXPORT_SYMBOL(sym) Fixes: 69a94abb82ee ("export.h, genksyms: do not make genksyms calculate CRC of trimmed symbols") Signed-off-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-11lib/test_kasan.c: fix memory leak in kmalloc_oob_krealloc_more()Gustavo A. R. Silva
commit 3e21d9a501bf99aee2e5835d7f34d8c823f115b5 upstream. In case memory resources for _ptr2_ were allocated, release them before return. Notice that in case _ptr1_ happens to be NULL, krealloc() behaves exactly like kmalloc(). Addresses-Coverity-ID: 1490594 ("Resource leak") Link: http://lkml.kernel.org/r/20200123160115.GA4202@embeddedor Fixes: 3f15801cdc23 ("lib: add kasan test module") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-05XArray: Fix xas_pause at ULONG_MAXMatthew Wilcox (Oracle)
[ Upstream commit 82a22311b7a68a78709699dc8c098953b70e4fd2 ] If we were unlucky enough to call xas_pause() when the index was at ULONG_MAX (or a multi-slot entry which ends at ULONG_MAX), we would wrap the index back around to 0 and restart the iteration from the beginning. Use the XAS_BOUNDS state to indicate that we should just stop the iteration. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-29lib: Reduce user_access_begin() boundaries in strncpy_from_user() and ↵Christophe Leroy
strnlen_user() commit ab10ae1c3bef56c29bac61e1201c752221b87b41 upstream. The range passed to user_access_begin() by strncpy_from_user() and strnlen_user() starts at 'src' and goes up to the limit of userspace although reads will be limited by the 'count' param. On 32 bits powerpc (book3s/32) access has to be granted for each 256Mbytes segment and the cost increases with the number of segments to unlock. Limit the range with 'count' param. Fixes: 594cc251fdd0 ("make 'user_access_begin()' do 'access_ok()'") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-29XArray: Fix xas_find returning too many entriesMatthew Wilcox (Oracle)
commit c44aa5e8ab58b5f4cf473970ec784c3333496a2e upstream. If you call xas_find() with the initial index > max, it should have returned NULL but was returning the entry at index. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-29XArray: Fix xa_find_after with multi-index entriesMatthew Wilcox (Oracle)
commit 19c30f4dd0923ef191f35c652ee4058e91e89056 upstream. If the entry is of an order which is a multiple of XA_CHUNK_SIZE, the current detection of sibling entries does not work. Factor out an xas_sibling() function to make xa_find_after() a little more understandable, and write a new implementation that doesn't suffer from the same bug. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-29XArray: Fix infinite loop with entry at ULONG_MAXMatthew Wilcox (Oracle)
commit 430f24f94c8a174d411a550d7b5529301922e67a upstream. If there is an entry at ULONG_MAX, xa_for_each() will overflow the 'index + 1' in xa_find_after() and wrap around to 0. Catch this case and terminate the loop by returning NULL. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-12sbitmap: only queue kyber's wait callback if not already activeDavid Jeffery
[ Upstream commit df034c93f15ee71df231ff9fe311d27ff08a2a52 ] Under heavy loads where the kyber I/O scheduler hits the token limits for its scheduling domains, kyber can become stuck. When active requests complete, kyber may not be woken up leaving the I/O requests in kyber stuck. This stuck state is due to a race condition with kyber and the sbitmap functions it uses to run a callback when enough requests have completed. The running of a sbt_wait callback can race with the attempt to insert the sbt_wait. Since sbitmap_del_wait_queue removes the sbt_wait from the list first then sets the sbq field to NULL, kyber can see the item as not on a list but the call to sbitmap_add_wait_queue will see sbq as non-NULL. This results in the sbt_wait being inserted onto the wait list but ws_active doesn't get incremented. So the sbitmap queue does not know there is a waiter on a wait list. Since sbitmap doesn't think there is a waiter, kyber may never be informed that there are domain tokens available and the I/O never advances. With the sbt_wait on a wait list, kyber believes it has an active waiter so cannot insert a new waiter when reaching the domain's full state. This race can be fixed by only adding the sbt_wait to the queue if the sbq field is NULL. If sbq is not NULL, there is already an action active which will trigger the re-running of kyber. Let it run and add the sbt_wait to the wait list if still needing to wait. Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: David Jeffery <djeffery@redhat.com> Reported-by: John Pittman <jpittman@redhat.com> Tested-by: John Pittman <jpittman@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-09lib/ubsan: don't serialize UBSAN reportJulien Grall
[ Upstream commit ce5c31db3645b649a31044a4d8b6057f6c723702 ] At the moment, UBSAN report will be serialized using a spin_lock(). On RT-systems, spinlocks are turned to rt_spin_lock and may sleep. This will result to the following splat if the undefined behavior is in a context that can sleep: BUG: sleeping function called from invalid context at /src/linux/kernel/locking/rtmutex.c:968 in_atomic(): 1, irqs_disabled(): 128, pid: 3447, name: make 1 lock held by make/3447: #0: 000000009a966332 (&mm->mmap_sem){++++}, at: do_page_fault+0x140/0x4f8 irq event stamp: 6284 hardirqs last enabled at (6283): [<ffff000011326520>] _raw_spin_unlock_irqrestore+0x90/0xa0 hardirqs last disabled at (6284): [<ffff0000113262b0>] _raw_spin_lock_irqsave+0x30/0x78 softirqs last enabled at (2430): [<ffff000010088ef8>] fpsimd_restore_current_state+0x60/0xe8 softirqs last disabled at (2427): [<ffff000010088ec0>] fpsimd_restore_current_state+0x28/0xe8 Preemption disabled at: [<ffff000011324a4c>] rt_mutex_futex_unlock+0x4c/0xb0 CPU: 3 PID: 3447 Comm: make Tainted: G W 5.2.14-rt7-01890-ge6e057589653 #911 Call trace: dump_backtrace+0x0/0x148 show_stack+0x14/0x20 dump_stack+0xbc/0x104 ___might_sleep+0x154/0x210 rt_spin_lock+0x68/0xa0 ubsan_prologue+0x30/0x68 handle_overflow+0x64/0xe0 __ubsan_handle_add_overflow+0x10/0x18 __lock_acquire+0x1c28/0x2a28 lock_acquire+0xf0/0x370 _raw_spin_lock_irqsave+0x58/0x78 rt_mutex_futex_unlock+0x4c/0xb0 rt_spin_unlock+0x28/0x70 get_page_from_freelist+0x428/0x2b60 __alloc_pages_nodemask+0x174/0x1708 alloc_pages_vma+0x1ac/0x238 __handle_mm_fault+0x4ac/0x10b0 handle_mm_fault+0x1d8/0x3b0 do_page_fault+0x1c8/0x4f8 do_translation_fault+0xb8/0xe0 do_mem_abort+0x3c/0x98 el0_da+0x20/0x24 The spin_lock() will protect against multiple CPUs to output a report together, I guess to prevent them from being interleaved. However, they can still interleave with other messages (and even splat from __might_sleep). So the lock usefulness seems pretty limited. Rather than trying to accomodate RT-system by switching to a raw_spin_lock(), the lock is now completely dropped. Link: http://lkml.kernel.org/r/20190920100835.14999-1-julien.grall@arm.com Signed-off-by: Julien Grall <julien.grall@arm.com> Reported-by: Andre Przywara <andre.przywara@arm.com> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-31ubsan, x86: Annotate and allow __ubsan_handle_shift_out_of_bounds() in ↵Peter Zijlstra
uaccess regions [ Upstream commit 9a50dcaf0416a43e1fe411dc61a99c8333c90119 ] The new check_zeroed_user() function uses variable shifts inside of a user_access_begin()/user_access_end() section and that results in GCC emitting __ubsan_handle_shift_out_of_bounds() calls, even though through value range analysis it would be able to see that the UB in question is impossible. Annotate and whitelist this UBSAN function; continued use of user_access_begin()/user_access_end() will undoubtedly result in further uses of function. Reported-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: cyphar@cyphar.com Cc: keescook@chromium.org Cc: linux@rasmusvillemoes.dk Fixes: f5a1a536fa14 ("lib: introduce copy_struct_from_user() helper") Link: https://lkml.kernel.org/r/20191021131149.GA19358@hirez.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-17lib: raid6: fix awk build warningsGreg Kroah-Hartman
commit 702600eef73033ddd4eafcefcbb6560f3e3a90f7 upstream. Newer versions of awk spit out these fun warnings: awk: ../lib/raid6/unroll.awk:16: warning: regexp escape sequence `\#' is not a known regexp operator As commit 700c1018b86d ("x86/insn: Fix awk regexp warnings") showed, it turns out that there are a number of awk strings that do not need to be escaped and newer versions of awk now warn about this. Fix the string up so that no warning is produced. The exact same kernel module gets created before and after this patch, showing that it wasn't needed. Link: https://lore.kernel.org/r/20191206152600.GA75093@kroah.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-15lib/xz: fix XZ_DYNALLOC to avoid useless memory reallocationsLasse Collin
s->dict.allocated was initialized to 0 but never set after a successful allocation, thus the code always thought that the dictionary buffer has to be reallocated. Link: http://lkml.kernel.org/r/20191104185107.3b6330df@tukaani.org Signed-off-by: Lasse Collin <lasse.collin@tukaani.org> Reported-by: Yu Sun <yusun2@cisco.com> Acked-by: Daniel Walker <danielwa@cisco.com> Cc: "Yixia Si (yisi)" <yisi@cisco.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-10lib: Remove select of inexistant GENERIC_IOCorentin Labbe
config option GENERIC_IO was removed but still selected by lib/kconfig This patch finish the cleaning. Fixes: 9de8da47742b ("kconfig: kill off GENERIC_IO option") Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-08Merge tag 'xarray-5.4' of git://git.infradead.org/users/willy/linux-daxLinus Torvalds
Pull XArray fixes from Matthew Wilcox: "These all fix various bugs, some of which people have tripped over and some of which have been caught by automatic tools" * tag 'xarray-5.4' of git://git.infradead.org/users/willy/linux-dax: idr: Fix idr_alloc_u32 on 32-bit systems idr: Fix integer overflow in idr_for_each_entry radix tree: Remove radix_tree_iter_find idr: Fix idr_get_next_ul race with idr_remove XArray: Fix xas_next() with a single entry at 0
2019-11-06dump_stack: avoid the livelock of the dump_lockKevin Hao
In the current code, we use the atomic_cmpxchg() to serialize the output of the dump_stack(), but this implementation suffers the thundering herd problem. We have observed such kind of livelock on a Marvell cn96xx board(24 cpus) when heavily using the dump_stack() in a kprobe handler. Actually we can let the competitors to wait for the releasing of the lock before jumping to atomic_cmpxchg(). This will definitely mitigate the thundering herd problem. Thanks Linus for the suggestion. [akpm@linux-foundation.org: fix comment] Link: http://lkml.kernel.org/r/20191030031637.6025-1-haokexin@gmail.com Fixes: b58d977432c8 ("dump_stack: serialize the output from dump_stack()") Signed-off-by: Kevin Hao <haokexin@gmail.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-03idr: Fix idr_alloc_u32 on 32-bit systemsMatthew Wilcox (Oracle)
Attempting to allocate an entry at 0xffffffff when one is already present would succeed in allocating one at 2^32, which would confuse everything. Return -ENOSPC in this case, as expected. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2019-11-01idr: Fix idr_get_next_ul race with idr_removeMatthew Wilcox (Oracle)
Commit 5c089fd0c734 ("idr: Fix idr_get_next race with idr_remove") neglected to fix idr_get_next_ul(). As far as I can tell, nobody's actually using this interface under the RCU read lock, but fix it now before anybody decides to use it. Fixes: 5c089fd0c734 ("idr: Fix idr_get_next race with idr_remove") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2019-10-23lib/vdso: Make clock_getres() POSIX compliant againThomas Gleixner
A recent commit removed the NULL pointer check from the clock_getres() implementation causing a test case to fault. POSIX requires an explicit NULL pointer check for clock_getres() aside of the validity check of the clock_id argument for obscure reasons. Add it back for both 32bit and 64bit. Note, this is only a partial revert of the offending commit which does not bring back the broken fallback invocation in the the 32bit compat implementations of clock_getres() and clock_gettime(). Fixes: a9446a906f52 ("lib/vdso/32: Remove inconsistent NULL pointer checks") Reported-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Christophe Leroy <christophe.leroy@c-s.fr> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1910211202260.1904@nanos.tec.linutronix.de
2019-10-18Merge tag 'copy-struct-from-user-v5.4-rc4' of ↵Linus Torvalds
gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux Pull usercopy test fixlets from Christian Brauner: "This contains two improvements for the copy_struct_from_user() tests: - a coding style change to get rid of the ugly "if ((ret |= test()))" pointed out when pulling the original patchset. - avoid a soft lockups when running the usercopy tests on machines with large page sizes by scanning only a 1024 byte region" * tag 'copy-struct-from-user-v5.4-rc4' of gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux: usercopy: Avoid soft lockups in test_check_nonzero_user() lib: test_user_copy: style cleanup
2019-10-16usercopy: Avoid soft lockups in test_check_nonzero_user()Michael Ellerman
On a machine with a 64K PAGE_SIZE, the nested for loops in test_check_nonzero_user() can lead to soft lockups, eg: watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [modprobe:611] Modules linked in: test_user_copy(+) vmx_crypto gf128mul crc32c_vpmsum virtio_balloon ip_tables x_tables autofs4 CPU: 4 PID: 611 Comm: modprobe Tainted: G L 5.4.0-rc1-gcc-8.2.0-00001-gf5a1a536fa14-dirty #1151 ... NIP __might_sleep+0x20/0xc0 LR __might_fault+0x40/0x60 Call Trace: check_zeroed_user+0x12c/0x200 test_user_copy_init+0x67c/0x1210 [test_user_copy] do_one_initcall+0x60/0x340 do_init_module+0x7c/0x2f0 load_module+0x2d94/0x30e0 __do_sys_finit_module+0xc8/0x150 system_call+0x5c/0x68 Even with a 4K PAGE_SIZE the test takes multiple seconds. Instead tweak it to only scan a 1024 byte region, but make it cross the page boundary. Fixes: f5a1a536fa14 ("lib: introduce copy_struct_from_user() helper") Suggested-by: Aleksa Sarai <cyphar@cyphar.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Aleksa Sarai <cyphar@cyphar.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/r/20191016122732.13467-1-mpe@ellerman.id.au Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2019-10-14lib/test_meminit: add a kmem_cache_alloc_bulk() testAlexander Potapenko
Make sure allocations from kmem_cache_alloc_bulk() and kmem_cache_free_bulk() are properly initialized. Link: http://lkml.kernel.org/r/20191007091605.30530-2-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Christoph Lameter <cl@linux.com> Cc: Laura Abbott <labbott@redhat.com> Cc: Thibaut Sautereau <thibaut@sautereau.fr> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-10-14lib/generic-radix-tree.c: add kmemleak annotationsEric Biggers
Kmemleak is falsely reporting a leak of the slab allocation in sctp_stream_init_ext(): BUG: memory leak unreferenced object 0xffff8881114f5d80 (size 96): comm "syz-executor934", pid 7160, jiffies 4294993058 (age 31.950s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000ce7a1326>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline] [<00000000ce7a1326>] slab_post_alloc_hook mm/slab.h:439 [inline] [<00000000ce7a1326>] slab_alloc mm/slab.c:3326 [inline] [<00000000ce7a1326>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553 [<000000007abb7ac9>] kmalloc include/linux/slab.h:547 [inline] [<000000007abb7ac9>] kzalloc include/linux/slab.h:742 [inline] [<000000007abb7ac9>] sctp_stream_init_ext+0x2b/0xa0 net/sctp/stream.c:157 [<0000000048ecb9c1>] sctp_sendmsg_to_asoc+0x946/0xa00 net/sctp/socket.c:1882 [<000000004483ca2b>] sctp_sendmsg+0x2a8/0x990 net/sctp/socket.c:2102 [...] But it's freed later. Kmemleak misses the allocation because its pointer is stored in the generic radix tree sctp_stream::out, and the generic radix tree uses raw pages which aren't tracked by kmemleak. Fix this by adding the kmemleak hooks to the generic radix tree code. Link: http://lkml.kernel.org/r/20191004065039.727564-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Reported-by: <syzbot+7f3b6b106be8dcdcdeec@syzkaller.appspotmail.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Vlad Yasevich <vyasevich@gmail.com> Cc: Xin Long <lucien.xin@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-10-12Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "A handful of fixes: a kexec linking fix, an AMD MWAITX fix, a vmware guest support fix when built under Clang, and new CPU model number definitions" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu: Add Comet Lake to the Intel CPU models header lib/string: Make memzero_explicit() inline instead of external x86/cpu/vmware: Use the full form of INL in VMWARE_PORT x86/asm: Fix MWAITX C-state hint value
2019-10-09Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "A larger-than-usual batch of arm64 fixes for -rc3. The bulk of the fixes are dealing with a bunch of issues with the build system from the compat vDSO, which unfortunately led to some significant Makefile rework to manage the horrible combinations of toolchains that we can end up needing to drive simultaneously. We came close to disabling the thing entirely, but Vincenzo was quick to spin up some patches and I ended up picking up most of the bits that were left [*]. Future work will look at disentangling the header files properly. Other than that, we have some important fixes all over, including one papering over the miscompilation fallout from forcing CONFIG_OPTIMIZE_INLINING=y, which I'm still unhappy about. Harumph. We've still got a couple of open issues, so I'm expecting to have some more fixes later this cycle. Summary: - Numerous fixes to the compat vDSO build system, especially when combining gcc and clang - Fix parsing of PAR_EL1 in spurious kernel fault detection - Partial workaround for Neoverse-N1 erratum #1542419 - Fix IRQ priority masking on entry from compat syscalls - Fix advertisment of FRINT HWCAP to userspace - Attempt to workaround inlining breakage with '__always_inline' - Fix accidental freeing of parent SVE state on fork() error path - Add some missing NULL pointer checks in instruction emulation init - Some formatting and comment fixes" [*] Will's final fixes were Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com> but they were already in linux-next by then and he didn't rebase just to add those. * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (21 commits) arm64: armv8_deprecated: Checking return value for memory allocation arm64: Kconfig: Make CONFIG_COMPAT_VDSO a proper Kconfig option arm64: vdso32: Rename COMPATCC to CC_COMPAT arm64: vdso32: Pass '--target' option to clang via VDSO_CAFLAGS arm64: vdso32: Don't use KBUILD_CPPFLAGS unconditionally arm64: vdso32: Move definition of COMPATCC into vdso32/Makefile arm64: Default to building compat vDSO with clang when CONFIG_CC_IS_CLANG lib: vdso: Remove CROSS_COMPILE_COMPAT_VDSO arm64: vdso32: Remove jump label config option in Makefile arm64: vdso32: Detect binutils support for dmb ishld arm64: vdso: Remove stale files from old assembly implementation arm64: vdso32: Fix broken compat vDSO build warnings arm64: mm: fix spurious fault detection arm64: ftrace: Ensure synchronisation in PLT setup for Neoverse-N1 #1542419 arm64: Fix incorrect irqflag restore for priority masking for compat arm64: mm: avoid virt_to_phys(init_mm.pgd) arm64: cpufeature: Effectively expose FRINT capability to userspace arm64: Mark functions using explicit register variables as '__always_inline' docs: arm64: Fix indentation and doc formatting arm64/sve: Fix wrong free for task->thread.sve_state ...
2019-10-08lib/string: Make memzero_explicit() inline instead of externalArvind Sankar
With the use of the barrier implied by barrier_data(), there is no need for memzero_explicit() to be extern. Making it inline saves the overhead of a function call, and allows the code to be reused in arch/*/purgatory without having to duplicate the implementation. Tested-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H . Peter Anvin <hpa@zytor.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephan Mueller <smueller@chronox.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-crypto@vger.kernel.org Cc: linux-s390@vger.kernel.org Fixes: 906a4bb97f5d ("crypto: sha256 - Use get/put_unaligned_be32 to get input, memzero_explicit") Link: https://lkml.kernel.org/r/20191007220000.GA408752@rani.riverdale.lan Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-10-07lib: vdso: Remove CROSS_COMPILE_COMPAT_VDSOVincenzo Frascino
arm64 was the last architecture using CROSS_COMPILE_COMPAT_VDSO config option. With this patch series the dependency in the architecture has been removed. Remove CROSS_COMPILE_COMPAT_VDSO from the Unified vDSO library code. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-10-07lib: test_user_copy: style cleanupAleksa Sarai
While writing the tests for copy_struct_from_user(), I used a construct that Linus doesn't appear to be too fond of: On 2019-10-04, Linus Torvalds <torvalds@linux-foundation.org> wrote: > Hmm. That code is ugly, both before and after the fix. > > This just doesn't make sense for so many reasons: > > if ((ret |= test(umem_src == NULL, "kmalloc failed"))) > > where the insanity comes from > > - why "|=" when you know that "ret" was zero before (and it had to > be, for the test to make sense) > > - why do this as a single line anyway? > > - don't do the stupid "double parenthesis" to hide a warning. Make it > use an actual comparison if you add a layer of parentheses. So instead, use a bog-standard check that isn't nearly as ugly. Fixes: 341115822f88 ("usercopy: Add parentheses around assignment in test_copy_struct_from_user") Fixes: f5a1a536fa14 ("lib: introduce copy_struct_from_user() helper") Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/r/20191005233028.18566-1-cyphar@cyphar.com Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2019-10-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix ieeeu02154 atusb driver use-after-free, from Johan Hovold. 2) Need to validate TCA_CBQ_WRROPT netlink attributes, from Eric Dumazet. 3) txq null deref in mac80211, from Miaoqing Pan. 4) ionic driver needs to select NET_DEVLINK, from Arnd Bergmann. 5) Need to disable bh during nft_connlimit GC, from Pablo Neira Ayuso. 6) Avoid division by zero in taprio scheduler, from Vladimir Oltean. 7) Various xgmac fixes in stmmac driver from Jose Abreu. 8) Avoid 64-bit division in mlx5 leading to link errors on 32-bit from Michal Kubecek. 9) Fix bad VLAN check in rtl8366 DSA driver, from Linus Walleij. 10) Fix sleep while atomic in sja1105, from Vladimir Oltean. 11) Suspend/resume deadlock in stmmac, from Thierry Reding. 12) Various UDP GSO fixes from Josh Hunt. 13) Fix slab out of bounds access in tcp_zerocopy_receive(), from Eric Dumazet. 14) Fix OOPS in __ipv6_ifa_notify(), from David Ahern. 15) Memory leak in NFC's llcp_sock_bind, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (72 commits) selftests/net: add nettest to .gitignore net: qlogic: Fix memory leak in ql_alloc_large_buffers nfc: fix memory leak in llcp_sock_bind() sch_dsmark: fix potential NULL deref in dsmark_init() net: phy: at803x: use operating parameters from PHY-specific status net: phy: extract pause mode net: phy: extract link partner advertisement reading net: phy: fix write to mii-ctrl1000 register ipv6: Handle missing host route in __ipv6_ifa_notify net: phy: allow for reset line to be tied to a sleepy GPIO controller net: ipv4: avoid mixed n_redirects and rate_tokens usage r8152: Set macpassthru in reset_resume callback cxgb4:Fix out-of-bounds MSI-X info array access Revert "ipv6: Handle race in addrconf_dad_work" net: make sock_prot_memory_pressure() return "const char *" rxrpc: Fix rxrpc_recvmsg tracepoint qmi_wwan: add support for Cinterion CLS8 devices tcp: fix slab-out-of-bounds in tcp_zerocopy_receive() lib: textsearch: fix escapes in example code udp: only do GSO if # of segs > 1 ...
2019-10-03usercopy: Add parentheses around assignment in test_copy_struct_from_userNathan Chancellor
Clang warns: lib/test_user_copy.c:96:10: warning: using the result of an assignment as a condition without parentheses [-Wparentheses] if (ret |= test(umem_src == NULL, "kmalloc failed")) ~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ lib/test_user_copy.c:96:10: note: place parentheses around the assignment to silence this warning if (ret |= test(umem_src == NULL, "kmalloc failed")) ^ ( ) lib/test_user_copy.c:96:10: note: use '!=' to turn this compound assignment into an inequality comparison if (ret |= test(umem_src == NULL, "kmalloc failed")) ^~ != Add the parentheses as it suggests because this is intentional. Fixes: f5a1a536fa14 ("lib: introduce copy_struct_from_user() helper") Link: https://github.com/ClangBuiltLinux/linux/issues/731 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Acked-by: Aleksa Sarai <cyphar@cyphar.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/r/20191003171121.2723619-1-natechancellor@gmail.com Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2019-10-03lib: textsearch: fix escapes in example codeRandy Dunlap
This textsearch code example does not need the '\' escapes and they can be misleading to someone reading the example. Also, gcc and sparse warn that the "\%d" is an unknown escape sequence. Fixes: 5968a70d7af5 ("textsearch: fix kernel-doc warnings and add kernel-api section") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-01lib: introduce copy_struct_from_user() helperAleksa Sarai
A common pattern for syscall extensions is increasing the size of a struct passed from userspace, such that the zero-value of the new fields result in the old kernel behaviour (allowing for a mix of userspace and kernel vintages to operate on one another in most cases). While this interface exists for communication in both directions, only one interface is straightforward to have reasonable semantics for (userspace passing a struct to the kernel). For kernel returns to userspace, what the correct semantics are (whether there should be an error if userspace is unaware of a new extension) is very syscall-dependent and thus probably cannot be unified between syscalls (a good example of this problem is [1]). Previously there was no common lib/ function that implemented the necessary extension-checking semantics (and different syscalls implemented them slightly differently or incompletely[2]). Future patches replace common uses of this pattern to make use of copy_struct_from_user(). Some in-kernel selftests that insure that the handling of alignment and various byte patterns are all handled identically to memchr_inv() usage. [1]: commit 1251201c0d34 ("sched/core: Fix uclamp ABI bug, clean up and robustify sched_read_attr() ABI logic and code") [2]: For instance {sched_setattr,perf_event_open,clone3}(2) all do do similar checks to copy_struct_from_user() while rt_sigprocmask(2) always rejects differently-sized struct arguments. Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/r/20191001011055.19283-2-cyphar@cyphar.com Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2019-09-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds
Pull networking fixes from David Miller: 1) Sanity check URB networking device parameters to avoid divide by zero, from Oliver Neukum. 2) Disable global multicast filter in NCSI, otherwise LLDP and IPV6 don't work properly. Longer term this needs a better fix tho. From Vijay Khemka. 3) Small fixes to selftests (use ping when ping6 is not present, etc.) from David Ahern. 4) Bring back rt_uses_gateway member of struct rtable, it's semantics were not well understood and trying to remove it broke things. From David Ahern. 5) Move usbnet snaity checking, ignore endpoints with invalid wMaxPacketSize. From Bjørn Mork. 6) Missing Kconfig deps for sja1105 driver, from Mao Wenan. 7) Various small fixes to the mlx5 DR steering code, from Alaa Hleihel, Alex Vesker, and Yevgeny Kliteynik 8) Missing CAP_NET_RAW checks in various places, from Ori Nimron. 9) Fix crash when removing sch_cbs entry while offloading is enabled, from Vinicius Costa Gomes. 10) Signedness bug fixes, generally in looking at the result given by of_get_phy_mode() and friends. From Dan Crapenter. 11) Disable preemption around BPF_PROG_RUN() calls, from Eric Dumazet. 12) Don't create VRF ipv6 rules if ipv6 is disabled, from David Ahern. 13) Fix quantization code in tcp_bbr, from Kevin Yang. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (127 commits) net: tap: clean up an indentation issue nfp: abm: fix memory leak in nfp_abm_u32_knode_replace tcp: better handle TCP_USER_TIMEOUT in SYN_SENT state sk_buff: drop all skb extensions on free and skb scrubbing tcp_bbr: fix quantization code to not raise cwnd if not probing bandwidth mlxsw: spectrum_flower: Fail in case user specifies multiple mirror actions Documentation: Clarify trap's description mlxsw: spectrum: Clear VLAN filters during port initialization net: ena: clean up indentation issue NFC: st95hf: clean up indentation issue net: phy: micrel: add Asym Pause workaround for KSZ9021 net: socionext: ave: Avoid using netdev_err() before calling register_netdev() ptp: correctly disable flags on old ioctls lib: dimlib: fix help text typos net: dsa: microchip: Always set regmap stride to 1 nfp: flower: fix memory leak in nfp_flower_spawn_vnic_reprs nfp: flower: prevent memory leak in nfp_flower_spawn_phy_reprs net/sched: Set default of CONFIG_NET_TC_SKB_EXT to N vrf: Do not attempt to create IPv6 mcast rule if IPv6 is disabled net: sched: sch_sfb: don't call qdisc_put() while holding tree lock ...
2019-09-27lib: dimlib: fix help text typosRandy Dunlap
Fix help text typos for DIMLIB. Fixes: 4f75da3666c0 ("linux/dim: Move implementation to .c files") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Uwe Kleine-König <uwe@kleine-koenig.org> Cc: Tal Gilboa <talgi@mellanox.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Uwe Kleine-König <uwe@kleine-koenig.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-27dimlib: make DIMLIB a hidden symbolUwe Kleine-König
According to Tal Gilboa the only benefit from DIM comes from a driver that uses it. So it doesn't make sense to make this symbol user visible, instead all drivers that use it should select it (as is already the case AFAICT). Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-25lib: untag user pointers in strn*_userAndrey Konovalov
Patch series "arm64: untag user pointers passed to the kernel", v19. === Overview arm64 has a feature called Top Byte Ignore, which allows to embed pointer tags into the top byte of each pointer. Userspace programs (such as HWASan, a memory debugging tool [1]) might use this feature and pass tagged user pointers to the kernel through syscalls or other interfaces. Right now the kernel is already able to handle user faults with tagged pointers, due to these patches: 1. 81cddd65 ("arm64: traps: fix userspace cache maintenance emulation on a tagged pointer") 2. 7dcd9dd8 ("arm64: hw_breakpoint: fix watchpoint matching for tagged pointers") 3. 276e9327 ("arm64: entry: improve data abort handling of tagged pointers") This patchset extends tagged pointer support to syscall arguments. As per the proposed ABI change [3], tagged pointers are only allowed to be passed to syscalls when they point to memory ranges obtained by anonymous mmap() or sbrk() (see the patchset [3] for more details). For non-memory syscalls this is done by untaging user pointers when the kernel performs pointer checking to find out whether the pointer comes from userspace (most notably in access_ok). The untagging is done only when the pointer is being checked, the tag is preserved as the pointer makes its way through the kernel and stays tagged when the kernel dereferences the pointer when perfoming user memory accesses. The mmap and mremap (only new_addr) syscalls do not currently accept tagged addresses. Architectures may interpret the tag as a background colour for the corresponding vma. Other memory syscalls (mprotect, etc.) don't do user memory accesses but rather deal with memory ranges, and untagged pointers are better suited to describe memory ranges internally. Thus for memory syscalls we untag pointers completely when they enter the kernel. === Other approaches One of the alternative approaches to untagging that was considered is to completely strip the pointer tag as the pointer enters the kernel with some kind of a syscall wrapper, but that won't work with the countless number of different ioctl calls. With this approach we would need a custom wrapper for each ioctl variation, which doesn't seem practical. An alternative approach to untagging pointers in memory syscalls prologues is to inspead allow tagged pointers to be passed to find_vma() (and other vma related functions) and untag them there. Unfortunately, a lot of find_vma() callers then compare or subtract the returned vma start and end fields against the pointer that was being searched. Thus this approach would still require changing all find_vma() callers. === Testing The following testing approaches has been taken to find potential issues with user pointer untagging: 1. Static testing (with sparse [2] and separately with a custom static analyzer based on Clang) to track casts of __user pointers to integer types to find places where untagging needs to be done. 2. Static testing with grep to find parts of the kernel that call find_vma() (and other similar functions) or directly compare against vm_start/vm_end fields of vma. 3. Static testing with grep to find parts of the kernel that compare user pointers with TASK_SIZE or other similar consts and macros. 4. Dynamic testing: adding BUG_ON(has_tag(addr)) to find_vma() and running a modified syzkaller version that passes tagged pointers to the kernel. Based on the results of the testing the requried patches have been added to the patchset. === Notes This patchset is meant to be merged together with "arm64 relaxed ABI" [3]. This patchset is a prerequisite for ARM's memory tagging hardware feature support [4]. This patchset has been merged into the Pixel 2 & 3 kernel trees and is now being used to enable testing of Pixel phones with HWASan. Thanks! [1] http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html [2] https://github.com/lucvoo/sparse-dev/commit/5f960cb10f56ec2017c128ef9d16060e0145f292 [3] https://lkml.org/lkml/2019/6/12/745 [4] https://community.arm.com/processors/b/blog/posts/arm-a-profile-architecture-2018-developments-armv85a This patch (of 11) This patch is a part of a series that extends kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments. strncpy_from_user and strnlen_user accept user addresses as arguments, and do not go through the same path as copy_from_user and others, so here we need to handle the case of tagged user addresses separately. Untag user pointers passed to these functions. Note, that this patch only temporarily untags the pointers to perform validity checks, but then uses them as is to perform user memory accesses. [andreyknvl@google.com: fix sparc4 build] Link: http://lkml.kernel.org/r/CAAeHK+yx4a-P0sDrXTUxMvO2H0CJZUFPffBrg_cU7oJOZyC7ew@mail.gmail.com Link: http://lkml.kernel.org/r/c5a78bcad3e94d6cda71fcaa60a423231ae71e4c.1563904656.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com> Acked-by: Kees Cook <keescook@chromium.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Eric Auger <eric.auger@redhat.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Jens Wiklander <jens.wiklander@linaro.org> Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-25lib/lzo/lzo1x_compress.c: fix alignment bug in lzo-rleDave Rodgman
Fix an unaligned access which breaks on platforms where this is not permitted (e.g., Sparc). Link: http://lkml.kernel.org/r/20190912145502.35229-1-dave.rodgman@arm.com Signed-off-by: Dave Rodgman <dave.rodgman@arm.com> Cc: Dave Rodgman <dave.rodgman@arm.com> Cc: Markus F.X.J. Oberhumer <markus@oberhumer.com> Cc: Minchan Kim <minchan@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-25bug: move WARN_ON() "cut here" into exception handlerKees Cook
The original clean up of "cut here" missed the WARN_ON() case (that does not have a printk message), which was fixed recently by adding an explicit printk of "cut here". This had the downside of adding a printk() to every WARN_ON() caller, which reduces the utility of using an instruction exception to streamline the resulting code. By making this a new BUGFLAG, all of these can be removed and "cut here" can be handled by the exception handler. This was very pronounced on PowerPC, but the effect can be seen on x86 as well. The resulting text size of a defconfig build shows some small savings from this patch: text data bss dec hex filename 19691167 5134320 1646664 26472151 193eed7 vmlinux.before 19676362 5134260 1663048 26473670 193f4c6 vmlinux.after This change also opens the door for creating something like BUG_MSG(), where a custom printk() before issuing BUG(), without confusing the "cut here" line. Link: http://lkml.kernel.org/r/201908200943.601DD59DCE@keescook Fixes: 6b15f678fb7d ("include/asm-generic/bug.h: fix "cut here" for WARN_ON for __WARN_TAINT architectures") Signed-off-by: Kees Cook <keescook@chromium.org> Reported-by: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Drew Davenport <ddavenport@chromium.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org> Cc: Feng Tang <feng.tang@intel.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Cc: Borislav Petkov <bp@suse.de> Cc: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-25compiler: enable CONFIG_OPTIMIZE_INLINING forciblyMasahiro Yamada
Commit 9012d011660e ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING") allowed all architectures to enable this option. A couple of build errors were reported by randconfig, but all of them have been ironed out. Towards the goal of removing CONFIG_OPTIMIZE_INLINING entirely (and it will simplify the 'inline' macro in compiler_types.h), this commit changes it to always-on option. Going forward, the compiler will always be allowed to not inline functions marked 'inline'. This is not a problem for x86 since it has been long used by arch/x86/configs/{x86_64,i386}_defconfig. I am keeping the config option just in case any problem crops up for other architectures. The code clean-up will be done after confirming this is solid. Link: http://lkml.kernel.org/r/20190830034304.24259-1-yamada.masahiro@socionext.com Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Nick Desaulniers <ndesaulniers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-25lib/hexdump: make print_hex_dump_bytes() a nop on !DEBUG buildsStephen Boyd
I'm seeing a bunch of debug prints from a user of print_hex_dump_bytes() in my kernel logs, but I don't have CONFIG_DYNAMIC_DEBUG enabled nor do I have DEBUG defined in my build. The problem is that print_hex_dump_bytes() calls a wrapper function in lib/hexdump.c that calls print_hex_dump() with KERN_DEBUG level. There are three cases to consider here 1. CONFIG_DYNAMIC_DEBUG=y --> call dynamic_hex_dum() 2. CONFIG_DYNAMIC_DEBUG=n && DEBUG --> call print_hex_dump() 3. CONFIG_DYNAMIC_DEBUG=n && !DEBUG --> stub it out Right now, that last case isn't detected and we still call print_hex_dump() from the stub wrapper. Let's make print_hex_dump_bytes() only call print_hex_dump_debug() so that it works properly in all cases. Case #1, print_hex_dump_debug() calls dynamic_hex_dump() and we get same behavior. Case #2, print_hex_dump_debug() calls print_hex_dump() with KERN_DEBUG and we get the same behavior. Case #3, print_hex_dump_debug() is a nop, changing behavior to what we want, i.e. print nothing. Link: http://lkml.kernel.org/r/20190816235624.115280-1-swboyd@chromium.org Signed-off-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-25lib/extable.c: add missing prototypesValdis Kletnieks
When building with W=1, a number of warnings are issued: CC lib/extable.o lib/extable.c:63:6: warning: no previous prototype for 'sort_extable' [-Wmissing-prototypes] 63 | void sort_extable(struct exception_table_entry *start, | ^~~~~~~~~~~~ lib/extable.c:75:6: warning: no previous prototype for 'trim_init_extable' [-Wmissing-prototypes] 75 | void trim_init_extable(struct module *m) | ^~~~~~~~~~~~~~~~~ lib/extable.c:115:1: warning: no previous prototype for 'search_extable' [-Wmissing-prototypes] 115 | search_extable(const struct exception_table_entry *base, | ^~~~~~~~~~~~~~ Add the missing #include for the prototypes. Link: http://lkml.kernel.org/r/45574.1565235784@turing-police Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-25lib/generic-radix-tree.c: make 2 functions static inlineValdis Kletnieks
When building with W=1, we get some warnings: l CC lib/generic-radix-tree.o lib/generic-radix-tree.c:39:10: warning: no previous prototype for 'genradix_root_to_depth' [-Wmissing-prototypes] 39 | unsigned genradix_root_to_depth(struct genradix_root *r) | ^~~~~~~~~~~~~~~~~~~~~~ lib/generic-radix-tree.c:44:23: warning: no previous prototype for 'genradix_root_to_node' [-Wmissing-prototypes] 44 | struct genradix_node *genradix_root_to_node(struct genradix_root *r) | ^~~~~~~~~~~~~~~~~~~~~ They're not used anywhere else, so make them static inline. Link: http://lkml.kernel.org/r/46923.1565236485@turing-police Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu> Cc: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-25strscpy: reject buffer sizes larger than INT_MAXKees Cook
As already done for snprintf(), add a check in strscpy() for giant (i.e. likely negative and/or miscalculated) copy sizes, WARN, and error out. Link: http://lkml.kernel.org/r/201907260928.23DE35406@keescook Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Joe Perches <joe@perches.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Yann Droneaud <ydroneaud@opteya.com> Cc: David Laight <David.Laight@aculab.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Stephen Kitt <steve@sk2.org> Cc: Jann Horn <jannh@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>