summaryrefslogtreecommitdiffstats
path: root/init
AgeCommit message (Collapse)Author
2020-05-20x86: Fix early boot crash on gcc-10, third tryBorislav Petkov
commit a9a3ed1eff3601b63aea4fb462d8b3b92c7c1e7e upstream. ... or the odyssey of trying to disable the stack protector for the function which generates the stack canary value. The whole story started with Sergei reporting a boot crash with a kernel built with gcc-10: Kernel panic — not syncing: stack-protector: Kernel stack is corrupted in: start_secondary CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.6.0-rc5—00235—gfffb08b37df9 #139 Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M—D3H, BIOS F12 11/14/2013 Call Trace: dump_stack panic ? start_secondary __stack_chk_fail start_secondary secondary_startup_64 -—-[ end Kernel panic — not syncing: stack—protector: Kernel stack is corrupted in: start_secondary This happens because gcc-10 tail-call optimizes the last function call in start_secondary() - cpu_startup_entry() - and thus emits a stack canary check which fails because the canary value changes after the boot_init_stack_canary() call. To fix that, the initial attempt was to mark the one function which generates the stack canary with: __attribute__((optimize("-fno-stack-protector"))) ... start_secondary(void *unused) however, using the optimize attribute doesn't work cumulatively as the attribute does not add to but rather replaces previously supplied optimization options - roughly all -fxxx options. The key one among them being -fno-omit-frame-pointer and thus leading to not present frame pointer - frame pointer which the kernel needs. The next attempt to prevent compilers from tail-call optimizing the last function call cpu_startup_entry(), shy of carving out start_secondary() into a separate compilation unit and building it with -fno-stack-protector, was to add an empty asm(""). This current solution was short and sweet, and reportedly, is supported by both compilers but we didn't get very far this time: future (LTO?) optimization passes could potentially eliminate this, which leads us to the third attempt: having an actual memory barrier there which the compiler cannot ignore or move around etc. That should hold for a long time, but hey we said that about the other two solutions too so... Reported-by: Sergei Trofimovich <slyfox@gentoo.org> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Kalle Valo <kvalo@codeaurora.org> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20200314164451.346497-1-slyfox@gentoo.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-20gcc-10: mark more functions __init to avoid section mismatch warningsLinus Torvalds
commit e99332e7b4cda6e60f5b5916cf9943a79dbef902 upstream. It seems that for whatever reason, gcc-10 ends up not inlining a couple of functions that used to be inlined before. Even if they only have one single callsite - it looks like gcc may have decided that the code was unlikely, and not worth inlining. The code generation difference is harmless, but caused a few new section mismatch errors, since the (now no longer inlined) function wasn't in the __init section, but called other init functions: Section mismatch in reference from the function kexec_free_initrd() to the function .init.text:free_initrd_mem() Section mismatch in reference from the function tpm2_calc_event_log_size() to the function .init.text:early_memremap() Section mismatch in reference from the function tpm2_calc_event_log_size() to the function .init.text:early_memunmap() So add the appropriate __init annotation to make modpost not complain. In both cases there were trivially just a single callsite from another __init function. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-20Stop the ad-hoc games with -Wno-maybe-initializedLinus Torvalds
commit 78a5255ffb6a1af189a83e493d916ba1c54d8c75 upstream. We have some rather random rules about when we accept the "maybe-initialized" warnings, and when we don't. For example, we consider it unreliable for gcc versions < 4.9, but also if -O3 is enabled, or if optimizing for size. And then various kernel config options disabled it, because they know that they trigger that warning by confusing gcc sufficiently (ie PROFILE_ALL_BRANCHES). And now gcc-10 seems to be introducing a lot of those warnings too, so it falls under the same heading as 4.9 did. At the same time, we have a very straightforward way to _enable_ that warning when wanted: use "W=2" to enable more warnings. So stop playing these ad-hoc games, and just disable that warning by default, with the known and straight-forward "if you want to work on the extra compiler warnings, use W=123". Would it be great to have code that is always so obvious that it never confuses the compiler whether a variable is used initialized or not? Yes, it would. In a perfect world, the compilers would be smarter, and our source code would be simpler. That's currently not the world we live in, though. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-02printk: queue wake_up_klogd irq_work only if per-CPU areas are readySergey Senozhatsky
commit ab6f762f0f53162d41497708b33c9a3236d3609e upstream. printk_deferred(), similarly to printk_safe/printk_nmi, does not immediately attempt to print a new message on the consoles, avoiding calls into non-reentrant kernel paths, e.g. scheduler or timekeeping, which potentially can deadlock the system. Those printk() flavors, instead, rely on per-CPU flush irq_work to print messages from safer contexts. For same reasons (recursive scheduler or timekeeping calls) printk() uses per-CPU irq_work in order to wake up user space syslog/kmsg readers. However, only printk_safe/printk_nmi do make sure that per-CPU areas have been initialised and that it's safe to modify per-CPU irq_work. This means that, for instance, should printk_deferred() be invoked "too early", that is before per-CPU areas are initialised, printk_deferred() will perform illegal per-CPU access. Lech Perczak [0] reports that after commit 1b710b1b10ef ("char/random: silence a lockdep splat with printk()") user-space syslog/kmsg readers are not able to read new kernel messages. The reason is printk_deferred() being called too early (as was pointed out by Petr and John). Fix printk_deferred() and do not queue per-CPU irq_work before per-CPU areas are initialized. Link: https://lore.kernel.org/lkml/aa0732c6-5c4e-8a8b-a1c1-75ebe3dca05b@camlintechnologies.com/ Reported-by: Lech Perczak <l.perczak@camlintechnologies.com> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Tested-by: Jann Horn <jannh@google.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Theodore Ts'o <tytso@mit.edu> Cc: John Ogness <john.ogness@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-05kbuild: remove header compile testMasahiro Yamada
commit fcbb8461fd2376ba3782b5b8bd440c929b8e4980 upstream. There are both positive and negative options about this feature. At first, I thought it was a good idea, but actually Linus stated a negative opinion (https://lkml.org/lkml/2019/9/29/227). I admit it is ugly and annoying. The baseline I'd like to keep is the compile-test of uapi headers. (Otherwise, kernel developers have no way to ensure the correctness of the exported headers.) I will maintain a small build rule in usr/include/Makefile. Remove the other header test functionality. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> [ added to 5.4.y due to start of build warnings from backported patches because of this feature - gregkh] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-01Revert "um: Enable CONFIG_CONSTRUCTORS"Johannes Berg
commit 87c9366e17259040a9118e06b6dc8de986e5d3d1 upstream. This reverts commit 786b2384bf1c ("um: Enable CONFIG_CONSTRUCTORS"). There are two issues with this commit, uncovered by Anton in tests on some (Debian) systems: 1) I completely forgot to call any constructors if CONFIG_CONSTRUCTORS isn't set. Don't recall now if it just wasn't needed on my system, or if I never tested this case. 2) With that fixed, it works - with CONFIG_CONSTRUCTORS *unset*. If I set CONFIG_CONSTRUCTORS, it fails again, which isn't totally unexpected since whatever wanted to run is likely to have to run before the kernel init etc. that calls the constructors in this case. Basically, some constructors that gcc emits (libc has?) need to run very early during init; the failure mode otherwise was that the ptrace fork test already failed: ---------------------- $ ./linux mem=512M Core dump limits : soft - 0 hard - NONE Checking that ptrace can change system call numbers...check_ptrace : child exited with exitcode 6, while expecting 0; status 0x67f Aborted ---------------------- Thinking more about this, it's clear that we simply cannot support CONFIG_CONSTRUCTORS in UML. All the cases we need now (gcov, kasan) involve not use of the __attribute__((constructor)), but instead some constructor code/entry generated by gcc. Therefore, we cannot distinguish between kernel constructors and system constructors. Thus, revert this commit. Cc: stable@vger.kernel.org [5.4+] Fixes: 786b2384bf1c ("um: Enable CONFIG_CONSTRUCTORS") Reported-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Richard Weinberger <richard@nod.at>
2020-01-23mm, debug_pagealloc: don't rely on static keys too earlyVlastimil Babka
commit 8e57f8acbbd121ecfb0c9dc13b8b030f86c6bd3b upstream. Commit 96a2b03f281d ("mm, debug_pagelloc: use static keys to enable debugging") has introduced a static key to reduce overhead when debug_pagealloc is compiled in but not enabled. It relied on the assumption that jump_label_init() is called before parse_early_param() as in start_kernel(), so when the "debug_pagealloc=on" option is parsed, it is safe to enable the static key. However, it turns out multiple architectures call parse_early_param() earlier from their setup_arch(). x86 also calls jump_label_init() even earlier, so no issue was found while testing the commit, but same is not true for e.g. ppc64 and s390 where the kernel would not boot with debug_pagealloc=on as found by our QA. To fix this without tricky changes to init code of multiple architectures, this patch partially reverts the static key conversion from 96a2b03f281d. Init-time and non-fastpath calls (such as in arch code) of debug_pagealloc_enabled() will again test a simple bool variable. Fastpath mm code is converted to a new debug_pagealloc_enabled_static() variant that relies on the static key, which is enabled in a well-defined point in mm_init() where it's guaranteed that jump_label_init() has been called, regardless of architecture. [sfr@canb.auug.org.au: export _debug_pagealloc_enabled_early] Link: http://lkml.kernel.org/r/20200106164944.063ac07b@canb.auug.org.au Link: http://lkml.kernel.org/r/20191219130612.23171-1-vbabka@suse.cz Fixes: 96a2b03f281d ("mm, debug_pagelloc: use static keys to enable debugging") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Qian Cai <cai@lca.pw> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-28Merge branch 'next-lockdown' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull kernel lockdown mode from James Morris: "This is the latest iteration of the kernel lockdown patchset, from Matthew Garrett, David Howells and others. From the original description: This patchset introduces an optional kernel lockdown feature, intended to strengthen the boundary between UID 0 and the kernel. When enabled, various pieces of kernel functionality are restricted. Applications that rely on low-level access to either hardware or the kernel may cease working as a result - therefore this should not be enabled without appropriate evaluation beforehand. The majority of mainstream distributions have been carrying variants of this patchset for many years now, so there's value in providing a doesn't meet every distribution requirement, but gets us much closer to not requiring external patches. There are two major changes since this was last proposed for mainline: - Separating lockdown from EFI secure boot. Background discussion is covered here: https://lwn.net/Articles/751061/ - Implementation as an LSM, with a default stackable lockdown LSM module. This allows the lockdown feature to be policy-driven, rather than encoding an implicit policy within the mechanism. The new locked_down LSM hook is provided to allow LSMs to make a policy decision around whether kernel functionality that would allow tampering with or examining the runtime state of the kernel should be permitted. The included lockdown LSM provides an implementation with a simple policy intended for general purpose use. This policy provides a coarse level of granularity, controllable via the kernel command line: lockdown={integrity|confidentiality} Enable the kernel lockdown feature. If set to integrity, kernel features that allow userland to modify the running kernel are disabled. If set to confidentiality, kernel features that allow userland to extract confidential information from the kernel are also disabled. This may also be controlled via /sys/kernel/security/lockdown and overriden by kernel configuration. New or existing LSMs may implement finer-grained controls of the lockdown features. Refer to the lockdown_reason documentation in include/linux/security.h for details. The lockdown feature has had signficant design feedback and review across many subsystems. This code has been in linux-next for some weeks, with a few fixes applied along the way. Stephen Rothwell noted that commit 9d1f8be5cf42 ("bpf: Restrict bpf when kernel lockdown is in confidentiality mode") is missing a Signed-off-by from its author. Matthew responded that he is providing this under category (c) of the DCO" * 'next-lockdown' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (31 commits) kexec: Fix file verification on S390 security: constify some arrays in lockdown LSM lockdown: Print current->comm in restriction messages efi: Restrict efivar_ssdt_load when the kernel is locked down tracefs: Restrict tracefs when the kernel is locked down debugfs: Restrict debugfs when the kernel is locked down kexec: Allow kexec_file() with appropriate IMA policy when locked down lockdown: Lock down perf when in confidentiality mode bpf: Restrict bpf when kernel lockdown is in confidentiality mode lockdown: Lock down tracing and perf kprobes when in confidentiality mode lockdown: Lock down /proc/kcore x86/mmiotrace: Lock down the testmmiotrace module lockdown: Lock down module params that specify hardware parameters (eg. ioport) lockdown: Lock down TIOCSSERIAL lockdown: Prohibit PCMCIA CIS storage when the kernel is locked down acpi: Disable ACPI table override if the kernel is locked down acpi: Ignore acpi_rsdp kernel param when the kernel has been locked down ACPI: Limit access to custom_method when the kernel is locked down x86/msr: Restrict MSR access when the kernel is locked down x86: Lock down IO port access when the kernel is locked down ...
2019-09-27Merge branch 'next-integrity' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull integrity updates from Mimi Zohar: "The major feature in this time is IMA support for measuring and appraising appended file signatures. In addition are a couple of bug fixes and code cleanup to use struct_size(). In addition to the PE/COFF and IMA xattr signatures, the kexec kernel image may be signed with an appended signature, using the same scripts/sign-file tool that is used to sign kernel modules. Similarly, the initramfs may contain an appended signature. This contained a lot of refactoring of the existing appended signature verification code, so that IMA could retain the existing framework of calculating the file hash once, storing it in the IMA measurement list and extending the TPM, verifying the file's integrity based on a file hash or signature (eg. xattrs), and adding an audit record containing the file hash, all based on policy. (The IMA support for appended signatures patch set was posted and reviewed 11 times.) The support for appended signature paves the way for adding other signature verification methods, such as fs-verity, based on a single system-wide policy. The file hash used for verifying the signature and the signature, itself, can be included in the IMA measurement list" * 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: ima: ima_api: Use struct_size() in kzalloc() ima: use struct_size() in kzalloc() sefltest/ima: support appended signatures (modsig) ima: Fix use after free in ima_read_modsig() MODSIGN: make new include file self contained ima: fix freeing ongoing ahash_request ima: always return negative code for error ima: Store the measurement again when appraising a modsig ima: Define ima-modsig template ima: Collect modsig ima: Implement support for module-style appended signatures ima: Factor xattr_verify() out of ima_appraise_measurement() ima: Add modsig appraise_type option for module-style appended signatures integrity: Select CONFIG_KEYS instead of depending on it PKCS#7: Introduce pkcs7_get_digest() PKCS#7: Refactor verify_pkcs7_signature() MODSIGN: Export module signature definitions ima: initialize the "template" field with the default template
2019-09-24mm: use CPU_BITS_NONE to initialize init_mm.cpu_bitmaskMike Rapoport
Replace open-coded bitmap array initialization of init_mm.cpu_bitmask with neat CPU_BITS_NONE macro. And, since init_mm.cpu_bitmask is statically set to zero, there is no way to clear it again in start_kernel(). Link: http://lkml.kernel.org/r/1565703815-8584-1-git-send-email-rppt@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24mm: consolidate pgtable_cache_init() and pgd_cache_init()Mike Rapoport
Both pgtable_cache_init() and pgd_cache_init() are used to initialize kmem cache for page table allocations on several architectures that do not use PAGE_SIZE tables for one or more levels of the page table hierarchy. Most architectures do not implement these functions and use __weak default NOP implementation of pgd_cache_init(). Since there is no such default for pgtable_cache_init(), its empty stub is duplicated among most architectures. Rename the definitions of pgd_cache_init() to pgtable_cache_init() and drop empty stubs of pgtable_cache_init(). Link: http://lkml.kernel.org/r/1566457046-22637-1-git-send-email-rppt@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Will Deacon <will@kernel.org> [arm64] Acked-by: Thomas Gleixner <tglx@linutronix.de> [x86] Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24mm: kmemleak: use the memory pool for early allocationsCatalin Marinas
Currently kmemleak uses a static early_log buffer to trace all memory allocation/freeing before the slab allocator is initialised. Such early log is replayed during kmemleak_init() to properly initialise the kmemleak metadata for objects allocated up that point. With a memory pool that does not rely on the slab allocator, it is possible to skip this early log entirely. In order to remove the early logging, consider kmemleak_enabled == 1 by default while the kmem_cache availability is checked directly on the object_cache and scan_area_cache variables. The RCU callback is only invoked after object_cache has been initialised as we wouldn't have any concurrent list traversal before this. In order to reduce the number of callbacks before kmemleak is fully initialised, move the kmemleak_init() call to mm_init(). [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: remove WARN_ON(), per Catalin] Link: http://lkml.kernel.org/r/20190812160642.52134-4-catalin.marinas@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Qian Cai <cai@lca.pw> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-22Merge tag 'modules-for-v5.4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux Pull modules updates from Jessica Yu: "The main bulk of this pull request introduces a new exported symbol namespaces feature. The number of exported symbols is increasingly growing with each release (we're at about 31k exports as of 5.3-rc7) and we currently have no way of visualizing how these symbols are "clustered" or making sense of this huge export surface. Namespacing exported symbols allows kernel developers to more explicitly partition and categorize exported symbols, as well as more easily limiting the availability of namespaced symbols to other parts of the kernel. For starters, we have introduced the USB_STORAGE namespace to demonstrate the API's usage. I have briefly summarized the feature and its main motivations in the tag below. Summary: - Introduce exported symbol namespaces. This new feature allows subsystem maintainers to partition and categorize their exported symbols into explicit namespaces. Module authors are now required to import the namespaces they need. Some of the main motivations of this feature include: allowing kernel developers to better manage the export surface, allow subsystem maintainers to explicitly state that usage of some exported symbols should only be limited to certain users (think: inter-module or inter-driver symbols, debugging symbols, etc), as well as more easily limiting the availability of namespaced symbols to other parts of the kernel. With the module import requirement, it is also easier to spot the misuse of exported symbols during patch review. Two new macros are introduced: EXPORT_SYMBOL_NS() and EXPORT_SYMBOL_NS_GPL(). The API is thoroughly documented in Documentation/kbuild/namespaces.rst. - Some small code and kbuild cleanups here and there" * tag 'modules-for-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux: module: Remove leftover '#undef' from export header module: remove unneeded casts in cmp_name() module: move CONFIG_UNUSED_SYMBOLS to the sub-menu of MODULES module: remove redundant 'depends on MODULES' module: Fix link failure due to invalid relocation on namespace offset usb-storage: export symbols in USB_STORAGE namespace usb-storage: remove single-use define for debugging docs: Add documentation for Symbol Namespaces scripts: Coccinelle script for namespace dependencies. modpost: add support for generating namespace dependencies export: allow definition default namespaces in Makefiles or sources module: add config option MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS modpost: add support for symbol namespaces module: add support for symbol namespaces. export: explicitly align struct kernel_symbol module: support reading multiple values per modinfo tag
2019-09-21Merge tag 'for-linus-5.4-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml Pull UML updates from Richard Weinberger: - virtio support - fixes for our new time travel mode - various improvements to make lockdep and kasan work better - SPDX header updates * tag 'for-linus-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: (25 commits) um: irq: Fix LAST_IRQ usage in init_IRQ() um: Add SPDX headers for files in arch/um/include um: Add SPDX headers for files in arch/um/os-Linux um: Add SPDX headers to files in arch/um/kernel/ um: Add SPDX headers for files in arch/um/drivers um: virtio: Implement VHOST_USER_PROTOCOL_F_REPLY_ACK um: virtio: Implement VHOST_USER_PROTOCOL_F_SLAVE_REQ um: drivers: Add virtio vhost-user driver um: Use real DMA barriers um: Don't use generic barrier.h um: time-travel: Restrict time update in IRQ handler um: time-travel: Fix periodic timers um: Enable CONFIG_CONSTRUCTORS um: Place (soft)irq text with macros um: Fix VDSO compiler warning um: Implement TRACE_IRQFLAGS_SUPPORT um: Remove misleading #define ARCh_IRQ_ENABLED um: Avoid using uninitialized regs um: Remove sig_info[SIGALRM] um: Error handling fixes in vector drivers ...
2019-09-21Merge tag 'compiler-attributes-for-linus-v5.4' of git://github.com/ojeda/linuxLinus Torvalds
Pull asm inline support from Miguel Ojeda: "Make use of gcc 9's "asm inline()" (Rasmus Villemoes): gcc 9+ (and gcc 8.3, 7.5) provides a way to override the otherwise crude heuristic that gcc uses to estimate the size of the code represented by an asm() statement. From the gcc docs If you use 'asm inline' instead of just 'asm', then for inlining purposes the size of the asm is taken as the minimum size, ignoring how many instructions GCC thinks it is. For compatibility with older compilers, we obviously want a #if [understands asm inline] #define asm_inline asm inline #else #define asm_inline asm #endif But since we #define the identifier inline to attach some attributes, we have to use an alternate spelling of that keyword. gcc provides both __inline__ and __inline, and we currently #define both to inline, so they all have the same semantics. We have to free up one of __inline__ and __inline, and the latter is by far the easiest. The two x86 changes cause smaller code gen differences than I'd expect, but I think we do want the asm_inline thing available sooner or later, so this is just to get the ball rolling" * tag 'compiler-attributes-for-linus-v5.4' of git://github.com/ojeda/linux: x86: bug.h: use asm_inline in _BUG_FLAGS definitions x86: alternative.h: use asm_inline for all alternative variants compiler-types.h: add asm_inline definition compiler_types.h: don't #define __inline lib/zstd/mem.h: replace __inline by inline staging: rtl8723bs: replace __inline by inline
2019-09-20Merge tag 'kbuild-v5.4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - add modpost warn exported symbols marked as 'static' because 'static' and EXPORT_SYMBOL is an odd combination - break the build early if gold linker is used - optimize the Bison rule to produce .c and .h files by a single pattern rule - handle PREEMPT_RT in the module vermagic and UTS_VERSION - warn CONFIG options leaked to the user-space except existing ones - make single targets work properly - rebuild modules when module linker scripts are updated - split the module final link stage into scripts/Makefile.modfinal - fix the missed error code in merge_config.sh - improve the error message displayed on the attempt of the O= build in unclean source tree - remove 'clean-dirs' syntax - disable -Wimplicit-fallthrough warning for Clang - add CONFIG_CC_OPTIMIZE_FOR_SIZE_O3 for ARC - remove ARCH_{CPP,A,C}FLAGS variables - add $(BASH) to run bash scripts - change *CFLAGS_<basetarget>.o to take the relative path to $(obj) instead of the basename - stop suppressing Clang's -Wunused-function warnings when W=1 - fix linux/export.h to avoid genksyms calculating CRC of trimmed exported symbols - misc cleanups * tag 'kbuild-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (63 commits) genksyms: convert to SPDX License Identifier for lex.l and parse.y modpost: use __section in the output to *.mod.c modpost: use MODULE_INFO() for __module_depends export.h, genksyms: do not make genksyms calculate CRC of trimmed symbols export.h: remove defined(__KERNEL__), which is no longer needed kbuild: allow Clang to find unused static inline functions for W=1 build kbuild: rename KBUILD_ENABLE_EXTRA_GCC_CHECKS to KBUILD_EXTRA_WARN kbuild: refactor scripts/Makefile.extrawarn merge_config.sh: ignore unwanted grep errors kbuild: change *FLAGS_<basetarget>.o to take the path relative to $(obj) modpost: add NOFAIL to strndup modpost: add guid_t type definition kbuild: add $(BASH) to run scripts with bash-extension kbuild: remove ARCH_{CPP,A,C}FLAGS kbuild,arc: add CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3 for ARC kbuild: Do not enable -Wimplicit-fallthrough for clang for now kbuild: clean up subdir-ymn calculation in Makefile.clean kbuild: remove unneeded '+' marker from cmd_clean kbuild: remove clean-dirs syntax kbuild: check clean srctree even earlier ...
2019-09-19Merge branch 'work.mount2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc mount API conversions from Al Viro: "Conversions to new API for shmem and friends and for mount_mtd()-using filesystems. As for the rest of the mount API conversions in -next, some of them belong in the individual trees (e.g. binderfs one should definitely go through android folks, after getting redone on top of their changes). I'm going to drop those and send the rest (trivial ones + stuff ACKed by maintainers) in a separate series - by that point they are independent from each other. Some stuff has already migrated into individual trees (NFS conversion, for example, or FUSE stuff, etc.); those presumably will go through the regular merges from corresponding trees." * 'work.mount2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: vfs: Make fs_parse() handle fs_param_is_fd-type params better vfs: Convert ramfs, shmem, tmpfs, devtmpfs, rootfs to use the new mount API shmem_parse_one(): switch to use of fs_parse() shmem_parse_options(): take handling a single option into a helper shmem_parse_options(): don't bother with mpol in separate variable shmem_parse_options(): use a separate structure to keep the results make shmem_fill_super() static make ramfs_fill_super() static devtmpfs: don't mix {ramfs,shmem}_fill_super() with mount_single() vfs: Convert squashfs to use the new mount API mtd: Kill mount_mtd() vfs: Convert jffs2 to use the new mount API vfs: Convert cramfs to use the new mount API vfs: Convert romfs to use the new mount API vfs: Add a single-or-reconfig keying to vfs_get_super()
2019-09-17Merge branch 'timers-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core timer updates from Thomas Gleixner: "Timers and timekeeping updates: - A large overhaul of the posix CPU timer code which is a preparation for moving the CPU timer expiry out into task work so it can be properly accounted on the task/process. An update to the bogus permission checks will come later during the merge window as feedback was not complete before heading of for travel. - Switch the timerqueue code to use cached rbtrees and get rid of the homebrewn caching of the leftmost node. - Consolidate hrtimer_init() + hrtimer_init_sleeper() calls into a single function - Implement the separation of hrtimers to be forced to expire in hard interrupt context even when PREEMPT_RT is enabled and mark the affected timers accordingly. - Implement a mechanism for hrtimers and the timer wheel to protect RT against priority inversion and live lock issues when a (hr)timer which should be canceled is currently executing the callback. Instead of infinitely spinning, the task which tries to cancel the timer blocks on a per cpu base expiry lock which is held and released by the (hr)timer expiry code. - Enable the Hyper-V TSC page based sched_clock for Hyper-V guests resulting in faster access to timekeeping functions. - Updates to various clocksource/clockevent drivers and their device tree bindings. - The usual small improvements all over the place" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (101 commits) posix-cpu-timers: Fix permission check regression posix-cpu-timers: Always clear head pointer on dequeue hrtimer: Add a missing bracket and hide `migration_base' on !SMP posix-cpu-timers: Make expiry_active check actually work correctly posix-timers: Unbreak CONFIG_POSIX_TIMERS=n build tick: Mark sched_timer to expire in hard interrupt context hrtimer: Add kernel doc annotation for HRTIMER_MODE_HARD x86/hyperv: Hide pv_ops access for CONFIG_PARAVIRT=n posix-cpu-timers: Utilize timerqueue for storage posix-cpu-timers: Move state tracking to struct posix_cputimers posix-cpu-timers: Deduplicate rlimit handling posix-cpu-timers: Remove pointless comparisons posix-cpu-timers: Get rid of 64bit divisions posix-cpu-timers: Consolidate timer expiry further posix-cpu-timers: Get rid of zero checks rlimit: Rewrite non-sensical RLIMIT_CPU comment posix-cpu-timers: Respect INFINITY for hard RTTIME limit posix-cpu-timers: Switch thread group sampling to array posix-cpu-timers: Restructure expiry array posix-cpu-timers: Remove cputime_expires ...
2019-09-16Merge branch 'sched-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: - MAINTAINERS: Add Mark Rutland as perf submaintainer, Juri Lelli and Vincent Guittot as scheduler submaintainers. Add Dietmar Eggemann, Steven Rostedt, Ben Segall and Mel Gorman as scheduler reviewers. As perf and the scheduler is getting bigger and more complex, document the status quo of current responsibilities and interests, and spread the review pain^H^H^H^H fun via an increase in the Cc: linecount generated by scripts/get_maintainer.pl. :-) - Add another series of patches that brings the -rt (PREEMPT_RT) tree closer to mainline: split the monolithic CONFIG_PREEMPT dependencies into a new CONFIG_PREEMPTION category that will allow the eventual introduction of CONFIG_PREEMPT_RT. Still a few more hundred patches to go though. - Extend the CPU cgroup controller with uclamp.min and uclamp.max to allow the finer shaping of CPU bandwidth usage. - Micro-optimize energy-aware wake-ups from O(CPUS^2) to O(CPUS). - Improve the behavior of high CPU count, high thread count applications running under cpu.cfs_quota_us constraints. - Improve balancing with SCHED_IDLE (SCHED_BATCH) tasks present. - Improve CPU isolation housekeeping CPU allocation NUMA locality. - Fix deadline scheduler bandwidth calculations and logic when cpusets rebuilds the topology, or when it gets deadline-throttled while it's being offlined. - Convert the cpuset_mutex to percpu_rwsem, to allow it to be used from setscheduler() system calls without creating global serialization. Add new synchronization between cpuset topology-changing events and the deadline acceptance tests in setscheduler(), which were broken before. - Rework the active_mm state machine to be less confusing and more optimal. - Rework (simplify) the pick_next_task() slowpath. - Improve load-balancing on AMD EPYC systems. - ... and misc cleanups, smaller fixes and improvements - please see the Git log for more details. * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits) sched/psi: Correct overly pessimistic size calculation sched/fair: Speed-up energy-aware wake-ups sched/uclamp: Always use 'enum uclamp_id' for clamp_id values sched/uclamp: Update CPU's refcount on TG's clamp changes sched/uclamp: Use TG's clamps to restrict TASK's clamps sched/uclamp: Propagate system defaults to the root group sched/uclamp: Propagate parent clamps sched/uclamp: Extend CPU's cgroup controller sched/topology: Improve load balancing on AMD EPYC systems arch, ia64: Make NUMA select SMP sched, perf: MAINTAINERS update, add submaintainers and reviewers sched/fair: Use rq_lock/unlock in online_fair_sched_group cpufreq: schedutil: fix equation in comment sched: Rework pick_next_task() slow-path sched: Allow put_prev_task() to drop rq->lock sched/fair: Expose newidle_balance() sched: Add task_struct pointer to sched_class::set_curr_task sched: Rework CPU hotplug task selection sched/{rt,deadline}: Fix set_next_task vs pick_next_task sched: Fix kerneldoc comment for ia64_set_curr_task ...
2019-09-16Merge branch 'sched/rt' into sched/core, to pick up -rt changesIngo Molnar
Pick up the first couple of patches working towards PREEMPT_RT. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-15um: Enable CONFIG_CONSTRUCTORSJohannes Berg
We do need to call the constructors for *modules*, and at least for KASAN in the future, we must call even the kernel constructors only later when the kernel has been initialized. Instead of relying on libc to call them, emit an empty section for libc and let the kernel's CONSTRUCTORS code do the rest of the job. Tested that it indeed doesn't work in modules, and does work after the fixes in both, with a few functions with __attribute__((constructor)) in both dynamic and static builds. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2019-09-15compiler-types.h: add asm_inline definitionRasmus Villemoes
This adds an asm_inline macro which expands to "asm inline" [1] when the compiler supports it. This is currently gcc 9.1+, gcc 8.3 and (once released) gcc 7.5 [2]. It expands to just "asm" for other compilers. Using asm inline("foo") instead of asm("foo") overrules gcc's heuristic estimate of the size of the code represented by the asm() statement, and makes gcc use the minimum possible size instead. That can in turn affect gcc's inlining decisions. I wasn't sure whether to make this a function-like macro or not - this way, it can be combined with volatile as asm_inline volatile() but perhaps we'd prefer to spell that asm_inline_volatile() anyway. The Kconfig logic is taken from an RFC patch by Masahiro Yamada [3]. [1] Technically, asm __inline, since both inline and __inline__ are macros that attach various attributes, making gcc barf if one literally does "asm inline()". However, the third spelling __inline is available for referring to the bare keyword. [2] https://lore.kernel.org/lkml/20190907001411.GG9749@gate.crashing.org/ [3] https://lore.kernel.org/lkml/1544695154-15250-1-git-send-email-yamada.masahiro@socionext.com/ Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
2019-09-12vfs: Convert ramfs, shmem, tmpfs, devtmpfs, rootfs to use the new mount APIDavid Howells
Convert the ramfs, shmem, tmpfs, devtmpfs and rootfs filesystems to the new internal mount API as the old one will be obsoleted and removed. This allows greater flexibility in communication of mount parameters between userspace, the VFS and the filesystem. See Documentation/filesystems/mount_api.txt for more information. Note that tmpfs is slightly tricky as it can contain embedded commas, so it can't be trivially split up using strsep() to break on commas in generic_parse_monolithic(). Instead, tmpfs has to supply its own generic parser. However, if tmpfs changes, then devtmpfs and rootfs, which are wrappers around tmpfs or ramfs, must change too - and thus so must ramfs, so these had to be converted also. [AV: rewritten] Signed-off-by: David Howells <dhowells@redhat.com> cc: Hugh Dickins <hughd@google.com> cc: linux-mm@kvack.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-09-11module: move CONFIG_UNUSED_SYMBOLS to the sub-menu of MODULESMasahiro Yamada
When CONFIG_MODULES is disabled, CONFIG_UNUSED_SYMBOLS is pointless, thus it should be invisible. Instead of adding "depends on MODULES", I moved it to the sub-menu "Enable loadable module support", which is a better fit. I put it close to TRIM_UNUSED_KSYMS because it depends on !UNUSED_SYMBOLS. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Jessica Yu <jeyu@kernel.org>
2019-09-11module: remove redundant 'depends on MODULES'Masahiro Yamada
These are located in the 'if MODULES' ... 'endif' block. Remove the redundant dependencies. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Jessica Yu <jeyu@kernel.org>
2019-09-10module: add config option MODULE_ALLOW_MISSING_NAMESPACE_IMPORTSMatthias Maennich
If MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS is enabled (default=n), the requirement for modules to import all namespaces that are used by the module is relaxed. Enabling this option effectively allows (invalid) modules to be loaded while only a warning is emitted. Disabling this option keeps the enforcement at module loading time and loading is denied if the module's imports are not satisfactory. Reviewed-by: Martijn Coenen <maco@android.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Matthias Maennich <maennich@google.com> Signed-off-by: Jessica Yu <jeyu@kernel.org>
2019-09-05make shmem_fill_super() staticAl Viro
... have callers use shmem_mount() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-09-05make ramfs_fill_super() staticAl Viro
all users should just call ramfs_mount() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-09-04kbuild,arc: add CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3 for ARCMasahiro Yamada
arch/arc/Makefile overrides -O2 with -O3. This is the only user of ARCH_CFLAGS. There is no user of ARCH_CPPFLAGS or ARCH_AFLAGS. My plan is to remove ARCH_{CPP,A,C}FLAGS after refactoring the ARC Makefile. Currently, ARC has no way to enable -Wmaybe-uninitialized because both -O3 and -Os disable it. Enabling it will be useful for compile-testing. This commit allows allmodconfig (, which defaults to -O2) to enable it. Add CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3=y to all the defconfig files in arch/arc/configs/ in order to keep the current config settings. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Vineet Gupta <vgupta@synopsys.com>
2019-09-03sched/uclamp: Extend CPU's cgroup controllerPatrick Bellasi
The cgroup CPU bandwidth controller allows to assign a specified (maximum) bandwidth to the tasks of a group. However this bandwidth is defined and enforced only on a temporal base, without considering the actual frequency a CPU is running on. Thus, the amount of computation completed by a task within an allocated bandwidth can be very different depending on the actual frequency the CPU is running that task. The amount of computation can be affected also by the specific CPU a task is running on, especially when running on asymmetric capacity systems like Arm's big.LITTLE. With the availability of schedutil, the scheduler is now able to drive frequency selections based on actual task utilization. Moreover, the utilization clamping support provides a mechanism to bias the frequency selection operated by schedutil depending on constraints assigned to the tasks currently RUNNABLE on a CPU. Giving the mechanisms described above, it is now possible to extend the cpu controller to specify the minimum (or maximum) utilization which should be considered for tasks RUNNABLE on a cpu. This makes it possible to better defined the actual computational power assigned to task groups, thus improving the cgroup CPU bandwidth controller which is currently based just on time constraints. Extend the CPU controller with a couple of new attributes uclamp.{min,max} which allow to enforce utilization boosting and capping for all the tasks in a group. Specifically: - uclamp.min: defines the minimum utilization which should be considered i.e. the RUNNABLE tasks of this group will run at least at a minimum frequency which corresponds to the uclamp.min utilization - uclamp.max: defines the maximum utilization which should be considered i.e. the RUNNABLE tasks of this group will run up to a maximum frequency which corresponds to the uclamp.max utilization These attributes: a) are available only for non-root nodes, both on default and legacy hierarchies, while system wide clamps are defined by a generic interface which does not depends on cgroups. This system wide interface enforces constraints on tasks in the root node. b) enforce effective constraints at each level of the hierarchy which are a restriction of the group requests considering its parent's effective constraints. Root group effective constraints are defined by the system wide interface. This mechanism allows each (non-root) level of the hierarchy to: - request whatever clamp values it would like to get - effectively get only up to the maximum amount allowed by its parent c) have higher priority than task-specific clamps, defined via sched_setattr(), thus allowing to control and restrict task requests. Add two new attributes to the cpu controller to collect "requested" clamp values. Allow that at each non-root level of the hierarchy. Keep it simple by not caring now about "effective" values computation and propagation along the hierarchy. Update sysctl_sched_uclamp_handler() to use the newly introduced uclamp_mutex so that we serialize system default updates with cgroup relate updates. Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Michal Koutny <mkoutny@suse.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Alessio Balsini <balsini@android.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <quentin.perret@arm.com> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Steve Muckle <smuckle@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@google.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lkml.kernel.org/r/20190822132811.31294-2-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-08-29init/Kconfig: rework help of CONFIG_CC_OPTIMIZE_FOR_SIZEMasahiro Yamada
CONFIG_CC_OPTIMIZE_FOR_SIZE was originally an independent boolean option, but commit 877417e6ffb9 ("Kbuild: change CC_OPTIMIZE_FOR_SIZE definition") turned it into a choice between _PERFORMANCE and _SIZE. The phrase "If unsure, say N." sounds like an independent option. Reword the help text to make it appropriate for the choice menu. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-08-28posix-cpu-timers: Move state tracking to struct posix_cputimersThomas Gleixner
Put it where it belongs and clean up the ifdeffery in fork completely. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190821192922.743229404@linutronix.de
2019-08-22kbuild: add CONFIG_ASM_MODVERSIONSMasahiro Yamada
Add CONFIG_ASM_MODVERSIONS. This allows to remove one if-conditional nesting in scripts/Makefile.build. scripts/Makefile.build is run every time Kbuild descends into a sub-directory. So, I want to avoid $(wildcard ...) evaluation where possible although computing $(wildcard ...) is so cheap that it may not make measurable performance difference. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
2019-08-20Revert "init/Kconfig: Fix infinite Kconfig recursion on PPC"Will Deacon
This reverts commit 71c67a31f09fa8fdd1495dffd96a5f0d4cef2ede. Commit 117acf5c29dd ("powerpc/Makefile: Always pass --synthetic to nm if supported") removed the only conditional definition of $(NM), so we can revert our temporary bodge to avoid Kconfig recursion and go back to passing $(NM) through to the 'tools-support-relr.sh' when detecting support for RELR relocations. Signed-off-by: Will Deacon <will@kernel.org>
2019-08-19lockdown: Enforce module signatures if the kernel is locked downDavid Howells
If the kernel is locked down, require that all modules have valid signatures that we can verify. I have adjusted the errors generated: (1) If there's no signature (ENODATA) or we can't check it (ENOPKG, ENOKEY), then: (a) If signatures are enforced then EKEYREJECTED is returned. (b) If there's no signature or we can't check it, but the kernel is locked down then EPERM is returned (this is then consistent with other lockdown cases). (2) If the signature is unparseable (EBADMSG, EINVAL), the signature fails the check (EKEYREJECTED) or a system error occurs (eg. ENOMEM), we return the error we got. Note that the X.509 code doesn't check for key expiry as the RTC might not be valid or might not have been transferred to the kernel's clock yet. [Modified by Matthew Garrett to remove the IMA integration. This will be replaced with integration with the IMA architecture policy patchset.] Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Matthew Garrett <matthewgarrett@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Jessica Yu <jeyu@kernel.org> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19security: Support early LSMsMatthew Garrett
The lockdown module is intended to allow for kernels to be locked down early in boot - sufficiently early that we don't have the ability to kmalloc() yet. Add support for early initialisation of some LSMs, and then add them to the list of names when we do full initialisation later. Early LSMs are initialised in link order and cannot be overridden via boot parameters, and cannot make use of kmalloc() (since the allocator isn't initialised yet). (Fixed by Stephen Rothwell to include a stub to fix builds when !CONFIG_SECURITY) Signed-off-by: Matthew Garrett <mjg59@google.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-14Kbuild: Handle PREEMPT_RT for version string and magicThomas Gleixner
Update the build scripts and the version magic to reflect when CONFIG_PREEMPT_RT is enabled in the same way as CONFIG_PREEMPT is treated. The resulting version strings: Linux m 5.3.0-rc1+ #100 SMP Fri Jul 26 ... Linux m 5.3.0-rc1+ #101 SMP PREEMPT Fri Jul 26 ... Linux m 5.3.0-rc1+ #102 SMP PREEMPT_RT Fri Jul 26 ... The module vermagic: 5.3.0-rc1+ SMP mod_unload modversions 5.3.0-rc1+ SMP preempt mod_unload modversions 5.3.0-rc1+ SMP preempt_rt mod_unload modversions Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-08-07init/Kconfig: Fix infinite Kconfig recursion on PPCWill Deacon
Commit 5cf896fb6be3 ("arm64: Add support for relocating the kernel with RELR relocations") introduced CONFIG_TOOLS_SUPPORT_RELR, which checks for RELR support in the toolchain as part of the kernel configuration. During this procedure, "$(NM)" is invoked to see if it supports the new relocation format, however PowerPC conditionally overrides this variable in the architecture Makefile in order to pass '--synthetic' when targetting PPC64. This conditional override causes Kconfig to recurse forever, since CONFIG_TOOLS_SUPPORT_RELR cannot be determined without $(NM) being defined, but that in turn depends on CONFIG_PPC64: $ make ARCH=powerpc CROSS_COMPILE=powerpc-linux-gnu- scripts/kconfig/conf --syncconfig Kconfig scripts/kconfig/conf --syncconfig Kconfig scripts/kconfig/conf --syncconfig Kconfig [...] In this particular case, it looks like PowerPC may be able to pass '--synthetic' unconditionally to nm or even drop it altogether. While that is being resolved, let's just bodge the RELR check by picking up $(NM) directly from the environment in whatever state it happens to be in. Cc: Peter Collingbourne <pcc@google.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Suggested-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-08-05MODSIGN: Export module signature definitionsThiago Jung Bauermann
IMA will use the module_signature format for append signatures, so export the relevant definitions and factor out the code which verifies that the appended signature trailer is valid. Also, create a CONFIG_MODULE_SIG_FORMAT option so that IMA can select it and be able to use mod_check_sig() without having to depend on either CONFIG_MODULE_SIG or CONFIG_MODULES. s390 duplicated the definition of struct module_signature so now they can use the new <linux/module_signature.h> header instead. Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Acked-by: Jessica Yu <jeyu@kernel.org> Reviewed-by: Philipp Rudo <prudo@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2019-08-05arm64: Add support for relocating the kernel with RELR relocationsPeter Collingbourne
RELR is a relocation packing format for relative relocations. The format is described in a generic-abi proposal: https://groups.google.com/d/topic/generic-abi/bX460iggiKg/discussion The LLD linker can be instructed to pack relocations in the RELR format by passing the flag --pack-dyn-relocs=relr. This patch adds a new config option, CONFIG_RELR. Enabling this option instructs the linker to pack vmlinux's relative relocations in the RELR format, and causes the kernel to apply the relocations at startup along with the RELA relocations. RELA relocations still need to be applied because the linker will emit RELA relative relocations if they are unrepresentable in the RELR format (i.e. address not a multiple of 2). Enabling CONFIG_RELR reduces the size of a defconfig kernel image with CONFIG_RANDOMIZE_BASE by 3.5MB/16% uncompressed, or 550KB/5% compressed (lz4). Signed-off-by: Peter Collingbourne <pcc@google.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-07-31sched/preempt: Use CONFIG_PREEMPTION where appropriateThomas Gleixner
CONFIG_PREEMPTION is selected by CONFIG_PREEMPT and by CONFIG_PREEMPT_RT. Both PREEMPT and PREEMPT_RT require the same functionality which today depends on CONFIG_PREEMPT. Switch the preemption code, scheduler and init task over to use CONFIG_PREEMPTION. That's the first step towards RT in that area. The more complex changes are coming separately. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Paul E. McKenney <paulmck@linux.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20190726212124.117528401@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-07-19Merge branch 'work.mount0' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs mount updates from Al Viro: "The first part of mount updates. Convert filesystems to use the new mount API" * 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits) mnt_init(): call shmem_init() unconditionally constify ksys_mount() string arguments don't bother with registering rootfs init_rootfs(): don't bother with init_ramfs_fs() vfs: Convert smackfs to use the new mount API vfs: Convert selinuxfs to use the new mount API vfs: Convert securityfs to use the new mount API vfs: Convert apparmorfs to use the new mount API vfs: Convert openpromfs to use the new mount API vfs: Convert xenfs to use the new mount API vfs: Convert gadgetfs to use the new mount API vfs: Convert oprofilefs to use the new mount API vfs: Convert ibmasmfs to use the new mount API vfs: Convert qib_fs/ipathfs to use the new mount API vfs: Convert efivarfs to use the new mount API vfs: Convert configfs to use the new mount API vfs: Convert binfmt_misc to use the new mount API convenience helper: get_tree_single() convenience helper get_tree_nodev() vfs: Kill sget_userns() ...
2019-07-17Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge more updates from Andrew Morton: "VM: - z3fold fixes and enhancements by Henry Burns and Vitaly Wool - more accurate reclaimed slab caches calculations by Yafang Shao - fix MAP_UNINITIALIZED UAPI symbol to not depend on config, by Christoph Hellwig - !CONFIG_MMU fixes by Christoph Hellwig - new novmcoredd parameter to omit device dumps from vmcore, by Kairui Song - new test_meminit module for testing heap and pagealloc initialization, by Alexander Potapenko - ioremap improvements for huge mappings, by Anshuman Khandual - generalize kprobe page fault handling, by Anshuman Khandual - device-dax hotplug fixes and improvements, by Pavel Tatashin - enable synchronous DAX fault on powerpc, by Aneesh Kumar K.V - add pte_devmap() support for arm64, by Robin Murphy - unify locked_vm accounting with a helper, by Daniel Jordan - several misc fixes core/lib: - new typeof_member() macro including some users, by Alexey Dobriyan - make BIT() and GENMASK() available in asm, by Masahiro Yamada - changed LIST_POISON2 on x86_64 to 0xdead000000000122 for better code generation, by Alexey Dobriyan - rbtree code size optimizations, by Michel Lespinasse - convert struct pid count to refcount_t, by Joel Fernandes get_maintainer.pl: - add --no-moderated switch to skip moderated ML's, by Joe Perches misc: - ptrace PTRACE_GET_SYSCALL_INFO interface - coda updates - gdb scripts, various" [ Using merge message suggestion from Vlastimil Babka, with some editing - Linus ] * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (100 commits) fs/select.c: use struct_size() in kmalloc() mm: add account_locked_vm utility function arm64: mm: implement pte_devmap support mm: introduce ARCH_HAS_PTE_DEVMAP mm: clean up is_device_*_page() definitions mm/mmap: move common defines to mman-common.h mm: move MAP_SYNC to asm-generic/mman-common.h device-dax: "Hotremove" persistent memory that is used like normal RAM mm/hotplug: make remove_memory() interface usable device-dax: fix memory and resource leak if hotplug fails include/linux/lz4.h: fix spelling and copy-paste errors in documentation ipc/mqueue.c: only perform resource calculation if user valid include/asm-generic/bug.h: fix "cut here" for WARN_ON for __WARN_TAINT architectures scripts/gdb: add helpers to find and list devices scripts/gdb: add lx-genpd-summary command drivers/pps/pps.c: clear offset flags in PPS_SETPARAMS ioctl kernel/pid.c: convert struct pid count to refcount_t drivers/rapidio/devices/rio_mport_cdev.c: NUL terminate some strings select: shift restore_saved_sigmask_unless() into poll_select_copy_remaining() select: change do_poll() to return -ERESTARTNOHAND rather than -EINTR ...
2019-07-16init/Kconfig: fix neighboring typosKees Cook
This fixes a couple typos I noticed in the slab Kconfig: sacrifies -> sacrifices accellerate -> accelerate Seeing as no other instances of these typos are found elsewhere in the kernel and that I originally added one of the two, I can only assume working on slab must have caused damage to the spelling centers of my brain. Link: http://lkml.kernel.org/r/201905292203.CD000546EB@keescook Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-15docs: cgroup-v1: add it to the admin-guide bookMauro Carvalho Chehab
Those files belong to the admin guide, so add them. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-07-15docs: accounting: convert to ReSTMauro Carvalho Chehab
Rename the accounting documentation files to ReST, add an index for them and adjust in order to produce a nice html output via the Sphinx build system. At its new index.rst, let's add a :orphan: while this is not linked to the main index.rst file, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-07-12Merge tag 'kbuild-v5.3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - remove headers_{install,check}_all targets - remove unreasonable 'depends on !UML' from CONFIG_SAMPLES - re-implement 'make headers_install' more cleanly - add new header-test-y syntax to compile-test headers - compile-test exported headers to ensure they are compilable in user-space - compile-test headers under include/ to ensure they are self-contained - remove -Waggregate-return, -Wno-uninitialized, -Wno-unused-value flags - add -Werror=unknown-warning-option for Clang - add 128-bit built-in types support to genksyms - fix missed rebuild of modules.builtin - propagate 'No space left on device' error in fixdep to Make - allow Clang to use its integrated assembler - improve some coccinelle scripts - add a new flag KBUILD_ABS_SRCTREE to request Kbuild to use absolute path for $(srctree). - do not ignore errors when compression utility is missing - misc cleanups * tag 'kbuild-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (49 commits) kbuild: use -- separater intead of $(filter-out ...) for cc-cross-prefix kbuild: Inform user to pass ARCH= for make mrproper kbuild: fix compression errors getting ignored kbuild: add a flag to force absolute path for srctree kbuild: replace KBUILD_SRCTREE with boolean building_out_of_srctree kbuild: remove src and obj from the top Makefile scripts/tags.sh: remove unused environment variables from comments scripts/tags.sh: drop SUBARCH support for ARM kbuild: compile-test kernel headers to ensure they are self-contained kheaders: include only headers into kheaders_data.tar.xz kheaders: remove meaningless -R option of 'ls' kbuild: support header-test-pattern-y kbuild: do not create wrappers for header-test-y kbuild: compile-test exported headers to ensure they are self-contained init/Kconfig: add CONFIG_CC_CAN_LINK kallsyms: exclude kasan local symbols on s390 kbuild: add more hints about SUBDIRS replacement coccinelle: api/stream_open: treat all wait_.*() calls as blocking coccinelle: put_device: Add a cast to an expression for an assignment coccinelle: put_device: Adjust a message construction ...
2019-07-12mm: init: report memory auto-initialization features at boot timeAlexander Potapenko
Print the currently enabled stack and heap initialization modes. Stack initialization is enabled by a config flag, while heap initialization is configured at boot time with defaults being set in the config. It's more convenient for the user to have all information about these hardening measures in one place at boot, so the user can reason about the expected behavior of the running system. The possible options for stack are: - "all" for CONFIG_INIT_STACK_ALL; - "byref_all" for CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL; - "byref" for CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF; - "__user" for CONFIG_GCC_PLUGIN_STRUCTLEAK_USER; - "off" otherwise. Depending on the values of init_on_alloc and init_on_free boottime options we also report "heap alloc" and "heap free" as "on"/"off". In the init_on_free mode initializing pages at boot time may take a while, so print a notice about that as well. This depends on how much memory is installed, the memory bandwidth, etc. On a relatively modern x86 system, it takes about 0.75s/GB to wipe all memory: [ 0.418722] mem auto-init: stack:byref_all, heap alloc:off, heap free:on [ 0.419765] mem auto-init: clearing system memory may take some time... [ 12.376605] Memory: 16408564K/16776672K available (14339K kernel code, 1397K rwdata, 3756K rodata, 1636K init, 11460K bss, 368108K reserved, 0K cma-reserved) Link: http://lkml.kernel.org/r/20190617151050.92663-3-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Suggested-by: Kees Cook <keescook@chromium.org> Acked-by: Kees Cook <keescook@chromium.org> Cc: Christoph Lameter <cl@linux.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: James Morris <jmorris@namei.org> Cc: Jann Horn <jannh@google.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Laura Abbott <labbott@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Sandeep Patil <sspatil@android.com> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: Souptick Joarder <jrdr.linux@gmail.com> Cc: Marco Elver <elver@google.com> Cc: Kaiwan N Billimoria <kaiwan@kaiwantech.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-09Merge tag 'docs-5.3' of git://git.lwn.net/linuxLinus Torvalds
Pull Documentation updates from Jonathan Corbet: "It's been a relatively busy cycle for docs: - A fair pile of RST conversions, many from Mauro. These create more than the usual number of simple but annoying merge conflicts with other trees, unfortunately. He has a lot more of these waiting on the wings that, I think, will go to you directly later on. - A new document on how to use merges and rebases in kernel repos, and one on Spectre vulnerabilities. - Various improvements to the build system, including automatic markup of function() references because some people, for reasons I will never understand, were of the opinion that :c:func:``function()`` is unattractive and not fun to type. - We now recommend using sphinx 1.7, but still support back to 1.4. - Lots of smaller improvements, warning fixes, typo fixes, etc" * tag 'docs-5.3' of git://git.lwn.net/linux: (129 commits) docs: automarkup.py: ignore exceptions when seeking for xrefs docs: Move binderfs to admin-guide Disable Sphinx SmartyPants in HTML output doc: RCU callback locks need only _bh, not necessarily _irq docs: format kernel-parameters -- as code Doc : doc-guide : Fix a typo platform: x86: get rid of a non-existent document Add the RCU docs to the core-api manual Documentation: RCU: Add TOC tree hooks Documentation: RCU: Rename txt files to rst Documentation: RCU: Convert RCU UP systems to reST Documentation: RCU: Convert RCU linked list to reST Documentation: RCU: Convert RCU basic concepts to reST docs: filesystems: Remove uneeded .rst extension on toctables scripts/sphinx-pre-install: fix out-of-tree build docs: zh_CN: submitting-drivers.rst: Remove a duplicated Documentation/ Documentation: PGP: update for newer HW devices Documentation: Add section about CPU vulnerabilities for Spectre Documentation: platform: Delete x86-laptop-drivers.txt docs: Note that :c:func: should no longer be used ...
2019-07-09Merge tag 'for-5.3/block-20190708' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block updates from Jens Axboe: "This is the main block updates for 5.3. Nothing earth shattering or major in here, just fixes, additions, and improvements all over the map. This contains: - Series of documentation fixes (Bart) - Optimization of the blk-mq ctx get/put (Bart) - null_blk removal race condition fix (Bob) - req/bio_op() cleanups (Chaitanya) - Series cleaning up the segment accounting, and request/bio mapping (Christoph) - Series cleaning up the page getting/putting for bios (Christoph) - block cgroup cleanups and moving it to where it is used (Christoph) - block cgroup fixes (Tejun) - Series of fixes and improvements to bcache, most notably a write deadlock fix (Coly) - blk-iolatency STS_AGAIN and accounting fixes (Dennis) - Series of improvements and fixes to BFQ (Douglas, Paolo) - debugfs_create() return value check removal for drbd (Greg) - Use struct_size(), where appropriate (Gustavo) - Two lighnvm fixes (Heiner, Geert) - MD fixes, including a read balance and corruption fix (Guoqing, Marcos, Xiao, Yufen) - block opal shadow mbr additions (Jonas, Revanth) - sbitmap compare-and-exhange improvemnts (Pavel) - Fix for potential bio->bi_size overflow (Ming) - NVMe pull requests: - improved PCIe suspent support (Keith Busch) - error injection support for the admin queue (Akinobu Mita) - Fibre Channel discovery improvements (James Smart) - tracing improvements including nvmetc tracing support (Minwoo Im) - misc fixes and cleanups (Anton Eidelman, Minwoo Im, Chaitanya Kulkarni)" - Various little fixes and improvements to drivers and core" * tag 'for-5.3/block-20190708' of git://git.kernel.dk/linux-block: (153 commits) blk-iolatency: fix STS_AGAIN handling block: nr_phys_segments needs to be zero for REQ_OP_WRITE_ZEROES blk-mq: simplify blk_mq_make_request() blk-mq: remove blk_mq_put_ctx() sbitmap: Replace cmpxchg with xchg block: fix .bi_size overflow block: sed-opal: check size of shadow mbr block: sed-opal: ioctl for writing to shadow mbr block: sed-opal: add ioctl for done-mark of shadow mbr block: never take page references for ITER_BVEC direct-io: use bio_release_pages in dio_bio_complete block_dev: use bio_release_pages in bio_unmap_user block_dev: use bio_release_pages in blkdev_bio_end_io iomap: use bio_release_pages in iomap_dio_bio_end_io block: use bio_release_pages in bio_map_user_iov block: use bio_release_pages in bio_unmap_user block: optionally mark pages dirty in bio_release_pages block: move the BIO_NO_PAGE_REF check into bio_release_pages block: skd_main.c: Remove call to memset after dma_alloc_coherent block: mtip32xx: Remove call to memset after dma_alloc_coherent ...