aboutsummaryrefslogtreecommitdiffstats
path: root/kernel/module.c
AgeCommit message (Collapse)Author
2020-10-29module: statically initialize init section freeing dataDaniel Jordan
[ Upstream commit fdf09ab887829cd1b671e45d9549f8ec1ffda0fa ] Corentin hit the following workqueue warning when running with CRYPTO_MANAGER_EXTRA_TESTS: WARNING: CPU: 2 PID: 147 at kernel/workqueue.c:1473 __queue_work+0x3b8/0x3d0 Modules linked in: ghash_generic CPU: 2 PID: 147 Comm: modprobe Not tainted 5.6.0-rc1-next-20200214-00068-g166c9264f0b1-dirty #545 Hardware name: Pine H64 model A (DT) pc : __queue_work+0x3b8/0x3d0 Call trace: __queue_work+0x3b8/0x3d0 queue_work_on+0x6c/0x90 do_init_module+0x188/0x1f0 load_module+0x1d00/0x22b0 I wasn't able to reproduce on x86 or rpi 3b+. This is WARN_ON(!list_empty(&work->entry)) from __queue_work(), and it happens because the init_free_wq work item isn't initialized in time for a crypto test that requests the gcm module. Some crypto tests were recently moved earlier in boot as explained in commit c4741b230597 ("crypto: run initcalls for generic implementations earlier"), which went into mainline less than two weeks before the Fixes commit. Avoid the warning by statically initializing init_free_wq and the corresponding llist. Link: https://lore.kernel.org/lkml/20200217204803.GA13479@Red/ Fixes: 1a7b7d922081 ("modules: Use vmalloc special flag") Reported-by: Corentin Labbe <clabbe.montjoie@gmail.com> Tested-by: Corentin Labbe <clabbe.montjoie@gmail.com> Tested-on: sun50i-h6-pine-h64 Tested-on: imx8mn-ddr4-evk Tested-on: sun50i-a64-bananapi-m64 Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Jessica Yu <jeyu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-21module: Correctly truncate sysfs sections outputKees Cook
commit 11990a5bd7e558e9203c1070fc52fb6f0488e75b upstream. The only-root-readable /sys/module/$module/sections/$section files did not truncate their output to the available buffer size. While most paths into the kernfs read handlers end up using PAGE_SIZE buffers, it's possible to get there through other paths (e.g. splice, sendfile). Actually limit the output to the "count" passed into the read function, and report it back correctly. *sigh* Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/lkml/20200805002015.GE23458@shao2-debian Fixes: ed66f991bb19 ("module: Refactor section attr into bin attribute") Cc: stable@vger.kernel.org Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Jessica Yu <jeyu@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-09Merge tag 'kallsyms_show_value-v5.8-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull kallsyms fix from Kees Cook: "Refactor kallsyms_show_value() users for correct cred. I'm not delighted by the timing of getting these changes to you, but it does fix a handful of kernel address exposures, and no one has screamed yet at the patches. Several users of kallsyms_show_value() were performing checks not during "open". Refactor everything needed to gain proper checks against file->f_cred for modules, kprobes, and bpf" * tag 'kallsyms_show_value-v5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: selftests: kmod: Add module address visibility test bpf: Check correct cred for CAP_SYSLOG in bpf_dump_raw_ok() kprobes: Do not expose probe addresses to non-CAP_SYSLOG module: Do not expose section addresses to non-CAP_SYSLOG module: Refactor section attr into bin attribute kallsyms: Refactor kallsyms_show_value() to take cred
2020-07-08module: Do not expose section addresses to non-CAP_SYSLOGKees Cook
The printing of section addresses in /sys/module/*/sections/* was not using the correct credentials to evaluate visibility. Before: # cat /sys/module/*/sections/.*text 0xffffffffc0458000 ... # capsh --drop=CAP_SYSLOG -- -c "cat /sys/module/*/sections/.*text" 0xffffffffc0458000 ... After: # cat /sys/module/*/sections/*.text 0xffffffffc0458000 ... # capsh --drop=CAP_SYSLOG -- -c "cat /sys/module/*/sections/.*text" 0x0000000000000000 ... Additionally replaces the existing (safe) /proc/modules check with file->f_cred for consistency. Reported-by: Dominik Czarnota <dominik.czarnota@trailofbits.com> Fixes: be71eda5383f ("module: Fix display of wrong module .text address") Cc: stable@vger.kernel.org Tested-by: Jessica Yu <jeyu@kernel.org> Acked-by: Jessica Yu <jeyu@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-08module: Refactor section attr into bin attributeKees Cook
In order to gain access to the open file's f_cred for kallsym visibility permission checks, refactor the module section attributes to use the bin_attribute instead of attribute interface. Additionally removes the redundant "name" struct member. Cc: stable@vger.kernel.org Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Tested-by: Jessica Yu <jeyu@kernel.org> Acked-by: Jessica Yu <jeyu@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-08kallsyms: Refactor kallsyms_show_value() to take credKees Cook
In order to perform future tests against the cred saved during open(), switch kallsyms_show_value() to operate on a cred, and have all current callers pass current_cred(). This makes it very obvious where callers are checking the wrong credential in their "read" contexts. These will be fixed in the coming patches. Additionally switch return value to bool, since it is always used as a direct permission check, not a 0-on-success, negative-on-error style function return. Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-03vmalloc: fix the owner argument for the new __vmalloc_node_range callersChristoph Hellwig
Fix the recently added new __vmalloc_node_range callers to pass the correct values as the owner for display in /proc/vmallocinfo. Fixes: 800e26b81311 ("x86/hyperv: allocate the hypercall page with only read and execute bits") Fixes: 10d5e97c1bf8 ("arm64: use PAGE_KERNEL_ROX directly in alloc_insn_page") Fixes: 7a0e27b2a0ce ("mm: remove vmalloc_exec") Reported-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20200627075649.2455097-1-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-26mm: remove vmalloc_execChristoph Hellwig
Merge vmalloc_exec into its only caller. Note that for !CONFIG_MMU __vmalloc_node_range maps to __vmalloc, which directly clears the __GFP_HIGHMEM added by the vmalloc_exec stub anyway. Link: http://lkml.kernel.org/r/20200618064307.32739-4-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Jessica Yu <jeyu@kernel.org> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08module: move the set_fs hack for flush_icache_range to m68kChristoph Hellwig
flush_icache_range generally operates on kernel addresses, but for some reason m68k needed a set_fs override. Move that into the m68k code insted of keeping it in the module loader. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Jessica Yu <jeyu@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Link: http://lkml.kernel.org/r/20200515143646.3857579-30-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-05Merge tag 'modules-for-v5.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux Pull module updates from Jessica Yu: - Harden CONFIG_STRICT_MODULE_RWX by rejecting any module that has SHF_WRITE|SHF_EXECINSTR sections - Remove and clean up nested #ifdefs, as it makes code hard to read * tag 'modules-for-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux: module: Harden STRICT_MODULE_RWX module: break nested ARCH_HAS_STRICT_MODULE_RWX and STRICT_MODULE_RWX #ifdefs
2020-06-04Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching Pull livepatching updates from Jiri Kosina: - simplifications and improvements for issues Peter Ziljstra found during his previous work on W^X cleanups. This allows us to remove livepatch arch-specific .klp.arch sections and add proper support for jump labels in patched code. Also, this patchset removes the last module_disable_ro() usage in the tree. Patches from Josh Poimboeuf and Peter Zijlstra - a few other minor cleanups * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching: MAINTAINERS: add lib/livepatch to LIVE PATCHING livepatch: add arch-specific headers to MAINTAINERS livepatch: Make klp_apply_object_relocs static MAINTAINERS: adjust to livepatch .klp.arch removal module: Make module_enable_ro() static again x86/module: Use text_mutex in apply_relocate_add() module: Remove module_disable_ro() livepatch: Remove module_disable_ro() usage x86/module: Use text_poke() for late relocations s390/module: Use s390_kernel_write() for late relocations s390: Change s390_kernel_write() return type to match memcpy() livepatch: Prevent module-specific KLP rela sections from referencing vmlinux symbols livepatch: Remove .klp.arch livepatch: Apply vmlinux-specific KLP relocations early livepatch: Disallow vmlinux.ko
2020-06-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextLinus Torvalds
Pull networking updates from David Miller: 1) Allow setting bluetooth L2CAP modes via socket option, from Luiz Augusto von Dentz. 2) Add GSO partial support to igc, from Sasha Neftin. 3) Several cleanups and improvements to r8169 from Heiner Kallweit. 4) Add IF_OPER_TESTING link state and use it when ethtool triggers a device self-test. From Andrew Lunn. 5) Start moving away from custom driver versions, use the globally defined kernel version instead, from Leon Romanovsky. 6) Support GRO vis gro_cells in DSA layer, from Alexander Lobakin. 7) Allow hard IRQ deferral during NAPI, from Eric Dumazet. 8) Add sriov and vf support to hinic, from Luo bin. 9) Support Media Redundancy Protocol (MRP) in the bridging code, from Horatiu Vultur. 10) Support netmap in the nft_nat code, from Pablo Neira Ayuso. 11) Allow UDPv6 encapsulation of ESP in the ipsec code, from Sabrina Dubroca. Also add ipv6 support for espintcp. 12) Lots of ReST conversions of the networking documentation, from Mauro Carvalho Chehab. 13) Support configuration of ethtool rxnfc flows in bcmgenet driver, from Doug Berger. 14) Allow to dump cgroup id and filter by it in inet_diag code, from Dmitry Yakunin. 15) Add infrastructure to export netlink attribute policies to userspace, from Johannes Berg. 16) Several optimizations to sch_fq scheduler, from Eric Dumazet. 17) Fallback to the default qdisc if qdisc init fails because otherwise a packet scheduler init failure will make a device inoperative. From Jesper Dangaard Brouer. 18) Several RISCV bpf jit optimizations, from Luke Nelson. 19) Correct the return type of the ->ndo_start_xmit() method in several drivers, it's netdev_tx_t but many drivers were using 'int'. From Yunjian Wang. 20) Add an ethtool interface for PHY master/slave config, from Oleksij Rempel. 21) Add BPF iterators, from Yonghang Song. 22) Add cable test infrastructure, including ethool interfaces, from Andrew Lunn. Marvell PHY driver is the first to support this facility. 23) Remove zero-length arrays all over, from Gustavo A. R. Silva. 24) Calculate and maintain an explicit frame size in XDP, from Jesper Dangaard Brouer. 25) Add CAP_BPF, from Alexei Starovoitov. 26) Support terse dumps in the packet scheduler, from Vlad Buslov. 27) Support XDP_TX bulking in dpaa2 driver, from Ioana Ciornei. 28) Add devm_register_netdev(), from Bartosz Golaszewski. 29) Minimize qdisc resets, from Cong Wang. 30) Get rid of kernel_getsockopt and kernel_setsockopt in order to eliminate set_fs/get_fs calls. From Christoph Hellwig. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2517 commits) selftests: net: ip_defrag: ignore EPERM net_failover: fixed rollback in net_failover_open() Revert "tipc: Fix potential tipc_aead refcnt leak in tipc_crypto_rcv" Revert "tipc: Fix potential tipc_node refcnt leak in tipc_rcv" vmxnet3: allow rx flow hash ops only when rss is enabled hinic: add set_channels ethtool_ops support selftests/bpf: Add a default $(CXX) value tools/bpf: Don't use $(COMPILE.c) bpf, selftests: Use bpf_probe_read_kernel s390/bpf: Use bcr 0,%0 as tail call nop filler s390/bpf: Maintain 8-byte stack alignment selftests/bpf: Fix verifier test selftests/bpf: Fix sample_cnt shared between two threads bpf, selftests: Adapt cls_redirect to call csum_level helper bpf: Add csum_level helper for fixing up csum levels bpf: Fix up bpf_skb_adjust_room helper's skb csum setting sfc: add missing annotation for efx_ef10_try_update_nic_stats_vf() crypto/chtls: IPv6 support for inline TLS Crypto/chcr: Fixes a coccinile check error Crypto/chcr: Fixes compilations warnings ...
2020-06-02Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge updates from Andrew Morton: "A few little subsystems and a start of a lot of MM patches. Subsystems affected by this patch series: squashfs, ocfs2, parisc, vfs. With mm subsystems: slab-generic, slub, debug, pagecache, gup, swap, memcg, pagemap, memory-failure, vmalloc, kasan" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (128 commits) kasan: move kasan_report() into report.c mm/mm_init.c: report kasan-tag information stored in page->flags ubsan: entirely disable alignment checks under UBSAN_TRAP kasan: fix clang compilation warning due to stack protector x86/mm: remove vmalloc faulting mm: remove vmalloc_sync_(un)mappings() x86/mm/32: implement arch_sync_kernel_mappings() x86/mm/64: implement arch_sync_kernel_mappings() mm/ioremap: track which page-table levels were modified mm/vmalloc: track which page-table levels were modified mm: add functions to track page directory modifications s390: use __vmalloc_node in stack_alloc powerpc: use __vmalloc_node in alloc_vm_stack arm64: use __vmalloc_node in arch_alloc_vmap_stack mm: remove vmalloc_user_node_flags mm: switch the test_vmalloc module to use __vmalloc_node mm: remove __vmalloc_node_flags_caller mm: remove both instances of __vmalloc_node_flags mm: remove the prot argument to __vmalloc_node mm: remove the pgprot argument to __vmalloc ...
2020-06-02mm: remove the pgprot argument to __vmallocChristoph Hellwig
The pgprot argument to __vmalloc is always PAGE_KERNEL now, so remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Michael Kelley <mikelley@microsoft.com> [hyperv] Acked-by: Gao Xiang <xiang@kernel.org> [erofs] Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Wei Liu <wei.liu@kernel.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: David Airlie <airlied@linux.ie> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Laura Abbott <labbott@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Sakari Ailus <sakari.ailus@linux.intel.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Link: http://lkml.kernel.org/r/20200414131348.444715-22-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-01Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM updates from Russell King: - remove a now unnecessary usage of the KERNEL_DS for sys_oabi_epoll_ctl() - update my email address in a number of drivers - decompressor EFI updates from Ard Biesheuvel - module unwind section handling updates - sparsemem Kconfig cleanups - make act_mm macro respect THREAD_SIZE * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 8980/1: Allow either FLATMEM or SPARSEMEM on the multiplatform build ARM: 8979/1: Remove redundant ARCH_SPARSEMEM_DEFAULT setting ARM: 8978/1: mm: make act_mm() respect THREAD_SIZE ARM: decompressor: run decompressor in place if loaded via UEFI ARM: decompressor: move GOT into .data for EFI enabled builds ARM: decompressor: defer loading of the contents of the LC0 structure ARM: decompressor: split off _edata and stack base into separate object ARM: decompressor: move headroom variable out of LC0 ARM: 8976/1: module: allow arch overrides for .init section names ARM: 8975/1: module: fix handling of unwind init sections ARM: 8974/1: use SPARSMEM_STATIC when SPARSEMEM is enabled ARM: 8971/1: replace the sole use of a symbol with its definition ARM: 8969/1: decompressor: simplify libfdt builds Update rmk's email address in various drivers ARM: compat: remove KERNEL_DS usage in sys_oabi_epoll_ctl()
2020-05-19kprobes: Prevent probes in .noinstr.text sectionThomas Gleixner
Instrumentation is forbidden in the .noinstr.text section. Make kprobes respect this. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lkml.kernel.org/r/20200505134100.179862032@linutronix.de
2020-05-19ARM: 8976/1: module: allow arch overrides for .init section namesVincent Whitchurch
ARM stores unwind information for .init.text in sections named .ARM.extab.init.text and .ARM.exidx.init.text. Since those aren't currently recognized as init sections, they're allocated along with the core section, and relocation fails if the core and the init section are allocated from different regions and can't reach other. final section addresses: ... 0x7f800000 .init.text .. 0xcbb54078 .ARM.exidx.init.text .. section 16 reloc 0 sym '': relocation 42 out of range (0xcbb54078 -> 0x7f800000) Allow architectures to override the section name so that ARM can fix this. Acked-by: Jessica Yu <jeyu@kernel.org> Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-05-12kprobes: Support NOKPROBE_SYMBOL() in modulesMasami Hiramatsu
Support NOKPROBE_SYMBOL() in modules. NOKPROBE_SYMBOL() records only symbol address in "_kprobe_blacklist" section in the module. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200505134059.771170126@linutronix.de
2020-05-12kprobes: Support __kprobes blacklist in modulesMasami Hiramatsu
Support __kprobes attribute for blacklist functions in modules. The __kprobes attribute functions are stored in .kprobes.text section. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200505134059.678201813@linutronix.de
2020-05-08module: Make module_enable_ro() static againJosh Poimboeuf
Now that module_enable_ro() has no more external users, make it static again. Suggested-by: Jessica Yu <jeyu@kernel.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Jessica Yu <jeyu@kernel.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2020-05-08module: Remove module_disable_ro()Josh Poimboeuf
module_disable_ro() has no more users. Remove it. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Joe Lawrence <joe.lawrence@redhat.com> Acked-by: Miroslav Benes <mbenes@suse.cz> Acked-by: Jessica Yu <jeyu@kernel.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2020-05-08livepatch: Apply vmlinux-specific KLP relocations earlyJosh Poimboeuf
KLP relocations are livepatch-specific relocations which are applied to a KLP module's text or data. They exist for two reasons: 1) Unexported symbols: replacement functions often need to access unexported symbols (e.g. static functions), which "normal" relocations don't allow. 2) Late module patching: this is the ability for a KLP module to bypass normal module dependencies, such that the KLP module can be loaded *before* a to-be-patched module. This means that relocations which need to access symbols in the to-be-patched module might need to be applied to the KLP module well after it has been loaded. Non-late-patched KLP relocations are applied from the KLP module's init function. That usually works fine, unless the patched code wants to use alternatives, paravirt patching, jump tables, or some other special section which needs relocations. Then we run into ordering issues and crashes. In order for those special sections to work properly, the KLP relocations should be applied *before* the special section init code runs, such as apply_paravirt(), apply_alternatives(), or jump_label_apply_nops(). You might think the obvious solution would be to move the KLP relocation initialization earlier, but it's not necessarily that simple. The problem is the above-mentioned late module patching, for which KLP relocations can get applied well after the KLP module is loaded. To "fix" this issue in the past, we created .klp.arch sections: .klp.arch.{module}..altinstructions .klp.arch.{module}..parainstructions Those sections allow KLP late module patching code to call apply_paravirt() and apply_alternatives() after the module-specific KLP relocations (.klp.rela.{module}.{section}) have been applied. But that has a lot of drawbacks, including code complexity, the need for arch-specific code, and the (per-arch) danger that we missed some special section -- for example the __jump_table section which is used for jump labels. It turns out there's a simpler and more functional approach. There are two kinds of KLP relocation sections: 1) vmlinux-specific KLP relocation sections .klp.rela.vmlinux.{sec} These are relocations (applied to the KLP module) which reference unexported vmlinux symbols. 2) module-specific KLP relocation sections .klp.rela.{module}.{sec}: These are relocations (applied to the KLP module) which reference unexported or exported module symbols. Up until now, these have been treated the same. However, they're inherently different. Because of late module patching, module-specific KLP relocations can be applied very late, thus they can create the ordering headaches described above. But vmlinux-specific KLP relocations don't have that problem. There's nothing to prevent them from being applied earlier. So apply them at the same time as normal relocations, when the KLP module is being loaded. This means that for vmlinux-specific KLP relocations, we no longer have any ordering issues. vmlinux-referencing jump labels, alternatives, and paravirt patching will work automatically, without the need for the .klp.arch hacks. All that said, for module-specific KLP relocations, the ordering problems still exist and we *do* still need .klp.arch. Or do we? Stay tuned. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Joe Lawrence <joe.lawrence@redhat.com> Acked-by: Miroslav Benes <mbenes@suse.cz> Acked-by: Jessica Yu <jeyu@kernel.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2020-04-21kernel/module: Hide vermagic header file from general useLeon Romanovsky
VERMAGIC* definitions are not supposed to be used by the drivers, see this [1] bug report, so introduce special define to guard inclusion of this header file and define it in kernel/modules.h and in internal script that generates *.mod.c files. In-tree module build: ➜ kernel git:(vermagic) ✗ make clean ➜ kernel git:(vermagic) ✗ make M=drivers/infiniband/hw/mlx5 ➜ kernel git:(vermagic) ✗ modinfo drivers/infiniband/hw/mlx5/mlx5_ib.ko filename: /images/leonro/src/kernel/drivers/infiniband/hw/mlx5/mlx5_ib.ko <...> vermagic: 5.6.0+ SMP mod_unload modversions Out-of-tree module build: ➜ mlx5 make -C /images/leonro/src/kernel clean M=/tmp/mlx5 ➜ mlx5 make -C /images/leonro/src/kernel M=/tmp/mlx5 ➜ mlx5 modinfo /tmp/mlx5/mlx5_ib.ko filename: /tmp/mlx5/mlx5_ib.ko <...> vermagic: 5.6.0+ SMP mod_unload modversions [1] https://lore.kernel.org/lkml/20200411155623.GA22175@zn.tnic Reported-by: Borislav Petkov <bp@suse.de> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: Jessica Yu <jeyu@kernel.org> Co-developed-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-21module: Harden STRICT_MODULE_RWXPeter Zijlstra
We're very close to enforcing W^X memory, refuse to load modules that violate this principle per construction. [jeyu: move module_enforce_rwx_sections under STRICT_MODULE_RWX as per discussion] Link: http://lore.kernel.org/r/20200403171303.GK20760@hirez.programming.kicks-ass.net Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Jessica Yu <jeyu@kernel.org>
2020-04-17module: break nested ARCH_HAS_STRICT_MODULE_RWX and STRICT_MODULE_RWX #ifdefsJessica Yu
Various frob_* and module_{enable,disable}_* functions are defined in a CONFIG_ARCH_HAS_STRICT_MODULE_RWX ifdef block which also has a nested CONFIG_STRICT_MODULE_RWX ifdef block within it. This is unecessary and makes things hard to read. Not only that, this construction requires redundant empty stubs for module_enable_nx(). I suspect this was originally done for cosmetic reasons - to keep all the frob_* functions in the same place, and all the module_{enable,disable}_* functions right after, but as a result it made the code harder to read. Make this more readable by unnesting the ifdef blocks and getting rid of the redundant empty stubs. Signed-off-by: Jessica Yu <jeyu@kernel.org>
2020-04-09Merge tag 'modules-for-v5.7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux Pull module updates from Jessica Yu: "Only a small cleanup this time around: a trivial conversion of zero-length arrays to flexible arrays" * tag 'modules-for-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux: kernel: module: Replace zero-length array with flexible-array member
2020-04-07proc: faster open/read/close with "permanent" filesAlexey Dobriyan
Now that "struct proc_ops" exist we can start putting there stuff which could not fly with VFS "struct file_operations"... Most of fs/proc/inode.c file is dedicated to make open/read/.../close reliable in the event of disappearing /proc entries which usually happens if module is getting removed. Files like /proc/cpuinfo which never disappear simply do not need such protection. Save 2 atomic ops, 1 allocation, 1 free per open/read/close sequence for such "permanent" files. Enable "permanent" flag for /proc/cpuinfo /proc/kmsg /proc/modules /proc/slabinfo /proc/stat /proc/sysvipc/* /proc/swaps More will come once I figure out foolproof way to prevent out module authors from marking their stuff "permanent" for performance reasons when it is not. This should help with scalability: benchmark is "read /proc/cpuinfo R times by N threads scattered over the system". N R t, s (before) t, s (after) ----------------------------------------------------- 64 4096 1.582458 1.530502 -3.2% 256 4096 6.371926 6.125168 -3.9% 1024 4096 25.64888 24.47528 -4.6% Benchmark source: #include <chrono> #include <iostream> #include <thread> #include <vector> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <unistd.h> const int NR_CPUS = sysconf(_SC_NPROCESSORS_ONLN); int N; const char *filename; int R; int xxx = 0; int glue(int n) { cpu_set_t m; CPU_ZERO(&m); CPU_SET(n, &m); return sched_setaffinity(0, sizeof(cpu_set_t), &m); } void f(int n) { glue(n % NR_CPUS); while (*(volatile int *)&xxx == 0) { } for (int i = 0; i < R; i++) { int fd = open(filename, O_RDONLY); char buf[4096]; ssize_t rv = read(fd, buf, sizeof(buf)); asm volatile ("" :: "g" (rv)); close(fd); } } int main(int argc, char *argv[]) { if (argc < 4) { std::cerr << "usage: " << argv[0] << ' ' << "N /proc/filename R "; return 1; } N = atoi(argv[1]); filename = argv[2]; R = atoi(argv[3]); for (int i = 0; i < NR_CPUS; i++) { if (glue(i) == 0) break; } std::vector<std::thread> T; T.reserve(N); for (int i = 0; i < N; i++) { T.emplace_back(f, i); } auto t0 = std::chrono::system_clock::now(); { *(volatile int *)&xxx = 1; for (auto& t: T) { t.join(); } } auto t1 = std::chrono::system_clock::now(); std::chrono::duration<double> dt = t1 - t0; std::cout << dt.count() << ' '; return 0; } P.S.: Explicit randomization marker is added because adding non-function pointer will silently disable structure layout randomization. [akpm@linux-foundation.org: coding style fixes] Reported-by: kbuild test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Joe Perches <joe@perches.com> Link: http://lkml.kernel.org/r/20200222201539.GA22576@avx2 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-17kernel: module: Replace zero-length array with flexible-array memberGustavo A. R. Silva
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Jessica Yu <jeyu@kernel.org>
2020-02-04proc: convert everything to "struct proc_ops"Alexey Dobriyan
The most notable change is DEFINE_SHOW_ATTRIBUTE macro split in seq_file.h. Conversion rule is: llseek => proc_lseek unlocked_ioctl => proc_ioctl xxx => proc_xxx delete ".owner = THIS_MODULE" line [akpm@linux-foundation.org: fix drivers/isdn/capi/kcapi_proc.c] [sfr@canb.auug.org.au: fix kernel/sched/psi.c] Link: http://lkml.kernel.org/r/20200122180545.36222f50@canb.auug.org.au Link: http://lkml.kernel.org/r/20191225172546.GB13378@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31Merge tag 'modules-for-v5.6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux Pull module updates from Jessica Yu: "Summary of modules changes for the 5.6 merge window: - Add "MS" (SHF_MERGE|SHF_STRINGS) section flags to __ksymtab_strings to indicate to the linker that it can perform string deduplication (i.e., duplicate strings are reduced to a single copy in the string table). This means any repeated namespace string would be merged to just one entry in __ksymtab_strings. - Various code cleanups and small fixes (fix small memleak in error path, improve moduleparam docs, silence rcu warnings, improve error logging)" * tag 'modules-for-v5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux: module.h: Annotate mod_kallsyms with __rcu module: avoid setting info->name early in case we can fall back to info->mod->name modsign: print module name along with error message kernel/module: Fix memleak in module_add_modinfo_attrs() export.h: reduce __ksymtab_strings string duplication by using "MS" section flags moduleparam: fix kerneldoc modules: lockdep: Suppress suspicious RCU usage warning
2020-01-20module: avoid setting info->name early in case we can fall back to ↵Jessica Yu
info->mod->name In setup_load_info(), info->name (which contains the name of the module, mostly used for early logging purposes before the module gets set up) gets unconditionally assigned if .modinfo is missing despite the fact that there is an if (!info->name) check near the end of the function. Avoid assigning a placeholder string to info->name if .modinfo doesn't exist, so that we can fall back to info->mod->name later on. Fixes: 5fdc7db6448a ("module: setup load info before module_sig_check()") Reviewed-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Jessica Yu <jeyu@kernel.org>
2020-01-15modsign: print module name along with error messageJessica Yu
It is useful to know which module failed signature verification, so print the module name along with the error message. Signed-off-by: Jessica Yu <jeyu@kernel.org>
2020-01-08kernel/module: Fix memleak in module_add_modinfo_attrs()YueHaibing
In module_add_modinfo_attrs() if sysfs_create_file() fails on the first iteration of the loop (so i = 0), we forget to free the modinfo_attrs. Fixes: bc6f2a757d52 ("kernel/module: Fix mem leak in module_add_modinfo_attrs") Reviewed-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Jessica Yu <jeyu@kernel.org>
2019-12-25Merge branch 'core/kprobes' into perf/core, to pick up a completed branchIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-12-11Merge tag 'trace-v5.5-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: - Remove code I accidentally applied when doing a minor fix up to a patch, and then using "git commit -a --amend", which pulled in some other changes I was playing with. - Remove an used variable in trace_events_inject code - Fix function graph tracer when it traces a ftrace direct function. It will now ignore tracing a function that has a ftrace direct tramploine attached. This is needed for eBPF to use the ftrace direct code. * tag 'trace-v5.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ftrace: Fix function_graph tracer interaction with BPF trampoline tracing: remove set but not used variable 'buffer' module: Remove accidental change of module_enable_x()
2019-12-10module: Remove accidental change of module_enable_x()Steven Rostedt (VMware)
When pulling in Divya Indi's patch, I made a minor fix to remove unneeded braces. I commited my fix up via "git commit -a --amend". Unfortunately, I didn't realize I had some changes I was testing in the module code, and those changes were applied to Divya's patch as well. This reverts the accidental updates to the module code. Cc: Jessica Yu <jeyu@kernel.org> Cc: Divya Indi <divya.indi@oracle.com> Reported-by: Peter Zijlstra <peterz@infradead.org> Fixes: e585e6469d6f ("tracing: Verify if trace array exists before destroying it.") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-12-10Merge tag 'v5.5-rc1' into core/kprobes, to resolve conflictsIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-12-09modules: lockdep: Suppress suspicious RCU usage warningMasami Hiramatsu
While running kprobe module test, find_module_all() caused a suspicious RCU usage warning. ----- ============================= WARNING: suspicious RCU usage 5.4.0-next-20191202+ #63 Not tainted ----------------------------- kernel/module.c:619 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by rmmod/642: #0: ffffffff8227da80 (module_mutex){+.+.}, at: __x64_sys_delete_module+0x9a/0x230 stack backtrace: CPU: 0 PID: 642 Comm: rmmod Not tainted 5.4.0-next-20191202+ #63 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x71/0xa0 find_module_all+0xc1/0xd0 __x64_sys_delete_module+0xac/0x230 ? do_syscall_64+0x12/0x1f0 do_syscall_64+0x50/0x1f0 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x4b6d49 ----- This is because list_for_each_entry_rcu(modules) is called without rcu_read_lock(). This is safe because the module_mutex is locked. Pass lockdep_is_held(&module_mutex) to the list_for_each_entry_rcu() to suppress this warning, This also fixes similar issue in mod_find() and each_symbol_section(). Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Jessica Yu <jeyu@kernel.org>
2019-12-05Merge tag 'modules-for-v5.5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux Pull modules updates from Jessica Yu: "Summary of modules changes for the 5.5 merge window: - Refactor include/linux/export.h and remove code duplication between EXPORT_SYMBOL and EXPORT_SYMBOL_NS to make it more readable. The most notable change is that no namespace is represented by an empty string "" rather than NULL. - Fix a module load/unload race where waiter(s) trying to load the same module weren't being woken up when a module finally goes away" * tag 'modules-for-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux: kernel/module.c: wakeup processes in module_wq on module unload moduleparam: fix parameter description mismatch export: avoid code duplication in include/linux/export.h
2019-11-27Merge tag 'trace-v5.5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "New tracing features: - New PERMANENT flag to ftrace_ops when attaching a callback to a function. As /proc/sys/kernel/ftrace_enabled when set to zero will disable all attached callbacks in ftrace, this has a detrimental impact on live kernel tracing, as it disables all that it patched. If a ftrace_ops is registered to ftrace with the PERMANENT flag set, it will prevent ftrace_enabled from being disabled, and if ftrace_enabled is already disabled, it will prevent a ftrace_ops with PREMANENT flag set from being registered. - New register_ftrace_direct(). As eBPF would like to register its own trampolines to be called by the ftrace nop locations directly, without going through the ftrace trampoline, this function has been added. This allows for eBPF trampolines to live along side of ftrace, perf, kprobe and live patching. It also utilizes the ftrace enabled_functions file that keeps track of functions that have been modified in the kernel, to allow for security auditing. - Allow for kernel internal use of ftrace instances. Subsystems in the kernel can now create and destroy their own tracing instances which allows them to have their own tracing buffer, and be able to record events without worrying about other users from writing over their data. - New seq_buf_hex_dump() that lets users use the hex_dump() in their seq_buf usage. - Notifications now added to tracing_max_latency to allow user space to know when a new max latency is hit by one of the latency tracers. - Wider spread use of generic compare operations for use of bsearch and friends. - More synthetic event fields may be defined (32 up from 16) - Use of xarray for architectures with sparse system calls, for the system call trace events. This along with small clean ups and fixes" * tag 'trace-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (51 commits) tracing: Enable syscall optimization for MIPS tracing: Use xarray for syscall trace events tracing: Sample module to demonstrate kernel access to Ftrace instances. tracing: Adding new functions for kernel access to Ftrace instances tracing: Fix Kconfig indentation ring-buffer: Fix typos in function ring_buffer_producer ftrace: Use BIT() macro ftrace: Return ENOTSUPP when DYNAMIC_FTRACE_WITH_DIRECT_CALLS is not configured ftrace: Rename ftrace_graph_stub to ftrace_stub_graph ftrace: Add a helper function to modify_ftrace_direct() to allow arch optimization ftrace: Add helper find_direct_entry() to consolidate code ftrace: Add another check for match in register_ftrace_direct() ftrace: Fix accounting bug with direct->count in register_ftrace_direct() ftrace/selftests: Fix spelling mistake "wakeing" -> "waking" tracing: Increase SYNTH_FIELDS_MAX for synthetic_events ftrace/samples: Add a sample module that implements modify_ftrace_direct() ftrace: Add modify_ftrace_direct() tracing: Add missing "inline" in stub function of latency_fsnotify() tracing: Remove stray tab in TRACE_EVAL_MAP_FILE's help text tracing: Use seq_buf_hex_dump() to dump buffers ...
2019-11-27module: Remove set_all_modules_text_*()Peter Zijlstra
Now that there are no users of set_all_modules_text_*() left, remove it. While it appears nds32 uses it, it does not have STRICT_MODULE_RWX and therefore ends up with the NOP stubs. Tested-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Greentime Hu <green.hu@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jessica Yu <jeyu@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vincent Chen <deanbo422@gmail.com> Link: https://lkml.kernel.org/r/20191111132458.284298307@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-15kernel/module.c: wakeup processes in module_wq on module unloadKonstantin Khorenko
Fix the race between load and unload a kernel module. sys_delete_module() try_stop_module() mod->state = _GOING add_unformed_module() old = find_module_all() (old->state == _GOING => wait_event_interruptible()) During pre-condition finished_loading() rets 0 schedule() (never gets waken up later) free_module() mod->state = _UNFORMED list_del_rcu(&mod->list) (dels mod from "modules" list) return The race above leads to modprobe hanging forever on loading a module. Error paths on loading module call wake_up_all(&module_wq) after freeing module, so let's do the same on straight module unload. Fixes: 6e6de3dee51a ("kernel/module.c: Only return -EEXIST for modules that have finished loading") Reviewed-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com> Signed-off-by: Jessica Yu <jeyu@kernel.org>
2019-11-13tracing: Verify if trace array exists before destroying it.Divya Indi
A trace array can be destroyed from userspace or kernel. Verify if the trace array exists before proceeding to destroy/remove it. Link: http://lkml.kernel.org/r/1565805327-579-3-git-send-email-divya.indi@oracle.com Reviewed-by: Aruna Ramakrishna <aruna.ramakrishna@oracle.com> Signed-off-by: Divya Indi <divya.indi@oracle.com> [ Removed unneeded braces ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-11-06module/ftrace: handle patchable-function-entryMark Rutland
When using patchable-function-entry, the compiler will record the callsites into a section named "__patchable_function_entries" rather than "__mcount_loc". Let's abstract this difference behind a new FTRACE_CALLSITE_SECTION, so that architectures don't have to handle this explicitly (e.g. with custom module linker scripts). As parisc currently handles this explicitly, it is fixed up accordingly, with its custom linker script removed. Since FTRACE_CALLSITE_SECTION is only defined when DYNAMIC_FTRACE is selected, the parisc module loading code is updated to only use the definition in that case. When DYNAMIC_FTRACE is not selected, modules shouldn't have this section, so this removes some redundant work in that case. To make sure that this is keep up-to-date for modules and the main kernel, a comment is added to vmlinux.lds.h, with the existing ifdeffery simplified for legibility. I built parisc generic-{32,64}bit_defconfig with DYNAMIC_FTRACE enabled, and verified that the section made it into the .ko files for modules. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Helge Deller <deller@gmx.de> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Torsten Duwe <duwe@suse.de> Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Tested-by: Sven Schnelle <svens@stackframe.org> Tested-by: Torsten Duwe <duwe@suse.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: Jessica Yu <jeyu@kernel.org> Cc: linux-parisc@vger.kernel.org
2019-10-28export: avoid code duplication in include/linux/export.hMasahiro Yamada
include/linux/export.h has lots of code duplication between EXPORT_SYMBOL and EXPORT_SYMBOL_NS. To improve the maintainability and readability, unify the implementation. When the symbol has no namespace, pass the empty string "" to the 'ns' parameter. The drawback of this change is, it grows the code size. When the symbol has no namespace, sym->namespace was previously NULL, but it is now an empty string "". So, it increases 1 byte for every no namespace EXPORT_SYMBOL. A typical kernel configuration has 10K exported symbols, so it increases 10KB in rough estimation. I did not come up with a good idea to refactor it without increasing the code size. I am not sure how big a deal it is, but at least include/linux/export.h looks nicer. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> [maennich: rebase on top of 3 fixes for the namespace feature] Signed-off-by: Matthias Maennich <maennich@google.com> Signed-off-by: Jessica Yu <jeyu@kernel.org>
2019-09-28Merge branch 'next-lockdown' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull kernel lockdown mode from James Morris: "This is the latest iteration of the kernel lockdown patchset, from Matthew Garrett, David Howells and others. From the original description: This patchset introduces an optional kernel lockdown feature, intended to strengthen the boundary between UID 0 and the kernel. When enabled, various pieces of kernel functionality are restricted. Applications that rely on low-level access to either hardware or the kernel may cease working as a result - therefore this should not be enabled without appropriate evaluation beforehand. The majority of mainstream distributions have been carrying variants of this patchset for many years now, so there's value in providing a doesn't meet every distribution requirement, but gets us much closer to not requiring external patches. There are two major changes since this was last proposed for mainline: - Separating lockdown from EFI secure boot. Background discussion is covered here: https://lwn.net/Articles/751061/ - Implementation as an LSM, with a default stackable lockdown LSM module. This allows the lockdown feature to be policy-driven, rather than encoding an implicit policy within the mechanism. The new locked_down LSM hook is provided to allow LSMs to make a policy decision around whether kernel functionality that would allow tampering with or examining the runtime state of the kernel should be permitted. The included lockdown LSM provides an implementation with a simple policy intended for general purpose use. This policy provides a coarse level of granularity, controllable via the kernel command line: lockdown={integrity|confidentiality} Enable the kernel lockdown feature. If set to integrity, kernel features that allow userland to modify the running kernel are disabled. If set to confidentiality, kernel features that allow userland to extract confidential information from the kernel are also disabled. This may also be controlled via /sys/kernel/security/lockdown and overriden by kernel configuration. New or existing LSMs may implement finer-grained controls of the lockdown features. Refer to the lockdown_reason documentation in include/linux/security.h for details. The lockdown feature has had signficant design feedback and review across many subsystems. This code has been in linux-next for some weeks, with a few fixes applied along the way. Stephen Rothwell noted that commit 9d1f8be5cf42 ("bpf: Restrict bpf when kernel lockdown is in confidentiality mode") is missing a Signed-off-by from its author. Matthew responded that he is providing this under category (c) of the DCO" * 'next-lockdown' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (31 commits) kexec: Fix file verification on S390 security: constify some arrays in lockdown LSM lockdown: Print current->comm in restriction messages efi: Restrict efivar_ssdt_load when the kernel is locked down tracefs: Restrict tracefs when the kernel is locked down debugfs: Restrict debugfs when the kernel is locked down kexec: Allow kexec_file() with appropriate IMA policy when locked down lockdown: Lock down perf when in confidentiality mode bpf: Restrict bpf when kernel lockdown is in confidentiality mode lockdown: Lock down tracing and perf kprobes when in confidentiality mode lockdown: Lock down /proc/kcore x86/mmiotrace: Lock down the testmmiotrace module lockdown: Lock down module params that specify hardware parameters (eg. ioport) lockdown: Lock down TIOCSSERIAL lockdown: Prohibit PCMCIA CIS storage when the kernel is locked down acpi: Disable ACPI table override if the kernel is locked down acpi: Ignore acpi_rsdp kernel param when the kernel has been locked down ACPI: Limit access to custom_method when the kernel is locked down x86/msr: Restrict MSR access when the kernel is locked down x86: Lock down IO port access when the kernel is locked down ...
2019-09-27Merge branch 'next-integrity' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull integrity updates from Mimi Zohar: "The major feature in this time is IMA support for measuring and appraising appended file signatures. In addition are a couple of bug fixes and code cleanup to use struct_size(). In addition to the PE/COFF and IMA xattr signatures, the kexec kernel image may be signed with an appended signature, using the same scripts/sign-file tool that is used to sign kernel modules. Similarly, the initramfs may contain an appended signature. This contained a lot of refactoring of the existing appended signature verification code, so that IMA could retain the existing framework of calculating the file hash once, storing it in the IMA measurement list and extending the TPM, verifying the file's integrity based on a file hash or signature (eg. xattrs), and adding an audit record containing the file hash, all based on policy. (The IMA support for appended signatures patch set was posted and reviewed 11 times.) The support for appended signature paves the way for adding other signature verification methods, such as fs-verity, based on a single system-wide policy. The file hash used for verifying the signature and the signature, itself, can be included in the IMA measurement list" * 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: ima: ima_api: Use struct_size() in kzalloc() ima: use struct_size() in kzalloc() sefltest/ima: support appended signatures (modsig) ima: Fix use after free in ima_read_modsig() MODSIGN: make new include file self contained ima: fix freeing ongoing ahash_request ima: always return negative code for error ima: Store the measurement again when appraising a modsig ima: Define ima-modsig template ima: Collect modsig ima: Implement support for module-style appended signatures ima: Factor xattr_verify() out of ima_appraise_measurement() ima: Add modsig appraise_type option for module-style appended signatures integrity: Select CONFIG_KEYS instead of depending on it PKCS#7: Introduce pkcs7_get_digest() PKCS#7: Refactor verify_pkcs7_signature() MODSIGN: Export module signature definitions ima: initialize the "template" field with the default template
2019-09-11module: remove unneeded casts in cmp_name()Masahiro Yamada
You can pass opaque pointers directly. I also renamed 'va' and 'vb' into more meaningful arguments. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Jessica Yu <jeyu@kernel.org>
2019-09-11module: Fix link failure due to invalid relocation on namespace offsetWill Deacon
Commit 8651ec01daed ("module: add support for symbol namespaces.") broke linking for arm64 defconfig: | lib/crypto/arc4.o: In function `__ksymtab_arc4_setkey': | arc4.c:(___ksymtab+arc4_setkey+0x8): undefined reference to `no symbol' | lib/crypto/arc4.o: In function `__ksymtab_arc4_crypt': | arc4.c:(___ksymtab+arc4_crypt+0x8): undefined reference to `no symbol' This is because the dummy initialisation of the 'namespace_offset' field in 'struct kernel_symbol' when using EXPORT_SYMBOL on architectures with support for PREL32 locations uses an offset from an absolute address (0) in an effort to trick 'offset_to_pointer' into behaving as a NOP, allowing non-namespaced symbols to be treated in the same way as those belonging to a namespace. Unfortunately, place-relative relocations require a symbol reference rather than an absolute value and, although x86 appears to get away with this due to placing the kernel text at the top of the address space, it almost certainly results in a runtime failure if the kernel is relocated dynamically as a result of KASLR. Rework 'namespace_offset' so that a value of 0, which cannot occur for a valid namespaced symbol, indicates that the corresponding symbol does not belong to a namespace. Cc: Matthias Maennich <maennich@google.com> Cc: Jessica Yu <jeyu@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Fixes: 8651ec01daed ("module: add support for symbol namespaces.") Reported-by: kbuild test robot <lkp@intel.com> Tested-by: Matthias Maennich <maennich@google.com> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Matthias Maennich <maennich@google.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Jessica Yu <jeyu@kernel.org>
2019-09-10module: add config option MODULE_ALLOW_MISSING_NAMESPACE_IMPORTSMatthias Maennich
If MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS is enabled (default=n), the requirement for modules to import all namespaces that are used by the module is relaxed. Enabling this option effectively allows (invalid) modules to be loaded while only a warning is emitted. Disabling this option keeps the enforcement at module loading time and loading is denied if the module's imports are not satisfactory. Reviewed-by: Martijn Coenen <maco@android.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Matthias Maennich <maennich@google.com> Signed-off-by: Jessica Yu <jeyu@kernel.org>