summaryrefslogtreecommitdiffstats
path: root/arch/s390/include
AgeCommit message (Collapse)Author
2020-10-01mm/gup: fix gup_fast with dynamic page table foldingVasily Gorbik
commit d3f7b1bb204099f2f7306318896223e8599bb6a2 upstream. Currently to make sure that every page table entry is read just once gup_fast walks perform READ_ONCE and pass pXd value down to the next gup_pXd_range function by value e.g.: static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end, unsigned int flags, struct page **pages, int *nr) ... pudp = pud_offset(&p4d, addr); This function passes a reference on that local value copy to pXd_offset, and might get the very same pointer in return. This happens when the level is folded (on most arches), and that pointer should not be iterated. On s390 due to the fact that each task might have different 5,4 or 3-level address translation and hence different levels folded the logic is more complex and non-iteratable pointer to a local copy leads to severe problems. Here is an example of what happens with gup_fast on s390, for a task with 3-level paging, crossing a 2 GB pud boundary: // addr = 0x1007ffff000, end = 0x10080001000 static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end, unsigned int flags, struct page **pages, int *nr) { unsigned long next; pud_t *pudp; // pud_offset returns &p4d itself (a pointer to a value on stack) pudp = pud_offset(&p4d, addr); do { // on second iteratation reading "random" stack value pud_t pud = READ_ONCE(*pudp); // next = 0x10080000000, due to PUD_SIZE/MASK != PGDIR_SIZE/MASK on s390 next = pud_addr_end(addr, end); ... } while (pudp++, addr = next, addr != end); // pudp++ iterating over stack return 1; } This happens since s390 moved to common gup code with commit d1874a0c2805 ("s390/mm: make the pxd_offset functions more robust") and commit 1a42010cdc26 ("s390/mm: convert to the generic get_user_pages_fast code"). s390 tried to mimic static level folding by changing pXd_offset primitives to always calculate top level page table offset in pgd_offset and just return the value passed when pXd_offset has to act as folded. What is crucial for gup_fast and what has been overlooked is that PxD_SIZE/MASK and thus pXd_addr_end should also change correspondingly. And the latter is not possible with dynamic folding. To fix the issue in addition to pXd values pass original pXdp pointers down to gup_pXd_range functions. And introduce pXd_offset_lockless helpers, which take an additional pXd entry value parameter. This has already been discussed in https://lkml.kernel.org/r/20190418100218.0a4afd51@mschwideX1 Fixes: 1a42010cdc26 ("s390/mm: convert to the generic get_user_pages_fast code") Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Claudio Imbrenda <imbrenda@linux.ibm.com> Cc: <stable@vger.kernel.org> [5.2+] Link: https://lkml.kernel.org/r/patch.git-943f1e5dcff2.your-ad-here.call-01599856292-ext-8676@work.hours Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-01s390: avoid misusing CALL_ON_STACK for task stack setupVasily Gorbik
[ Upstream commit 7bcaad1f9fac889f5fcd1a383acf7e00d006da41 ] CALL_ON_STACK is intended to be used for temporary stack switching with potential return to the caller. When CALL_ON_STACK is misused to switch from nodat stack to task stack back_chain information would later lead stack unwinder from task stack into (per cpu) nodat stack which is reused for other purposes. This would yield confusing unwinding result or errors. To avoid that introduce CALL_ON_STACK_NORETURN to be used instead. It makes sure that back_chain is zeroed and unwinder finishes gracefully ending up at task pt_regs. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-09s390: don't trace preemption in percpu macrosSven Schnelle
[ Upstream commit 1196f12a2c960951d02262af25af0bb1775ebcc2 ] Since commit a21ee6055c30 ("lockdep: Change hardirq{s_enabled,_context} to per-cpu variables") the lockdep code itself uses percpu variables. This leads to recursions because the percpu macros are calling preempt_enable() which might call trace_preempt_on(). Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-03s390/numa: set node distance to LOCAL_DISTANCESasha Levin
[ Upstream commit 535e4fc623fab2e09a0653fc3a3e17f382ad0251 ] The node distance is hardcoded to 0, which causes a trouble for some user-level applications. In particular, "libnuma" expects the distance of a node to itself as LOCAL_DISTANCE. This update removes the offending node distance override. Cc: <stable@vger.kernel.org> # 4.4 Fixes: 3a368f742da1 ("s390/numa: add core infrastructure") Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-07-16s390: Change s390_kernel_write() return type to match memcpy()Josh Poimboeuf
[ Upstream commit cb2cceaefb4c4dc28fc27ff1f1b2d258bfc10353 ] s390_kernel_write()'s function type is almost identical to memcpy(). Change its return type to "void *" so they can be used interchangeably. Cc: linux-s390@vger.kernel.org Cc: heiko.carstens@de.ibm.com Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Joe Lawrence <joe.lawrence@redhat.com> Acked-by: Miroslav Benes <mbenes@suse.cz> Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> # s390 Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-07-16KVM: s390: reduce number of IO pins to 1Christian Borntraeger
[ Upstream commit 774911290c589e98e3638e73b24b0a4d4530e97c ] The current number of KVM_IRQCHIP_NUM_PINS results in an order 3 allocation (32kb) for each guest start/restart. This can result in OOM killer activity even with free swap when the memory is fragmented enough: kernel: qemu-system-s39 invoked oom-killer: gfp_mask=0x440dc0(GFP_KERNEL_ACCOUNT|__GFP_COMP|__GFP_ZERO), order=3, oom_score_adj=0 kernel: CPU: 1 PID: 357274 Comm: qemu-system-s39 Kdump: loaded Not tainted 5.4.0-29-generic #33-Ubuntu kernel: Hardware name: IBM 8562 T02 Z06 (LPAR) kernel: Call Trace: kernel: ([<00000001f848fe2a>] show_stack+0x7a/0xc0) kernel: [<00000001f8d3437a>] dump_stack+0x8a/0xc0 kernel: [<00000001f8687032>] dump_header+0x62/0x258 kernel: [<00000001f8686122>] oom_kill_process+0x172/0x180 kernel: [<00000001f8686abe>] out_of_memory+0xee/0x580 kernel: [<00000001f86e66b8>] __alloc_pages_slowpath+0xd18/0xe90 kernel: [<00000001f86e6ad4>] __alloc_pages_nodemask+0x2a4/0x320 kernel: [<00000001f86b1ab4>] kmalloc_order+0x34/0xb0 kernel: [<00000001f86b1b62>] kmalloc_order_trace+0x32/0xe0 kernel: [<00000001f84bb806>] kvm_set_irq_routing+0xa6/0x2e0 kernel: [<00000001f84c99a4>] kvm_arch_vm_ioctl+0x544/0x9e0 kernel: [<00000001f84b8936>] kvm_vm_ioctl+0x396/0x760 kernel: [<00000001f875df66>] do_vfs_ioctl+0x376/0x690 kernel: [<00000001f875e304>] ksys_ioctl+0x84/0xb0 kernel: [<00000001f875e39a>] __s390x_sys_ioctl+0x2a/0x40 kernel: [<00000001f8d55424>] system_call+0xd8/0x2c8 As far as I can tell s390x does not use the iopins as we bail our for anything other than KVM_IRQ_ROUTING_S390_ADAPTER and the chip/pin is only used for KVM_IRQ_ROUTING_IRQCHIP. So let us use a small number to reduce the memory footprint. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200617083620.5409-1-borntraeger@de.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-30s390/vdso: fix vDSO clock_getres()Vincenzo Frascino
[ Upstream commit 478237a595120a18e9b52fd2c57a6e8b7a01e411 ] clock_getres in the vDSO library has to preserve the same behaviour of posix_get_hrtimer_res(). In particular, posix_get_hrtimer_res() does: sec = 0; ns = hrtimer_resolution; and hrtimer_resolution depends on the enablement of the high resolution timers that can happen either at compile or at run time. Fix the s390 vdso implementation of clock_getres keeping a copy of hrtimer_resolution in vdso data and using that directly. Link: https://lkml.kernel.org/r/20200324121027.21665-1-vincenzo.frascino@arm.com Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [heiko.carstens@de.ibm.com: use llgf for proper zero extension] Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-24s390: fix syscall_get_error for compat processesDmitry V. Levin
commit b3583fca5fb654af2cfc1c08259abb9728272538 upstream. If both the tracer and the tracee are compat processes, and gprs[2] is assigned a value by __poke_user_compat, then the higher 32 bits of gprs[2] are cleared, IS_ERR_VALUE() always returns false, and syscall_get_error() always returns 0. Fix the implementation by sign-extending the value for compat processes the same way as x86 implementation does. The bug was exposed to user space by commit 201766a20e30f ("ptrace: add PTRACE_GET_SYSCALL_INFO request") and detected by strace test suite. This change fixes strace syscall tampering on s390. Link: https://lkml.kernel.org/r/20200602180051.GA2427@altlinux.org Fixes: 753c4dd6a2fa2 ("[S390] ptrace changes") Cc: Elvira Khabirova <lineprinter@altlinux.org> Cc: stable@vger.kernel.org # v2.6.28+ Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-27s390/pci: Fix s390_mmio_read/write with MIONiklas Schnelle
commit f058599e22d59e594e5aae1dc10560568d8f4a8b upstream. The s390_mmio_read/write syscalls are currently broken when running with MIO. The new pcistb_mio/pcstg_mio/pcilg_mio instructions are executed similiarly to normal load/store instructions and do address translation in the current address space. That means inside the kernel they are aware of mappings into kernel address space while outside the kernel they use user space mappings (usually created through mmap'ing a PCI device file). Now when existing user space applications use the s390_pci_mmio_write and s390_pci_mmio_read syscalls, they pass I/O addresses that are mapped into user space so as to be usable with the new instructions without needing a syscall. Accessing these addresses with the old instructions as done currently leads to a kernel panic. Also, for such a user space mapping there may not exist an equivalent kernel space mapping which means we can't just use the new instructions in kernel space. Instead of replicating user mappings in the kernel which then might collide with other mappings, we can conceptually execute the new instructions as if executed by the user space application using the secondary address space. This even allows us to directly store to the user pointer without the need for copy_to/from_user(). Cc: stable@vger.kernel.org Fixes: 71ba41c9b1d9 ("s390/pci: provide support for MIO instructions") Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-13s390: prevent leaking kernel address in BEARSven Schnelle
commit 0b38b5e1d0e2f361e418e05c179db05bb688bbd6 upstream. When userspace executes a syscall or gets interrupted, BEAR contains a kernel address when returning to userspace. This make it pretty easy to figure out where the kernel is mapped even with KASLR enabled. To fix this, add lpswe to lowcore and always execute it there, so userspace sees only the lowcore address of lpswe. For this we have to extend both critical_cleanup and the SWITCH_ASYNC macro to also check for lpswe addresses in lowcore. Fixes: b2d24b97b2a9 ("s390/kernel: add support for kernel address space layout randomization (KASLR)") Cc: <stable@vger.kernel.org> # v5.2+ Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-12s390/mm: fix panic in gup_fast on large pudGerald Schaefer
commit 582b4e55403e053d8a48ff687a05174da9cc3fb0 upstream. On s390 there currently is no implementation of pud_write(). That was ok as long as we had our own implementation of get_user_pages_fast() which checked for pud protection by testing the bit directly w/o using pud_write(). The other callers of pud_write() are not reachable on s390. After commit 1a42010cdc26 ("s390/mm: convert to the generic get_user_pages_fast code") we use the generic get_user_pages_fast(), which does call pud_write() in pud_access_permitted() for FOLL_WRITE access on a large pud. Without an s390 specific pud_write(), the generic version is called, which contains a BUG() statement to remind us that we don't have a proper implementation. This results in a kernel panic. Fix this by providing an implementation of pud_write(). Cc: <stable@vger.kernel.org> # 5.2+ Fixes: 1a42010cdc26 ("s390/mm: convert to the generic get_user_pages_fast code") Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-12s390/qdio: fill SL with absolute addressesJulian Wiedmann
[ Upstream commit e9091ffd6a0aaced111b5d6ead5eaab5cd7101bc ] As the comment says, sl->sbal holds an absolute address. qeth currently solves this through wild casting, while zfcp doesn't care. Handle this properly in the code that actually builds the SL. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Reviewed-by: Steffen Maier <maier@linux.ibm.com> [for qdio] Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-28s390/mm: Explicitly compare PAGE_DEFAULT_KEY against zero in ↵Nathan Chancellor
storage_key_init_range commit 380324734956c64cd060e1db4304f3117ac15809 upstream. Clang warns: In file included from ../arch/s390/purgatory/purgatory.c:10: In file included from ../include/linux/kexec.h:18: In file included from ../include/linux/crash_core.h:6: In file included from ../include/linux/elfcore.h:5: In file included from ../include/linux/user.h:1: In file included from ../arch/s390/include/asm/user.h:11: ../arch/s390/include/asm/page.h:45:6: warning: converting the result of '<<' to a boolean always evaluates to false [-Wtautological-constant-compare] if (PAGE_DEFAULT_KEY) ^ ../arch/s390/include/asm/page.h:23:44: note: expanded from macro 'PAGE_DEFAULT_KEY' #define PAGE_DEFAULT_KEY (PAGE_DEFAULT_ACC << 4) ^ 1 warning generated. Explicitly compare this against zero to silence the warning as it is intended to be used in a boolean context. Fixes: de3fa841e429 ("s390/mm: fix compile for PAGE_DEFAULT_KEY != 0") Link: https://github.com/ClangBuiltLinux/linux/issues/860 Link: https://lkml.kernel.org/r/20200214064207.10381-1-natechancellor@gmail.com Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-24s390/pci: Recover handle in clp_set_pci_fn()Niklas Schnelle
[ Upstream commit 17cdec960cf776b20b1fb08c622221babe591d51 ] When we try to recover a PCI function using echo 1 > /sys/bus/pci/devices/<id>/recover or manually with echo 1 > /sys/bus/pci/devices/<id>/remove echo 0 > /sys/bus/pci/slots/<slot>/power echo 1 > /sys/bus/pci/slots/<slot>/power clp_disable_fn() / clp_enable_fn() call clp_set_pci_fn() to first disable and then reenable the function. When the function is already in the requested state we may be left with an invalid function handle. To get a new valid handle we do a clp_list_pci() call. For this we need both the function ID and function handle in clp_set_pci_fn() so pass the zdev and get both. To simplify things also pull setting the refreshed function handle into clp_set_pci_fn() Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-19s390/time: Fix clk type in get_tod_clockNathan Chancellor
commit 0f8a206df7c920150d2aa45574fba0ab7ff6be4f upstream. Clang warns: In file included from ../arch/s390/boot/startup.c:3: In file included from ../include/linux/elf.h:5: In file included from ../arch/s390/include/asm/elf.h:132: In file included from ../include/linux/compat.h:10: In file included from ../include/linux/time.h:74: In file included from ../include/linux/time32.h:13: In file included from ../include/linux/timex.h:65: ../arch/s390/include/asm/timex.h:160:20: warning: passing 'unsigned char [16]' to parameter of type 'char *' converts between pointers to integer types with different sign [-Wpointer-sign] get_tod_clock_ext(clk); ^~~ ../arch/s390/include/asm/timex.h:149:44: note: passing argument to parameter 'clk' here static inline void get_tod_clock_ext(char *clk) ^ Change clk's type to just be char so that it matches what happens in get_tod_clock_ext. Fixes: 57b28f66316d ("[S390] s390_hypfs: Add new attributes") Link: https://github.com/ClangBuiltLinux/linux/issues/861 Link: http://lkml.kernel.org/r/20200208140858.47970-1-natechancellor@gmail.com Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-11s390/mm: fix dynamic pagetable upgrade for hugetlbfsGerald Schaefer
commit 5f490a520bcb393389a4d44bec90afcb332eb112 upstream. Commit ee71d16d22bb ("s390/mm: make TASK_SIZE independent from the number of page table levels") changed the logic of TASK_SIZE and also removed the arch_mmap_check() implementation for s390. This combination has a subtle effect on how get_unmapped_area() for hugetlbfs pages works. It is now possible that a user process establishes a hugetlbfs mapping at an address above 4 TB, without triggering a dynamic pagetable upgrade from 3 to 4 levels. This is because hugetlbfs mappings will not use mm->get_unmapped_area, but rather file->f_op->get_unmapped_area, which currently is the generic implementation of hugetlb_get_unmapped_area() that does not know about s390 dynamic pagetable upgrades, but with the new definition of TASK_SIZE, it will now allow mappings above 4 TB. Subsequent access to such a mapped address above 4 TB will result in a page fault loop, because the CPU cannot translate such a large address with 3 pagetable levels. The fault handler will try to map in a hugepage at the address, but due to the folded pagetable logic it will end up with creating entries in the 3 level pagetable, possibly overwriting existing mappings, and then it all repeats when the access is retried. Apart from the page fault loop, this can have various nasty effects, e.g. kernel panic from one of the BUG_ON() checks in memory management code, or even data loss if an existing mapping gets overwritten. Fix this by implementing HAVE_ARCH_HUGETLB_UNMAPPED_AREA support for s390, providing an s390 version for hugetlb_get_unmapped_area() with pagetable upgrade support similar to arch_get_unmapped_area(), which will then be used instead of the generic version. Fixes: ee71d16d22bb ("s390/mm: make TASK_SIZE independent from the number of page table levels") Cc: <stable@vger.kernel.org> # 4.12+ Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-31s390/ftrace: fix endless recursion in function_graph tracerSven Schnelle
[ Upstream commit 6feeee8efc53035c3195b02068b58ae947538aa4 ] The following sequence triggers a kernel stack overflow on s390x: mount -t tracefs tracefs /sys/kernel/tracing cd /sys/kernel/tracing echo function_graph > current_tracer [crash] This is because preempt_count_{add,sub} are in the list of traced functions, which can be demonstrated by: echo preempt_count_add >set_ftrace_filter echo function_graph > current_tracer [crash] The stack overflow happens because get_tod_clock_monotonic() gets called by ftrace but itself calls preempt_{disable,enable}(), which leads to a endless recursion. Fix this by using preempt_{disable,enable}_notrace(). Fixes: 011620688a71 ("s390/time: ensure get_clock_monotonic() returns monotonic values") Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-31s390/mm: add mm_pxd_folded() checks to pxd_free()Gerald Schaefer
[ Upstream commit 2416cefc504ba8ae9b17e3e6b40afc72708f96be ] Unlike pxd_free_tlb(), the pxd_free() functions do not check for folded page tables. This is not an issue so far, as those functions will actually never be called, since no code will reach them when page tables are folded. In order to avoid future issues, and to make the s390 code more similar to other architectures, add mm_pxd_folded() checks, similar to how it is done in pxd_free_tlb(). This was found by testing a patch from from Anshuman Khandual, which is currently discussed on LKML ("mm/debug: Add tests validating architecture page table helpers"). Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-31s390/time: ensure get_clock_monotonic() returns monotonic valuesHeiko Carstens
[ Upstream commit 011620688a71f2f1fe9901dbc2479a7c01053196 ] The current implementation of get_clock_monotonic() leaves it up to the caller to call the function with preemption disabled. The only core kernel caller (sched_clock) however does not disable preemption. In order to make sure that all callers of this function see monotonic values handle disabling preemption within the function itself. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-17s390/mm: properly clear _PAGE_NOEXEC bit when it is not supportedGerald Schaefer
commit ab874f22d35a8058d8fdee5f13eb69d8867efeae upstream. On older HW or under a hypervisor, w/o the instruction-execution- protection (IEP) facility, and also w/o EDAT-1, a translation-specification exception may be recognized when bit 55 of a pte is one (_PAGE_NOEXEC). The current code tries to prevent setting _PAGE_NOEXEC in such cases, by removing it within set_pte_at(). However, ptep_set_access_flags() will modify a pte directly, w/o using set_pte_at(). There is at least one scenario where this can result in an active pte with _PAGE_NOEXEC set, which would then lead to a panic due to a translation-specification exception (write to swapped out page): do_swap_page pte = mk_pte (with _PAGE_NOEXEC bit) set_pte_at (will remove _PAGE_NOEXEC bit in page table, but keep it in local variable pte) vmf->orig_pte = pte (pte still contains _PAGE_NOEXEC bit) do_wp_page wp_page_reuse entry = vmf->orig_pte (still with _PAGE_NOEXEC bit) ptep_set_access_flags (writes entry with _PAGE_NOEXEC bit) Fix this by clearing _PAGE_NOEXEC already in mk_pte_phys(), where the pgprot value is applied, so that no pte with _PAGE_NOEXEC will ever be visible, if it is not supported. The check in set_pte_at() can then also be removed. Cc: <stable@vger.kernel.org> # 4.11+ Fixes: 57d7f939e7bd ("s390: add no-execute support") Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-31s390/unwind: fix mixing regs and spIlya Leoshkevich
unwind_for_each_frame stops after the first frame if regs->gprs[15] <= sp. The reason is that in case regs are specified, the first frame should be regs->psw.addr and the second frame should be sp->gprs[8]. However, currently the second frame is regs->gprs[15], which confuses outside_of_stack(). Fix by introducing a flag to distinguish this special case from unwinding the interrupt handler, for which the current behavior is appropriate. Fixes: 78c98f907413 ("s390/unwind: introduce stack unwind API") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Cc: stable@vger.kernel.org # v5.2+ Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-10-11s390/uaccess: avoid (false positive) compiler warningsChristian Borntraeger
Depending on inlining decisions by the compiler, __get/put_user_fn might become out of line. Then the compiler is no longer able to tell that size can only be 1,2,4 or 8 due to the check in __get/put_user resulting in false positives like ./arch/s390/include/asm/uaccess.h: In function ‘__put_user_fn’: ./arch/s390/include/asm/uaccess.h:113:9: warning: ‘rc’ may be used uninitialized in this function [-Wmaybe-uninitialized] 113 | return rc; | ^~ ./arch/s390/include/asm/uaccess.h: In function ‘__get_user_fn’: ./arch/s390/include/asm/uaccess.h:143:9: warning: ‘rc’ may be used uninitialized in this function [-Wmaybe-uninitialized] 143 | return rc; | ^~ These functions are supposed to be always inlined. Mark it as such. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-10-04s390/mm: mark function(s) __always_inlineHeiko Carstens
Always inline asm inlines with variable operands for "i" constraints, since they won't compile if the compiler would decide to not inline them. Reported-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-10-04s390/jump_label: mark function(s) __always_inlineHeiko Carstens
Always inline asm inlines with variable operands for "i" constraints, since they won't compile if the compiler would decide to not inline them. Reported-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-10-04s390/cpu_mf: mark function(s) __always_inlineHeiko Carstens
Always inline asm inlines with variable operands for "i" constraints, since they won't compile if the compiler would decide to not inline them. Reported-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-10-04s390/atomic,bitops: mark function(s) __always_inlineHeiko Carstens
Always inline asm inlines with variable operands for "i" constraints, since they won't compile if the compiler would decide to not inline them. Reported-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-10-04s390/mm: fix -Wunused-but-set-variable warningsQian Cai
Convert two functions to static inline to get ride of W=1 GCC warnings like, mm/gup.c: In function 'gup_pte_range': mm/gup.c:1816:16: warning: variable 'ptem' set but not used [-Wunused-but-set-variable] pte_t *ptep, *ptem; ^~~~ mm/mmap.c: In function 'acct_stack_growth': mm/mmap.c:2322:16: warning: variable 'new_start' set but not used [-Wunused-but-set-variable] unsigned long new_start; ^~~~~~~~~ Signed-off-by: Qian Cai <cai@lca.pw> Link: https://lore.kernel.org/lkml/1570138596-11913-1-git-send-email-cai@lca.pw/ Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-10-04s390: mark __cpacf_query() as __always_inlineJiri Kosina
arch/s390/kvm/kvm-s390.c calls on several places __cpacf_query() directly, which makes it impossible to meet the "i" constraint for the asm operands (opcode in this case). As we are now force-enabling CONFIG_OPTIMIZE_INLINING on all architectures, this causes a build failure on s390: In file included from arch/s390/kvm/kvm-s390.c:44: ./arch/s390/include/asm/cpacf.h: In function '__cpacf_query': ./arch/s390/include/asm/cpacf.h:179:2: warning: asm operand 3 probably doesn't match constraints 179 | asm volatile( | ^~~ ./arch/s390/include/asm/cpacf.h:179:2: error: impossible constraint in 'asm' Mark __cpacf_query() as __always_inline in order to fix that, analogically how we fixes __cpacf_check_opcode(), cpacf_query_func() and scpacf_query() already. Reported-and-tested-by: Michal Kubecek <mkubecek@suse.cz> Fixes: d83623c5eab2 ("s390: mark __cpacf_check_opcode() and cpacf_query_func() as __always_inline") Fixes: e60fb8bf68d4 ("s390/cpacf: mark scpacf_query() as __always_inline") Fixes: ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING forcibly") Fixes: 9012d011660e ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING") Signed-off-by: Jiri Kosina <jkosina@suse.cz> Link: https://lore.kernel.org/lkml/nycvar.YFH.7.76.1910012203010.13160@cbobk.fhfr.pm Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-10-01s390/qdio: clarify size of the QIB parm areaJulian Wiedmann
The QIB parm area is 128 bytes long. Current code consistently misuses an _entirely unrelated_ QDIO constant, merely because it has the same value. Stop doing so. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Reviewed-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-10-01s390/cpumsf: Check for CPU Measurement samplingThomas Richter
s390 IBM z15 introduces a check if the CPU Mesurement Facility sampling is temporarily unavailable. If this is the case return -EBUSY and abort the setup of CPU Measuement facility sampling. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-09-26Merge tag 's390-5.4-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Vasily Gorbik: - Fix three kasan findings - Add PERF_EVENT_IOC_PERIOD ioctl support - Add Crypto Express7S support and extend sysfs attributes for pkey - Minor common I/O layer documentation corrections * tag 's390-5.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/cio: exclude subchannels with no parent from pseudo check s390/cio: avoid calling strlen on null pointer s390/topology: avoid firing events before kobjs are created s390/cpumf: Remove mixed white space s390/cpum_sf: Support ioctl PERF_EVENT_IOC_PERIOD s390/zcrypt: CEX7S exploitation support s390/cio: fix intparm documentation s390/pkey: Add sysfs attributes to emit AES CIPHER key blobs
2019-09-24mm: consolidate pgtable_cache_init() and pgd_cache_init()Mike Rapoport
Both pgtable_cache_init() and pgd_cache_init() are used to initialize kmem cache for page table allocations on several architectures that do not use PAGE_SIZE tables for one or more levels of the page table hierarchy. Most architectures do not implement these functions and use __weak default NOP implementation of pgd_cache_init(). Since there is no such default for pgtable_cache_init(), its empty stub is duplicated among most architectures. Rename the definitions of pgd_cache_init() to pgtable_cache_init() and drop empty stubs of pgtable_cache_init(). Link: http://lkml.kernel.org/r/1566457046-22637-1-git-send-email-rppt@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Will Deacon <will@kernel.org> [arm64] Acked-by: Thomas Gleixner <tglx@linutronix.de> [x86] Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24mm: remove quicklist page table cachesNicholas Piggin
Patch series "mm: remove quicklist page table caches". A while ago Nicholas proposed to remove quicklist page table caches [1]. I've rebased his patch on the curren upstream and switched ia64 and sh to use generic versions of PTE allocation. [1] https://lore.kernel.org/linux-mm/20190711030339.20892-1-npiggin@gmail.com This patch (of 3): Remove page table allocator "quicklists". These have been around for a long time, but have not got much traction in the last decade and are only used on ia64 and sh architectures. The numbers in the initial commit look interesting but probably don't apply anymore. If anybody wants to resurrect this it's in the git history, but it's unhelpful to have this code and divergent allocator behaviour for minor archs. Also it might be better to instead make more general improvements to page allocator if this is still so slow. Link: http://lkml.kernel.org/r/1565250728-21721-2-git-send-email-rppt@linux.ibm.com Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-20Merge tag 'powerpc-5.4-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "This is a bit late, partly due to me travelling, and partly due to a power outage knocking out some of my test systems *while* I was travelling. - Initial support for running on a system with an Ultravisor, which is software that runs below the hypervisor and protects guests against some attacks by the hypervisor. - Support for building the kernel to run as a "Secure Virtual Machine", ie. as a guest capable of running on a system with an Ultravisor. - Some changes to our DMA code on bare metal, to allow devices with medium sized DMA masks (> 32 && < 59 bits) to use more than 2GB of DMA space. - Support for firmware assisted crash dumps on bare metal (powernv). - Two series fixing bugs in and refactoring our PCI EEH code. - A large series refactoring our exception entry code to use gas macros, both to make it more readable and also enable some future optimisations. As well as many cleanups and other minor features & fixups. Thanks to: Adam Zerella, Alexey Kardashevskiy, Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Anshuman Khandual, Balbir Singh, Benjamin Herrenschmidt, Cédric Le Goater, Christophe JAILLET, Christophe Leroy, Christopher M. Riedl, Christoph Hellwig, Claudio Carvalho, Daniel Axtens, David Gibson, David Hildenbrand, Desnes A. Nunes do Rosario, Ganesh Goudar, Gautham R. Shenoy, Greg Kurz, Guerney Hunt, Gustavo Romero, Halil Pasic, Hari Bathini, Joakim Tjernlund, Jonathan Neuschafer, Jordan Niethe, Leonardo Bras, Lianbo Jiang, Madhavan Srinivasan, Mahesh Salgaonkar, Mahesh Salgaonkar, Masahiro Yamada, Maxiwell S. Garcia, Michael Anderson, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Qian Cai, Ram Pai, Ravi Bangoria, Reza Arbab, Ryan Grimm, Sam Bobroff, Santosh Sivaraj, Segher Boessenkool, Sukadev Bhattiprolu, Thiago Bauermann, Thiago Jung Bauermann, Thomas Gleixner, Tom Lendacky, Vasant Hegde" * tag 'powerpc-5.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (264 commits) powerpc/mm/mce: Keep irqs disabled during lockless page table walk powerpc: Use ftrace_graph_ret_addr() when unwinding powerpc/ftrace: Enable HAVE_FUNCTION_GRAPH_RET_ADDR_PTR ftrace: Look up the address of return_to_handler() using helpers powerpc: dump kernel log before carrying out fadump or kdump docs: powerpc: Add missing documentation reference powerpc/xmon: Fix output of XIVE IPI powerpc/xmon: Improve output of XIVE interrupts powerpc/mm/radix: remove useless kernel messages powerpc/fadump: support holes in kernel boot memory area powerpc/fadump: remove RMA_START and RMA_END macros powerpc/fadump: update documentation about option to release opalcore powerpc/fadump: consider f/w load area powerpc/opalcore: provide an option to invalidate /sys/firmware/opal/core file powerpc/opalcore: export /sys/firmware/opal/core for analysing opal crashes powerpc/fadump: update documentation about CONFIG_PRESERVE_FA_DUMP powerpc/fadump: add support to preserve crash data on FADUMP disabled kernel powerpc/fadump: improve how crashed kernel's memory is reserved powerpc/fadump: consider reserved ranges while releasing memory powerpc/fadump: make crash memory ranges array allocation generic ...
2019-09-19s390/cpumf: Remove mixed white spaceThomas Richter
Remove blanks in comment and replace them by tabs. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-09-19s390/cpum_sf: Support ioctl PERF_EVENT_IOC_PERIODThomas Richter
A perf_event can be set up to deliver overflow notifications via SIGIO signal. The setup of the event is: 1. create event with perf_event_open() 2. assign it a signal for I/O notification with fcntl() 3. Install signal handler and consume samples The initial setup of perf_event_open() determines the period/frequency time span needed to elapse before each signal is delivered to the user process. While the event is active, system call ioctl(.., PERF_EVENT_IOC_PERIOD, value) can be used the change the frequency/period time span of the active event. The remaining signal handler invocations honour the new value. This does not work on s390. In fact the time span does not change regardless of ioctl's third argument 'value'. The call succeeds but the time span does not change. Support this behavior and make it common with other platforms. This is achieved by changing the interval value of the sampling control block accordingly and feed this new value every time the event is enabled using pmu_event_enable(). Before this change the interval value was set only once at pmu_event_add() and never changed. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-09-19s390/zcrypt: CEX7S exploitation supportHarald Freudenberger
This patch adds CEX7 exploitation support for the AP bus code, the zcrypt device driver zoo and the vfio device driver. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-09-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextLinus Torvalds
Pull networking updates from David Miller: 1) Support IPV6 RA Captive Portal Identifier, from Maciej Żenczykowski. 2) Use bio_vec in the networking instead of custom skb_frag_t, from Matthew Wilcox. 3) Make use of xmit_more in r8169 driver, from Heiner Kallweit. 4) Add devmap_hash to xdp, from Toke Høiland-Jørgensen. 5) Support all variants of 5750X bnxt_en chips, from Michael Chan. 6) More RTNL avoidance work in the core and mlx5 driver, from Vlad Buslov. 7) Add TCP syn cookies bpf helper, from Petar Penkov. 8) Add 'nettest' to selftests and use it, from David Ahern. 9) Add extack support to drop_monitor, add packet alert mode and support for HW drops, from Ido Schimmel. 10) Add VLAN offload to stmmac, from Jose Abreu. 11) Lots of devm_platform_ioremap_resource() conversions, from YueHaibing. 12) Add IONIC driver, from Shannon Nelson. 13) Several kTLS cleanups, from Jakub Kicinski. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1930 commits) mlxsw: spectrum_buffers: Add the ability to query the CPU port's shared buffer mlxsw: spectrum: Register CPU port with devlink mlxsw: spectrum_buffers: Prevent changing CPU port's configuration net: ena: fix incorrect update of intr_delay_resolution net: ena: fix retrieval of nonadaptive interrupt moderation intervals net: ena: fix update of interrupt moderation register net: ena: remove all old adaptive rx interrupt moderation code from ena_com net: ena: remove ena_restore_ethtool_params() and relevant fields net: ena: remove old adaptive interrupt moderation code from ena_netdev net: ena: remove code duplication in ena_com_update_nonadaptive_moderation_interval _*() net: ena: enable the interrupt_moderation in driver_supported_features net: ena: reimplement set/get_coalesce() net: ena: switch to dim algorithm for rx adaptive interrupt moderation net: ena: add intr_moder_rx_interval to struct ena_com_dev and use it net: phy: adin: implement Energy Detect Powerdown mode via phy-tunable ethtool: implement Energy Detect Powerdown support via phy-tunable xen-netfront: do not assume sk_buff_head list is empty in error handling s390/ctcm: Delete unnecessary checks before the macro call “dev_kfree_skb” net: ena: don't wake up tx queue when down drop_monitor: Better sanitize notified packets ...
2019-09-18Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "s390: - ioctl hardening - selftests ARM: - ITS translation cache - support for 512 vCPUs - various cleanups and bugfixes PPC: - various minor fixes and preparation x86: - bugfixes all over the place (posted interrupts, SVM, emulation corner cases, blocked INIT) - some IPI optimizations" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (75 commits) KVM: X86: Use IPI shorthands in kvm guest when support KVM: x86: Fix INIT signal handling in various CPU states KVM: VMX: Introduce exit reason for receiving INIT signal on guest-mode KVM: VMX: Stop the preemption timer during vCPU reset KVM: LAPIC: Micro optimize IPI latency kvm: Nested KVM MMUs need PAE root too KVM: x86: set ctxt->have_exception in x86_decode_insn() KVM: x86: always stop emulation on page fault KVM: nVMX: trace nested VM-Enter failures detected by H/W KVM: nVMX: add tracepoint for failed nested VM-Enter x86: KVM: svm: Fix a check in nested_svm_vmrun() KVM: x86: Return to userspace with internal error on unexpected exit reason KVM: x86: Add kvm_emulate_{rd,wr}msr() to consolidate VXM/SVM code KVM: x86: Refactor up kvm_{g,s}et_msr() to simplify callers doc: kvm: Fix return description of KVM_SET_MSRS KVM: X86: Tune PLE Window tracepoint KVM: VMX: Change ple_window type to unsigned int KVM: X86: Remove tailing newline for tracepoints KVM: X86: Trace vcpu_id for vmexit KVM: x86: Manually calculate reserved bits when loading PDPTRS ...
2019-09-17Merge tag 's390-5.4-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Vasily Gorbik: - Add support for IBM z15 machines. - Add SHA3 and CCA AES cipher key support in zcrypt and pkey refactoring. - Move to arch_stack_walk infrastructure for the stack unwinder. - Various kasan fixes and improvements. - Various command line parsing fixes. - Improve decompressor phase debuggability. - Lift no bss usage restriction for the early code. - Use refcount_t for reference counters for couple of places in mm code. - Logging improvements and return code fix in vfio-ccw code. - Couple of zpci fixes and minor refactoring. - Remove some outdated documentation. - Fix secure boot detection. - Other various minor code clean ups. * tag 's390-5.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (48 commits) s390: remove pointless drivers-y in drivers/s390/Makefile s390/cpum_sf: Fix line length and format string s390/pci: fix MSI message data s390: add support for IBM z15 machines s390/crypto: Support for SHA3 via CPACF (MSA6) s390/startup: add pgm check info printing s390/crypto: xts-aes-s390 fix extra run-time crypto self tests finding vfio-ccw: fix error return code in vfio_ccw_sch_init() s390: vfio-ap: fix warning reset not completed s390/base: remove unused s390_base_mcck_handler s390/sclp: Fix bit checked for has_sipl s390/zcrypt: fix wrong handling of cca cipher keygenflags s390/kasan: add kdump support s390/setup: avoid using strncmp with hardcoded length s390/sclp: avoid using strncmp with hardcoded length s390/module: avoid using strncmp with hardcoded length s390/pci: avoid using strncmp with hardcoded length s390/kaslr: reserve memory for kasan usage s390/mem_detect: provide single get_mem_detect_end s390/cmma: reuse kstrtobool for option value parsing ...
2019-09-13s390/crypto: Support for SHA3 via CPACF (MSA6)Joerg Schmidbauer
This patch introduces sha3 support for s390. - Rework the s390-specific SHA1 and SHA2 related code to provide the basis for SHA3. - Provide two new kernel modules sha3_256_s390 and sha3_512_s390 together with new kernel options. Signed-off-by: Joerg Schmidbauer <jschmidb@de.ibm.com> Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2019-09-11Merge tag 'kvm-s390-next-5.4-1' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD * More selftests * Improved KVM_S390_MEM_OP ioctl input checking * Add kvm_valid_regs and kvm_dirty_regs invalid bit checking
2019-09-04KVM: s390: Disallow invalid bits in kvm_valid_regs and kvm_dirty_regsThomas Huth
If unknown bits are set in kvm_valid_regs or kvm_dirty_regs, this clearly indicates that something went wrong in the KVM userspace application. The x86 variant of KVM already contains a check for bad bits, so let's do the same on s390x now, too. Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com> Link: https://lore.kernel.org/lkml/20190904085200.29021-2-thuth@redhat.com/ Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2019-09-03s390/base: remove unused s390_base_mcck_handlerVasily Gorbik
s390_base_mcck_handler was used during system reset if diag308 set was not available. But after commit d485235b0054 ("s390: assume diag308 set always works") is a dead code and could be removed. Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-08-26s390/mem_detect: provide single get_mem_detect_endVasily Gorbik
get_mem_detect_end is already used in couple of places with potential to be utilized in more cases. Provide single get_mem_detect_end implementation in asm/mem_detect.h to be used by kasan and startup code. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-08-24s390/qeth: add TX NAPI support for IQD devicesJulian Wiedmann
Due to their large MTU and potentially low utilization of TX buffers, IQD devices in particular require fast TX recycling. This makes them a prime candidate for a TX NAPI path in qeth. qeth_tx_poll() uses the recently introduced qdio_inspect_queue() helper to poll the TX queue for completed buffers. To avoid hogging the CPU for too long, we yield to the stack after completing an entire queue's worth of buffers. While IQD is expected to transfer its buffers synchronously (and thus doesn't support TX interrupts), a timer covers for the odd case where a TX buffer doesn't complete synchronously. Currently this timer should only ever fire for (1) the mcast queue, (2) the occasional race, where the NAPI poll code observes an update to queue->used_buffers while the TX doorbell hasn't been issued yet. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-24s390/qdio: let drivers opt-out from Output Queue scanningJulian Wiedmann
If a driver wants to use the new Output Queue poll code, then the qdio layer must disable its internal Queue scanning. Let the driver select this mode by passing a special scan_threshold of 0. As the scan_threshold is the same for all Output Queues, also move it into the main qdio_irq struct. This allows for fast opt-out checking, a driver is expected to operate either _all_ or none of its Output Queues in polling mode. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Acked-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-24s390/qdio: enable drivers to poll for Output completionsJulian Wiedmann
While commit d36deae75011 ("qdio: extend API to allow polling") enhanced the qdio layer so that drivers can poll their Input Queues, we don't have the corresponding infrastructure for Output Queues yet. Factor out a helper that scans a single QDIO Queue, so that qeth can implement TX NAPI on top of it. While doing so, remove the duplicated tracking of the next-to-scan index (q->first_to_check vs q->first_to_kick) in this code path. qdio_handle_aobs() needs to move slightly upwards in the code hierarchy, so that it's still called from the polling path. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Acked-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-21s390/pkey: add CCA AES cipher key supportHarald Freudenberger
Introduce new ioctls and structs to be used with these new ioctls which are able to handle CCA AES secure keys and CCA AES cipher keys: PKEY_GENSECK2: Generate secure key, version 2. Generate either a CCA AES secure key or a CCA AES cipher key. PKEY_CLR2SECK2: Generate secure key from clear key value, version 2. Construct a CCA AES secure key or CCA AES cipher key from a given clear key value. PKEY_VERIFYKEY2: Verify the given secure key, version 2. Check for correct key type. If cardnr and domain are given, also check if this apqn is able to handle this type of key. If cardnr and domain are 0xFFFF, on return these values are filled with an apqn able to handle this key. The function also checks for the master key verification patterns of the key matching to the current or alternate mkvp of the apqn. CCA AES cipher keys are also checked for CPACF export allowed (CPRTCPAC flag). Currently CCA AES secure keys and CCA AES cipher keys are supported (may get extended in the future). PKEY_KBLOB2PROTK2: Transform a key blob (of any type) into a protected key, version 2. Difference to version 1 is only that this new ioctl has additional parameters to provide a list of apqns to be used for the transformation. PKEY_APQNS4K: Generate a list of APQNs based on the key blob given. Is able to find out which type of secure key is given (CCA AES secure key or CCA AES cipher key) and tries to find all matching crypto cards based on the MKVP and maybe other criterias (like CCA AES cipher keys need a CEX6C or higher). The list of APQNs is further filtered by the key's mkvp which needs to match to either the current mkvp or the alternate mkvp (which is the old mkvp on CCA adapters) of the apqns. The flags argument may be used to limit the matching apqns. If the PKEY_FLAGS_MATCH_CUR_MKVP is given, only the current mkvp of each apqn is compared. Likewise with the PKEY_FLAGS_MATCH_ALT_MKVP. If both are given it is assumed to return apqns where either the current or the alternate mkvp matches. If no matching APQN is found, the ioctl returns with 0 but the apqn_entries value is 0. PKEY_APQNS4KT: Generate a list of APQNs based on the key type given. Build a list of APQNs based on the given key type and maybe further restrict the list by given master key verification patterns. For different key types there may be different ways to match the master key verification patterns. For CCA keys (CCA data key and CCA cipher key) the first 8 bytes of cur_mkvp refer to the current mkvp value of the apqn and the first 8 bytes of the alt_mkvp refer to the old mkvp. The flags argument controls if the apqns current and/or alternate mkvp should match. If the PKEY_FLAGS_MATCH_CUR_MKVP is given, only the current mkvp of each apqn is compared. Likewise with the PKEY_FLAGS_MATCH_ALT_MKVP. If both are given, it is assumed to return apqns where either the current or the alternate mkvp matches. If no matching APQN is found, the ioctl returns with 0 but the apqn_entries value is 0. These new ioctls are now prepared for another new type of secure key blob which may come in the future. They all use a pointer to the key blob and a key blob length information instead of some hardcoded byte array. They all use the new enums pkey_key_type, pkey_key_size and pkey_key_info for getting/setting key type, key size and additional info about the key. All but the PKEY_VERIFY2 ioctl now work based on a list of apqns. This list is walked through trying to perform the operation on exactly this apqn without any further checking (like card type or online state). If the apqn fails, simple the next one in the list is tried until success (return 0) or the end of the list is reached (return -1 with errno ENODEV). All apqns in the list need to be exact apqns (0xFFFF as any card or domain is not allowed). There are two new ioctls which can be used to build a list of apqns based on a key or key type and maybe restricted by match to a current or alternate master key verifcation pattern. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-08-21s390/pkey: pkey cleanup: narrow in-kernel API, fix some variable typesHarald Freudenberger
There are a lot of pkey functions exported as in-kernel callable API functions but not used at all. This patch narrows down the pkey in-kernel API to what is currently only used and exploited. Within the kernel just use u32 without any leading __u32. Also functions declared in a header file in arch/s390/include/asm don't need a comment 'In-kernel API', this is by definition, otherwise the header file would be in arch/s390/include/uapi/asm. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>