aboutsummaryrefslogtreecommitdiffstats
path: root/arch/arm64/mm
AgeCommit message (Collapse)Author
2020-04-17arm64/mm: Hold memory hotplug lock while walking for kernel page table dumpAnshuman Khandual
[ Upstream commit bf2b59f60ee1fefa768d62ec6e8f4b4d9e04c691 ] The arm64 page table dump code can race with concurrent modification of the kernel page tables. When a leaf entries are modified concurrently, the dump code may log stale or inconsistent information for a VA range, but this is otherwise not harmful. When intermediate levels of table are freed, the dump code will continue to use memory which has been freed and potentially reallocated for another purpose. In such cases, the dump code may dereference bogus addresses, leading to a number of potential problems. Intermediate levels of table may by freed during memory hot-remove, which will be enabled by a subsequent patch. To avoid racing with this, take the memory hotplug lock when walking the kernel page table. Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-03-02arm64: context: Fix ASID limit in boot messagesJean-Philippe Brucker
Since commit f88f42f853a8 ("arm64: context: Free up kernel ASIDs if KPTI is not in use"), the NUM_USER_ASIDS macro doesn't correspond to the effective number of ASIDs when KPTI is enabled. Get an accurate number of available ASIDs in an arch_initcall, once we've discovered all CPUs' capabilities and know if we still need to halve the ASID space for KPTI. Fixes: f88f42f853a8 ("arm64: context: Free up kernel ASIDs if KPTI is not in use") Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-02-04x86: mm: avoid allocating struct mm_struct on the stackSteven Price
struct mm_struct is quite large (~1664 bytes) and so allocating on the stack may cause problems as the kernel stack size is small. Since ptdump_walk_pgd_level_core() was only allocating the structure so that it could modify the pgd argument we can instead introduce a pgd override in struct mm_walk and pass this down the call stack to where it is needed. Since the correct mm_struct is now being passed down, it is now also unnecessary to take the mmap_sem semaphore because ptdump_walk_pgd() will now take the semaphore on the real mm. [steven.price@arm.com: restore missed arm64 changes] Link: http://lkml.kernel.org/r/20200108145710.34314-1-steven.price@arm.com Link: http://lkml.kernel.org/r/20200108145710.34314-1-steven.price@arm.com Signed-off-by: Steven Price <steven.price@arm.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Hogan <jhogan@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: "Liang, Kan" <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Burton <paul.burton@mips.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Zong Li <zong.li@sifive.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-04mm: ptdump: reduce level numbers by 1 in note_page()Steven Price
Rather than having to increment the 'depth' number by 1 in ptdump_hole(), let's change the meaning of 'level' in note_page() since that makes the code simplier. Note that for x86, the level numbers were previously increased by 1 in commit 45dcd2091363 ("x86/mm/dump_pagetables: Fix printout of p4d level") and the comment "Bit 7 has a different meaning" was not updated, so this change also makes the code match the comment again. Link: http://lkml.kernel.org/r/20191218162402.45610-24-steven.price@arm.com Signed-off-by: Steven Price <steven.price@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Hogan <jhogan@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: "Liang, Kan" <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Burton <paul.burton@mips.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Zong Li <zong.li@sifive.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-04arm64: mm: display non-present entries in ptdumpSteven Price
Previously the /sys/kernel/debug/kernel_page_tables file would only show lines for entries present in the page tables. However it is useful to also show non-present entries as this makes the size and level of the holes more visible. This aligns the behaviour with x86 which also shows holes. Link: http://lkml.kernel.org/r/20191218162402.45610-23-steven.price@arm.com Signed-off-by: Steven Price <steven.price@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Hogan <jhogan@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: "Liang, Kan" <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Burton <paul.burton@mips.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Zong Li <zong.li@sifive.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-04arm64: mm: convert mm/dump.c to use walk_page_range()Steven Price
Now walk_page_range() can walk kernel page tables, we can switch the arm64 ptdump code over to using it, simplifying the code. Link: http://lkml.kernel.org/r/20191218162402.45610-22-steven.price@arm.com Signed-off-by: Steven Price <steven.price@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Hogan <jhogan@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: "Liang, Kan" <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Burton <paul.burton@mips.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Zong Li <zong.li@sifive.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-27Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "The changes are a real mixed bag this time around. The only scary looking one from the diffstat is the uapi change to asm-generic/mman-common.h, but this has been acked by Arnd and is actually just adding a pair of comments in an attempt to prevent allocation of some PROT values which tend to get used for arch-specific purposes. We'll be using them for Branch Target Identification (a CFI-like hardening feature), which is currently under review on the mailing list. New architecture features: - Support for Armv8.5 E0PD, which benefits KASLR in the same way as KPTI but without the overhead. This allows KPTI to be disabled on CPUs that are not affected by Meltdown, even is KASLR is enabled. - Initial support for the Armv8.5 RNG instructions, which claim to provide access to a high bandwidth, cryptographically secure hardware random number generator. As well as exposing these to userspace, we also use them as part of the KASLR seed and to seed the crng once all CPUs have come online. - Advertise a bunch of new instructions to userspace, including support for Data Gathering Hint, Matrix Multiply and 16-bit floating point. Kexec: - Cleanups in preparation for relocating with the MMU enabled - Support for loading crash dump kernels with kexec_file_load() Perf and PMU drivers: - Cleanups and non-critical fixes for a couple of system PMU drivers FPU-less (aka broken) CPU support: - Considerable fixes to support CPUs without the FP/SIMD extensions, including their presence in heterogeneous systems. Good luck finding a 64-bit userspace that handles this. Modern assembly function annotations: - Start migrating our use of ENTRY() and ENDPROC() over to the new-fangled SYM_{CODE,FUNC}_{START,END} macros, which are intended to aid debuggers Kbuild: - Cleanup detection of LSE support in the assembler by introducing 'as-instr' - Remove compressed Image files when building clean targets IP checksumming: - Implement optimised IPv4 checksumming routine when hardware offload is not in use. An IPv6 version is in the works, pending testing. Hardware errata: - Work around Cortex-A55 erratum #1530923 Shadow call stack: - Work around some issues with Clang's integrated assembler not liking our perfectly reasonable assembly code - Avoid allocating the X18 register, so that it can be used to hold the shadow call stack pointer in future ACPI: - Fix ID count checking in IORT code. This may regress broken firmware that happened to work with the old implementation, in which case we'll have to revert it and try something else - Fix DAIF corruption on return from GHES handler with pseudo-NMIs Miscellaneous: - Whitelist some CPUs that are unaffected by Spectre-v2 - Reduce frequency of ASID rollover when KPTI is compiled in but inactive - Reserve a couple of arch-specific PROT flags that are already used by Sparc and PowerPC and are planned for later use with BTI on arm64 - Preparatory cleanup of our entry assembly code in preparation for moving more of it into C later on - Refactoring and cleanup" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (73 commits) arm64: acpi: fix DAIF manipulation with pNMI arm64: kconfig: Fix alignment of E0PD help text arm64: Use v8.5-RNG entropy for KASLR seed arm64: Implement archrandom.h for ARMv8.5-RNG arm64: kbuild: remove compressed images on 'make ARCH=arm64 (dist)clean' arm64: entry: Avoid empty alternatives entries arm64: Kconfig: select HAVE_FUTEX_CMPXCHG arm64: csum: Fix pathological zero-length calls arm64: entry: cleanup sp_el0 manipulation arm64: entry: cleanup el0 svc handler naming arm64: entry: mark all entry code as notrace arm64: assembler: remove smp_dmb macro arm64: assembler: remove inherit_daif macro ACPI/IORT: Fix 'Number of IDs' handling in iort_id_map() mm: Reserve asm-generic prot flags 0x10 and 0x20 for arch use arm64: Use macros instead of hard-coded constants for MAIR_EL1 arm64: Add KRYO{3,4}XX CPU cores to spectre-v2 safe list arm64: kernel: avoid x18 in __cpu_soft_restart arm64: kvm: stop treating register x18 as caller save arm64/lib: copy_page: avoid x18 register in assembler code ...
2020-01-22Merge branch 'for-next/asm-annotations' into for-next/coreWill Deacon
* for-next/asm-annotations: (6 commits) arm64: kernel: Correct annotation of end of el0_sync ...
2020-01-22Merge branches 'for-next/acpi', 'for-next/cpufeatures', 'for-next/csum', ↵Will Deacon
'for-next/e0pd', 'for-next/entry', 'for-next/kbuild', 'for-next/kexec/cleanup', 'for-next/kexec/file-kdump', 'for-next/misc', 'for-next/nofpsimd', 'for-next/perf' and 'for-next/scs' into for-next/core * for-next/acpi: ACPI/IORT: Fix 'Number of IDs' handling in iort_id_map() * for-next/cpufeatures: (2 commits) arm64: Introduce ID_ISAR6 CPU register ... * for-next/csum: (2 commits) arm64: csum: Fix pathological zero-length calls ... * for-next/e0pd: (7 commits) arm64: kconfig: Fix alignment of E0PD help text ... * for-next/entry: (5 commits) arm64: entry: cleanup sp_el0 manipulation ... * for-next/kbuild: (4 commits) arm64: kbuild: remove compressed images on 'make ARCH=arm64 (dist)clean' ... * for-next/kexec/cleanup: (11 commits) Revert "arm64: kexec: make dtb_mem always enabled" ... * for-next/kexec/file-kdump: (2 commits) arm64: kexec_file: add crash dump support ... * for-next/misc: (12 commits) arm64: entry: Avoid empty alternatives entries ... * for-next/nofpsimd: (7 commits) arm64: nofpsmid: Handle TIF_FOREIGN_FPSTATE flag cleanly ... * for-next/perf: (2 commits) perf/imx_ddr: Fix cpu hotplug state cleanup ... * for-next/scs: (6 commits) arm64: kernel: avoid x18 in __cpu_soft_restart ...
2020-01-17arm64: Use macros instead of hard-coded constants for MAIR_EL1Catalin Marinas
Currently, the arm64 __cpu_setup has hard-coded constants for the memory attributes that go into the MAIR_EL1 register. Define proper macros in asm/sysreg.h and make use of them in proc.S. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-01-16arm64: mm: avoid x18 in idmap_kpti_install_ng_mappingsSami Tolvanen
idmap_kpti_install_ng_mappings uses x18 as a temporary register, which will result in a conflict when x18 is reserved. Use x16 and x17 instead where needed. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-01-16arm64: context: Free up kernel ASIDs if KPTI is not in useVladimir Murzin
We can extend user ASID space if it turns out that system does not require KPTI. We start with kernel ASIDs reserved because CPU caps are not finalized yet and free them up lazily on the next rollover if we confirm than KPTI is not in use. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-01-08mm: change_memory_common: add spaces for `*` operatorPan Zhang
Leaves one space before and after a binary operator both, it may be more elegant. Signed-off-by: Pan Zhang <zhangpan26@huawei.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-01-08arm64: mm: Use modern annotations for assembly functionsMark Brown
In an effort to clarify and simplify the annotation of assembly functions in the kernel new macros have been introduced. These replace ENTRY and ENDPROC and also add a new annotation for static functions which previously had no ENTRY equivalent. Update the annotations in the mm code to the new macros. Even the functions called from non-standard environments like idmap have no special requirements on their environments so can be treated like regular functions. Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-01-06arm64: Revert support for execute-only user mappingsCatalin Marinas
The ARMv8 64-bit architecture supports execute-only user permissions by clearing the PTE_USER and PTE_UXN bits, practically making it a mostly privileged mapping but from which user running at EL0 can still execute. The downside, however, is that the kernel at EL1 inadvertently reading such mapping would not trip over the PAN (privileged access never) protection. Revert the relevant bits from commit cab15ce604e5 ("arm64: Introduce execute-only page access permissions") so that PROT_EXEC implies PROT_READ (and therefore PTE_USER) until the architecture gains proper support for execute-only user mappings. Fixes: cab15ce604e5 ("arm64: Introduce execute-only page access permissions") Cc: <stable@vger.kernel.org> # 4.9.x- Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-04mm/memory_hotplug: shrink zones when offlining memoryDavid Hildenbrand
We currently try to shrink a single zone when removing memory. We use the zone of the first page of the memory we are removing. If that memmap was never initialized (e.g., memory was never onlined), we will read garbage and can trigger kernel BUGs (due to a stale pointer): BUG: unable to handle page fault for address: 000000000000353d #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 0 P4D 0 Oops: 0002 [#1] SMP PTI CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 5.3.0-rc5-next-20190820+ #317 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4 Workqueue: kacpi_hotplug acpi_hotplug_work_fn RIP: 0010:clear_zone_contiguous+0x5/0x10 Code: 48 89 c6 48 89 c3 e8 2a fe ff ff 48 85 c0 75 cf 5b 5d c3 c6 85 fd 05 00 00 01 5b 5d c3 0f 1f 840 RSP: 0018:ffffad2400043c98 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000200000000 RCX: 0000000000000000 RDX: 0000000000200000 RSI: 0000000000140000 RDI: 0000000000002f40 RBP: 0000000140000000 R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000 R13: 0000000000140000 R14: 0000000000002f40 R15: ffff9e3e7aff3680 FS: 0000000000000000(0000) GS:ffff9e3e7bb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000353d CR3: 0000000058610000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __remove_pages+0x4b/0x640 arch_remove_memory+0x63/0x8d try_remove_memory+0xdb/0x130 __remove_memory+0xa/0x11 acpi_memory_device_remove+0x70/0x100 acpi_bus_trim+0x55/0x90 acpi_device_hotplug+0x227/0x3a0 acpi_hotplug_work_fn+0x1a/0x30 process_one_work+0x221/0x550 worker_thread+0x50/0x3b0 kthread+0x105/0x140 ret_from_fork+0x3a/0x50 Modules linked in: CR2: 000000000000353d Instead, shrink the zones when offlining memory or when onlining failed. Introduce and use remove_pfn_range_from_zone(() for that. We now properly shrink the zones, even if we have DIMMs whereby - Some memory blocks fall into no zone (never onlined) - Some memory blocks fall into multiple zones (offlined+re-onlined) - Multiple memory blocks that fall into different zones Drop the zone parameter (with a potential dubious value) from __remove_pages() and __remove_section(). Link: http://lkml.kernel.org/r/20191006085646.5768-6-david@redhat.com Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") [visible after d0dc12e86b319] Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: <stable@vger.kernel.org> [5.0+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-12-06Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - ZONE_DMA32 initialisation fix when memblocks fall entirely within the first GB (used by ZONE_DMA in 5.5 for Raspberry Pi 4). - Couple of ftrace fixes following the FTRACE_WITH_REGS patchset. - access_ok() fix for the Tagged Address ABI when called from from a kernel thread (asynchronous I/O): the kthread does not have the TIF flags of the mm owner, so untag the user address unconditionally. - KVM compute_layout() called before the alternatives code patching. - Minor clean-ups. * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: entry: refine comment of stack overflow check arm64: ftrace: fix ifdeffery arm64: KVM: Invoke compute_layout() before alternatives are applied arm64: Validate tagged addresses in access_ok() called from kernel threads arm64: mm: Fix column alignment for UXN in kernel_page_tables arm64: insn: consistently handle exit text arm64: mm: Fix initialisation of DMA zones on non-NUMA systems
2019-12-04arm64: mm: Fix column alignment for UXN in kernel_page_tablesMark Brown
UXN is the only individual PTE bit other than the PTE_ATTRINDX_MASK ones which doesn't have both a set and a clear value provided, meaning that the columns in the table won't all be aligned. The PTE_ATTRINDX_MASK values are all both mutually exclusive and longer so are listed last to make a single final column for those values. Ensure everything is aligned by providing a clear value for UXN. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-12-04arm64: mm: Fix initialisation of DMA zones on non-NUMA systemsWill Deacon
John reports that the recently merged commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32") breaks the boot on his DB845C board: | Booting Linux on physical CPU 0x0000000000 [0x517f803c] | Linux version 5.4.0-mainline-10675-g957a03b9e38f | Machine model: Thundercomm Dragonboard 845c | [...] | Built 1 zonelists, mobility grouping on. Total pages: -188245 | Kernel command line: earlycon | firmware_class.path=/vendor/firmware/ androidboot.hardware=db845c | init=/init androidboot.boot_devices=soc/1d84000.ufshc | printk.devkmsg=on buildvariant=userdebug root=/dev/sda2 | androidboot.bootdevice=1d84000.ufshc androidboot.serialno=c4e1189c | androidboot.baseband=sda | msm_drm.dsi_display0=dsi_lt9611_1080_video_display: | androidboot.slot_suffix=_a skip_initramfs rootwait ro init=/init | | <hangs indefinitely here> This is because, when CONFIG_NUMA=n, zone_sizes_init() fails to handle memblocks that fall entirely within the ZONE_DMA region and erroneously ends up trying to add a negatively-sized region into the following ZONE_DMA32, which is later interpreted as a large unsigned region by the core MM code. Rework the non-NUMA implementation of zone_sizes_init() so that the start address of the memblock being processed is adjusted according to the end of the previous zone, which is then range-checked before updating the hole information of subsequent zones. Cc: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/lkml/CALAqxLVVcsmFrDKLRGRq7GewcW405yTOxG=KR3csVzQ6bXutkA@mail.gmail.com Fixes: 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32") Reported-by: John Stultz <john.stultz@linaro.org> Tested-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-11-28Merge branch 'master' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux; tag 'dma-mapping-5.5' of git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping updates from Christoph Hellwig: - improve dma-debug scalability (Eric Dumazet) - tiny dma-debug cleanup (Dan Carpenter) - check for vmap memory in dma_map_single (Kees Cook) - check for dma_addr_t overflows in dma-direct when using DMA offsets (Nicolas Saenz Julienne) - switch the x86 sta2x11 SOC to use more generic DMA code (Nicolas Saenz Julienne) - fix arm-nommu dma-ranges handling (Vladimir Murzin) - use __initdata in CMA (Shyam Saini) - replace the bus dma mask with a limit (Nicolas Saenz Julienne) - merge the remapping helpers into the main dma-direct flow (me) - switch xtensa to the generic dma remap handling (me) - various cleanups around dma_capable (me) - remove unused dev arguments to various dma-noncoherent helpers (me) * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux: * tag 'dma-mapping-5.5' of git://git.infradead.org/users/hch/dma-mapping: (22 commits) dma-mapping: treat dev->bus_dma_mask as a DMA limit dma-direct: exclude dma_direct_map_resource from the min_low_pfn check dma-direct: don't check swiotlb=force in dma_direct_map_resource dma-debug: clean up put_hash_bucket() powerpc: remove support for NULL dev in __phys_to_dma / __dma_to_phys dma-direct: avoid a forward declaration for phys_to_dma dma-direct: unify the dma_capable definitions dma-mapping: drop the dev argument to arch_sync_dma_for_* x86/PCI: sta2x11: use default DMA address translation dma-direct: check for overflows on 32 bit DMA addresses dma-debug: increase HASH_SIZE dma-debug: reorder struct dma_debug_entry fields xtensa: use the generic uncached segment support dma-mapping: merge the generic remapping helpers into dma-direct dma-direct: provide mmap and get_sgtable method overrides dma-direct: remove the dma_handle argument to __dma_direct_alloc_pages dma-direct: remove __dma_direct_free_pages usb: core: Remove redundant vmap checks kernel: dma-contiguous: mark CMA parameters __initdata/__initconst dma-debug: add a schedule point in debug_dma_dump_mappings() ...
2019-11-26Merge tag 'acpi-5.5-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "These update the ACPICA code in the kernel to upstream revision 20191018, add support for EFI specific purpose memory, update the ACPI EC driver to make it work on systems with hardware-reduced ACPI, improve ACPI-based device enumeration for some platforms, rework the lid blacklist handling in the button driver and add more lid quirks to it, unify ACPI _HID/_UID matching, fix assorted issues and clean up the code and documentation. Specifics: - Update the ACPICA code in the kernel to upstream revision 20191018 including: * Fixes for Clang warnings (Bob Moore) * Fix for possible overflow in get_tick_count() (Bob Moore) * Introduction of acpi_unload_table() (Bob Moore) * Debugger and utilities updates (Erik Schmauss) * Fix for unloading tables loaded via configfs (Nikolaus Voss) - Add support for EFI specific purpose memory to optionally allow either application-exclusive or core-kernel-mm managed access to differentiated memory (Dan Williams) - Fix and clean up processing of the HMAT table (Brice Goglin, Qian Cai, Tao Xu) - Update the ACPI EC driver to make it work on systems with hardware-reduced ACPI (Daniel Drake) - Always build in support for the Generic Event Device (GED) to allow one kernel binary to work both on systems with full hardware ACPI and hardware-reduced ACPI (Arjan van de Ven) - Fix the table unload mechanism to unregister platform devices created when the given table was loaded (Andy Shevchenko) - Rework the lid blacklist handling in the button driver and add more lid quirks to it (Hans de Goede) - Improve ACPI-based device enumeration for some platforms based on Intel BayTrail SoCs (Hans de Goede) - Add an OpRegion driver for the Cherry Trail Crystal Cove PMIC and prevent handlers from being registered for unhandled PMIC OpRegions (Hans de Goede) - Unify ACPI _HID/_UID matching (Andy Shevchenko) - Clean up documentation and comments (Cao jin, James Pack, Kacper Piwiński)" * tag 'acpi-5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (52 commits) ACPI: OSI: Shoot duplicate word ACPI: HMAT: use %u instead of %d to print u32 values ACPI: NUMA: HMAT: fix a section mismatch ACPI: HMAT: don't mix pxm and nid when setting memory target processor_pxm ACPI: NUMA: HMAT: Register "soft reserved" memory as an "hmem" device ACPI: NUMA: HMAT: Register HMAT at device_initcall level device-dax: Add a driver for "hmem" devices dax: Fix alloc_dax_region() compile warning lib: Uplevel the pmem "region" ida to a global allocator x86/efi: Add efi_fake_mem support for EFI_MEMORY_SP arm/efi: EFI soft reservation to memblock x86/efi: EFI soft reservation to E820 enumeration efi: Common enable/disable infrastructure for EFI soft reservation x86/efi: Push EFI_MEMMAP check into leaf routines efi: Enumerate EFI_MEMORY_SP ACPI: NUMA: Establish a new drivers/acpi/numa/ directory ACPICA: Update version to 20191018 ACPICA: debugger: remove leading whitespaces when converting a string to a buffer ACPICA: acpiexec: initialize all simple types and field units from user input ACPICA: debugger: add field unit support for acpi_db_get_next_token ...
2019-11-21Merge branch 'for-next/zone-dma' of ↵Christoph Hellwig
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into dma-mapping-for-next Pull in a stable branch from the arm64 tree that adds the zone_dma_bits variable to avoid creating hard to resolve conflicts with that addition.
2019-11-20dma-mapping: drop the dev argument to arch_sync_dma_for_*Christoph Hellwig
These are pure cache maintainance routines, so drop the unused struct device argument. Signed-off-by: Christoph Hellwig <hch@lst.de> Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2019-11-08Merge branches 'for-next/elf-hwcap-docs', 'for-next/smccc-conduit-cleanup', ↵Catalin Marinas
'for-next/zone-dma', 'for-next/relax-icc_pmr_el1-sync', 'for-next/double-page-fault', 'for-next/misc', 'for-next/kselftest-arm64-signal' and 'for-next/kaslr-diagnostics' into for-next/core * for-next/elf-hwcap-docs: : Update the arm64 ELF HWCAP documentation docs/arm64: cpu-feature-registers: Rewrite bitfields that don't follow [e, s] docs/arm64: cpu-feature-registers: Documents missing visible fields docs/arm64: elf_hwcaps: Document HWCAP_SB docs/arm64: elf_hwcaps: sort the HWCAP{, 2} documentation by ascending value * for-next/smccc-conduit-cleanup: : SMC calling convention conduit clean-up firmware: arm_sdei: use common SMCCC_CONDUIT_* firmware/psci: use common SMCCC_CONDUIT_* arm: spectre-v2: use arm_smccc_1_1_get_conduit() arm64: errata: use arm_smccc_1_1_get_conduit() arm/arm64: smccc/psci: add arm_smccc_1_1_get_conduit() * for-next/zone-dma: : Reintroduction of ZONE_DMA for Raspberry Pi 4 support arm64: mm: reserve CMA and crashkernel in ZONE_DMA32 dma/direct: turn ARCH_ZONE_DMA_BITS into a variable arm64: Make arm64_dma32_phys_limit static arm64: mm: Fix unused variable warning in zone_sizes_init mm: refresh ZONE_DMA and ZONE_DMA32 comments in 'enum zone_type' arm64: use both ZONE_DMA and ZONE_DMA32 arm64: rename variables used to calculate ZONE_DMA32's size arm64: mm: use arm64_dma_phys_limit instead of calling max_zone_dma_phys() * for-next/relax-icc_pmr_el1-sync: : Relax ICC_PMR_EL1 (GICv3) accesses when ICC_CTLR_EL1.PMHE is clear arm64: Document ICC_CTLR_EL3.PMHE setting requirements arm64: Relax ICC_PMR_EL1 accesses when ICC_CTLR_EL1.PMHE is clear * for-next/double-page-fault: : Avoid a double page fault in __copy_from_user_inatomic() if hw does not support auto Access Flag mm: fix double page fault on arm64 if PTE_AF is cleared x86/mm: implement arch_faults_on_old_pte() stub on x86 arm64: mm: implement arch_faults_on_old_pte() on arm64 arm64: cpufeature: introduce helper cpu_has_hw_af() * for-next/misc: : Various fixes and clean-ups arm64: kpti: Add NVIDIA's Carmel core to the KPTI whitelist arm64: mm: Remove MAX_USER_VA_BITS definition arm64: mm: simplify the page end calculation in __create_pgd_mapping() arm64: print additional fault message when executing non-exec memory arm64: psci: Reduce the waiting time for cpu_psci_cpu_kill() arm64: pgtable: Correct typo in comment arm64: docs: cpu-feature-registers: Document ID_AA64PFR1_EL1 arm64: cpufeature: Fix typos in comment arm64/mm: Poison initmem while freeing with free_reserved_area() arm64: use generic free_initrd_mem() arm64: simplify syscall wrapper ifdeffery * for-next/kselftest-arm64-signal: : arm64-specific kselftest support with signal-related test-cases kselftest: arm64: fake_sigreturn_misaligned_sp kselftest: arm64: fake_sigreturn_bad_size kselftest: arm64: fake_sigreturn_duplicated_fpsimd kselftest: arm64: fake_sigreturn_missing_fpsimd kselftest: arm64: fake_sigreturn_bad_size_for_magic0 kselftest: arm64: fake_sigreturn_bad_magic kselftest: arm64: add helper get_current_context kselftest: arm64: extend test_init functionalities kselftest: arm64: mangle_pstate_invalid_mode_el[123][ht] kselftest: arm64: mangle_pstate_invalid_daif_bits kselftest: arm64: mangle_pstate_invalid_compat_toggle and common utils kselftest: arm64: extend toplevel skeleton Makefile * for-next/kaslr-diagnostics: : Provide diagnostics on boot for KASLR arm64: kaslr: Check command line before looking for a seed arm64: kaslr: Announce KASLR status on boot
2019-11-07arm/efi: EFI soft reservation to memblockDan Williams
UEFI 2.8 defines an EFI_MEMORY_SP attribute bit to augment the interpretation of the EFI Memory Types as "reserved for a specific purpose". The proposed Linux behavior for specific purpose memory is that it is reserved for direct-access (device-dax) by default and not available for any kernel usage, not even as an OOM fallback. Later, through udev scripts or another init mechanism, these device-dax claimed ranges can be reconfigured and hot-added to the available System-RAM with a unique node identifier. This device-dax management scheme implements "soft" in the "soft reserved" designation by allowing some or all of the reservation to be recovered as typical memory. This policy can be disabled at compile-time with CONFIG_EFI_SOFT_RESERVE=n, or runtime with efi=nosoftreserve. For this patch, update the ARM paths that consider EFI_CONVENTIONAL_MEMORY to optionally take the EFI_MEMORY_SP attribute into account as a reservation indicator. Publish the soft reservation as IORES_DESC_SOFT_RESERVED memory, similar to x86. (Based on an original patch by Ard) Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-11-07arm64: mm: reserve CMA and crashkernel in ZONE_DMA32Nicolas Saenz Julienne
With the introduction of ZONE_DMA in arm64 we moved the default CMA and crashkernel reservation into that area. This caused a regression on big machines that need big CMA and crashkernel reservations. Note that ZONE_DMA is only 1GB big. Restore the previous behavior as the wide majority of devices are OK with reserving these in ZONE_DMA32. The ones that need them in ZONE_DMA will configure it explicitly. Fixes: 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32") Reported-by: Qian Cai <cai@lca.pw> Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-11-06arm64: mm: simplify the page end calculation in __create_pgd_mapping()Masahiro Yamada
Calculate the page-aligned end address more simply. The local variable, "length" is unneeded. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-11-01dma/direct: turn ARCH_ZONE_DMA_BITS into a variableNicolas Saenz Julienne
Some architectures, notably ARM, are interested in tweaking this depending on their runtime DMA addressing limitations. Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-29arm64: print additional fault message when executing non-exec memoryXiang Zheng
When attempting to executing non-executable memory, the fault message shows: Unable to handle kernel read from unreadable memory at virtual address ffff802dac469000 This may confuse someone, so add a new fault message for instruction abort. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-28Merge branch 'for-next/entry-s-to-c' into for-next/coreCatalin Marinas
Move the synchronous exception paths from entry.S into a C file to improve the code readability. * for-next/entry-s-to-c: arm64: entry-common: don't touch daif before bp-hardening arm64: Remove asmlinkage from updated functions arm64: entry: convert el0_sync to C arm64: entry: convert el1_sync to C arm64: add local_daif_inherit() arm64: Add prototypes for functions called by entry.S arm64: remove __exception annotations
2019-10-28arm64: Make arm64_dma32_phys_limit staticCatalin Marinas
This variable is only used in the arch/arm64/mm/init.c file for ZONE_DMA32 initialisation, no need to expose it. Reported-by: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-28arm64: entry-common: don't touch daif before bp-hardeningJames Morse
The previous patches mechanically transformed the assembly version of entry.S to entry-common.c for synchronous exceptions. The C version of local_daif_restore() doesn't quite do the same thing as the assembly versions if pseudo-NMI is in use. In particular, | local_daif_restore(DAIF_PROCCTX_NOIRQ) will still allow pNMI to be delivered. This is not the behaviour do_el0_ia_bp_hardening() and do_sp_pc_abort() want as it should not be possible for the PMU handler to run as an NMI until the bp-hardening sequence has run. The bp-hardening calls were placed where they are because this was the first C code to run after the relevant exceptions. As we've now moved that point earlier, move the checks and calls earlier too. This makes it clearer that this stuff runs before any kind of exception, and saves modifying PSTATE twice. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-28arm64: Remove asmlinkage from updated functionsJames Morse
Now that the callers of these functions have moved into C, they no longer need the asmlinkage annotation. Remove it. Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-28arm64: remove __exception annotationsJames Morse
Since commit 732674980139 ("arm64: unwind: reference pt_regs via embedded stack frame") arm64 has not used the __exception annotation to dump the pt_regs during stack tracing. in_exception_text() has no callers. This annotation is only used to blacklist kprobes, it means the same as __kprobes. Section annotations like this require the functions to be grouped together between the start/end markers, and placed according to the linker script. For kprobes we also have NOKPROBE_SYMBOL() which logs the symbol address in a section that kprobes parses and blacklists at boot. Using NOKPROBE_SYMBOL() instead lets kprobes publish the list of blacklisted symbols, and saves us from having an arm64 specific spelling of __kprobes. do_debug_exception() already has a NOKPROBE_SYMBOL() annotation. Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-16arm64: mm: fix inverted PAR_EL1.F checkMark Rutland
When detecting a spurious EL1 translation fault, we have the CPU retry the translation using an AT S1E1R instruction, and inspect PAR_EL1 to determine if the fault was spurious. When PAR_EL1.F == 0, the AT instruction successfully translated the address without a fault, which implies the original fault was spurious. However, in this case we return false and treat the original fault as if it was not spurious. Invert the return value so that we treat such a case as spurious. Cc: Catalin Marinas <catalin.marinas@arm.com> Fixes: 42f91093b043 ("arm64: mm: Ignore spurious translation faults taken from the kernel") Tested-by: James Morse <james.morse@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-10-16arm64: mm: Fix unused variable warning in zone_sizes_initNathan Chancellor
When building arm64 allnoconfig, CONFIG_ZONE_DMA and CONFIG_ZONE_DMA32 get disabled so there is a warning about max_dma being unused. ../arch/arm64/mm/init.c:215:16: warning: unused variable 'max_dma' [-Wunused-variable] unsigned long max_dma = min; ^ 1 warning generated. Add __maybe_unused to make this clear to the compiler. Fixes: 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32") Reviewed-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-16arm64/mm: Poison initmem while freeing with free_reserved_area()Anshuman Khandual
Platform implementation for free_initmem() should poison the memory while freeing it up. Hence pass across POISON_FREE_INITMEM while calling into free_reserved_area(). The same is being followed in the generic fallback for free_initmem() and some other platforms overriding it. Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-kernel@vger.kernel.org Reviewed-by: Steven Price <steven.price@arm.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-16arm64: use generic free_initrd_mem()Mike Rapoport
arm64 calls memblock_free() for the initrd area in its implementation of free_initrd_mem(), but this call has no actual effect that late in the boot process. By the time initrd is freed, all the reserved memory is managed by the page allocator and the memblock.reserved is unused, so the only purpose of the memblock_free() call is to keep track of initrd memory for debugging and accounting. Without the memblock_free() call the only difference between arm64 and the generic versions of free_initrd_mem() is the memory poisoning. Move memblock_free() call to the generic code, enable it there for the architectures that define ARCH_KEEP_MEMBLOCK and use the generic implementation of free_initrd_mem() on arm64. Tested-by: Anshuman Khandual <anshuman.khandual@arm.com> #arm64 Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-14arm64: use both ZONE_DMA and ZONE_DMA32Nicolas Saenz Julienne
So far all arm64 devices have supported 32 bit DMA masks for their peripherals. This is not true anymore for the Raspberry Pi 4 as most of it's peripherals can only address the first GB of memory on a total of up to 4 GB. This goes against ZONE_DMA32's intent, as it's expected for ZONE_DMA32 to be addressable with a 32 bit mask. So it was decided to re-introduce ZONE_DMA in arm64. ZONE_DMA will contain the lower 1G of memory, which is currently the memory area addressable by any peripheral on an arm64 device. ZONE_DMA32 will contain the rest of the 32 bit addressable memory. Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-14arm64: rename variables used to calculate ZONE_DMA32's sizeNicolas Saenz Julienne
Let the name indicate that they are used to calculate ZONE_DMA32's size as opposed to ZONE_DMA. Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-14arm64: mm: use arm64_dma_phys_limit instead of calling max_zone_dma_phys()Nicolas Saenz Julienne
By the time we call zones_sizes_init() arm64_dma_phys_limit already contains the result of max_zone_dma_phys(). We use the variable instead of calling the function directly to save some precious cpu time. Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-07arm64: mm: fix spurious fault detectionMark Rutland
When detecting a spurious EL1 translation fault, we attempt to compare ESR_EL1.DFSC with PAR_EL1.FST. We erroneously use FIELD_PREP() to extract PAR_EL1.FST, when we should be using FIELD_GET(). In the wise words of Robin Murphy: | FIELD_GET() is a UBFX, FIELD_PREP() is a BFI Using FIELD_PREP() means that that dfsc & ESR_ELx_FSC_TYPE is always zero, and hence not equal to ESR_ELx_FSC_FAULT. Thus we detect any unhandled translation fault as spurious. ... so let's use FIELD_GET() to ensure we don't decide all translation faults are spurious. ESR_EL1.DFSC occupies bits [5:0], and requires no shifting. Fixes: 42f91093b043332a ("arm64: mm: Ignore spurious translation faults taken from the kernel") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reported-by: Robin Murphy <robin.murphy@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2019-10-04arm64: mm: avoid virt_to_phys(init_mm.pgd)Mark Rutland
If we take an unhandled fault in the kernel, we call show_pte() to dump the {PGDP,PGD,PUD,PMD,PTE} values for the corresponding page table walk, where the PGDP value is virt_to_phys(mm->pgd). The boot-time and runtime kernel page tables, init_pg_dir and swapper_pg_dir respectively, are kernel symbols. Thus, it is not valid to call virt_to_phys() on either of these, though we'll do so if we take a fault on a TTBR1 address. When CONFIG_DEBUG_VIRTUAL is not selected, virt_to_phys() will silently fix this up. However, when CONFIG_DEBUG_VIRTUAL is selected, this results in splats as below. Depending on when these occur, they can happen to suppress information needed to debug the original unhandled fault, such as the backtrace: | Unable to handle kernel paging request at virtual address ffff7fffec73cf0f | Mem abort info: | ESR = 0x96000004 | EC = 0x25: DABT (current EL), IL = 32 bits | SET = 0, FnV = 0 | EA = 0, S1PTW = 0 | Data abort info: | ISV = 0, ISS = 0x00000004 | CM = 0, WnR = 0 | ------------[ cut here ]------------ | virt_to_phys used for non-linear address: 00000000102c9dbe (swapper_pg_dir+0x0/0x1000) | WARNING: CPU: 1 PID: 7558 at arch/arm64/mm/physaddr.c:15 __virt_to_phys+0xe0/0x170 arch/arm64/mm/physaddr.c:12 | Kernel panic - not syncing: panic_on_warn set ... | SMP: stopping secondary CPUs | Dumping ftrace buffer: | (ftrace buffer empty) | Kernel Offset: disabled | CPU features: 0x0002,23000438 | Memory Limit: none | Rebooting in 1 seconds.. We can avoid this by ensuring that we call __pa_symbol() for init_mm.pgd, as this will always be a kernel symbol. As the dumped {PGD,PUD,PMD,PTE} values are the raw values from the relevant entries we don't need to handle these specially. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-09-26mm: treewide: clarify pgtable_page_{ctor,dtor}() namingMark Rutland
The naming of pgtable_page_{ctor,dtor}() seems to have confused a few people, and until recently arm64 used these erroneously/pointlessly for other levels of page table. To make it incredibly clear that these only apply to the PTE level, and to align with the naming of pgtable_pmd_page_{ctor,dtor}(), let's rename them to pgtable_pte_page_{ctor,dtor}(). These changes were generated with the following shell script: ---- git grep -lw 'pgtable_page_.tor' | while read FILE; do sed -i '{s/pgtable_page_ctor/pgtable_pte_page_ctor/}' $FILE; sed -i '{s/pgtable_page_dtor/pgtable_pte_page_dtor/}' $FILE; done ---- ... with the documentation re-flowed to remain under 80 columns, and whitespace fixed up in macros to keep backslashes aligned. There should be no functional change as a result of this patch. Link: http://lkml.kernel.org/r/20190722141133.3116-1-mark.rutland@arm.com Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24arm64, mm: move generic mmap layout functions to mmAlexandre Ghiti
arm64 handles top-down mmap layout in a way that can be easily reused by other architectures, so make it available in mm. It then introduces a new config ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT that can be set by other architectures to benefit from those functions. Note that this new config depends on MMU being enabled, if selected without MMU support, a warning will be thrown. Link: http://lkml.kernel.org/r/20190730055113.23635-5-alex@ghiti.fr Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Suggested-by: Christoph Hellwig <hch@infradead.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Kees Cook <keescook@chromium.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: James Hogan <jhogan@kernel.org> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Paul Burton <paul.burton@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24arm64: consider stack randomization for mmap base only when necessaryAlexandre Ghiti
Do not offset mmap base address because of stack randomization if current task does not want randomization. Note that x86 already implements this behaviour. Link: http://lkml.kernel.org/r/20190730055113.23635-4-alex@ghiti.fr Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Kees Cook <keescook@chromium.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: James Hogan <jhogan@kernel.org> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Paul Burton <paul.burton@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24arm64: make use of is_compat_task instead of hardcoding this testAlexandre Ghiti
Each architecture has its own way to determine if a task is a compat task, by using is_compat_task in arch_mmap_rnd, it allows more genericity and then it prepares its moving to mm/. Link: http://lkml.kernel.org/r/20190730055113.23635-3-alex@ghiti.fr Signed-off-by: Alexandre Ghiti <alex@ghiti.fr> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Kees Cook <keescook@chromium.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: James Hogan <jhogan@kernel.org> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Paul Burton <paul.burton@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24mm: consolidate pgtable_cache_init() and pgd_cache_init()Mike Rapoport
Both pgtable_cache_init() and pgd_cache_init() are used to initialize kmem cache for page table allocations on several architectures that do not use PAGE_SIZE tables for one or more levels of the page table hierarchy. Most architectures do not implement these functions and use __weak default NOP implementation of pgd_cache_init(). Since there is no such default for pgtable_cache_init(), its empty stub is duplicated among most architectures. Rename the definitions of pgd_cache_init() to pgtable_cache_init() and drop empty stubs of pgtable_cache_init(). Link: http://lkml.kernel.org/r/1566457046-22637-1-git-send-email-rppt@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Will Deacon <will@kernel.org> [arm64] Acked-by: Thomas Gleixner <tglx@linutronix.de> [x86] Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24mm: introduce page_size()Matthew Wilcox (Oracle)
Patch series "Make working with compound pages easier", v2. These three patches add three helpers and convert the appropriate places to use them. This patch (of 3): It's unnecessarily hard to find out the size of a potentially huge page. Replace 'PAGE_SIZE << compound_order(page)' with page_size(page). Link: http://lkml.kernel.org/r/20190721104612.19120-2-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-19Merge tag 'dma-mapping-5.4' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds
Pull dma-mapping updates from Christoph Hellwig: - add dma-mapping and block layer helpers to take care of IOMMU merging for mmc plus subsequent fixups (Yoshihiro Shimoda) - rework handling of the pgprot bits for remapping (me) - take care of the dma direct infrastructure for swiotlb-xen (me) - improve the dma noncoherent remapping infrastructure (me) - better defaults for ->mmap, ->get_sgtable and ->get_required_mask (me) - cleanup mmaping of coherent DMA allocations (me) - various misc cleanups (Andy Shevchenko, me) * tag 'dma-mapping-5.4' of git://git.infradead.org/users/hch/dma-mapping: (41 commits) mmc: renesas_sdhi_internal_dmac: Add MMC_CAP2_MERGE_CAPABLE mmc: queue: Fix bigger segments usage arm64: use asm-generic/dma-mapping.h swiotlb-xen: merge xen_unmap_single into xen_swiotlb_unmap_page swiotlb-xen: simplify cache maintainance swiotlb-xen: use the same foreign page check everywhere swiotlb-xen: remove xen_swiotlb_dma_mmap and xen_swiotlb_dma_get_sgtable xen: remove the exports for xen_{create,destroy}_contiguous_region xen/arm: remove xen_dma_ops xen/arm: simplify dma_cache_maint xen/arm: use dev_is_dma_coherent xen/arm: consolidate page-coherent.h xen/arm: use dma-noncoherent.h calls for xen-swiotlb cache maintainance arm: remove wrappers for the generic dma remap helpers dma-mapping: introduce a dma_common_find_pages helper dma-mapping: always use VM_DMA_COHERENT for generic DMA remap vmalloc: lift the arm flag for coherent mappings to common code dma-mapping: provide a better default ->get_required_mask dma-mapping: remove the dma_declare_coherent_memory export remoteproc: don't allow modular build ...