aboutsummaryrefslogtreecommitdiffstats
path: root/arch/powerpc/lib
AgeCommit message (Collapse)Author
2017-03-02sched/headers: Prepare to remove the <linux/mm_types.h> dependency from ↵Ingo Molnar
<linux/sched.h> Update code that relied on sched.h including various MM types for them. This will allow us to remove the <linux/mm_types.h> include from <linux/sched.h>. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-02-27kprobes: move kprobe declarations to asm-generic/kprobes.hLuis R. Rodriguez
Often all is needed is these small helpers, instead of compiler.h or a full kprobes.h. This is important for asm helpers, in fact even some asm/kprobes.h make use of these helpers... instead just keep a generic asm file with helpers useful for asm code with the least amount of clutter as possible. Likewise we need now to also address what to do about this file for both when architectures have CONFIG_HAVE_KPROBES, and when they do not. Then for when architectures have CONFIG_HAVE_KPROBES but have disabled CONFIG_KPROBES. Right now most asm/kprobes.h do not have guards against CONFIG_KPROBES, this means most architecture code cannot include asm/kprobes.h safely. Correct this and add guards for architectures missing them. Additionally provide architectures that not have kprobes support with the default asm-generic solution. This lets us force asm/kprobes.h on the header include/linux/kprobes.h always, but most importantly we can now safely include just asm/kprobes.h on architecture code without bringing the full kitchen sink of header files. Two architectures already provided a guard against CONFIG_KPROBES on its kprobes.h: sh, arch. The rest of the architectures needed gaurds added. We avoid including any not-needed headers on asm/kprobes.h unless kprobes have been enabled. In a subsequent atomic change we can try now to remove compiler.h from include/linux/kprobes.h. During this sweep I've also identified a few architectures defining a common macro needed for both kprobes and ftrace, that of the definition of the breakput instruction up. Some refer to this as BREAKPOINT_INSTRUCTION. This must be kept outside of the #ifdef CONFIG_KPROBES guard. [mcgrof@kernel.org: fix arm64 build] Link: http://lkml.kernel.org/r/CAB=NE6X1WMByuARS4mZ1g9+W=LuVBnMDnh_5zyN0CLADaVh=Jw@mail.gmail.com [sfr@canb.auug.org.au: fixup for kprobes declarations moving] Link: http://lkml.kernel.org/r/20170214165933.13ebd4f4@canb.auug.org.au Link: http://lkml.kernel.org/r/20170203233139.32682-1-mcgrof@kernel.org Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-10powerpc/kprobes: Implement OptprobesAnju T
Current infrastructure of kprobe uses the unconditional trap instruction to probe a running kernel. Optprobe allows kprobe to replace the trap with a branch instruction to a detour buffer. Detour buffer contains instructions to create an in memory pt_regs. Detour buffer also has a call to optimized_callback() which in turn call the pre_handler(). After the execution of the pre-handler, a call is made for instruction emulation. The NIP is determined in advanced through dummy instruction emulation and a branch instruction is created to the NIP at the end of the trampoline. To address the limitation of branch instruction in POWER architecture, detour buffer slot is allocated from a reserved area. For the time being, 64KB is reserved in memory for this purpose. Instructions which can be emulated using analyse_instr() are the candidates for optimization. Before optimization ensure that the address range between the detour buffer allocated and the instruction being probed is within +/- 32MB. Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-10powerpc: Add helper to check if offset is within relative branch rangeAnju T
To permit the use of relative branch instruction in powerpc, the target address has to be relatively nearby, since the address is specified in an immediate field (24 bit filed) in the instruction opcode itself. Here nearby refers to 32MB on either side of the current instruction. This patch verifies whether the target address is within +/- 32MB range or not. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-06powerpc/64: Fix naming of cache block vs. cache lineBenjamin Herrenschmidt
In a number of places we called "cache line size" what is actually the cache block size, which in the powerpc architecture, means the effective size to use with cache management instructions (it can be different from the actual cache line size). We fix the naming across the board and properly retrieve both pieces of information when available in the device-tree. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-25powerpc/sstep: Return directly after a failed address_ok() in emulate_step()Markus Elfring
Setting err and going to ldst_done just returns 0, without using err, so just return 0 directly. We already do that for other call sites in this function. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> [mpe: Rewrite change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-25powerpc/64: Use optimized checksum routines on little-endianPaul Mackerras
Currently we have optimized hand-coded assembly checksum routines for big-endian 64-bit systems, but for little-endian we use the generic C routines. This modifies the optimized routines to work for little-endian. With this, we no longer need to enable CONFIG_GENERIC_CSUM. This also fixes a couple of comments in checksum_64.S so they accurately reflect what the associated instruction does. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> [mpe: Use the more common __BIG_ENDIAN__] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-12-24Replace <asm/uaccess.h> with <linux/uaccess.h> globallyLinus Torvalds
This was entirely automated, using the script by Al: PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-18powerpc/lib: Fix randconfig build failure in sstep.cMichael Ellerman
Under some configs we need to explicitly include cpu_has_feature.h, otherwise we fail with: arch/powerpc/lib/sstep.c:1992:7: error: implicit declaration of function 'cpu_has_feature' Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-14powerpc: EX_TABLE macro for exception tablesNicholas Piggin
This macro is taken from s390, and allows more flexibility in changing exception table format. mpe: Put it in ppc_asm.h and only define one version using stringinfy_in_c(). Add some empty definitions and headers to keep the selftests happy. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-14Merge branch 'kbuild' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild Pull kbuild updates from Michal Marek: - EXPORT_SYMBOL for asm source by Al Viro. This does bring a regression, because genksyms no longer generates checksums for these symbols (CONFIG_MODVERSIONS). Nick Piggin is working on a patch to fix this. Plus, we are talking about functions like strcpy(), which rarely change prototypes. - Fixes for PPC fallout of the above by Stephen Rothwell and Nick Piggin - fixdep speedup by Alexey Dobriyan. - preparatory work by Nick Piggin to allow architectures to build with -ffunction-sections, -fdata-sections and --gc-sections - CONFIG_THIN_ARCHIVES support by Stephen Rothwell - fix for filenames with colons in the initramfs source by me. * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: (22 commits) initramfs: Escape colons in depfile ppc: there is no clear_pages to export powerpc/64: whitelist unresolved modversions CRCs kbuild: -ffunction-sections fix for archs with conflicting sections kbuild: add arch specific post-link Makefile kbuild: allow archs to select link dead code/data elimination kbuild: allow architectures to use thin archives instead of ld -r kbuild: Regenerate genksyms lexer kbuild: genksyms fix for typeof handling fixdep: faster CONFIG_ search ia64: move exports to definitions sparc32: debride memcpy.S a bit [sparc] unify 32bit and 64bit string.h sparc: move exports to definitions ppc: move exports to definitions arm: move exports to definitions s390: move exports to definitions m68k: move exports to definitions alpha: move exports to actual definitions x86: move exports to actual definitions ...
2016-10-14Merge tag 'powerpc-4.9-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull more powerpc updates from Michael Ellerman: "Some more powerpc updates for 4.9: Freescale updates from Scott Wood: - qbman support (a prerequisite for datapath drivers such as ethernet) - a PCI DMA fix+improvement - reset handler changes - more 8xx optimizations - some cleanups and fixes.' Fixes: - selftests/powerpc: Add missing binaries to .gitignores (Michael Ellerman) - selftests/powerpc: Fix build break caused by EXPORT_SYMBOL changes (Michael Ellerman) - powerpc/pseries: Fix stack corruption in htpe code (Laurent Dufour) - powerpc/64s: Fix power4_fixup_nap placement (Nicholas Piggin) - powerpc/64: Fix incorrect return value from __copy_tofrom_user (Paul Mackerras) - powerpc/mm/hash64: Fix might_have_hea() check (Michael Ellerman) Other: - MAINTAINERS: Remove myself from PA Semi entries (Olof Johansson) - MAINTAINERS: Drop separate pseries entry (Michael Ellerman) - MAINTAINERS: Update powerpc website & add selftests (Michael Ellerman): * tag 'powerpc-4.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (35 commits) powerpc/mm/hash64: Fix might_have_hea() check powerpc/64: Fix incorrect return value from __copy_tofrom_user powerpc/64s: Fix power4_fixup_nap placement powerpc/pseries: Fix stack corruption in htpe code selftests/powerpc: Fix build break caused by EXPORT_SYMBOL changes MAINTAINERS: Update powerpc website & add selftests MAINTAINERS: Drop separate pseries entry MAINTAINERS: Remove myself from PA Semi entries selftests/powerpc: Add missing binaries to .gitignores arch/powerpc: Add CONFIG_FSL_DPAA to corenetXX_smp_defconfig soc/qman: Add self-test for QMan driver soc/bman: Add self-test for BMan driver soc/fsl: Introduce DPAA 1.x QMan device driver soc/fsl: Introduce DPAA 1.x BMan device driver powerpc/8xx: make user addr DTLB miss the short path powerpc/8xx: Move additional DTLBMiss handlers out of exception area powerpc/8xx: use r3 to scratch CR in ITLBmiss soc/fsl/qe: fix gpio save_regs functions powerpc/8xx: add dedicated machine check handler powerpc/8xx: add system_reset_exception ...
2016-10-12powerpc/64: Fix incorrect return value from __copy_tofrom_userPaul Mackerras
Debugging a data corruption issue with virtio-net/vhost-net led to the observation that __copy_tofrom_user was occasionally returning a value 16 larger than it should. Since the return value from __copy_tofrom_user is the number of bytes not copied, this means that __copy_tofrom_user can occasionally return a value larger than the number of bytes it was asked to copy. In turn this can cause higher-level copy functions such as copy_page_to_iter_iovec to corrupt memory by copying data into the wrong memory locations. It turns out that the failing case involves a fault on the store at label 79, and at that point the first unmodified byte of the destination is at R3 + 16. Consequently the exception handler for that store needs to add 16 to R3 before using it to work out how many bytes were not copied, but in this one case it was not adding the offset to R3. To fix it, this moves the label 179 to the point where we add 16 to R3. I have checked manually all the exception handlers for the loads and stores in this code and the rest of them are correct (it would be excellent to have an automated test of all the exception cases). This bug has been present since this code was initially committed in May 2002 to Linux version 2.5.20. Cc: stable@vger.kernel.org Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-07Merge tag 'powerpc-4.9-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Highlights: - Major rework of Book3S 64-bit exception vectors (Nicholas Piggin) - Use gas sections for arranging exception vectors et. al. - Large set of TM cleanups and selftests (Cyril Bur) - Enable transactional memory (TM) lazily for userspace (Cyril Bur) - Support for XZ compression in the zImage wrapper (Oliver O'Halloran) - Add support for bpf constant blinding (Naveen N. Rao) - Beginnings of upstream support for PA Semi Nemo motherboards (Darren Stevens) Fixes: - Ensure .mem(init|exit).text are within _stext/_etext (Michael Ellerman) - xmon: Don't use ld on 32-bit (Michael Ellerman) - vdso64: Use double word compare on pointers (Anton Blanchard) - powerpc/nvram: Fix an incorrect partition merge (Pan Xinhui) - powerpc: Fix usage of _PAGE_RO in hugepage (Christophe Leroy) - powerpc/mm: Update FORCE_MAX_ZONEORDER range to allow hugetlb w/4K (Aneesh Kumar K.V) - Fix memory leak in queue_hotplug_event() error path (Andrew Donnellan) - Replay hypervisor maintenance interrupt first (Nicholas Piggin) Various performance optimisations (Anton Blanchard): - Align hot loops of memset() and backwards_memcpy() - During context switch, check before setting mm_cpumask - Remove static branch prediction in atomic{, 64}_add_unless - Only disable HAVE_EFFICIENT_UNALIGNED_ACCESS on POWER7 little endian - Set default CPU type to POWER8 for little endian builds Cleanups & features: - Sparse fixes/cleanups (Daniel Axtens) - Preserve CFAR value on SLB miss caused by access to bogus address (Paul Mackerras) - Radix MMU fixups for POWER9 (Aneesh Kumar K.V) - Support for setting used_(vsr|vr|spe) in sigreturn path (for CRIU) (Simon Guo) - Optimise syscall entry for virtual, relocatable case (Nicholas Piggin) - Optimise MSR handling in exception handling (Nicholas Piggin) - Support for kexec with Radix MMU (Benjamin Herrenschmidt) - powernv EEH fixes (Russell Currey) - Suprise PCI hotplug support for powernv (Gavin Shan) - Endian/sparse fixes for powernv PCI (Gavin Shan) - Defconfig updates (Anton Blanchard) - KVM: PPC: Book3S HV: Migrate pinned pages out of CMA (Balbir Singh) - cxl: Flush PSL cache before resetting the adapter (Frederic Barrat) - cxl: replace loop with for_each_child_of_node(), remove unneeded of_node_put() (Andrew Donnellan) - Fix HV facility unavailable to use correct handler (Nicholas Piggin) - Remove unnecessary syscall trampoline (Nicholas Piggin) - fadump: Fix build break when CONFIG_PROC_VMCORE=n (Michael Ellerman) - Quieten EEH message when no adapters are found (Anton Blanchard) - powernv: Add PHB register dump debugfs handle (Russell Currey) - Use kprobe blacklist for exception handlers & asm functions (Nicholas Piggin) - Document the syscall ABI (Nicholas Piggin) - MAINTAINERS: Update cxl maintainers (Michael Neuling) - powerpc: Remove all usages of NO_IRQ (Michael Ellerman) Minor cleanups: - Andrew Donnellan, Christophe Leroy, Colin Ian King, Cyril Bur, Frederic Barrat, Pan Xinhui, PrasannaKumar Muralidharan, Rui Teng, Simon Guo" * tag 'powerpc-4.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (156 commits) powerpc/bpf: Add support for bpf constant blinding powerpc/bpf: Implement support for tail calls powerpc/bpf: Introduce accessors for using the tmp local stack space powerpc/fadump: Fix build break when CONFIG_PROC_VMCORE=n powerpc: tm: Enable transactional memory (TM) lazily for userspace powerpc/tm: Add TM Unavailable Exception powerpc: Remove do_load_up_transact_{fpu,altivec} powerpc: tm: Rename transct_(*) to ck(\1)_state powerpc: tm: Always use fp_state and vr_state to store live registers selftests/powerpc: Add checks for transactional VSXs in signal contexts selftests/powerpc: Add checks for transactional VMXs in signal contexts selftests/powerpc: Add checks for transactional FPUs in signal contexts selftests/powerpc: Add checks for transactional GPRs in signal contexts selftests/powerpc: Check that signals always get delivered selftests/powerpc: Add TM tcheck helpers in C selftests/powerpc: Allow tests to extend their kill timeout selftests/powerpc: Introduce GPR asm helper header file selftests/powerpc: Move VMX stack frame macros to header file selftests/powerpc: Rework FPU stack placement macros and move to header file selftests/powerpc: Check for VSX preservation across userspace preemption ...
2016-10-04powerpc/64: Align hot loops of memset() and backwards_memcpy()Anton Blanchard
Align the hot loops in our assembly implementation of memset() and backwards_memcpy(). backwards_memcpy() is called from tcp_v4_rcv(), so we might want to optimise this a little more. Signed-off-by: Anton Blanchard <anton@samba.org> Reviewed-by: Nick Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-13powerpc/Makefile: Drop CONFIG_WORD_SIZE for BITSMichael Ellerman
Commit 2578bfae84a7 ("[POWERPC] Create and use CONFIG_WORD_SIZE") added CONFIG_WORD_SIZE, and suggests that other arches were going to do likewise. But that never happened, powerpc is the only architecture which uses it. So switch to using a simple make variable, BITS, like x86, sh, sparc and tile. It is also easier to spell and simpler, avoiding any confusion about whether it's defined due to ordering of make vs kconfig. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-08powerpc/32: Fix again csum_partial_copy_generic()Christophe Leroy
Commit 7aef4136566b0 ("powerpc32: rewrite csum_partial_copy_generic() based on copy_tofrom_user()") introduced a bug when destination address is odd and len is lower than cacheline size. In that case the resulting csum value doesn't have to be rotated one byte because the cache-aligned copy part is skipped so no alignment is performed. Fixes: 7aef4136566b0 ("powerpc32: rewrite csum_partial_copy_generic() based on copy_tofrom_user()") Cc: stable@vger.kernel.org # v4.6+ Reported-by: Alessio Igor Bogani <alessio.bogani@elettra.eu> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Tested-by: Alessio Igor Bogani <alessio.bogani@elettra.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-10powerpc/32: Fix crash during static key initBenjamin Herrenschmidt
We cannot do those initializations from apply_feature_fixups() as this function runs in a very restricted environment on 32-bit where the kernel isn't running at its linked address and the PTRRELOC() macro must be used for any global accesss. Instead, split them into a separtate steup_feature_keys() function which is called in a more suitable spot on ppc32. Fixes: 309b315b6ec6 ("powerpc: Call jump_label_init() in apply_feature_fixups()") Reported-and-tested-by: Christian Kujau <lists@nerdbynature.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-10powerpc/32: Fix csum_partial_copy_generic()Christophe Leroy
Commit 7aef4136566b0 ("powerpc32: rewrite csum_partial_copy_generic() based on copy_tofrom_user()") introduced a bug when destination address is odd and initial csum is not null In that (rare) case the initial csum value has to be rotated one byte as well as the resulting value is This patch also fixes related comments Fixes: 7aef4136566b0 ("powerpc32: rewrite csum_partial_copy_generic() based on copy_tofrom_user()") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-07ppc: move exports to definitionsAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-08-05Merge tag 'powerpc-4.8-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull more powerpc updates from Michael Ellerman: "These were delayed for various reasons, so I let them sit in next a bit longer, rather than including them in my first pull request. Fixes: - Fix early access to cpu_spec relocation from Benjamin Herrenschmidt - Fix incorrect event codes in power9-event-list from Madhavan Srinivasan - Move register_process_table() out of ppc_md from Michael Ellerman Use jump_label use for [cpu|mmu]_has_feature(): - Add mmu_early_init_devtree() from Michael Ellerman - Move disable_radix handling into mmu_early_init_devtree() from Michael Ellerman - Do hash device tree scanning earlier from Michael Ellerman - Do radix device tree scanning earlier from Michael Ellerman - Do feature patching before MMU init from Michael Ellerman - Check features don't change after patching from Michael Ellerman - Make MMU_FTR_RADIX a MMU family feature from Aneesh Kumar K.V - Convert mmu_has_feature() to returning bool from Michael Ellerman - Convert cpu_has_feature() to returning bool from Michael Ellerman - Define radix_enabled() in one place & use static inline from Michael Ellerman - Add early_[cpu|mmu]_has_feature() from Michael Ellerman - Convert early cpu/mmu feature check to use the new helpers from Aneesh Kumar K.V - jump_label: Make it possible for arches to invoke jump_label_init() earlier from Kevin Hao - Call jump_label_init() in apply_feature_fixups() from Aneesh Kumar K.V - Remove mfvtb() from Kevin Hao - Move cpu_has_feature() to a separate file from Kevin Hao - Add kconfig option to use jump labels for cpu/mmu_has_feature() from Michael Ellerman - Add option to use jump label for cpu_has_feature() from Kevin Hao - Add option to use jump label for mmu_has_feature() from Kevin Hao - Catch usage of cpu/mmu_has_feature() before jump label init from Aneesh Kumar K.V - Annotate jump label assembly from Michael Ellerman TLB flush enhancements from Aneesh Kumar K.V: - radix: Implement tlb mmu gather flush efficiently - Add helper for finding SLBE LLP encoding - Use hugetlb flush functions - Drop multiple definition of mm_is_core_local - radix: Add tlb flush of THP ptes - radix: Rename function and drop unused arg - radix/hugetlb: Add helper for finding page size - hugetlb: Add flush_hugetlb_tlb_range - remove flush_tlb_page_nohash Add new ptrace regsets from Anshuman Khandual and Simon Guo: - elf: Add powerpc specific core note sections - Add the function flush_tmregs_to_thread - Enable in transaction NT_PRFPREG ptrace requests - Enable in transaction NT_PPC_VMX ptrace requests - Enable in transaction NT_PPC_VSX ptrace requests - Adapt gpr32_get, gpr32_set functions for transaction - Enable support for NT_PPC_CGPR - Enable support for NT_PPC_CFPR - Enable support for NT_PPC_CVMX - Enable support for NT_PPC_CVSX - Enable support for TM SPR state - Enable NT_PPC_TM_CTAR, NT_PPC_TM_CPPR, NT_PPC_TM_CDSCR - Enable support for NT_PPPC_TAR, NT_PPC_PPR, NT_PPC_DSCR - Enable support for EBB registers - Enable support for Performance Monitor registers" * tag 'powerpc-4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (48 commits) powerpc/mm: Move register_process_table() out of ppc_md powerpc/perf: Fix incorrect event codes in power9-event-list powerpc/32: Fix early access to cpu_spec relocation powerpc/ptrace: Enable support for Performance Monitor registers powerpc/ptrace: Enable support for EBB registers powerpc/ptrace: Enable support for NT_PPPC_TAR, NT_PPC_PPR, NT_PPC_DSCR powerpc/ptrace: Enable NT_PPC_TM_CTAR, NT_PPC_TM_CPPR, NT_PPC_TM_CDSCR powerpc/ptrace: Enable support for TM SPR state powerpc/ptrace: Enable support for NT_PPC_CVSX powerpc/ptrace: Enable support for NT_PPC_CVMX powerpc/ptrace: Enable support for NT_PPC_CFPR powerpc/ptrace: Enable support for NT_PPC_CGPR powerpc/ptrace: Adapt gpr32_get, gpr32_set functions for transaction powerpc/ptrace: Enable in transaction NT_PPC_VSX ptrace requests powerpc/ptrace: Enable in transaction NT_PPC_VMX ptrace requests powerpc/ptrace: Enable in transaction NT_PRFPREG ptrace requests powerpc/process: Add the function flush_tmregs_to_thread elf: Add powerpc specific core note sections powerpc/mm: remove flush_tlb_page_nohash powerpc/mm/hugetlb: Add flush_hugetlb_tlb_range ...
2016-08-03powerpc/32: Fix early access to cpu_spec relocationBenjamin Herrenschmidt
Commit 9402c6846131 ("powerpc: Factor do_feature_fixup calls") introduced a subtle bug on 32-bit. When reading the cpu spec from the global, we not only need to do a pointer relocation on the global address but also on the pointer we read from it. This fixes crashes reported on MPC5200 based machines. Fixes: 9402c6846131 ("powerpc: Factor do_feature_fixup calls") Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-02treewide: replace obsolete _refok by __refFabian Frederick
There was only one use of __initdata_refok and __exit_refok __init_refok was used 46 times against 82 for __ref. Those definitions are obsolete since commit 312b1485fb50 ("Introduce new section reference annotations tags: __ref, __refdata, __refconst") This patch removes the following compatibility definitions and replaces them treewide. /* compatibility defines */ #define __init_refok __ref #define __initdata_refok __refdata #define __exit_refok __ref I can also provide separate patches if necessary. (One patch per tree and check in 1 month or 2 to remove old definitions) [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/1466796271-3043-1-git-send-email-fabf@skynet.be Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Ingo Molnar <mingo@redhat.com> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-01powerpc: Add option to use jump label for mmu_has_feature()Kevin Hao
As we just did for CPU features. Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01powerpc: Add option to use jump label for cpu_has_feature()Kevin Hao
We do binary patching of asm code using CPU features, which is a one-time operation, done during early boot. However checks of CPU features in C code are currently done at run time, even though the set of CPU features can never change after boot. We can optimise this by using jump labels to implement cpu_has_feature(), meaning checks in C code are binary patched into a single nop or branch. For a C sequence along the lines of: if (cpu_has_feature(FOO)) return 2; The generated code before is roughly: ld r9,-27640(r2) ld r9,0(r9) lwz r9,32(r9) cmpwi cr7,r9,0 bge cr7, 1f li r3,2 blr 1: ... After (true): nop li r3,2 blr After (false): b 1f li r3,2 blr 1: ... mpe: Rename MAX_CPU_FEATURES as we already have a #define with that name, and define it simply as a constant, rather than doing tricks with sizeof and NULL pointers. Rename the array to cpu_feature_keys. Use the kconfig we added to guard it. Add BUILD_BUG_ON() if the feature is not a compile time constant. Rewrite the change log. Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01powerpc: Call jump_label_init() in apply_feature_fixups()Aneesh Kumar K.V
Call jump_label_init() early so that we can use static keys for CPU and MMU feature checks. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-01powerpc/kernel: Check features don't change after patchingMichael Ellerman
Early in boot we binary patch some sections of code based on the CPU and MMU feature bits. But it is a one-time patching, there is no facility for repatching the code later if the set of features change. It is a major bug if the set of features changes after we've done the code patching - so add a check for it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-21powerpc: Factor do_feature_fixup callsBenjamin Herrenschmidt
32 and 64-bit do a similar set of calls early on, we move it all to a single common function to make the boot code more readable. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-15powerpc/lib: Clarify that adde is an instruction and we mean pluralStewart Smith
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-16powerpc: Introduce asm-prototypes.hDaniel Axtens
Sparse picked up a number of functions that are implemented in C and then only referred to in asm code. This introduces asm-prototypes.h, which provides a place for prototypes of these functions. This silences some sparse warnings. Signed-off-by: Daniel Axtens <dja@axtens.net> [mpe: Add include guards, clean up copyright & GPL text] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-14powerpc/spinlock: Fix spin_unlock_wait()Boqun Feng
There is an ordering issue with spin_unlock_wait() on powerpc, because the spin_lock primitive is an ACQUIRE and an ACQUIRE is only ordering the load part of the operation with memory operations following it. Therefore the following event sequence can happen: CPU 1 CPU 2 CPU 3 ================== ==================== ============== spin_unlock(&lock); spin_lock(&lock): r1 = *lock; // r1 == 0; o = object; o = READ_ONCE(object); // reordered here object = NULL; smp_mb(); spin_unlock_wait(&lock); *lock = 1; smp_mb(); o->dead = true; < o = READ_ONCE(object); > // reordered upwards if (o) // true BUG_ON(o->dead); // true!! To fix this, we add a "nop" ll/sc loop in arch_spin_unlock_wait() on ppc, the "nop" ll/sc loop reads the lock value and writes it back atomically, in this way it will synchronize the view of the lock on CPU1 with that on CPU2. Therefore in the scenario above, either CPU2 will fail to get the lock at first or CPU1 will see the lock acquired by CPU2, both cases will eliminate this bug. This is a similar idea as what Will Deacon did for ARM64 in: d86b8da04dfa ("arm64: spinlock: serialise spin_unlock_wait against concurrent lockers") Furthermore, if the "nop" ll/sc figures out the lock is locked, we actually don't need to do the "nop" ll/sc trick again, we can just do a normal load+check loop for the lock to be released, because in that case, spin_unlock_wait() is called when someone is holding the lock, and the store part of the "nop" ll/sc happens before the lock release of the current lock holder: "nop" ll/sc -> spin_unlock() and the lock release happens before the next lock acquisition: spin_unlock() -> spin_lock() <next holder> which means the "nop" ll/sc happens before the next lock acquisition: "nop" ll/sc -> spin_unlock() -> spin_lock() <next holder> With a smp_mb() preceding spin_unlock_wait(), the store of object is guaranteed to be observed by the next lock holder: STORE -> smp_mb() -> "nop" ll/sc -> spin_unlock() -> spin_lock() <next holder> This patch therefore fixes the issue and also cleans the arch_spin_unlock_wait() a little bit by removing superfluous memory barriers in loops and consolidating the implementations for PPC32 and PPC64 into one. Suggested-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> [mpe: Inline the "nop" ll/sc loop and set EH=0, munge change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-14powerpc: Various typo fixesMichael Ellerman
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-14powerpc: Align hot loops of some string functionsAnton Blanchard
Align the hot loops in our assembly implementation of strncpy(), strncmp() and memchr(). Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-14powerpc: Remove assembly versions of strcpy, strcat, strlen and strcmpAnton Blanchard
A number of our assembly implementations of string functions do not align their hot loops. I was going to align them manually, but I realised that they are are almost instruction for instruction identical to what gcc produces, with the advantage that gcc does align them. In light of that, let's just remove the assembly versions. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11powerpc/sstep: Fix emulation fall-throughOliver O'Halloran
There is a switch fallthough in instr_analyze() which can cause an invalid instruction to be emulated as a different, valid, instruction. The rld* (opcode 30) case extracts a sub-opcode from bits 3:1 of the instruction word. However, the only valid values of this field are 001 and 000. These cases are correctly handled, but the others are not which causes execution to fall through into case 31. Breaking out of the switch causes the instruction to be marked as unknown and allows the caller to deal with the invalid instruction in a manner consistent with other invalid instructions. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11powerpc/sstep: Fix sstep.c compile on powerpcspeLennart Sorensen
Commit be96f63375a1 ("powerpc: Split out instruction analysis part of emulate_step()") introduced ldarx and stdcx into the instructions in sstep.c, which are not accepted by the assembler on powerpcspe, but does seem to be accepted by the normal powerpc assembler even in 32 bit mode. Wrap these two instructions in a __powerpc64__ check like it is everywhere else in the file. Fixes: be96f63375a1 ("powerpc: Split out instruction analysis part of emulate_step()") Signed-off-by: Len Sorensen <lsorense@csclub.uwaterloo.ca> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-27powerpc: rework sparse for lib/xor_vmx.cDaniel Axtens
Sparse doesn't seem to be passing -maltivec around properly, leading to lots of errors: .../include/altivec.h:34:2: error: Use the "-maltivec" flag to enable PowerPC AltiVec support arch/powerpc/lib/xor_vmx.c:27:16: error: Expected ; at end of declaration arch/powerpc/lib/xor_vmx.c:27:16: error: got signed arch/powerpc/lib/xor_vmx.c:60:9: error: No right hand side of '*'-expression arch/powerpc/lib/xor_vmx.c:60:9: error: Expected ; at end of statement arch/powerpc/lib/xor_vmx.c:60:9: error: got v1_in ... arch/powerpc/lib/xor_vmx.c:87:9: error: too many errors Only include the altivec.h header for non-__CHECKER__ builds. For builds with __CHECKER__, make up some stubs instead, as suggested by Balbir. (The vector size of 16 is arbitrary.) Suggested-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Daniel Axtens <dja@axtens.net> Tested-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-11powerpc: Make generic_memcpy() private to copy_32.SMichael Ellerman
generic_memcpy() is only called from copy_32.S, so there's no reason for it to be global. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-14Merge branch 'next' of ↵Michael Ellerman
git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next Freescale updates from Scott: "Highlights include 8xx optimizations, 32-bit checksum optimizations, 86xx consolidation, e5500/e6500 cpu hotplug, more fman and other dt bits, and minor fixes/cleanup."
2016-03-09powerpc: optimise csum_partial() call when len is constantChristophe Leroy
csum_partial is often called for small fixed length packets for which it is suboptimal to use the generic csum_partial() function. For instance, in my configuration, I got: * One place calling it with constant len 4 * Seven places calling it with constant len 8 * Three places calling it with constant len 14 * One place calling it with constant len 20 * One place calling it with constant len 24 * One place calling it with constant len 32 This patch renames csum_partial() to __csum_partial() and implements csum_partial() as a wrapper inline function which * uses csum_add() for small 16bits multiple constant length * uses ip_fast_csum() for other 32bits multiple constant * uses __csum_partial() in all other cases Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-07powerpc/ftrace: Use $(CC_FLAGS_FTRACE) when disabling ftraceTorsten Duwe
Rather than open-coding -pg whereever we want to disable ftrace, use the existing $(CC_FLAGS_FTRACE) variable. This has the advantage that it will work in future when we use a different set of flags to enable ftrace. Signed-off-by: Torsten Duwe <duwe@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-04powerpc32: optimise csum_partial() loopChristophe Leroy
On the 8xx, load latency is 2 cycles and taking branches also takes 2 cycles. So let's unroll the loop. This patch improves csum_partial() speed by around 10% on both: * 8xx (single issue processor with parallel execution) * 83xx (superscalar 6xx processor with dual instruction fetch and parallel execution) Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-04powerpc32: optimise a few instructions in csum_partial()Christophe Leroy
r5 does contain the value to be updated, so lets use r5 all way long for that. It makes the code more readable. To avoid confusion, it is better to use adde instead of addc The first addition is useless. Its only purpose is to clear carry. As r4 is a signed int that is always positive, this can be done by using srawi instead of srwi Let's also remove the comment about bdnz having no overhead as it is not correct on all powerpc, at least on MPC8xx In the last part, in our situation, the remaining quantity of bytes to be proceeded is between 0 and 3. Therefore, we can base that part on the value of bit 31 and bit 30 of r4 instead of anding r4 with 3 then proceding on comparisons and substractions. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-04powerpc32: rewrite csum_partial_copy_generic() based on copy_tofrom_user()Christophe Leroy
csum_partial_copy_generic() does the same as copy_tofrom_user and also calculates the checksum during the copy. Unlike copy_tofrom_user(), the existing version of csum_partial_copy_generic() doesn't take benefit of the cache. This patch is a rewrite of csum_partial_copy_generic() based on copy_tofrom_user(). The previous version of csum_partial_copy_generic() was handling errors. Now we have the checksum wrapper functions to handle the error case like in powerpc64 so we can make the error case simple: just return -EFAULT. copy_tofrom_user() only has r12 available => we use it for the checksum r7 and r8 which contains pointers to error feedback are used, so we stack them. On a TCP benchmark using socklib on the loopback interface on which checksum offload and scatter/gather have been deactivated, we get about 20% performance increase. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-04powerpc: inline ip_fast_csum()Christophe Leroy
In several architectures, ip_fast_csum() is inlined There are functions like ip_send_check() which do nothing much more than calling ip_fast_csum(). Inlining ip_fast_csum() allows the compiler to optimise better Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [scottwood: whitespace and cast fixes] Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-04powerpc32: checksum_wrappers_64 becomes checksum_wrappersChristophe Leroy
The powerpc64 checksum wrapper functions adds csum_and_copy_to_user() which otherwise is implemented in include/net/checksum.h by using csum_partial() then copy_to_user() Those two wrapper fonctions are also applicable to powerpc32 as it is based on the use of csum_partial_copy_generic() which also exists on powerpc32 This patch renames arch/powerpc/lib/checksum_wrappers_64.c to arch/powerpc/lib/checksum_wrappers.c and makes it non-conditional to CONFIG_WORD_SIZE Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2016-03-04powerpc: unexport csum_tcpudp_magicChristophe Leroy
csum_tcpudp_magic is now an inline function, so there is nothing to export Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Scott Wood <oss@buserror.net>
2015-12-01powerpc: Create disable_kernel_{fp,altivec,vsx,spe}()Anton Blanchard
The enable_kernel_*() functions leave the relevant MSR bits enabled until we exit the kernel sometime later. Create disable versions that wrap the kernel use of FP, Altivec VSX or SPE. While we don't want to disable it normally for performance reasons (MSR writes are slow), it will be used for a debug boot option that does this and catches bad uses in other areas of the kernel. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-09-17powerpc32: memset: only use dcbz once cache is enabledLEROY Christophe
memset() uses instruction dcbz to speed up clearing by not wasting time loading cache line with data that will be overwritten. Some platform like mpc52xx do no have cache active at startup and can therefore not use memset(). Allthough no part of the code explicitly uses memset(), GCC may make calls to it. This patch modifies memset() such that at startup, memset() unconditionally skip the optimised bloc that uses dcbz instruction. Once the initial MMU is set up, in machine_init() we patch memset() by replacing this inconditional jump by a NOP Tested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-09-17powerpc32: memcpy: only use dcbz once cache is enabledLEROY Christophe
memcpy() uses instruction dcbz to speed up copy by not wasting time loading cache line with data that will be overwritten. Some platform like mpc52xx do no have cache active at startup and can therefore not use memcpy(). Allthough no part of the code explicitly uses memcpy(), GCC makes calls to it. This patch modifies memcpy() such that at startup, memcpy() unconditionally jumps to generic_memcpy() which doesn't use the dcbz instruction. Once the initial MMU is set up, in machine_init() we patch memcpy() by replacing this inconditional jump by a NOP Reported-by: Michal Sojka <sojkam1@fel.cvut.cz> Tested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>