summaryrefslogtreecommitdiffstats
path: root/arch/powerpc/kernel/process.c
AgeCommit message (Collapse)Author
2016-05-20exit_thread: remove empty bodiesJiri Slaby
Define HAVE_EXIT_THREAD for archs which want to do something in exit_thread. For others, let's define exit_thread as an empty inline. This is a cleanup before we change the prototype of exit_thread to accept a task parameter. [akpm@linux-foundation.org: fix mips] Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: "David S. Miller" <davem@davemloft.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chen Liqin <liqin.linux@gmail.com> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Chris Zankel <chris@zankel.net> Cc: David Howells <dhowells@redhat.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: James Hogan <james.hogan@imgtec.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Jonas Bonn <jonas@southpole.se> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Lennox Wu <lennox.wu@gmail.com> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Mikael Starvik <starvik@axis.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Rich Felker <dalias@libc.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@arm.linux.org.uk> Cc: Steven Miao <realmz6@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-11powerpc/mm/radix: Use STD_MMU_64 to properly isolate hash related codeAneesh Kumar K.V
We also use MMU_FTR_RADIX to branch out from code path specific to hash. No functionality change. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-18Merge branch 'topic/livepatch' into nextMichael Ellerman
Merge the support for live patching on ppc64le using mprofile-kernel. This branch has also been merged into the livepatching tree for v4.7.
2016-04-14powerpc/livepatch: Add livepatch stack to struct thread_infoMichael Ellerman
In order to support live patching we need to maintain an alternate stack of TOC & LR values. We use the base of the stack for this, and store the "live patch stack pointer" in struct thread_info. Unlike the other fields of thread_info, we can not statically initialise that value, so it must be done at run time. This patch just adds the code to support that, it is not enabled until the next patch which actually adds live patch support. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Balbir Singh <bsingharora@gmail.com>
2016-04-12powerpc: sparse: Include headers for __weak symbolsDaniel Axtens
Sometimes when sparse warns about undefined symbols, it isn't because they should have 'static' added, it's because they're overriding __weak symbols defined elsewhere, and the header has been missed. Fix a few of them by adding appropriate headers. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-29powerpc/process: Fix altivec SPR not being savedOliver O'Halloran
In save_sprs() in process.c contains the following test: if (cpu_has_feature(cpu_has_feature(CPU_FTR_ALTIVEC))) t->vrsave = mfspr(SPRN_VRSAVE); CPU feature with the mask 0x1 is CPU_FTR_COHERENT_ICACHE so the test is equivilent to: if (cpu_has_feature(CPU_FTR_ALTIVEC) && cpu_has_feature(CPU_FTR_COHERENT_ICACHE)) On CPUs without support for both (i.e G5) this results in vrsave not being saved between context switches. The vector register save/restore code doesn't use VRSAVE to determine which registers to save/restore, but the value of VRSAVE is used to determine if altivec is being used in several code paths. Fixes: 152d523e6307 ("powerpc: Create context switch helpers save_sprs() and restore_sprs()") Cc: stable@vger.kernel.org Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-19Merge tag 'powerpc-4.6-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "This was delayed a day or two by some build-breakage on old toolchains which we've now fixed. There's two PCI commits both acked by Bjorn. There's one commit to mm/hugepage.c which is (co)authored by Kirill. Highlights: - Restructure Linux PTE on Book3S/64 to Radix format from Paul Mackerras - Book3s 64 MMU cleanup in preparation for Radix MMU from Aneesh Kumar K.V - Add POWER9 cputable entry from Michael Neuling - FPU/Altivec/VSX save/restore optimisations from Cyril Bur - Add support for new ftrace ABI on ppc64le from Torsten Duwe Various cleanups & minor fixes from: - Adam Buchbinder, Andrew Donnellan, Balbir Singh, Christophe Leroy, Cyril Bur, Luis Henriques, Madhavan Srinivasan, Pan Xinhui, Russell Currey, Sukadev Bhattiprolu, Suraj Jitindar Singh. General: - atomics: Allow architectures to define their own __atomic_op_* helpers from Boqun Feng - Implement atomic{, 64}_*_return_* variants and acquire/release/ relaxed variants for (cmp)xchg from Boqun Feng - Add powernv_defconfig from Jeremy Kerr - Fix BUG_ON() reporting in real mode from Balbir Singh - Add xmon command to dump OPAL msglog from Andrew Donnellan - Add xmon command to dump process/task similar to ps(1) from Douglas Miller - Clean up memory hotplug failure paths from David Gibson pci/eeh: - Redesign SR-IOV on PowerNV to give absolute isolation between VFs from Wei Yang. - EEH Support for SRIOV VFs from Wei Yang and Gavin Shan. - PCI/IOV: Rename and export virtfn_{add, remove} from Wei Yang - PCI: Add pcibios_bus_add_device() weak function from Wei Yang - MAINTAINERS: Update EEH details and maintainership from Russell Currey cxl: - Support added to the CXL driver for running on both bare-metal and hypervisor systems, from Christophe Lombard and Frederic Barrat. - Ignore probes for virtual afu pci devices from Vaibhav Jain perf: - Export Power8 generic and cache events to sysfs from Sukadev Bhattiprolu - hv-24x7: Fix usage with chip events, display change in counter values, display domain indices in sysfs, eliminate domain suffix in event names, from Sukadev Bhattiprolu Freescale: - Updates from Scott: "Highlights include 8xx optimizations, 32-bit checksum optimizations, 86xx consolidation, e5500/e6500 cpu hotplug, more fman and other dt bits, and minor fixes/cleanup" * tag 'powerpc-4.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (179 commits) powerpc: Fix unrecoverable SLB miss during restore_math() powerpc/8xx: Fix do_mtspr_cpu6() build on older compilers powerpc/rcpm: Fix build break when SMP=n powerpc/book3e-64: Use hardcoded mttmr opcode powerpc/fsl/dts: Add "jedec,spi-nor" flash compatible powerpc/T104xRDB: add tdm riser card node to device tree powerpc32: PAGE_EXEC required for inittext powerpc/mpc85xx: Add pcsphy nodes to FManV3 device tree powerpc/mpc85xx: Add MDIO bus muxing support to the board device tree(s) powerpc/86xx: Introduce and use common dtsi powerpc/86xx: Update device tree powerpc/86xx: Move dts files to fsl directory powerpc/86xx: Switch to kconfig fragments approach powerpc/86xx: Update defconfigs powerpc/86xx: Consolidate common platform code powerpc32: Remove one insn in mulhdu powerpc32: small optimisation in flush_icache_range() powerpc: Simplify test in __dma_sync() powerpc32: move xxxxx_dcache_range() functions inline powerpc32: Remove clear_pages() and define clear_page() inline ...
2016-03-02powerpc: Add the ability to save VSX without giving it upCyril Bur
This patch adds the ability to be able to save the VSX registers to the thread struct without giving up (disabling the facility) next time the process returns to userspace. This patch builds on a previous optimisation for the FPU and VEC registers in the thread copy path to avoid a possibly pointless reload of VSX state. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-02powerpc: Add the ability to save Altivec without giving it upCyril Bur
This patch adds the ability to be able to save the VEC registers to the thread struct without giving up (disabling the facility) next time the process returns to userspace. This patch builds on a previous optimisation for the FPU registers in the thread copy path to avoid a possibly pointless reload of VEC state. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-02powerpc: Add the ability to save FPU without giving it upCyril Bur
This patch adds the ability to be able to save the FPU registers to the thread struct without giving up (disabling the facility) next time the process returns to userspace. This patch optimises the thread copy path (as a result of a fork() or clone()) so that the parent thread can return to userspace with hot registers avoiding a possibly pointless reload of FPU register state. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-02powerpc: Prepare for splitting giveup_{fpu, altivec, vsx} in twoCyril Bur
This prepares for the decoupling of saving {fpu,altivec,vsx} registers and marking {fpu,altivec,vsx} as being unused by a thread. Currently giveup_{fpu,altivec,vsx}() does both however optimisations to task switching can be made if these two operations are decoupled. save_all() will permit the saving of registers to thread structs and leave threads MSR with bits enabled. This patch introduces no functional change. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-02powerpc: Restore FPU/VEC/VSX if previously usedCyril Bur
Currently the FPU, VEC and VSX facilities are lazily loaded. This is not a problem unless a process is using these facilities. Modern versions of GCC are very good at automatically vectorising code, new and modernised workloads make use of floating point and vector facilities, even the kernel makes use of vectorised memcpy. All this combined greatly increases the cost of a syscall since the kernel uses the facilities sometimes even in syscall fast-path making it increasingly common for a thread to take an *_unavailable exception soon after a syscall, not to mention potentially taking all three. The obvious overcompensation to this problem is to simply always load all the facilities on every exit to userspace. Loading up all FPU, VEC and VSX registers every time can be expensive and if a workload does avoid using them, it should not be forced to incur this penalty. An 8bit counter is used to detect if the registers have been used in the past and the registers are always loaded until the value wraps to back to zero. Several versions of the assembly in entry_64.S were tested: 1. Always calling C. 2. Performing a common case check and then calling C. 3. A complex check in asm. After some benchmarking it was determined that avoiding C in the common case is a performance benefit (option 2). The full check in asm (option 3) greatly complicated that codepath for a negligible performance gain and the trade-off was deemed not worth it. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> [mpe: Move load_vec in the struct to fill an existing hole, reword change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> fixup
2016-03-02powerpc: Explicitly disable math features when copying threadCyril Bur
Currently when threads get scheduled off they always giveup the FPU, Altivec (VMX) and Vector (VSX) units if they were using them. When they are scheduled back on a fault is then taken to enable each facility and load registers. As a result explicitly disabling FPU/VMX/VSX has not been necessary. Future changes and optimisations remove this mandatory giveup and fault which could cause calls such as clone() and fork() to copy threads and run them later with FPU/VMX/VSX enabled but no registers loaded. This patch starts the process of having MSR_{FP,VEC,VSX} mean that a threads registers are hot while not having MSR_{FP,VEC,VSX} means that the registers must be loaded. This allows for a smarter return to userspace. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-27mm: ASLR: use get_random_long()Daniel Cashman
Replace calls to get_random_int() followed by a cast to (unsigned long) with calls to get_random_long(). Also address shifting bug which, in case of x86 removed entropy mask for mmap_rnd_bits values > 31 bits. Signed-off-by: Daniel Cashman <dcashman@android.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: David S. Miller <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Nick Kralevich <nnk@google.com> Cc: Jeff Vander Stoep <jeffv@google.com> Cc: Mark Salyzyn <salyzyn@android.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-14Merge tag 'powerpc-4.4-3' into nextMichael Ellerman
Merge the two TM fixes we merged in 4.4. We are about to merge selftests for these, and without the fixes the selftests will oops. powerpc fixes for 4.4 #2 - tm: Block signal return from setting invalid MSR state from Michael Neuling - tm: Check for already reclaimed tasks from Michael Neuling
2015-12-14powerpc: Print MSR TM bits in oops messagesMichael Neuling
Print MSR TM bits in oops messages. This appends them to the end like this: MSR: 8000000502823031 <SF,VEC,VSX,FP,ME,IR,DR,LE,TM[TE]> You get the TM[] only if at least one TM MSR bit is set. Inside the TM[], E means Enabled (bit 32), S means Suspended (bit 33), and T means Transactional (bit 34) If no bits are set, you get no TM[] output. Include rework of printbits() to handle this case. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-10powerpc: Fix DSCR inheritance over fork()Anton Blanchard
Two DSCR tests have a hack in them: /* * XXX: Force a context switch out so that DSCR * current value is copied into the thread struct * which is required for the child to inherit the * changed value. */ sleep(1); We should not be working around this in the testcase, it is a kernel bug. Fix it by copying the current DSCR to the child, instead of what we had in the thread struct at last context switch. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-10powerpc: Call restore_sprs() before _switch()Anton Blanchard
commit 152d523e6307 ("powerpc: Create context switch helpers save_sprs() and restore_sprs()") moved the restore of SPRs after the call to _switch(). There is an issue with this approach - new tasks do not return through _switch(), they are set up by copy_thread() to directly return through ret_from_fork() or ret_from_kernel_thread(). This means restore_sprs() is not getting called for new tasks. Fix this by moving restore_sprs() before _switch(). Fixes: 152d523e6307 ("powerpc: Create context switch helpers save_sprs() and restore_sprs()") Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-10powerpc: Call check_if_tm_restore_required() in enable_kernel_*()Anton Blanchard
Commit a0e72cf12b1a ("powerpc: Create msr_check_and_{set,clear}()") removed a call to check_if_tm_restore_required() in the enable_kernel_*() functions. Add them back in. Fixes: a0e72cf12b1a ("powerpc: Create msr_check_and_{set,clear}()") Reported-by: Rashmica Gupta <rashmicy@gmail.com> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-02powerpc: clean up asm/switch_to.hAnton Blanchard
Remove a bunch of unnecessary fallback functions and group things in a more logical way. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-02powerpc: Rearrange __switch_to()Anton Blanchard
Most of __switch_to() is housekeeping, TLB batching, timekeeping etc. Move these away from the more complex and critical context switching code. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-02powerpc: create flush_all_to_thread()Anton Blanchard
Create a single function that flushes everything (FP, VMX, VSX, SPE). Doing this all at once means we only do one MSR write. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-02powerpc: create giveup_all()Anton Blanchard
Create a single function that gives everything up (FP, VMX, VSX, SPE). Doing this all at once means we only do one MSR write. A context switch microbenchmark using yield(): http://ozlabs.org/~anton/junkcode/context_switch2.c ./context_switch2 --test=yield --fp --altivec --vector 0 0 shows an improvement of 3% on POWER8. Signed-off-by: Anton Blanchard <anton@samba.org> [mpe: giveup_all() needs to be EXPORT_SYMBOL'ed] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-01powerpc: Remove fp_enable() and vec_enable(), use msr_check_and_{set, clear}()Anton Blanchard
More consolidation of our MSR available bit handling. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-01powerpc: Add ppc_strict_facility_enable boot optionAnton Blanchard
Add a boot option that strictly manages the MSR unavailable bits. This catches kernel uses of FP/Altivec/SPE that would otherwise corrupt user state. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-01powerpc: Create msr_check_and_{set,clear}()Anton Blanchard
Create helper functions to set and clear MSR bits after first checking if they are already set. Grouping them will make it easy to avoid the MSR writes in a subsequent optimisation. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-01powerpc: Move part of giveup_vsx into cAnton Blanchard
Move the MSR modification into c. Removing it from the assembly function will allow us to avoid costly MSR writes by batching them up. Check the FP and VMX bits before calling the relevant giveup_*() function. This makes giveup_vsx() and flush_vsx_to_thread() perform more like their sister functions, and allows us to use flush_vsx_to_thread() in the signal code. Move the check_if_tm_restore_required() check in. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-01powerpc: Move part of giveup_fpu,altivec,spe into cAnton Blanchard
Move the MSR modification into new c functions. Removing it from the low level functions will allow us to avoid costly MSR writes by batching them up. Move the check_if_tm_restore_required() check into these new functions. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-01powerpc: Create mtmsrd_isync()Anton Blanchard
mtmsrd_isync() will do an mtmsrd followed by an isync on older processors. On newer processors we avoid the isync via a feature fixup. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-01powerpc: Simplify TM restore checksAnton Blanchard
Instead of having multiple giveup_*_maybe_transactional() functions, separate out the TM check into a new function called check_if_tm_restore_required(). This will make it easier to optimise the giveup_*() functions in a subsequent patch. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-01powerpc: Remove UP only lazy floating point and vector optimisationsAnton Blanchard
The UP only lazy floating point and vector optimisations were written back when SMP was not common, and neither glibc nor gcc used vector instructions. Now SMP is very common, glibc aggressively uses vector instructions and gcc autovectorises. We want to add new optimisations that apply to both UP and SMP, but in preparation for that remove these UP only optimisations. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-01powerpc: Create context switch helpers save_sprs() and restore_sprs()Anton Blanchard
Move all our context switch SPR save and restore code into two helpers. We do a few optimisations: - Group all mfsprs and all mtsprs. In many cases an mtspr sets a scoreboarding bit that an mfspr waits on, so the current practise of mfspr A; mtspr A; mfpsr B; mtspr B is the worst scheduling we can do. - SPR writes are slow, so check that the value is changing before writing it. A context switch microbenchmark using yield(): http://ozlabs.org/~anton/junkcode/context_switch2.c ./context_switch2 --test=yield 0 0 shows an improvement of almost 10% on POWER8. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-11-23powerpc/tm: Check for already reclaimed tasksMichael Neuling
Currently we can hit a scenario where we'll tm_reclaim() twice. This results in a TM bad thing exception because the second reclaim occurs when not in suspend mode. The scenario in which this can happen is the following. We attempt to deliver a signal to userspace. To do this we need obtain the stack pointer to write the signal context. To get this stack pointer we must tm_reclaim() in case we need to use the checkpointed stack pointer (see get_tm_stackpointer()). Normally we'd then return directly to userspace to deliver the signal without going through __switch_to(). Unfortunatley, if at this point we get an error (such as a bad userspace stack pointer), we need to exit the process. The exit will result in a __switch_to(). __switch_to() will attempt to save the process state which results in another tm_reclaim(). This tm_reclaim() now causes a TM Bad Thing exception as this state has already been saved and the processor is no longer in TM suspend mode. Whee! This patch checks the state of the MSR to ensure we are TM suspended before we attempt the tm_reclaim(). If we've already saved the state away, we should no longer be in TM suspend mode. This has the additional advantage of checking for a potential TM Bad Thing exception. Found using syscall fuzzer. Fixes: fb09692e71f1 ("powerpc: Add reclaim and recheckpoint functions for context switching transactional memory processes") Cc: stable@vger.kernel.org # v3.9+ Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-09-03Merge tag 'powerpc-4.3-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - support "hybrid" iommu/direct DMA ops for coherent_mask < dma_mask from Benjamin Herrenschmidt - EEH fixes for SRIOV from Gavin - introduce rtas_get_sensor_fast() for IRQ handlers from Thomas Huth - use hardware RNG for arch_get_random_seed_* not arch_get_random_* from Paul Mackerras - seccomp filter support from Michael Ellerman - opal_cec_reboot2() handling for HMIs & machine checks from Mahesh Salgaonkar - add powerpc timebase as a trace clock source from Naveen N. Rao - misc cleanups in the xmon, signal & SLB code from Anshuman Khandual - add an inline function to update POWER8 HID0 from Gautham R. Shenoy - fix pte_pagesize_index() crash on 4K w/64K hash from Michael Ellerman - drop support for 64K local store on 4K kernels from Michael Ellerman - move dma_get_required_mask() from pnv_phb to pci_controller_ops from Andrew Donnellan - initialize distance lookup table from drconf path from Nikunj A Dadhania - enable RTC class support from Vaibhav Jain - disable automatically blocked PCI config from Gavin Shan - add LEDs driver for PowerNV platform from Vasant Hegde - fix endianness issues in the HVSI driver from Laurent Dufour - kexec endian fixes from Samuel Mendoza-Jonas - fix corrupted pdn list from Gavin Shan - fix fenced PHB caused by eeh_slot_error_detail() from Gavin Shan - Freescale updates from Scott: Highlights include 32-bit memcpy/memset optimizations, checksum optimizations, 85xx config fragments and updates, device tree updates, e6500 fixes for non-SMP, and misc cleanup and minor fixes. - a ton of cxl updates & fixes: - add explicit precision specifiers from Rasmus Villemoes - use more common format specifier from Rasmus Villemoes - destroy cxl_adapter_idr on module_exit from Johannes Thumshirn - destroy afu->contexts_idr on release of an afu from Johannes Thumshirn - compile with -Werror from Daniel Axtens - EEH support from Daniel Axtens - plug irq_bitmap getting leaked in cxl_context from Vaibhav Jain - add alternate MMIO error handling from Ian Munsie - allow release of contexts which have been OPENED but not STARTED from Andrew Donnellan - remove use of macro DEFINE_PCI_DEVICE_TABLE from Vaishali Thakkar - release irqs if memory allocation fails from Vaibhav Jain - remove racy attempt to force EEH invocation in reset from Daniel Axtens - fix + cleanup error paths in cxl_dev_context_init from Ian Munsie - fix force unmapping mmaps of contexts allocated through the kernel api from Ian Munsie - set up and enable PSL Timebase from Philippe Bergheaud * tag 'powerpc-4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (140 commits) cxl: Set up and enable PSL Timebase cxl: Fix force unmapping mmaps of contexts allocated through the kernel api cxl: Fix + cleanup error paths in cxl_dev_context_init powerpc/eeh: Fix fenced PHB caused by eeh_slot_error_detail() powerpc/pseries: Cleanup on pci_dn_reconfig_notifier() powerpc/pseries: Fix corrupted pdn list powerpc/powernv: Enable LEDS support powerpc/iommu: Set default DMA offset in dma_dev_setup cxl: Remove racy attempt to force EEH invocation in reset cxl: Release irqs if memory allocation fails cxl: Remove use of macro DEFINE_PCI_DEVICE_TABLE powerpc/powernv: Fix mis-merge of OPAL support for LEDS driver powerpc/powernv: Reset HILE before kexec_sequence() powerpc/kexec: Reset secondary cpu endianness before kexec powerpc/hvsi: Fix endianness issues in the HVSI driver leds/powernv: Add driver for PowerNV platform powerpc/powernv: Create LED platform device powerpc/powernv: Add OPAL interfaces for accessing and modifying system LED states powerpc/powernv: Fix the log message when disabling VF cxl: Allow release of contexts which have been OPENED but not STARTED ...
2015-07-16powerpc/tm: Drop tm_orig_msr from thread_structAnshuman Khandual
Currently tm_orig_msr is getting used during process context switch only. Then there is ckpt_regs which saves the checkpointed userspace context The MSR slot contained in ckpt_regs structure can be used during process context switch instead of tm_orig_msr, thus allowing us to drop it from thread_struct structure. This patch does that change. Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-14powerpc: Uncomment and make enable_kernel_vsx() routine availableLeonidas Da Silva Barbosa
enable_kernel_vsx() function was commented since anything was using it. However, vmx-crypto driver uses VSX instructions which are only available if VSX is enable. Otherwise it rises an exception oops. This patch uncomment enable_kernel_vsx() routine and makes it available. Signed-off-by: Leonidas S. Barbosa <leosilva@linux.vnet.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-06-07powerpc/kernel: Remove the unused extern dscr_defaultAnshuman Khandual
The process context switch code no longer uses dscr_default variable from the sysfs.c file. The variable became unused when we started storing the CPU specific DSCR value in the PACA structure instead. This patch just removes this extern declaration. It was originally added by the following commit. Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-03-20powerpc/kernel: Rename copy_thread() 'arg' argument to 'kthread_arg'Alex Dowad
The 'arg' argument to copy_thread() is only ever used when forking a new kernel thread. Hence, rename it to 'kthread_arg' for clarity. Signed-off-by: Alex Dowad <alexinbeijing@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-17powerpc: Use generic PIE randomizationVineeth Vijayan
Back in 2009 we merged 501cb16d3cfd "Randomise PIEs", which added support for randomizing PIE (Position Independent Executable) binaries. That commit added randomize_et_dyn(), which correctly randomized the addresses, but failed to honor PF_RANDOMIZE. That means it was not possible to disable PIE randomization via the personality flag, or /proc/sys/kernel/randomize_va_space. Since then there has been generic support for PIE randomization added to binfmt_elf.c, selectable via ARCH_BINFMT_ELF_RANDOMIZE_PIE. Enabling that allows us to drop randomize_et_dyn(), which means we start honoring PF_RANDOMIZE correctly. It also causes a fairly major change to how we layout PIE binaries. Currently we will place the binary at 512MB-520MB for 32 bit binaries, or 512MB-1.5GB for 64 bit binaries, eg: $ cat /proc/$$/maps 4e550000-4e580000 r-xp 00000000 08:02 129813 /bin/dash 4e580000-4e590000 rw-p 00020000 08:02 129813 /bin/dash 10014110000-10014140000 rw-p 00000000 00:00 0 [heap] 3fffaa3f0000-3fffaa5a0000 r-xp 00000000 08:02 921 /lib/powerpc64le-linux-gnu/libc-2.19.so 3fffaa5a0000-3fffaa5b0000 rw-p 001a0000 08:02 921 /lib/powerpc64le-linux-gnu/libc-2.19.so 3fffaa5c0000-3fffaa5d0000 rw-p 00000000 00:00 0 3fffaa5d0000-3fffaa5f0000 r-xp 00000000 00:00 0 [vdso] 3fffaa5f0000-3fffaa620000 r-xp 00000000 08:02 1246 /lib/powerpc64le-linux-gnu/ld-2.19.so 3fffaa620000-3fffaa630000 rw-p 00020000 08:02 1246 /lib/powerpc64le-linux-gnu/ld-2.19.so 3ffffc340000-3ffffc370000 rw-p 00000000 00:00 0 [stack] With this commit applied we don't do any special randomisation for the binary, and instead rely on mmap randomisation. This means the binary ends up at high addresses, eg: $ cat /proc/$$/maps 3fff99820000-3fff999d0000 r-xp 00000000 08:02 921 /lib/powerpc64le-linux-gnu/libc-2.19.so 3fff999d0000-3fff999e0000 rw-p 001a0000 08:02 921 /lib/powerpc64le-linux-gnu/libc-2.19.so 3fff999f0000-3fff99a00000 rw-p 00000000 00:00 0 3fff99a00000-3fff99a20000 r-xp 00000000 00:00 0 [vdso] 3fff99a20000-3fff99a50000 r-xp 00000000 08:02 1246 /lib/powerpc64le-linux-gnu/ld-2.19.so 3fff99a50000-3fff99a60000 rw-p 00020000 08:02 1246 /lib/powerpc64le-linux-gnu/ld-2.19.so 3fff99a60000-3fff99a90000 r-xp 00000000 08:02 129813 /bin/dash 3fff99a90000-3fff99aa0000 rw-p 00020000 08:02 129813 /bin/dash 3fffc3de0000-3fffc3e10000 rw-p 00000000 00:00 0 [stack] 3fffc55e0000-3fffc5610000 rw-p 00000000 00:00 0 [heap] Although this should be OK, it's possible it might break badly written binaries that make assumptions about the address space layout. Signed-off-by: Vineeth Vijayan <vvijayan@mvista.com> [mpe: Rewrite changelog] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-10powerpc/ftrace: Remove mod_return_to_handlerAnton Blanchard
mod_return_to_handler is the same as return_to_handler, except it handles the change of the TOC (r2). Add this into return_to_handler and remove mod_return_to_handler. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-05powerpc: Use probe_kernel_address in show_instructionsAnton Blanchard
We really don't want to take a pagefault in show_instructions, so use probe_kernel_address instead of __get_user. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-03powerpc: Replace __get_cpu_var usesChristoph Lameter
This still has not been merged and now powerpc is the only arch that does not have this change. Sorry about missing linuxppc-dev before. V2->V2 - Fix up to work against 3.18-rc1 __get_cpu_var() is used for multiple purposes in the kernel source. One of them is address calculation via the form &__get_cpu_var(x). This calculates the address for the instance of the percpu variable of the current processor based on an offset. Other use cases are for storing and retrieving data from the current processors percpu area. __get_cpu_var() can be used as an lvalue when writing data or on the right side of an assignment. __get_cpu_var() is defined as : __get_cpu_var() always only does an address determination. However, store and retrieve operations could use a segment prefix (or global register on other platforms) to avoid the address calculation. this_cpu_write() and this_cpu_read() can directly take an offset into a percpu area and use optimized assembly code to read and write per cpu variables. This patch converts __get_cpu_var into either an explicit address calculation using this_cpu_ptr() or into a use of this_cpu operations that use the offset. Thereby address calculations are avoided and less registers are used when code is generated. At the end of the patch set all uses of __get_cpu_var have been removed so the macro is removed too. The patch set includes passes over all arches as well. Once these operations are used throughout then specialized macros can be defined in non -x86 arches as well in order to optimize per cpu access by f.e. using a global register that may be set to the per cpu base. Transformations done to __get_cpu_var() 1. Determine the address of the percpu instance of the current processor. DEFINE_PER_CPU(int, y); int *x = &__get_cpu_var(y); Converts to int *x = this_cpu_ptr(&y); 2. Same as #1 but this time an array structure is involved. DEFINE_PER_CPU(int, y[20]); int *x = __get_cpu_var(y); Converts to int *x = this_cpu_ptr(y); 3. Retrieve the content of the current processors instance of a per cpu variable. DEFINE_PER_CPU(int, y); int x = __get_cpu_var(y) Converts to int x = __this_cpu_read(y); 4. Retrieve the content of a percpu struct DEFINE_PER_CPU(struct mystruct, y); struct mystruct x = __get_cpu_var(y); Converts to memcpy(&x, this_cpu_ptr(&y), sizeof(x)); 5. Assignment to a per cpu variable DEFINE_PER_CPU(int, y) __get_cpu_var(y) = x; Converts to __this_cpu_write(y, x); 6. Increment/Decrement etc of a per cpu variable DEFINE_PER_CPU(int, y); __get_cpu_var(y)++ Converts to __this_cpu_inc(y) Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: Paul Mackerras <paulus@samba.org> Signed-off-by: Christoph Lameter <cl@linux.com> [mpe: Fix build errors caused by set/or_softirq_pending(), and rework assignment in __set_breakpoint() to use memcpy().] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-15powerpc: Rename __get_SP() to current_stack_pointer()Anton Blanchard
Michael points out that __get_SP() is a pretty horrible function name. Let's give it a better name. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-15powerpc: Reimplement __get_SP() as a function not a defineAnton Blanchard
Li Zhong points out an issue with our current __get_SP() implementation. If ftrace function tracing is enabled (ie -pg profiling using _mcount) we spill a stack frame on 64bit all the time. If a function calls __get_SP() and later calls a function that is tail call optimised, we will pop the stack frame and the value returned by __get_SP() is no longer valid. An example from Li can be found in save_stack_trace -> save_context_stack: c0000000000432c0 <.save_stack_trace>: c0000000000432c0: mflr r0 c0000000000432c4: std r0,16(r1) c0000000000432c8: stdu r1,-128(r1) <-- stack frame for _mcount c0000000000432cc: std r3,112(r1) c0000000000432d0: bl <._mcount> c0000000000432d4: nop c0000000000432d8: mr r4,r1 <-- __get_SP() c0000000000432dc: ld r5,632(r13) c0000000000432e0: ld r3,112(r1) c0000000000432e4: li r6,1 c0000000000432e8: addi r1,r1,128 <-- pop stack frame c0000000000432ec: ld r0,16(r1) c0000000000432f0: mtlr r0 c0000000000432f4: b <.save_context_stack> <-- tail call optimized save_context_stack ends up with a stack pointer below the current one, and it is likely to be scribbled over. Fix this by making __get_SP() a function which returns the callers stack frame. Also replace inline assembly which grabs the stack pointer in save_stack_trace and show_stack with __get_SP(). This also fixes an issue with perf_arch_fetch_caller_regs(). It currently unwinds the stack once, which will skip a valid stack frame on a leaf function. With the __get_SP() fixes in this patch, we never need to unwind the stack frame to get to the first interesting frame. We have to export __get_SP() because perf_arch_fetch_caller_regs() (which is used in modules) calls it from a header file. Reported-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-25powerpc: Move more symbol exports next to function definitionsAnton Blanchard
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-08-05powerpc: Reduce scariness of interrupt frames in stack tracesPaul Mackerras
Some people see things like "Exception: 501" in stack traces in dmesg and assume that means that something has gone badly wrong, when in fact "Exception: 501" just means a device interrupt was taken. This changes "Exception" to "interrupt" to make it clearer that we are just recording the fact of a change in control flow rather than some error condition. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-28powerpc: Pull out ksp_vsid logic into a helperMichael Ellerman
The previous patch left a bit of a wart in copy_process(). Clean it up a bit by moving the logic out into a helper. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-07-28powerpc: Remove MMU_FTR_SLBMichael Ellerman
We now only support cpus that use an SLB, so we don't need an MMU feature to indicate that. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11powerpc: Correct DSCR during TM context switchSam bobroff
Correct the DSCR SPR becoming temporarily corrupted if a task is context switched during a transaction. The problem occurs while suspending the task and is caused by saving the DSCR to thread.dscr after it has already been set to the CPU's default value: __switch_to() calls __switch_to_tm() which calls tm_reclaim_task() which calls tm_reclaim_thread() which calls tm_reclaim() where the DSCR is set to the CPU's default __switch_to() calls _switch() where thread.dscr is set to the DSCR When the task is resumed, it's transaction will be doomed (as usual) and the DSCR SPR will be corrupted, although the checkpointed value will be correct. Therefore the DSCR will be immediately corrected by the transaction aborting, unless it has been suspended. In that case the incorrect value can be seen by the task until it resumes the transaction. The fix is to treat the DSCR similarly to the TAR and save it early in __switch_to(). A program exposing the problem is added to the kernel self tests as: tools/testing/selftests/powerpc/tm/tm-resched-dscr. Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> CC: <stable@vger.kernel.org> [v3.10+] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-05-20powerpc: Fix smp_processor_id() in preemptible splat in set_breakpointPaul Gortmaker
Currently, on 8641D, which doesn't set CONFIG_HAVE_HW_BREAKPOINT we get the following splat: BUG: using smp_processor_id() in preemptible [00000000] code: login/1382 caller is set_breakpoint+0x1c/0xa0 CPU: 0 PID: 1382 Comm: login Not tainted 3.15.0-rc3-00041-g2aafe1a4d451 #1 Call Trace: [decd5d80] [c0008dc4] show_stack+0x50/0x158 (unreliable) [decd5dc0] [c03c6fa0] dump_stack+0x7c/0xdc [decd5de0] [c01f8818] check_preemption_disabled+0xf4/0x104 [decd5e00] [c00086b8] set_breakpoint+0x1c/0xa0 [decd5e10] [c00d4530] flush_old_exec+0x2bc/0x588 [decd5e40] [c011c468] load_elf_binary+0x2ac/0x1164 [decd5ec0] [c00d35f8] search_binary_handler+0xc4/0x1f8 [decd5ef0] [c00d4ee8] do_execve+0x3d8/0x4b8 [decd5f40] [c001185c] ret_from_syscall+0x0/0x38 --- Exception: c01 at 0xfeee554 LR = 0xfeee7d4 The call path in this case is: flush_thread --> set_debug_reg_defaults --> set_breakpoint --> __get_cpu_var Since preemption is enabled in the cleanup of flush thread, and there is no need to disable it, introduce the distinction between set_breakpoint and __set_breakpoint, leaving only the flush_thread instance as the current user of set_breakpoint. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>