summaryrefslogtreecommitdiffstats
path: root/arch/powerpc/include/asm/ppc-opcode.h
AgeCommit message (Collapse)Author
2015-10-21powerpc: Revert "Use the POWER8 Micro Partition Prefetch Engine in KVM HV on ↵Paul Mackerras
POWER8" This reverts commit 9678cdaae939 ("Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8") because the original commit had multiple, partly self-cancelling bugs, that could cause occasional memory corruption. In fact the logmpp instruction was incorrectly using register r0 as the source of the buffer address and operation code, and depending on what was in r0, it would either do nothing or corrupt the 64k page pointed to by r0. The logmpp instruction encoding and the operation code definitions could be corrected, but then there is the problem that there is no clearly defined way to know when the hardware has finished writing to the buffer. The original commit attempted to work around this by aborting the write-out before starting the prefetch, but this is ineffective in the case where the virtual core is now executing on a different physical core from the one where the write-out was initiated. These problems plus advice from the hardware designers not to use the function (since the measured performance improvement from using the feature was actually mostly negative), mean that reverting the code is the best option. Fixes: 9678cdaae939 ("Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8") Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-22KVM: PPC: Fix warnings from sparseThomas Huth
When compiling the KVM code for POWER with "make C=1", sparse complains about functions missing proper prototypes and a 64-bit constant missing the ULL prefix. Let's fix this by making the functions static or by including the proper header with the prototypes, and by appending a ULL prefix to the constant PPC_MPPE_ADDRESS_MASK. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2015-05-11powerpc: Add ICSWX instructionDan Streetman
Add the asm ICSWX and ICSWEPX opcodes. Add definitions for the Coprocessor Request structures needed to use the icswx calls to coprocessors. Add icswx() function to perform the ICSWX asm using the provided Coprocessor Command Word value and Coprocessor Request Block structure. This is required for communication with the NX-842 coprocessor on a PowerNV system. Signed-off-by: Dan Streetman <ddstreet@ieee.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-04-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/usb/asix_common.c drivers/net/usb/sr9800.c drivers/net/usb/usbnet.c include/linux/usb/usbnet.h net/ipv4/tcp_ipv4.c net/ipv6/tcp_ipv6.c The TCP conflicts were overlapping changes. In 'net' we added a READ_ONCE() to the socket cached RX route read, whilst in 'net-next' Eric Dumazet touched the surrounding code dealing with how mini sockets are handled. With USB, it's a case of the same bug fix first going into net-next and then I cherry picked it back into net. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-20powerpc/powernv: Fixes for hypervisor doorbell handlingPaul Mackerras
Since we can now use hypervisor doorbells for host IPIs, this makes sure we clear the host IPI flag when taking a doorbell interrupt, and clears any pending doorbell IPI in pnv_smp_cpu_kill_self() (as we already do for IPIs sent via the XICS interrupt controller). Otherwise if there did happen to be a leftover pending doorbell interrupt for an offline CPU thread for any reason, it would prevent that thread from going into a power-saving mode; it would instead keep waking up because of the interrupt. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-02-20ppc: bpf: add reqired opcodes for ppc32Denis Kirjanov
Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-19Merge tag 'powerpc-3.19-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux Pull second batch of powerpc updates from Michael Ellerman: "The highlight is the series that reworks the idle management on powernv, which allows us to use deeper idle states on those machines. There's the fix from Anton for the "BUG at kernel/smpboot.c:134!" problem. An i2c driver for powernv. This is acked by Wolfram Sang, and he asked that we take it through the powerpc tree. A fix for audit from rgb at Red Hat, acked by Paul Moore who is one of the audit maintainers. A patch from Ben to export the symbol map of our OPAL firmware as a sysfs file, so that tools can use it. Also some CXL fixes, a couple of powerpc perf fixes, a fix for smt-enabled, and the patch to add __force to get_user() so we can use bitwise types" * tag 'powerpc-3.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: powerpc/powernv: Ignore smt-enabled on Power8 and later powerpc/uaccess: Allow get_user() with bitwise types powerpc/powernv: Expose OPAL firmware symbol map powernv/powerpc: Add winkle support for offline cpus powernv/cpuidle: Redesign idle states management powerpc/powernv: Enable Offline CPUs to enter deep idle states powerpc/powernv: Switch off MMU before entering nap/sleep/rvwinkle mode i2c: Driver to expose PowerNV platform i2c busses powerpc: add little endian flag to syscall_get_arch() power/perf/hv-24x7: Use kmem_cache_free() instead of kfree powerpc/perf/hv-24x7: Use per-cpu page buffer cxl: Unmap MMIO regions when detaching a context cxl: Add timeout to process element commands cxl: Change contexts_lock to a mutex to fix sleep while atomic bug powerpc: Secondary CPUs must set cpu_callin_map after setting active and online
2014-12-15powernv/powerpc: Add winkle support for offline cpusShreyas B. Prabhu
Winkle is a deep idle state supported in power8 chips. A core enters winkle when all the threads of the core enter winkle. In this state power supply to the entire chiplet i.e core, private L2 and private L3 is turned off. As a result it gives higher powersavings compared to sleep. But entering winkle results in a total hypervisor state loss. Hence the hypervisor context has to be preserved before entering winkle and restored upon wake up. Power-on Reset Engine (PORE) is a dedicated engine which is responsible for powering on the chiplet during wake up. It can be programmed to restore the register contests of a few specific registers. This patch uses PORE to restore register state wherever possible and uses stack to save and restore rest of the necessary registers. With hypervisor state restore things fall under three categories- per-core state, per-subcore state and per-thread state. To manage this, extend the infrastructure introduced for sleep. Mainly we add a paca variable subcore_sibling_mask. Using this and the core_idle_state we can distingush first thread in core and subcore. Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-03PPC: bpf_jit_comp: add SKF_AD_PKTTYPE instructionDenis Kirjanov
Add BPF extension SKF_AD_PKTTYPE to ppc JIT to load skb->pkt_type field. Before: [ 88.262622] test_bpf: #11 LD_IND_NET 86 97 99 PASS [ 88.265740] test_bpf: #12 LD_PKTTYPE 109 107 PASS After: [ 80.605964] test_bpf: #11 LD_IND_NET 44 40 39 PASS [ 80.607370] test_bpf: #12 LD_PKTTYPE 9 9 PASS CC: Alexei Starovoitov<alexei.starovoitov@gmail.com> CC: Michael Ellerman<mpe@ellerman.id.au> Cc: Matt Evans <matt@ozlabs.org> Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org> v2: Added test rusults Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-07Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull second round of KVM changes from Paolo Bonzini: "Here are the PPC and ARM changes for KVM, which I separated because they had small conflicts (respectively within KVM documentation, and with 3.16-rc changes). Since they were all within the subsystem, I took care of them. Stephen Rothwell reported some snags in PPC builds, but they are all fixed now; the latest linux-next report was clean. New features for ARM include: - KVM VGIC v2 emulation on GICv3 hardware - Big-Endian support for arm/arm64 (guest and host) - Debug Architecture support for arm64 (arm32 is on Christoffer's todo list) And for PPC: - Book3S: Good number of LE host fixes, enable HV on LE - Book3S HV: Add in-guest debug support This release drops support for KVM on the PPC440. As a result, the PPC merge removes more lines than it adds. :) I also included an x86 change, since Davidlohr tied it to an independent bug report and the reporter quickly provided a Tested-by; there was no reason to wait for -rc2" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (122 commits) KVM: Move more code under CONFIG_HAVE_KVM_IRQFD KVM: nVMX: fix "acknowledge interrupt on exit" when APICv is in use KVM: nVMX: Fix nested vmexit ack intr before load vmcs01 KVM: PPC: Enable IRQFD support for the XICS interrupt controller KVM: Give IRQFD its own separate enabling Kconfig option KVM: Move irq notifier implementation into eventfd.c KVM: Move all accesses to kvm::irq_routing into irqchip.c KVM: irqchip: Provide and use accessors for irq routing table KVM: Don't keep reference to irq routing table in irqfd struct KVM: PPC: drop duplicate tracepoint arm64: KVM: fix 64bit CP15 VM access for 32bit guests KVM: arm64: GICv3: mandate page-aligned GICV region arm64: KVM: GICv3: move system register access to msr_s/mrs_s KVM: PPC: PR: Handle FSCR feature deselects KVM: PPC: HV: Remove generic instruction emulation KVM: PPC: BOOKEHV: rename e500hv_spr to bookehv_spr KVM: PPC: Remove DCR handling KVM: PPC: Expose helper functions for data/inst faults KVM: PPC: Separate loadstore emulation from priv emulation KVM: PPC: Handle magic page in kvmppc_ld/st ...
2014-07-29powerpc/e6500: Add support for hardware threadsAndy Fleming
The general idea is that each core will release all of its threads into the secondary thread startup code, which will eventually wait in the secondary core holding area, for the appropriate bit in the PACA to be set. The kick_cpu function pointer will set that bit in the PACA, and thus "release" the core/thread to boot. We also need to do a few things that U-Boot normally does for CPUs (like enable branch prediction). Signed-off-by: Andy Fleming <afleming@freescale.com> [scottwood@freescale.com: various changes, including only enabling threads if Linux wants to kick them] Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-07-28Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8Stewart Smith
The POWER8 processor has a Micro Partition Prefetch Engine, which is a fancy way of saying "has way to store and load contents of L2 or L2+MRU way of L3 cache". We initiate the storing of the log (list of addresses) using the logmpp instruction and start restore by writing to a SPR. The logmpp instruction takes parameters in a single 64bit register: - starting address of the table to store log of L2/L2+L3 cache contents - 32kb for L2 - 128kb for L2+L3 - Aligned relative to maximum size of the table (32kb or 128kb) - Log control (no-op, L2 only, L2 and L3, abort logout) We should abort any ongoing logging before initiating one. To initiate restore, we write to the MPPR SPR. The format of what to write to the SPR is similar to the logmpp instruction parameter: - starting address of the table to read from (same alignment requirements) - table size (no data, until end of table) - prefetch rate (from fastest possible to slower. about every 8, 16, 24 or 32 cycles) The idea behind loading and storing the contents of L2/L3 cache is to reduce memory latency in a system that is frequently swapping vcores on a physical CPU. The best case scenario for doing this is when some vcores are doing very cache heavy workloads. The worst case is when they have about 0 cache hits, so we just generate needless memory operations. This implementation just does L2 store/load. In my benchmarks this proves to be useful. Benchmark 1: - 16 core POWER8 - 3x Ubuntu 14.04LTS guests (LE) with 8 VCPUs each - No split core/SMT - two guests running sysbench memory test. sysbench --test=memory --num-threads=8 run - one guest running apache bench (of default HTML page) ab -n 490000 -c 400 http://localhost/ This benchmark aims to measure performance of real world application (apache) where other guests are cache hot with their own workloads. The sysbench memory benchmark does pointer sized writes to a (small) memory buffer in a loop. In this benchmark with this patch I can see an improvement both in requests per second (~5%) and in mean and median response times (again, about 5%). The spread of minimum and maximum response times were largely unchanged. benchmark 2: - Same VM config as benchmark 1 - all three guests running sysbench memory benchmark This benchmark aims to see if there is a positive or negative affect to this cache heavy benchmark. Although due to the nature of the benchmark (stores) we may not see a difference in performance, but rather hopefully an improvement in consistency of performance (when vcore switched in, don't have to wait many times for cachelines to be pulled in) The results of this benchmark are improvements in consistency of performance rather than performance itself. With this patch, the few outliers in duration go away and we get more consistent performance in each guest. benchmark 3: - same 3 guests and CPU configuration as benchmark 1 and 2. - two idle guests - 1 guest running STREAM benchmark This scenario also saw performance improvement with this patch. On Copy and Scale workloads from STREAM, I got 5-6% improvement with this patch. For Add and triad, it was around 10% (or more). benchmark 4: - same 3 guests as previous benchmarks - two guests running sysbench --memory, distinctly different cache heavy workload - one guest running STREAM benchmark. Similar improvements to benchmark 3. benchmark 5: - 1 guest, 8 VCPUs, Ubuntu 14.04 - Host configured with split core (SMT8, subcores-per-core=4) - STREAM benchmark In this benchmark, we see a 10-20% performance improvement across the board of STREAM benchmark results with this patch. Based on preliminary investigation and microbenchmarks by Prerna Saxena <prerna@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-31powerpc/bpf: Fix DIVWU instruction opcodeVladimir Murzin
Currently DIVWU stands for *signed* divw opcode: 7d 2a 4b 96 divwu r9,r10,r9 7d 2a 4b d6 divw r9,r10,r9 Use the *unsigned* divw opcode for DIVWU. Suggested-by: Vassili Karpov <av1474@comtv.ru> Reviewed-by: Vassili Karpov <av1474@comtv.ru> Signed-off-by: Vladimir Murzin <murzin.v@gmail.com> Acked-by: Matt Evans <matt@ozlabs.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-10-31powerpc/bpf: BPF JIT compiler for 64-bit Little EndianPhilippe Bergheaud
This enables the Berkeley Packet Filter JIT compiler for the PowerPC running in 64bit Little Endian. Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-10-16powerpc: Emulate sync instruction variantsJames Yang
Reserved fields of the sync instruction have been used for other instructions (e.g. lwsync). On processors that do not support variants of the sync instruction, emulate it by executing a sync to subsume the effect of the intended instruction. Signed-off-by: James Yang <James.Yang@freescale.com> [scottwood@freescale.com: whitespace and subject line fix] Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-10-11powerpc: Little endian builds double word swap VSX state during context ↵Anton Blanchard
save/restore The elements within VSX loads and stores are big endian ordered regardless of endianness. Our VSX context save/restore code uses lxvd2x and stxvd2x which is a 2x doubleword operation. This means the two doublewords will be swapped and we have to perform another swap to undo it. We need to do this on save and restore. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-30powerpc: Move opcode definitions from kvm/emulate.c to asm/ppc-opcode.hHongtao Jia
Opcode and xopcode are useful definitions not just for KVM. Move these definitions to asm/ppc-opcode.h for public use. Also add the opcodes for LHAUX and LWZUX. Signed-off-by: Jia Hongtao <hongtao.jia@freescale.com> Signed-off-by: Li Yang <leoli@freescale.com> [scottwood@freesacle.com: update commit message and rebase] Signed-off-by: Scott Wood <scottwood@freescale.com>
2013-05-06powerpc: Emulate non privileged DSCR read and writeAnton Blanchard
POWER8 allows read and write of the DSCR in userspace. We added kernel emulation so applications could always use the instructions regardless of the CPU type. Unfortunately there are two SPRs for the DSCR and we only added emulation for the privileged one. Add code to match the non privileged one. A simple test was created to verify the fix: http://ozlabs.org/~anton/junkcode/user_dscr_test.c Without the patch we get a SIGILL and it passes with the patch. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: <stable@kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-26powerpc/perf: Add new BHRB related instructions for POWER8Anshuman Khandual
This patch adds new POWER8 instruction encoding for reading and clearing Branch History Rolling Buffer entries. The new instruction 'mfbhrbe' (move from branch history rolling buffer entry) is used to read BHRB buffer entries and instruction 'clrbhrb' (clear branch history rolling buffer) is used to clear the entire buffer. The instruction 'clrbhrb' has straight forward encoding. But the instruction encoding format for reading the BHRB entries is like 'mfbhrbe RT, BHRBE' where it takes two arguments, i.e the index for the BHRB buffer entry to read and a general purpose register to put the value which was read from the buffer entry. Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-02-15powerpc: Add new instructions for transactional memoryMichael Neuling
Here we define the new instructions we need for transactional memory in the kernel. This is so we can support compiling with binutils that don't support the new transactional memory instructions. Transactional memory results in two sets of architected state (GPRs/VSRs etc). treclaim allows us to read the checkpointed state (from the tbegin) so that we can store it away on a context switch. It does this by overwriting the exiting architected state, so you have to save that away before you treclaim. treclaim will also abort a transaction, so you can give a register value which contains an abort reason. trecheckpoint allows us to inject into the checkpointed state as if it were at the tbegin. It does this by copying the current architected state into the checkpointed state. Signed-off-by: Matt Evans <matt@ozlabs.org> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-10powerpc: Define differences between doorbells on book3e and book3sIan Munsie
There are a few key differences between doorbells on server compared with embedded that we care about on Linux, namely: - We have a new msgsndp instruction for directed privileged doorbells. msgsnd is used for directed hypervisor doorbells. - The tag we use in the instruction is the Thread Identification Register of the recipient thread (since server doorbells can only occur between threads within a single core), and is only 7 bits wide. - A new message type is introduced for server doorbells (none of the existing book3e message types are currently supported on book3s). Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Tested-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-12-18Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc update from Benjamin Herrenschmidt: "The main highlight is probably some base POWER8 support. There's more to come such as transactional memory support but that will wait for the next one. Overall it's pretty quiet, or rather I've been pretty poor at picking things up from patchwork and reviewing them this time around and Kumar no better on the FSL side it seems..." * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (73 commits) powerpc+of: Rename and fix OF reconfig notifier error inject module powerpc: mpc5200: Add a3m071 board support powerpc/512x: don't compile any platform DIU code if the DIU is not enabled powerpc/mpc52xx: use module_platform_driver macro powerpc+of: Export of_reconfig_notifier_[register,unregister] powerpc/dma/raidengine: add raidengine device powerpc/iommu/fsl: Add PAMU bypass enable register to ccsr_guts struct powerpc/mpc85xx: Change spin table to cached memory powerpc/fsl-pci: Add PCI controller ATMU PM support powerpc/86xx: fsl_pcibios_fixup_bus requires CONFIG_PCI drivers/virt: the Freescale hypervisor driver doesn't need to check MSR[GS] powerpc/85xx: p1022ds: Use NULL instead of 0 for pointers powerpc: Disable relocation on exceptions when kexecing powerpc: Enable relocation on during exceptions at boot powerpc: Move get_longbusy_msecs into hvcall.h and remove duplicate function powerpc: Add wrappers to enable/disable relocation on exceptions powerpc: Add set_mode hcall powerpc: Setup relocation on exceptions for bare metal systems powerpc: Move initial mfspr LPCR out of __init_LPCR powerpc: Add relocation on exception vector handlers ...
2012-11-17PPC: net: bpf_jit_comp: add XOR instruction for BPF JITDaniel Borkmann
This patch is a follow-up for patch "filter: add XOR instruction for use with X/K" that implements BPF PowerPC JIT parts for the BPF XOR operation. Signed-off-by: Daniel Borkmann <daniel.borkmann@tik.ee.ethz.ch> Cc: Matt Evans <matt@ozlabs.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Matt Evans <matt@ozlabs.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-15powerpc: Fix typos in Freescale copyright claimsYang Li
There are many cases that Semiconductor is misspelled. The patch fix these typos. Signed-off-by: Li Yang <leoli@freescale.com> Acked-by: Timur Tabi <timur@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-11-15powerpc/47x: Use the new ppc-opcode infrastructureTony Breeds
Don't use 47x only #defines for TLBIVAX or ICBT, supply and use helpers in ppc-opcode.h This fixes a compile breakage. Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-17powerpc: Add denormalisation exception handling for POWER6/7Michael Neuling
On POWER6 and POWER7 if the input operand to an instruction is a denormalised single precision binary floating point value we can take a denormalisation exception where it's expected that the hypervisor (HV=1) will fix up the inputs before the instruction is run. This adds code to handle this denormalisation exception for POWER6 and POWER7. It also add a CONFIG_PPC_DENORMALISATION option and sets it in pseries/ppc64_defconfig. This is useful on bare metal systems only. Based on patch from Milton Miller. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10powerpc: Enforce usage of RA 0-R31 where possibleMichael Neuling
Some macros use RA where when RA=R0 the values is 0, so make this the enforced mnemonic in the macro. Idea suggested by Andreas Schwab. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10powerpc: Add defines for RA 0-R31Michael Neuling
R0 is special since it'll be 0. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10powerpc: Enforce usage of R0-R31 where possibleMichael Neuling
Enforce the use of R0-R31 in macros where possible now we have all the fixes in. R0-R31 macros are removed here so that can't be used anymore. They should not be defined anywhere. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10powerpc: Introduce new __REG_R macrosMichael Neuling
Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10powerpc: Start using ___PPC_RA/B/S/T where necessaryMichael Neuling
Now have ___PPC_RA/B/S/T we can use it in some places. These are places where we can't use the existing defines which will soon enforce R0-R31 usage. The macros being changed here are being used in inline asm, which can't convert to enforce the R0-R31 usage. bpf_jit uses a mix of both generated and non-generated with the same code, so just convert all these to use the ___PPC_R versions which won't enforce R usage later. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10powerpc: Introduce new ___PPC_RA/B/S/T macrosMichael Neuling
These are currently the same as __PPC_RA/B/S/T but we'll wrap them soon. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10powerpc: Fix VSX macros so register names aren't wrappedMichael Neuling
We need to do this so we can enforce the name of a and b in called macros PPC_RA/B later. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10powerpc/pasemi: Move lbz/stbciz to ppc-opcode.hMichael Neuling
move lbz/stbciz to ppc-opcode.h. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10powerpc: Add defines for R0-R31Michael Neuling
We are going to use these later and convert r0 to %r0 etc. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-03-05KVM: PPC: Implement MMIO emulation support for Book3S HV guestsPaul Mackerras
This provides the low-level support for MMIO emulation in Book3S HV guests. When the guest tries to map a page which is not covered by any memslot, that page is taken to be an MMIO emulation page. Instead of inserting a valid HPTE, we insert an HPTE that has the valid bit clear but another hypervisor software-use bit set, which we call HPTE_V_ABSENT, to indicate that this is an absent page. An absent page is treated much like a valid page as far as guest hcalls (H_ENTER, H_REMOVE, H_READ etc.) are concerned, except of course that an absent HPTE doesn't need to be invalidated with tlbie since it was never valid as far as the hardware is concerned. When the guest accesses a page for which there is an absent HPTE, it will take a hypervisor data storage interrupt (HDSI) since we now set the VPM1 bit in the LPCR. Our HDSI handler for HPTE-not-present faults looks up the hash table and if it finds an absent HPTE mapping the requested virtual address, will switch to kernel mode and handle the fault in kvmppc_book3s_hv_page_fault(), which at present just calls kvmppc_hv_emulate_mmio() to set up the MMIO emulation. This is based on an earlier patch by Benjamin Herrenschmidt, but since heavily reworked. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-07-21net: filter: BPF 'JIT' compiler for PPC64Matt Evans
An implementation of a code generator for BPF programs to speed up packet filtering on PPC64, inspired by Eric Dumazet's x86-64 version. Filter code is generated as an ABI-compliant function in module_alloc()'d mem with stackframe & prologue/epilogue generated if required (simple filters don't need anything more than an li/blr). The filter's local variables, M[], live in registers. Supports all BPF opcodes, although "complicated" loads from negative packet offsets (e.g. SKF_LL_OFF) are not yet supported. There are a couple of further optimisations left for future work; many-pass assembly with branch-reach reduction and a register allocator to push M[] variables into volatile registers would improve the code quality further. This currently supports big-endian 64-bit PowerPC only (but is fairly simple to port to PPC32 or LE!). Enabled in the same way as x86-64: echo 1 > /proc/sys/net/core/bpf_jit_enable Or, enabled with extra debug output: echo 2 > /proc/sys/net/core/bpf_jit_enable Signed-off-by: Matt Evans <matt@ozlabs.org> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-27powerpc: Per process DSCR + some fixes (try#4)Alexey Kardashevskiy
The DSCR (aka Data Stream Control Register) is supported on some server PowerPC chips and allow some control over the prefetch of data streams. This patch allows the value to be specified per thread by emulating the corresponding mfspr and mtspr instructions. Children of such threads inherit the value. Other threads use a default value that can be specified in sysfs - /sys/devices/system/cpu/dscr_default. If a thread starts with non default value in the sysfs entry, all children threads inherit this non default value even if the sysfs value is changed later. Signed-off-by: Alexey Kardashevskiy <aik@au1.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20powerpc/a2: Add some #defines for A2 specific instructionsBenjamin Herrenschmidt
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20powerpc: Add NAP mode support on Power7 in HV modeBenjamin Herrenschmidt
Wakeup comes from the system reset handler with a potential loss of the non-hypervisor CPU state. We save the non-volatile state on the stack and a pointer to it in the PACA, which the system reset handler uses to restore things Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-12-09powerpc: Hardcode popcnt instructions for old assemblersAnton Blanchard
The popcnt instructions went into binutils relatively recently. As with a number of other instructions, create macros and hardcode them. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-06-22powerpc: Emulate most Book I instructions in emulate_step()Paul Mackerras
This extends the emulate_step() function to handle a large proportion of the Book I instructions implemented on current 64-bit server processors. The aim is to handle all the load and store instructions used in the kernel, plus all of the instructions that appear between l[wd]arx and st[wd]cx., so this handles the Altivec/VMX lvx and stvx and the VSX lxv2dx and stxv2dx instructions (implemented in POWER7). The new code can emulate user mode instructions, and checks the effective address for a load or store if the saved state is for user mode. It doesn't handle little-endian mode at present. For floating-point, Altivec/VMX and VSX instructions, it checks that the saved MSR has the enable bit for the relevant facility set, and if so, assumes that the FP/VMX/VSX registers contain valid state, and does loads or stores directly to/from the FP/VMX/VSX registers, using assembly helpers in ldstfp.S. Instructions supported now include: * Loads and stores, including some but not all VMX and VSX instructions, and lmw/stmw * Atomic loads and stores (l[dw]arx, st[dw]cx.) * Arithmetic instructions (add, subtract, multiply, divide, etc.) * Compare instructions * Rotate and mask instructions * Shift instructions * Logical instructions (and, or, xor, etc.) * Condition register logical instructions * mtcrf, cntlz[wd], exts[bhw] * isync, sync, lwsync, ptesync, eieio * Cache operations (dcbf, dcbst, dcbt, dcbtst) The overflow-checking arithmetic instructions are not included, but they appear not to be ever used in C code. This uses decimal values for the minor opcodes in the switch statements because that is what appears in the Power ISA specification, thus it is easier to check that they are correct if they are in decimal. If this is used to single-step an instruction where a data breakpoint interrupt occurred, then there is the possibility that the instruction is a lwarx or ldarx. In that case we have to be careful not to lose the reservation until we get to the matching st[wd]cx., or we'll never make forward progress. One alternative is to try to arrange that we can return from interrupts and handle data breakpoint interrupts without losing the reservation, which means not using any spinlocks, mutexes, or atomic ops (including bitops). That seems rather fragile. The other alternative is to emulate the larx/stcx and all the instructions in between. This is why this commit adds support for a wide range of integer instructions. Signed-off-by: Paul Mackerras <paulus@samba.org>
2010-03-16powerpc/85xx: Make sure lwarx hint isn't set on ppc32Kumar Gala
e500v1/v2 based chips will treat any reserved field being set in an opcode as illegal. Thus always setting the hint in the opcode is a bad idea. Anton should be kept away from the powerpc opcode map. Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-02-17powerpc: Use lwarx/ldarx hint in bit locksAnton Blanchard
This patch implements the lwarx/ldarx hint bit for bit locks. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-17powerpc: Use lwarx hint in spinlocksAnton Blanchard
Recent versions of the PowerPC architecture added a hint bit to the larx instructions to differentiate between an atomic operation and a lock operation: > 0 Other programs might attempt to modify the word in storage addressed by EA > even if the subsequent Store Conditional succeeds. > > 1 Other programs will not attempt to modify the word in storage addressed by > EA until the program that has acquired the lock performs a subsequent store > releasing the lock. To avoid a binutils dependency this patch create macros for the extended lwarx format and uses it in the spinlock code. To test this change I used a simple test case that acquires and releases a global pthread mutex: pthread_mutex_lock(&mutex); pthread_mutex_unlock(&mutex); On a 32 core POWER6, running 32 test threads we spend almost all our time in the futex spinlock code: 94.37% perf [kernel] [k] ._raw_spin_lock | |--99.95%-- ._raw_spin_lock | | | |--63.29%-- .futex_wake | | | |--36.64%-- .futex_wait_setup Which is a good test for this patch. The results (in lock/unlock operations per second) are: before: 1538203 ops/sec after: 2189219 ops/sec An improvement of 42% A 32 core POWER7 improves even more: before: 1279529 ops/sec after: 2282076 ops/sec An improvement of 78% Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20powerpc/mm: Add opcode definitions for tlbivax and tlbsrx.Benjamin Herrenschmidt
This adds the opcode definitions to ppc-opcode.h for the two instructions tlbivax and tlbsrx. as defined by Book3E 2.06 Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21powerpc: Add 2.06 tlbie mnemonicsMilton Miller
This adds the PowerPC 2.06 tlbie mnemonics and keeps backwards compatibilty for CPUs before 2.06. Only useful for bare metal systems. Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21powerpc: Move VSX load/stores into ppc-opcode.hMichael Neuling
Cleans up the VSX load/store instructions by moving them into ppc-opcode.h. Signed-off-by: Michael Neuling <mikey@neuling.org> Acked-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21powerpc: Cleanup macros in ppc-opcode.hMichael Neuling
Make macros more braces happy. Signed-off-by: Michael Neuling <mikey@neuling.org> Acked-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-04-23Revert "powerpc: Add support for early tlbilx opcode"Kumar Gala
This reverts commit e9965577406a2148ade97b5e0ce7c448b4ba4ef6. Our HW guys were able to fix this so it never sees the light of day. Signed-off-by: Kumar Gala <galak@kernel.crashing.org>