aboutsummaryrefslogtreecommitdiffstats
path: root/arch/powerpc/platforms
AgeCommit message (Collapse)Author
2015-01-22powerpc/powernv: Restore LPCR with LPCR_PECE1 clearedShreyas B. Prabhu
LPCR_PECE1 bit controls whether decrementer interrupts are allowed to cause exit from power-saving mode. While waking up from winkle, restoring LPCR with LPCR_PECE1 set (i.e Decrementer interrupts allowed) can cause issue in the following scenario: - All the threads in a core are offlined. The core enters deep winkle. - Spurious interrupt wakes up a thread in the core. Here LPCR is restored with LPCR_PECE1 bit set. - Since it was a spurious interrupt on a offline thread, the thread clears the interrupt and goes back to winkle. - Here before the thread executes winkle and puts the core into deep winkle, if a decrementer interrupt occurs on any of the sibling threads in the core that thread wakes up. - Since in offline loop we are flushing interrupt only in case of external interrupt, the decrementer interrupt does not get flushed. So at this stage the thread is stuck in this is loop of waking up at 0x100 due to decrementer interrupt, not flushing the interrupt as only external interrupts get flushed, entering winkle, waking up at 0x100 again. Fix this by programming PORE to restore LPCR with LPCR_PECE1 bit cleared when waking up from winkle. Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-01-12powernv: Fix OPAL tracepoint codeAnton Blanchard
Patch c49f63530bb6 ("powernv: Add OPAL tracepoints") has a spurious store to the stack: ld r12,opal_tracepoint_refcount@toc(r2); \ std r12,32(r1); \ The store was originally used to save the current tracepoint status so the entry and the exit tracepoints were always balanced. In the end I just created a separate path when tracepoints are enabled. The offset on the stack used for this store is not valid for ABIv2 and it causes strange issues. I noticed it because OPAL console input was broken. Fixes: c49f63530bb6 ("powernv: Add OPAL tracepoints") Cc: <stable@vger.kernel.org> # v3.17+ Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-29powerpc/kdump: Ignore failure in enabling big endian exception during crashHari Bathini
In LE kernel, we currently have a hack for kexec that resets the exception endian before starting a new kernel as the kernel that is loaded could be a big endian or a little endian kernel. In kdump case, resetting exception endian fails when one or more cpus is disabled. But we can ignore the failure and still go ahead, as in most cases crashkernel will be of same endianess as primary kernel and reseting endianess is not even needed in those cases. This patch adds a new inline function to say if this is kdump path. This function is used at places where such a check is needed. Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> [mpe: Rename to kdump_in_progress(), use bool, and edit comment] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-19Merge tag 'powerpc-3.19-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux Pull second batch of powerpc updates from Michael Ellerman: "The highlight is the series that reworks the idle management on powernv, which allows us to use deeper idle states on those machines. There's the fix from Anton for the "BUG at kernel/smpboot.c:134!" problem. An i2c driver for powernv. This is acked by Wolfram Sang, and he asked that we take it through the powerpc tree. A fix for audit from rgb at Red Hat, acked by Paul Moore who is one of the audit maintainers. A patch from Ben to export the symbol map of our OPAL firmware as a sysfs file, so that tools can use it. Also some CXL fixes, a couple of powerpc perf fixes, a fix for smt-enabled, and the patch to add __force to get_user() so we can use bitwise types" * tag 'powerpc-3.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: powerpc/powernv: Ignore smt-enabled on Power8 and later powerpc/uaccess: Allow get_user() with bitwise types powerpc/powernv: Expose OPAL firmware symbol map powernv/powerpc: Add winkle support for offline cpus powernv/cpuidle: Redesign idle states management powerpc/powernv: Enable Offline CPUs to enter deep idle states powerpc/powernv: Switch off MMU before entering nap/sleep/rvwinkle mode i2c: Driver to expose PowerNV platform i2c busses powerpc: add little endian flag to syscall_get_arch() power/perf/hv-24x7: Use kmem_cache_free() instead of kfree powerpc/perf/hv-24x7: Use per-cpu page buffer cxl: Unmap MMIO regions when detaching a context cxl: Add timeout to process element commands cxl: Change contexts_lock to a mutex to fix sleep while atomic bug powerpc: Secondary CPUs must set cpu_callin_map after setting active and online
2014-12-18powerpc/powernv: Ignore smt-enabled on Power8 and laterGreg Kurz
Starting with POWER8, the subcore logic relies on all threads of a core being booted so that they can participate in split mode switches. So on those machines we ignore the smt_enabled_at_boot setting (smt-enabled on the kernel command line). Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> [mpe: Update comment and change log to be more precise] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-15powerpc/powernv: Expose OPAL firmware symbol mapBenjamin Herrenschmidt
Newer versions of OPAL will provide this, so let's expose it to user space so tools like perf can use it to properly decode samples in firmware space. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-14Merge tag 'driver-core-3.19-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core update from Greg KH: "Here's the set of driver core patches for 3.19-rc1. They are dominated by the removal of the .owner field in platform drivers. They touch a lot of files, but they are "simple" changes, just removing a line in a structure. Other than that, a few minor driver core and debugfs changes. There are some ath9k patches coming in through this tree that have been acked by the wireless maintainers as they relied on the debugfs changes. Everything has been in linux-next for a while" * tag 'driver-core-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (324 commits) Revert "ath: ath9k: use debugfs_create_devm_seqfile() helper for seq_file entries" fs: debugfs: add forward declaration for struct device type firmware class: Deletion of an unnecessary check before the function call "vunmap" firmware loader: fix hung task warning dump devcoredump: provide a one-way disable function device: Add dev_<level>_once variants ath: ath9k: use debugfs_create_devm_seqfile() helper for seq_file entries ath: use seq_file api for ath9k debugfs files debugfs: add helper function to create device related seq_file drivers/base: cacheinfo: remove noisy error boot message Revert "core: platform: add warning if driver has no owner" drivers: base: support cpu cache information interface to userspace via sysfs drivers: base: add cpu_device_create to support per-cpu devices topology: replace custom attribute macros with standard DEVICE_ATTR* cpumask: factor out show_cpumap into separate helper function driver core: Fix unbalanced device reference in drivers_probe driver core: fix race with userland in device_add() sysfs/kernfs: make read requests on pre-alloc files use the buffer. sysfs/kernfs: allow attributes to request write buffer be pre-allocated. fs: sysfs: return EGBIG on write if offset is larger than file size ...
2014-12-15powernv/powerpc: Add winkle support for offline cpusShreyas B. Prabhu
Winkle is a deep idle state supported in power8 chips. A core enters winkle when all the threads of the core enter winkle. In this state power supply to the entire chiplet i.e core, private L2 and private L3 is turned off. As a result it gives higher powersavings compared to sleep. But entering winkle results in a total hypervisor state loss. Hence the hypervisor context has to be preserved before entering winkle and restored upon wake up. Power-on Reset Engine (PORE) is a dedicated engine which is responsible for powering on the chiplet during wake up. It can be programmed to restore the register contests of a few specific registers. This patch uses PORE to restore register state wherever possible and uses stack to save and restore rest of the necessary registers. With hypervisor state restore things fall under three categories- per-core state, per-subcore state and per-thread state. To manage this, extend the infrastructure introduced for sleep. Mainly we add a paca variable subcore_sibling_mask. Using this and the core_idle_state we can distingush first thread in core and subcore. Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-15powernv/cpuidle: Redesign idle states managementShreyas B. Prabhu
Deep idle states like sleep and winkle are per core idle states. A core enters these states only when all the threads enter either the particular idle state or a deeper one. There are tasks like fastsleep hardware bug workaround and hypervisor core state save which have to be done only by the last thread of the core entering deep idle state and similarly tasks like timebase resync, hypervisor core register restore that have to be done only by the first thread waking up from these state. The current idle state management does not have a way to distinguish the first/last thread of the core waking/entering idle states. Tasks like timebase resync are done for all the threads. This is not only is suboptimal, but can cause functionality issues when subcores and kvm is involved. This patch adds the necessary infrastructure to track idle states of threads in a per-core structure. It uses this info to perform tasks like fastsleep workaround and timebase resync only once per core. Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com> Originally-by: Preeti U. Murthy <preeti@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: linux-pm@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-15powerpc/powernv: Enable Offline CPUs to enter deep idle statesShreyas B. Prabhu
The secondary threads should enter deep idle states so as to gain maximum powersavings when the entire core is offline. To do so the offline path must be made aware of the available deepest idle state. Hence probe the device tree for the possible idle states in powernv core code and expose the deepest idle state through flags. Since the device tree is probed by the cpuidle driver as well, move the parameters required to discover the idle states into an appropriate common place to both the driver and the powernv core code. Another point is that fastsleep idle state may require workarounds in the kernel to function properly. This workaround is introduced in the subsequent patches. However neither the cpuidle driver or the hotplug path need be bothered about this workaround. They will be taken care of by the core powernv code. Originally-by: Srivatsa S. Bhat <srivatsa@mit.edu> Signed-off-by: Preeti U. Murthy <preeti@linux.vnet.ibm.com> Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com> Reviewed-by: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: linux-pm@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-14i2c: Driver to expose PowerNV platform i2c bussesNeelesh Gupta
The patch exposes the available i2c busses on the PowerNV platform to the kernel and implements the bus driver to support i2c and smbus commands. The driver uses the platform device infrastructure to probe the busses on the platform and registers them with the i2c driver framework. Signed-off-by: Neelesh Gupta <neelegup@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Wolfram Sang <wsa@the-dreams.de> (I2C part, excluding the bindings) Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-12Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree update from Jiri Kosina: "Usual stuff: documentation updates, printk() fixes, etc" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (24 commits) intel_ips: fix a type in error message cpufreq: cpufreq-dt: Move newline to end of error message ps3rom: fix error return code treewide: fix typo in printk and Kconfig ARM: dts: bcm63138: change "interupts" to "interrupts" Replace mentions of "list_struct" to "list_head" kernel: trace: fix printk message scsi: mpt2sas: fix ioctl in comment zbud, zswap: change module author email clocksource: Fix 'clcoksource' typo in comment arm: fix wording of "Crotex" in CONFIG_ARCH_EXYNOS3 help gpio: msm-v1: make boolean argument more obvious usb: Fix typo in usb-serial-simple.c PCI: Fix comment typo 'COMFIG_PM_OPS' powerpc: Fix comment typo 'CONIFG_8xx' powerpc: Fix comment typos 'CONFiG_ALTIVEC' clk: st: Spelling s/stucture/structure/ isci: Spelling s/stucture/structure/ usb: gadget: zero: Spelling s/infrastucture/infrastructure/ treewide: Fix company name in module descriptions ...
2014-12-11Merge tag 'powerpc-3.19-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux Pull powerpc updates from Michael Ellerman: "Some nice cleanups like removing bootmem, and removal of __get_cpu_var(). There is one patch to mm/gup.c. This is the generic GUP implementation, but is only used by us and arm(64). We have an ack from Steve Capper, and although we didn't get an ack from Andrew he told us to take the patch through the powerpc tree. There's one cxl patch. This is in drivers/misc, but Greg said he was happy for us to manage fixes for it. There is an infrastructure patch to support an IPMI driver for OPAL. There is also an RTC driver for OPAL. We weren't able to get any response from the RTC maintainer, Alessandro Zummo, so in the end we just merged the driver. The usual batch of Freescale updates from Scott" * tag 'powerpc-3.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (101 commits) powerpc/powernv: Return to cpu offline loop when finished in KVM guest powerpc/book3s: Fix partial invalidation of TLBs in MCE code. powerpc/mm: don't do tlbie for updatepp request with NO HPTE fault powerpc/xmon: Cleanup the breakpoint flags powerpc/xmon: Enable HW instruction breakpoint on POWER8 powerpc/mm/thp: Use tlbiel if possible powerpc/mm/thp: Remove code duplication powerpc/mm/hugetlb: Sanity check gigantic hugepage count powerpc/oprofile: Disable pagefaults during user stack read powerpc/mm: Check for matching hpte without taking hpte lock powerpc: Drop useless warning in eeh_init() powerpc/powernv: Cleanup unused MCE definitions/declarations. powerpc/eeh: Dump PHB diag-data early powerpc/eeh: Recover EEH error on ownership change for BCM5719 powerpc/eeh: Set EEH_PE_RESET on PE reset powerpc/eeh: Refactor eeh_reset_pe() powerpc: Remove more traces of bootmem powerpc/pseries: Initialise nvram_pstore_info's buf_lock cxl: Name interrupts in /proc/interrupt cxl: Return error to PSL if IRQ demultiplexing fails & print clearer warning ...
2014-12-11Merge tag 'devicetree-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/glikely/linux Pull devicetree changes from Grant Likely: "Lots of activity in the devicetree code for v3.18. Most of it is related to getting all of the overlay support code in place, but there are other important things in there. Highlights: - OF_RECONFIG notifiers for SPI, I2C and Platform devices. Those subsystems can now respond to live changes to the device tree. - CONFIG_OF_OVERLAY method for applying live changes to the device tree - Removal of the of_allnodes list. This used to be used to iterate over all the nodes in the device tree, but it is unnecessary because the same thing can be done by iterating over the list of child pointers. Getting rid of of_allnodes saves some memory and avoids the possibility of of_allnodes being sorted differently from the child lists. - Support for retrieving original DTB blob via sysfs. Needed by kexec. - More unittests - Documentation and minor bug fixes" * tag 'devicetree-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/glikely/linux: (42 commits) of: Delete unnecessary check before calling "of_node_put()" of: Drop ->next pointer from struct device_node spi: Check for spi_of_notifier when CONFIG_OF_DYNAMIC=y of: support passing console options with stdout-path of: add optional options parameter to of_find_node_by_path() of: Add bindings for chosen node, stdout-path of: Remove unneeded and incorrect MODULE_DEVICE_TABLE ARM: dt: fix up PL011 device tree bindings of: base, fix of_property_read_string_helper kernel-doc of: remove select of non-existant OF_DEVICE config symbol spi/of: Add OF notifier handler spi/of: Create new device registration method and accessors i2c/of: Add OF_RECONFIG notifier handler i2c/of: Factor out Devicetree registration code of/overlay: Add overlay unittests of/overlay: Introduce DT overlay support of/reconfig: Add OF_DYNAMIC notifier for platform_bus_type of/reconfig: Always use the same structure for notifiers of/reconfig: Add debug output for OF_RECONFIG notifiers of/reconfig: Add empty stubs for the of_reconfig methods ...
2014-12-10Merge branch 'akpm' (patchbomb from Andrew)Linus Torvalds
Merge first patchbomb from Andrew Morton: - a few minor cifs fixes - dma-debug upadtes - ocfs2 - slab - about half of MM - procfs - kernel/exit.c - panic.c tweaks - printk upates - lib/ updates - checkpatch updates - fs/binfmt updates - the drivers/rtc tree - nilfs - kmod fixes - more kernel/exit.c - various other misc tweaks and fixes * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (190 commits) exit: pidns: fix/update the comments in zap_pid_ns_processes() exit: pidns: alloc_pid() leaks pid_namespace if child_reaper is exiting exit: exit_notify: re-use "dead" list to autoreap current exit: reparent: call forget_original_parent() under tasklist_lock exit: reparent: avoid find_new_reaper() if no children exit: reparent: introduce find_alive_thread() exit: reparent: introduce find_child_reaper() exit: reparent: document the ->has_child_subreaper checks exit: reparent: s/while_each_thread/for_each_thread/ in find_new_reaper() exit: reparent: fix the cross-namespace PR_SET_CHILD_SUBREAPER reparenting exit: reparent: fix the dead-parent PR_SET_CHILD_SUBREAPER reparenting exit: proc: don't try to flush /proc/tgid/task/tgid exit: release_task: fix the comment about group leader accounting exit: wait: drop tasklist_lock before psig->c* accounting exit: wait: don't use zombie->real_parent exit: wait: cleanup the ptrace_reparented() checks usermodehelper: kill the kmod_thread_locker logic usermodehelper: don't use CLONE_VFORK for ____call_usermodehelper() fs/hfs/catalog.c: fix comparison bug in hfs_cat_keycmp nilfs2: fix the nilfs_iget() vs. nilfs_new_inode() races ...
2014-12-10ppc/cell: replace get_unused_fd() with get_unused_fd_flags(0)Yann Droneaud
This patch replaces calls to get_unused_fd() with equivalent call to get_unused_fd_flags(0) to preserve current behavor for existing code. In a further patch, get_unused_fd() will be removed so that new code start using get_unused_fd_flags(), with the hope O_CLOEXEC could be used, either by default or choosen by userspace. Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-10Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull VFS changes from Al Viro: "First pile out of several (there _definitely_ will be more). Stuff in this one: - unification of d_splice_alias()/d_materialize_unique() - iov_iter rewrite - killing a bunch of ->f_path.dentry users (and f_dentry macro). Getting that completed will make life much simpler for unionmount/overlayfs, since then we'll be able to limit the places sensitive to file _dentry_ to reasonably few. Which allows to have file_inode(file) pointing to inode in a covered layer, with dentry pointing to (negative) dentry in union one. Still not complete, but much closer now. - crapectomy in lustre (dead code removal, mostly) - "let's make seq_printf return nothing" preparations - assorted cleanups and fixes There _definitely_ will be more piles" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits) copy_from_iter_nocache() new helper: iov_iter_kvec() csum_and_copy_..._iter() iov_iter.c: handle ITER_KVEC directly iov_iter.c: convert copy_to_iter() to iterate_and_advance iov_iter.c: convert copy_from_iter() to iterate_and_advance iov_iter.c: get rid of bvec_copy_page_{to,from}_iter() iov_iter.c: convert iov_iter_zero() to iterate_and_advance iov_iter.c: convert iov_iter_get_pages_alloc() to iterate_all_kinds iov_iter.c: convert iov_iter_get_pages() to iterate_all_kinds iov_iter.c: convert iov_iter_npages() to iterate_all_kinds iov_iter.c: iterate_and_advance iov_iter.c: macros for iterating over iov_iter kill f_dentry macro dcache: fix kmemcheck warning in switch_names new helper: audit_file() nfsd_vfs_write(): use file_inode() ncpfs: use file_inode() kill f_dentry uses lockd: get rid of ->f_path.dentry->d_sb ...
2014-12-10Merge branch 'irq-irqdomain-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq domain updates from Thomas Gleixner: "The real interesting irq updates: - Support for hierarchical irq domains: For complex interrupt routing scenarios where more than one interrupt related chip is involved we had no proper representation in the generic interrupt infrastructure so far. That made people implement rather ugly constructs in their nested irq chip implementations. The main offenders are x86 and arm/gic. To distangle that mess we have now hierarchical irqdomains which seperate the various interrupt chips and connect them via the hierarchical domains. That keeps the domain specific details internal to the particular hierarchy level and removes the criss/cross referencing of chip internals. The resulting hierarchy for a complex x86 system will look like this: vector mapped: 74 msi-0 mapped: 2 dmar-ir-1 mapped: 69 ioapic-1 mapped: 4 ioapic-0 mapped: 20 pci-msi-2 mapped: 45 dmar-ir-0 mapped: 3 ioapic-2 mapped: 1 pci-msi-1 mapped: 2 htirq mapped: 0 Neither ioapic nor pci-msi know about the dmar interrupt remapping between themself and the vector domain. If interrupt remapping is disabled ioapic and pci-msi become direct childs of the vector domain. In hindsight we should have done that years ago, but in hindsight we always know better :) - Support for generic MSI interrupt domain handling We have more and more non PCI related MSI interrupts, so providing a generic infrastructure for this is better than having all affected architectures implementing their own private hacks. - Support for PCI-MSI interrupt domain handling, based on the generic MSI support. This part carries the pci/msi branch from Bjorn Helgaas pci tree to avoid a massive conflict. The PCI/MSI parts are acked by Bjorn. I have two more branches on top of this. The full conversion of x86 to hierarchical domains and a partial conversion of arm/gic" * 'irq-irqdomain-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits) genirq: Move irq_chip_write_msi_msg() helper to core PCI/MSI: Allow an msi_controller to be associated to an irq domain PCI/MSI: Provide mechanism to alloc/free MSI/MSIX interrupt from irqdomain PCI/MSI: Enhance core to support hierarchy irqdomain PCI/MSI: Move cached entry functions to irq core genirq: Provide default callbacks for msi_domain_ops genirq: Introduce msi_domain_alloc/free_irqs() asm-generic: Add msi.h genirq: Add generic msi irq domain support genirq: Introduce callback irq_chip.irq_write_msi_msg genirq: Work around __irq_set_handler vs stacked domains ordering issues irqdomain: Introduce helper function irq_domain_add_hierarchy() irqdomain: Implement a method to automatically call parent domains alloc/free genirq: Introduce helper irq_domain_set_info() to reduce duplicated code genirq: Split out flow handler typedefs into seperate header file genirq: Add IRQ_SET_MASK_OK_DONE to support stacked irqchip genirq: Introduce irq_chip.irq_compose_msi_msg() to support stacked irqchip genirq: Add more helper functions to support stacked irq_chip genirq: Introduce helper functions to support stacked irq_chip irqdomain: Do irq_find_mapping and set_type for hierarchy irqdomain in case OF ...
2014-12-09Merge tag 'drivers-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC driver updates from Arnd Bergmann: "These are changes for drivers that are intimately tied to some SoC and for some reason could not get merged through the respective subsystem maintainer tree. The largest single change here this time around is the Tegra iommu/memory controller driver, which gets updated to the new iommu DT binding. More drivers like this are likely to follow for the following merge window, but we should be able to do those through the iommu maintainer. Other notable changes are: - reset controller drivers from the reset maintainer (socfpga, sti, berlin) - fixes for the keystone navigator driver merged last time - at91 rtc driver changes related to the at91 cleanups - ARM perf driver changes from Will Deacon - updates for the brcmstb_gisb driver" * tag 'drivers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (53 commits) clocksource: arch_timer: Allow the device tree to specify uninitialized timer registers clocksource: arch_timer: Fix code to use physical timers when requested memory: Add NVIDIA Tegra memory controller support bus: brcmstb_gisb: Add register offset tables for older chips bus: brcmstb_gisb: Look up register offsets in a table bus: brcmstb_gisb: Introduce wrapper functions for MMIO accesses bus: brcmstb_gisb: Make the driver buildable on MIPS of: Add NVIDIA Tegra memory controller binding ARM: tegra: Move AHB Kconfig to drivers/amba amba: Add Kconfig file clk: tegra: Implement memory-controller clock serial: samsung: Fix serial config dependencies for exynos7 bus: brcmstb_gisb: resolve section mismatch ARM: common: edma: edma_pm_resume may be unused ARM: common: edma: add suspend resume hook powerpc/iommu: Rename iommu_[un]map_sg functions rtc: at91sam9: add DT bindings documentation rtc: at91sam9: use clk API instead of relying on AT91_SLOW_CLOCK ARM: at91: add clk_lookup entry for RTT devices rtc: at91sam9: rework the Kconfig description ...
2014-12-08Merge branch 'iov_iter' into for-nextAl Viro
2014-12-08powerpc/powernv: Return to cpu offline loop when finished in KVM guestPaul Mackerras
When a secondary hardware thread has finished running a KVM guest, we currently put that thread into nap mode using a nap instruction in the KVM code. This changes the code so that instead of doing a nap instruction directly, we instead cause the call to power7_nap() that put the thread into nap mode to return. The reason for doing this is to avoid having the KVM code having to know what low-power mode to put the thread into. In the case of a secondary thread used to run a KVM guest, the thread will be offline from the point of view of the host kernel, and the relevant power7_nap() call is the one in pnv_smp_cpu_disable(). In this case we don't want to clear pending IPIs in the offline loop in that function, since that might cause us to miss the wakeup for the next time the thread needs to run a guest. To tell whether or not to clear the interrupt, we use the SRR1 value returned from power7_nap(), and check if it indicates an external interrupt. We arrange that the return from power7_nap() when we have finished running a guest returns 0, so pending interrupts don't get flushed in that case. Note that it is important a secondary thread that has finished executing in the guest, or that didn't have a guest to run, should not return to power7_nap's caller while the kvm_hstate.hwthread_req flag in the PACA is non-zero, because the return from power7_nap will reenable the MMU, and the MMU might still be in guest context. In this situation we spin at low priority in real mode waiting for hwthread_req to become zero. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-05powerpc/mm: don't do tlbie for updatepp request with NO HPTE faultAneesh Kumar K.V
upatepp can get called for a nohpte fault when we find from the linux page table that the translation was hashed before. In that case we are sure that there is no existing translation, hence we could avoid doing tlbie. We could possibly race with a parallel fault filling the TLB. But that should be ok because updatepp is only ever relaxing permissions. We also look at linux pte permission bits when filling hash pte permission bits. We also hold the linux pte busy bits while inserting/updating a hashpte entry, hence a paralle update of linux pte is not possible. On the other hand mprotect involves ptep_modify_prot_start which cause a hpte invalidate and not updatepp. Performance number: We use randbox_access_bench written by Anton. Kernel with THP disabled and smaller hash page table size. 86.60% random_access_b [kernel.kallsyms] [k] .native_hpte_updatepp 2.10% random_access_b random_access_bench [.] doit 1.99% random_access_b [kernel.kallsyms] [k] .do_raw_spin_lock 1.85% random_access_b [kernel.kallsyms] [k] .native_hpte_insert 1.26% random_access_b [kernel.kallsyms] [k] .native_flush_hash_range 1.18% random_access_b [kernel.kallsyms] [k] .__delay 0.69% random_access_b [kernel.kallsyms] [k] .native_hpte_remove 0.37% random_access_b [kernel.kallsyms] [k] .clear_user_page 0.34% random_access_b [kernel.kallsyms] [k] .__hash_page_64K 0.32% random_access_b [kernel.kallsyms] [k] fast_exception_return 0.30% random_access_b [kernel.kallsyms] [k] .hash_page_mm With Fix: 27.54% random_access_b random_access_bench [.] doit 22.90% random_access_b [kernel.kallsyms] [k] .native_hpte_insert 5.76% random_access_b [kernel.kallsyms] [k] .native_hpte_remove 5.20% random_access_b [kernel.kallsyms] [k] fast_exception_return 5.12% random_access_b [kernel.kallsyms] [k] .__hash_page_64K 4.80% random_access_b [kernel.kallsyms] [k] .hash_page_mm 3.31% random_access_b [kernel.kallsyms] [k] data_access_common 1.84% random_access_b [kernel.kallsyms] [k] .trace_hardirqs_on_caller Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-02Merge remote-tracking branch 'benh/next' into nextMichael Ellerman
Merge updates collected & acked by Ben. A few EEH patches from Gavin, some mm updates from Aneesh and a few odds and ends.
2014-12-02powerpc/mm/thp: Use tlbiel if possibleAneesh Kumar K.V
If we know that user address space has never executed on other cpus we could use tlbiel. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-12-02powerpc/powernv: Cleanup unused MCE definitions/declarations.Mahesh Salgaonkar
Cleanup OpalMCE_* definitions/declarations and other related code which is not used anymore. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Acked-by: Benjamin Herrrenschmidt <benh@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-12-02powerpc/eeh: Dump PHB diag-data earlyGavin Shan
On PowerNV platform, PHB diag-data is dumped after stopping device drivers. In case of recursive EEH errors, the kernel is usually crashed before dumping PHB diag-data for the second EEH error. It's hard to locate the root cause of the second EEH error without PHB diag-data. The patch adds one more EEH option "eeh=early_log", which helps dumping PHB diag-data immediately once frozen PE is detected, in order to get the PHB diag-data for the second EEH error. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-12-02powerpc/eeh: Set EEH_PE_RESET on PE resetGavin Shan
The patch introduces additional flag EEH_PE_RESET to indicate the corresponding PE is under reset. In turn, the PE retrieval bakcend on PowerNV platform can return unfrozen state for the EEH core to moving forward. Flag EEH_PE_CFG_BLOCKED isn't the correct one for the purpose. In PCI passthrou case, the problem is more worse: Guest doesn't recover 6th EEH error. The PE is left in isolated (frozen) and config blocked state on Broadcom adapters. We can't retrieve the PE's state correctly any more, even from the host side via sysfs /sys/bus/pci/devices/xxx/eeh_pe_state. Reported-by: Rajeshkumar Subramanian <rajeshkumars@in.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-11-30hwmon: (ibmpowernv) Use platform 'id_table' to probe the deviceNeelesh Gupta
The current driver probe() function assumes the sensor device to be always present and gets executed every time if the driver is loaded, but the appropriate hardware could not be present. So, move the platform device creation as part of platform init code and use the 'id_table' to check if the device is present or not. Signed-off-by: Neelesh Gupta <neelegup@linux.vnet.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2014-11-27Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux Pull powerpc fixes from Michael Ellerman: "Here are five fixes for you to pull please. They're all CC'ed to stable except the "Fix PE state format" one which went in this release" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: powerpc: 32 bit getcpu VDSO function uses 64 bit instructions powerpc/powernv: Replace OPAL_DEASSERT_RESET with EEH_RESET_DEACTIVATE powerpc/eeh: Fix PE state format powerpc/pseries: Fix endiannes issue in RTAS call from xmon powerpc/powernv: Fix the hmi event version check.
2014-11-27powerpc/powernv: Replace OPAL_DEASSERT_RESET with EEH_RESET_DEACTIVATEGavin Shan
The flag passed to ioda_eeh_phb_reset() should be EEH_RESET_DEACTIVATE, which is translated to OPAL_DEASSERT_RESET or something else by the EEH backend accordingly. The patch replaces OPAL_DEASSERT_RESET with EEH_RESET_DEACTIVATE for ioda_eeh_phb_reset(). Cc: stable@vger.kernel.org Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-27powerpc/powernv: Fix the hmi event version check.Mahesh Salgaonkar
The current HMI event structure is an ABI and carries a version field to accommodate future changes without affecting/rearranging current structure members that are valid for previous versions. The current version check "if (hmi_evt->version != OpalHMIEvt_V1)" doesn't accomodate the fact that the version number may change in future. If firmware starts returning an HMI event with version > 1, this check will fail and no HMI information will be printed on older kernels. This patch fixes this issue. Cc: stable@vger.kernel.org # 3.17+ Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> [mpe: Reword changelog] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-24of/reconfig: Always use the same structure for notifiersGrant Likely
The OF_RECONFIG notifier callback uses a different structure depending on whether it is a node change or a property change. This is silly, and not very safe. Rework the code to use the same data structure regardless of the type of notifier. Signed-off-by: Grant Likely <grant.likely@linaro.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rob Herring <robh+dt@kernel.org> Cc: Pantelis Antoniou <pantelis.antoniou@konsulko.com> Cc: <linuxppc-dev@lists.ozlabs.org>
2014-11-24powerpc/pseries: Honor the generic "no_64bit_msi" flagBenjamin Herrenschmidt
Instead of the arch specific quirk which we are deprecating Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: <stable@vger.kernel.org>
2014-11-24powerpc/powernv: Honor the generic "no_64bit_msi" flagBenjamin Herrenschmidt
Instead of the arch specific quirk which we are deprecating and that drivers don't understand. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: <stable@vger.kernel.org>
2014-11-23PCI/MSI: Rename mask/unmask_msi_irq treewideThomas Gleixner
The PCI/MSI irq chip callbacks mask/unmask_msi_irq have been renamed to pci_msi_mask/unmask_irq to mark them PCI specific. Rename all usage sites. The conversion helper functions are kept around to avoid conflicts in next and will be removed after merging into mainline. Coccinelle assisted conversion. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: x86@kernel.org Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Murali Karicheri <m-karicheri2@ti.com> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Mohit Kumar <mohit.kumar@st.com> Cc: Simon Horman <horms@verge.net.au> Cc: Michal Simek <michal.simek@xilinx.com> Cc: Yijing Wang <wangyijing@huawei.com>
2014-11-23PCI/MSI: Rename write_msi_msg() to pci_write_msi_msg()Jiang Liu
Rename write_msi_msg() to pci_write_msi_msg() to mark it as PCI specific. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Grant Likely <grant.likely@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Yingjoe Chen <yingjoe.chen@mediatek.com> Cc: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-23PCI/MSI: Rename __read_msi_msg() to __pci_read_msi_msg()Jiang Liu
Rename __read_msi_msg() to __pci_read_msi_msg() and kill unused read_msi_msg(). It's a preparation to separate generic MSI code from PCI core. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Grant Likely <grant.likely@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Yingjoe Chen <yingjoe.chen@mediatek.com> Cc: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-20Merge Linus' tree to be be to apply submitted patches to newer code thanJiri Kosina
current trivial.git base
2014-11-19powerpc: Remove more traces of bootmemMichael Ellerman
Although we are now selecting NO_BOOTMEM, we still have some traces of bootmem lying around. That is because even with NO_BOOTMEM there is still a shim that converts bootmem calls into memblock calls, but ultimately we want to remove all traces of bootmem. Most of the patch is conversions from alloc_bootmem() to memblock_virt_alloc(). In general a call such as: p = (struct foo *)alloc_bootmem(x); Becomes: p = memblock_virt_alloc(x, 0); We don't need the cast because memblock_virt_alloc() returns a void *. The alignment value of zero tells memblock to use the default alignment, which is SMP_CACHE_BYTES, the same value alloc_bootmem() uses. We remove a number of NULL checks on the result of memblock_virt_alloc(). That is because memblock_virt_alloc() will panic if it can't allocate, in exactly the same way as alloc_bootmem(), so the NULL checks are and always have been redundant. The memory returned by memblock_virt_alloc() is already zeroed, so we remove several memsets of the result of memblock_virt_alloc(). Finally we convert a few uses of __alloc_bootmem(x, y, MAX_DMA_ADDRESS) to just plain memblock_virt_alloc(). We don't use memblock_alloc_base() because MAX_DMA_ADDRESS is ~0ul on powerpc, so limiting the allocation to that is pointless, 16XB ought to be enough for anyone. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-19powerpc/pseries: Initialise nvram_pstore_info's buf_lockLi Zhong
nvram_pstore_info's buf_lock is not initialized before registering, which is clearly incorrect. It causes some strange behavior when trying to obtain the lock during kdump process. On a UP configuration, the console stopped for a couple of seconds, then "lockup suspected" warning printed out, but then it continued to run. So try lock fails, and lockup reported, but then arch_spin_lock() passes. Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> [mpe: Edited changelog] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-18powerpc/iommu: Rename iommu_[un]map_sg functionsJoerg Roedel
The IOMMU-API gained support for a new iommu_map_sg function. This causes compile failures on powerpc because the function name is already globally used there. This patch renames adds a ppc_ prefix to these functions to solve the compile problem. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2014-11-18Merge remote-tracking branch 'scottwood/next' into nextMichael Ellerman
Scott says: "Highlights include a bunch of 8xx optimizations, device tree bindings for Freescale BMan, QMan, and FMan datapath components, misc device tree updates, and inbound rio window support."
2014-11-17rtc/tpo: Driver to support rtc and wakeup on PowerNV platformNeelesh Gupta
The patch implements the OPAL rtc driver that binds with the rtc driver subsystem. The driver uses the platform device infrastructure to probe the rtc device and register it to rtc class framework. The 'wakeup' is supported depending upon the property 'has-tpo' present in the OF node. It provides a way to load the generic rtc driver in in the absence of an OPAL driver. The patch also moves the existing OPAL rtc get/set time interfaces to the new driver and exposes the necessary OPAL calls using EXPORT_SYMBOL_GPL. Test results: ------------- Host: [root@tul169p1 ~]# ls -l /sys/class/rtc/ total 0 lrwxrwxrwx 1 root root 0 Oct 14 03:07 rtc0 -> ../../devices/opal-rtc/rtc/rtc0 [root@tul169p1 ~]# cat /sys/devices/opal-rtc/rtc/rtc0/time 08:10:07 [root@tul169p1 ~]# echo `date '+%s' -d '+ 2 minutes'` > /sys/class/rtc/rtc0/wakealarm [root@tul169p1 ~]# cat /sys/class/rtc/rtc0/wakealarm 1413274345 [root@tul169p1 ~]# FSP: $ smgr mfgState standby $ rtim timeofday System time is valid: 2014/10/14 08:12:04.225115 $ smgr mfgState ipling $ CC: devicetree@vger.kernel.org CC: tglx@linutronix.de CC: rtc-linux@googlegroups.com CC: a.zummo@towertech.it Signed-off-by: Neelesh Gupta <neelegup@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-14powerpc/powernv: Fix potential zero devisorGavin Shan
If there're no PHBs under P5IOC2 HUB device tree node, we should bail early to avoid zero devisor and allocating TCE tables. Reported-by: Anton Blanchard <anton@samba.org> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-14powerpc/powernv: Bail upon invalid master PEGavin Shan
When freezing compound PEs in pnv_ioda_freeze_pe(), we should bail upon illegal master PE. We needn't freeze slave PE because it should have been put into frozen state by hardware. Reported-by: Anton Blanchard <anton@samba.org> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-14powerpc/powernv: Simplify pnv_ioda_configure_pe()Gavin Shan
Nested if statements are always bad and the patch avoids one by checking PHB type and bail in advance if necessary. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-14powerpc/powernv: Set PELTV for compound PEsGavin Shan
Commit 262af55 ("powerpc/powernv: Enable M64 aperatus for PHB3") introduced compound PEs in order to support M64 aperatus on PHB3. However, we never configured PELTV for compound PEs. The patch fixes that by: parent PE can freeze all child compound PEs. Any compound PE affects the group. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-14powerpc/powernv: Initialize M64 PE in timeGavin Shan
The patch initializes PE instance when reserving PE number to keep consistent things as we did before. Also, it replaces the iteration on bridge's windows with the prefered way. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-14powerpc/powernv: Rename alloc_m64_pe() to reserve_m64_pe()Gavin Shan
The patch renames alloc_m64_pe() to reserve_m64_pe() to reflect its real usage: We reserve PE numbers for M64 segments in advance and then pick up the reserved PE numbers when building the mapping between PE numbers and M64 segments. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-14powerpc/powernv: Fix condition to remove M64Gavin Shan
The M64 resource should be removed if we don't have hook to initialize it, or (not and) fail to do that. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>