aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/irqchip/irq-gic-v3-its.c
AgeCommit message (Collapse)Author
2024-05-14Merge tag 'irq-core-2024-05-12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull interrupt subsystem updates from Thomas Gleixner: "Core code: - Interrupt storm detection for the lockup watchdog: Lockups which are caused by interrupt storms are not easy to debug because there is no information about the events which make the lockup detector trigger. To make this more user friendly, provide an extenstion to interrupt statistics which allows to take snapshots and an interface to retrieve the delta to the snapshot. Use this new mechanism in the watchdog code to do a two stage lockup analysis by taking the snapshot and printing the deltas for the topmost active interrupts on the second trigger. Note: This contains both the interrupt and the watchdog changes as the latter depend on the former obviously. - Avoid summation loops in the /proc/interrupts output and use the global counter when possible - Skip suspended interrupts on CPU hotplug operations to ensure that they are not delivered before the system resumes the device drivers when coming out of suspend. - On CPU hot-unplug interrupts which are affine to the outgoing CPU are migrated to a different CPU in the affinity mask. This can fail when the CPUs have no vectors left. Instead of giving up try to migrate it to any online CPU and thereby breaking the affinity setting in order to prevent a stale device interrupt which targets an offline CPU - The usual small cleanups Driver code: - Support for the RISCV AIA MSI controller - Make the interrupt allocation for the Loongson PCH controller more flexible to prevent vector exhaustion - The usual set of cleanups and fixes all over the place" * tag 'irq-core-2024-05-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits) irqchip/gic-v3-its: Remove BUG_ON in its_vpe_irq_domain_alloc cpuidle: Avoid explicit cpumask allocation on stack irqchip/sifive-plic: Avoid explicit cpumask allocation on stack irqchip/riscv-aplic-direct: Avoid explicit cpumask allocation on stack irqchip/loongson-eiointc: Avoid explicit cpumask allocation on stack irqchip/gic-v3-its: Avoid explicit cpumask allocation on stack irqchip/irq-bcm6345-l1: Avoid explicit cpumask allocation on stack cpumask: Introduce cpumask_first_and_and() irqchip/irq-brcmstb-l2: Avoid saving mask on shutdown genirq: Reuse irq_is_nmi() genirq/cpuhotplug: Retry with cpu_online_mask when migration fails genirq/cpuhotplug: Skip suspended interrupts when restoring affinity arm64: dts: st: Add interrupt parent to pinctrl on stm32mp251 arm64: dts: st: Add exti1 and exti2 nodes on stm32mp251 ARM: dts: stm32: List exti parent interrupts on stm32mp131 ARM: dts: stm32: List exti parent interrupts on stm32mp151 arm64: Kconfig.platforms: Enable STM32_EXTI for ARCH_STM32 irqchip/stm32-exti: Mark events reserved with RIF configuration check irqchip/stm32-exti: Skip secure events irqchip/stm32-exti: Convert driver to standard PM ...
2024-04-25irqchip/gic-v3-its: Remove BUG_ON in its_vpe_irq_domain_allocGuanrui Huang
This BUG_ON() is useless, because the same effect will be obtained by letting the code run its course and vm being dereferenced, triggering an exception. So just remove this check. Signed-off-by: Guanrui Huang <guanrui.huang@linux.alibaba.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240418061053.96803-3-guanrui.huang@linux.alibaba.com
2024-04-25irqchip/gic-v3-its: Prevent double free on errorGuanrui Huang
The error handling path in its_vpe_irq_domain_alloc() causes a double free when its_vpe_init() fails after successfully allocating at least one interrupt. This happens because its_vpe_irq_domain_free() frees the interrupts along with the area bitmap and the vprop_page and its_vpe_irq_domain_alloc() subsequently frees the area bitmap and the vprop_page again. Fix this by unconditionally invoking its_vpe_irq_domain_free() which handles all cases correctly and by removing the bitmap/vprop_page freeing from its_vpe_irq_domain_alloc(). [ tglx: Massaged change log ] Fixes: 7d75bbb4bc1a ("irqchip/gic-v3-its: Add VPE irq domain allocation/teardown") Signed-off-by: Guanrui Huang <guanrui.huang@linux.alibaba.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240418061053.96803-2-guanrui.huang@linux.alibaba.com
2024-04-24irqchip/gic-v3-its: Avoid explicit cpumask allocation on stackDawei Li
In general it's preferable to avoid placing cpumasks on the stack, as for large values of NR_CPUS these can consume significant amounts of stack space and make stack overflows more likely. Remove cpumask var on stack and use cpumask_any_and() to address it. Signed-off-by: Dawei Li <dawei.li@shingroup.cn> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240416085454.3547175-4-dawei.li@shingroup.cn
2024-04-09irqchip/gic-v3-its: Fix VSYNC referencing an unmapped VPE on GIC v4.1Nianyao Tang
As per the GICv4.1 spec (Arm IHI 0069H, 5.3.19): "A VMAPP with {V, Alloc}=={0, x} is self-synchronizing, This means the ITS command queue does not show the command as consumed until all of its effects are completed." Furthermore, VSYNC is allowed to deliver an SError when referencing a non existent VPE. By these definitions, a VMAPP followed by a VSYNC is a bug, as the later references a VPE that has been unmapped by the former. Fix it by eliding the VSYNC in this scenario. Fixes: 64edfaa9a234 ("irqchip/gic-v4.1: Implement the v4.1 flavour of VMAPP") Signed-off-by: Nianyao Tang <tangnianyao@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20240406022737.3898763-1-tangnianyao@huawei.com
2024-03-11Merge tag 'irq-core-2024-03-10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "Core: - Make affinity changes take effect immediately for interrupt threads. This reduces the impact on isolated CPUs as it pulls over the thread right away instead of doing it after the next hardware interrupt arrived. - Cleanup and improvements for the interrupt chip simulator - Deduplication of the interrupt descriptor initialization code so the sparse and non-sparse mode share more code. Drivers: - A set of conversions to platform_drivers::remove_new() which gets rid of the pointless return value. - A new driver for the Starfive JH8100 SoC - Support for Amlogic-T7 SoCs - Improvement for the interrupt handling and EOI management for the loongson interrupt controller. - The usual fixes and improvements all over the place" * tag 'irq-core-2024-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits) irqchip/ts4800: Convert to platform_driver::remove_new() callback irqchip/stm32-exti: Convert to platform_driver::remove_new() callback irqchip/renesas-rza1: Convert to platform_driver::remove_new() callback irqchip/renesas-irqc: Convert to platform_driver::remove_new() callback irqchip/renesas-intc-irqpin: Convert to platform_driver::remove_new() callback irqchip/pruss-intc: Convert to platform_driver::remove_new() callback irqchip/mvebu-pic: Convert to platform_driver::remove_new() callback irqchip/madera: Convert to platform_driver::remove_new() callback irqchip/ls-scfg-msi: Convert to platform_driver::remove_new() callback irqchip/keystone: Convert to platform_driver::remove_new() callback irqchip/imx-irqsteer: Convert to platform_driver::remove_new() callback irqchip/imx-intmux: Convert to platform_driver::remove_new() callback irqchip/imgpdc: Convert to platform_driver::remove_new() callback irqchip: Add StarFive external interrupt controller dt-bindings: interrupt-controller: Add starfive,jh8100-intc arm64: dts: Add gpio_intc node for Amlogic-T7 SoCs irqchip/meson-gpio: Add support for Amlogic-T7 SoCs dt-bindings: interrupt-controller: Add support for Amlogic-T7 SoCs irqchip/vic: Fix a kernel-doc warning genirq: Wake interrupt threads immediately when changing affinity ...
2024-02-21irqchip/gic-v3-its: Do not assume vPE tables are preallocatedOliver Upton
The GIC/ITS code is designed to ensure to pick up any preallocated LPI tables on the redistributors, as enabling LPIs is a one-way switch. There is no such restriction for vLPIs, and for GICv4.1 it is expected to allocate a new vPE table at boot. This works as intended when initializing an ITS, however when setting up a redistributor in cpu_init_lpis() the early return for preallocated RD tables skips straight past the GICv4 setup. This all comes to a head when trying to kexec() into a new kernel, as the new kernel silently fails to set up GICv4, leading to a complete loss of SGIs and LPIs for KVM VMs. Slap a band-aid on the problem by ensuring its_cpu_init_lpis() always initializes GICv4 on the way out, even if the other RD tables were preallocated. Fixes: 6479450f72c1 ("irqchip/gic-v4: Fix occasional VLPI drop") Reported-by: George Cherian <gcherian@marvell.com> Co-developed-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240219185809.286724-2-oliver.upton@linux.dev
2024-02-13irqchip/gic-v3-its: Fix GICv4.1 VPE affinity updateMarc Zyngier
When updating the affinity of a VPE, the VMOVP command is currently skipped if the two CPUs are part of the same VPE affinity. But this is wrong, as the doorbell corresponding to this VPE is still delivered on the 'old' CPU, which screws up the balancing. Furthermore, offlining that 'old' CPU results in doorbell interrupts generated for this VPE being discarded. The harsh reality is that VMOVP cannot be elided when a set_affinity() request occurs. It needs to be obeyed, and if an optimisation is to be made, it is at the point where the affinity change request is made (such as in KVM). Drop the VMOVP elision altogether, and only use the vpe_table_mask to try and stay within the same ITS affinity group if at all possible. Fixes: dd3f050a216e (irqchip/gic-v4.1: Implement the v4.1 flavour of VMOVP) Reported-by: Kunkun Jiang <jiangkunkun@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240213101206.2137483-4-maz@kernel.org
2024-02-13irqchip/gic-v3-its: Restore quirk probing for ACPI-based systemsMarc Zyngier
While refactoring the way the ITSs are probed, the handling of quirks applicable to ACPI-based platforms was lost. As a result, systems such as HIP07 lose their GICv4 functionnality, and some other may even fail to boot, unless they are configured to boot with DT. Move the enabling of quirks into its_probe_one(), making it common to all firmware implementations. Fixes: 9585a495ac93 ("irqchip/gic-v3-its: Split allocation from initialisation of its_node") Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240213101206.2137483-3-maz@kernel.org
2024-02-13irqchip/gic-v3-its: Handle non-coherent GICv4 redistributorsMarc Zyngier
Although the GICv3 code base has gained some handling of systems failing to handle the shareability attributes, the GICv4 side of things has been firmly ignored. This is unfortunate, as the new recent addition of the "dma-noncoherent" is supposed to apply to all of the GICR tables, and not just the ones that are common to v3 and v4. Add some checks to handle the VPROPBASE/VPENDBASE shareability and cacheability attributes in the same way we deal with the other GICR_BASE registers, wrapping the flag check in a helper for improved readability. Note that this has been found by inspection only, as I don't have access to HW that suffers from this particular issue. Fixes: 3a0fff0fb6a3 ("irqchip/gic-v3: Enable non-coherent redistributors/ITSes DT probing") Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Link: https://lore.kernel.org/r/20240213101206.2137483-2-maz@kernel.org
2024-02-13irqchip/gic-v3-its: Remove usage of the deprecated ida_simple_xx() APIChristophe JAILLET
ida_alloc() and ida_free() should be used instead of the deprecated ida_simple_get() and ida_simple_remove(). The upper limit of ida_simple_get() is exclusive, but the one of ida_alloc_max() is inclusive. Adjust the code accordingly. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/3b472b0e7edf6e483b8b255cf8d1cb0163532adf.1705222332.git.christophe.jaillet@wanadoo.fr
2024-01-08mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDERKirill A. Shutemov
commit 23baf831a32c ("mm, treewide: redefine MAX_ORDER sanely") has changed the definition of MAX_ORDER to be inclusive. This has caused issues with code that was not yet upstream and depended on the previous definition. To draw attention to the altered meaning of the define, rename MAX_ORDER to MAX_PAGE_ORDER. Link: https://lkml.kernel.org/r/20231228144704.14033-2-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-06irqchip/gic-v3-its: Flush ITS tables correctly in non-coherent GIC designsFang Xiang
In non-coherent GIC designs, the ITS tables must be flushed before writing to the GITS_BASER<n> registers, otherwise the ITS could read dirty tables, which results in unpredictable behavior. Flush the tables right at the begin of its_setup_baser() to prevent that. [ tglx: Massage changelog ] Fixes: a8707f553884 ("irqchip/gic-v3: Add Rockchip 3588001 erratum workaround") Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Fang Xiang <fangxiang3@xiaomi.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20231030083256.4345-1-fangxiang3@xiaomi.com
2023-10-25irqchip/gic-v3-its: Don't override quirk settings with default valuesMarc Zyngier
When splitting the allocation of the ITS node from its configuration, some of the default settings were kept in the latter instead of being moved to the former. This has the side effect of negating some of the quirk detections that have happened in between, amongst which the dreaded Synquacer hack (that also affect Dominic's TI platform). Move the initialisation of these fields early, so that they can again be overriden by the Synquacer quirk. Fixes: 9585a495ac93 ("irqchip/gic-v3-its: Split allocation from initialisation of its_node") Reported by: Dominic Rath <dominic.rath@ibv-augsburg.net> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Dominic Rath <dominic.rath@ibv-augsburg.net> Link: https://lore.kernel.org/r/20231024084831.GA3788@JADEVM-DRA Link: https://lore.kernel.org/r/20231024143431.2144579-1-maz@kernel.org
2023-10-07irqchip/gic-v3: Enable non-coherent redistributors/ITSes DT probingLorenzo Pieralisi
The GIC architecture specification defines a set of registers for redistributors and ITSes that control the sharebility and cacheability attributes of redistributors/ITSes initiator ports on the interconnect (GICR_[V]PROPBASER, GICR_[V]PENDBASER, GITS_BASER<n>). Architecturally the GIC provides a means to drive shareability and cacheability attributes signals and related IWB/OWB/ISH barriers but it is not mandatory for designs to wire up the corresponding interconnect signals that control the cacheability/shareability of transactions. Redistributors and ITSes interconnect ports can be connected to non-coherent interconnects that are not able to manage the shareability/cacheability attributes; this implicitly makes the redistributors and ITSes non-coherent observers. So far, the GIC driver on probe executes a write to "probe" for the redistributors and ITSes registers shareability bitfields by writing a value (ie InnerShareable - the shareability domain the CPUs are in) and check it back to detect whether the value sticks or not; this hinges on a GIC programming model behaviour that predates the current specifications, that just define shareability bits as writeable but do not guarantee that writing certain shareability values enable the expected behaviour for the redistributors/ITSes memory interconnect ports. To enable non-coherent GIC designs, introduce the "dma-noncoherent" device tree property to allow firmware to describe redistributors and ITSes as non-coherent observers on the memory interconnect and use the property to force the shareability attributes to be programmed into the redistributors and ITSes registers through the GIC quirks mechanism. Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20231006125929.48591-3-lpieralisi@kernel.org
2023-10-07irqchip/gic-v3-its: Split allocation from initialisation of its_nodeMarc Zyngier
In order to pave the way for more fancy quirk handling without making more of a mess of this terrible driver, split the allocation of the ITS descriptor (its_node) from the actual probing. This will allow firmware-specific hooks to be added between these two points. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20231006125929.48591-4-lpieralisi@kernel.org
2023-07-03irqchip/gic-v3: Enable Rockchip 3588001 erratum workaround for RK3588SSebastian Reichel
Commit a8707f553884 ("irqchip/gic-v3: Add Rockchip 3588001 erratum workaround") mentioned RK3588S (the slimmed down variant of RK3588) being affected, but did not check for its compatible value. Thus the quirk is not applied on RK3588S. Since the GIC ITS node got added to the upstream DT, boards using RK3588S are no longer booting without this quirk being applied. Fixes: 06cdac8e8407 ("arm64: dts: rockchip: add GIC ITS support to rk3588") Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230703164129.193991-1-sebastian.reichel@collabora.com
2023-07-03irqchip/gic-v4.1: Properly lock VPEs when doing a directLPI invalidationMarc Zyngier
We normally rely on the irq_to_cpuid_[un]lock() primitives to make sure nothing will change col->idx while performing a LPI invalidation. However, these primitives do not cover VPE doorbells, and we have some open-coded locking for that. Unfortunately, this locking is pretty bogus. Instead, extend the above primitives to cover VPE doorbells and convert the whole thing to it. Fixes: f3a059219bc7 ("irqchip/gic-v4.1: Ensure mutual exclusion between vPE affinity change and RD access") Reported-by: Kunkun Jiang <jiangkunkun@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: Zenghui Yu <yuzenghui@huawei.com> Cc: wanghaibin.wang@huawei.com Tested-by: Kunkun Jiang <jiangkunkun@huawei.com> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20230617073242.3199746-1-maz@kernel.org
2023-06-16irqchip/gic-v3-its: Enable RESEND_WHEN_IN_PROGRESS for LPIsJames Gowans
GICv3 LPIs are impacted by an architectural design issue: they do not have a global active state and as such a given LPI can be delivered to a new CPU after an affinity change while the previous instance of the same LPI handler has not yet completed on the original CPU. If LPIs had an active state, this second LPI would not be delivered until the first CPU deactivated the initial LPI, just like SPIs. To solve this issue, use the newly introduced IRQD_RESEND_WHEN_IN_PROGRESS flag, ensuring that we do not lose an LPI being delivered during that window by getting the GIC to resend it. This workaround gets enabled for all LPIs, including the VPE doorbells. Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: James Gowans <jgowans@amazon.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Marc Zyngier <maz@kernel.org> Cc: KarimAllah Raslan <karahmed@amazon.com> Cc: Yipeng Zou <zouyipeng@huawei.com> Cc: Zhang Jianhua <chris.zjh@huawei.com> [maz: massaged commit message] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230608120021.3273400-4-jgowans@amazon.com
2023-04-27Merge tag 'mm-stable-2023-04-27-15-30' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of switching from a user process to a kernel thread. - More folio conversions from Kefeng Wang, Zhang Peng and Pankaj Raghav. - zsmalloc performance improvements from Sergey Senozhatsky. - Yue Zhao has found and fixed some data race issues around the alteration of memcg userspace tunables. - VFS rationalizations from Christoph Hellwig: - removal of most of the callers of write_one_page() - make __filemap_get_folio()'s return value more useful - Luis Chamberlain has changed tmpfs so it no longer requires swap backing. Use `mount -o noswap'. - Qi Zheng has made the slab shrinkers operate locklessly, providing some scalability benefits. - Keith Busch has improved dmapool's performance, making part of its operations O(1) rather than O(n). - Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd, permitting userspace to wr-protect anon memory unpopulated ptes. - Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive rather than exclusive, and has fixed a bunch of errors which were caused by its unintuitive meaning. - Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature, which causes minor faults to install a write-protected pte. - Vlastimil Babka has done some maintenance work on vma_merge(): cleanups to the kernel code and improvements to our userspace test harness. - Cleanups to do_fault_around() by Lorenzo Stoakes. - Mike Rapoport has moved a lot of initialization code out of various mm/ files and into mm/mm_init.c. - Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for DRM, but DRM doesn't use it any more. - Lorenzo has also coverted read_kcore() and vread() to use iterators and has thereby removed the use of bounce buffers in some cases. - Lorenzo has also contributed further cleanups of vma_merge(). - Chaitanya Prakash provides some fixes to the mmap selftesting code. - Matthew Wilcox changes xfs and afs so they no longer take sleeping locks in ->map_page(), a step towards RCUification of pagefaults. - Suren Baghdasaryan has improved mmap_lock scalability by switching to per-VMA locking. - Frederic Weisbecker has reworked the percpu cache draining so that it no longer causes latency glitches on cpu isolated workloads. - Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig logic. - Liu Shixin has changed zswap's initialization so we no longer waste a chunk of memory if zswap is not being used. - Yosry Ahmed has improved the performance of memcg statistics flushing. - David Stevens has fixed several issues involving khugepaged, userfaultfd and shmem. - Christoph Hellwig has provided some cleanup work to zram's IO-related code paths. - David Hildenbrand has fixed up some issues in the selftest code's testing of our pte state changing. - Pankaj Raghav has made page_endio() unneeded and has removed it. - Peter Xu contributed some rationalizations of the userfaultfd selftests. - Yosry Ahmed has fixed an issue around memcg's page recalim accounting. - Chaitanya Prakash has fixed some arm-related issues in the selftests/mm code. - Longlong Xia has improved the way in which KSM handles hwpoisoned pages. - Peter Xu fixes a few issues with uffd-wp at fork() time. - Stefan Roesch has changed KSM so that it may now be used on a per-process and per-cgroup basis. * tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits) mm,unmap: avoid flushing TLB in batch if PTE is inaccessible shmem: restrict noswap option to initial user namespace mm/khugepaged: fix conflicting mods to collapse_file() sparse: remove unnecessary 0 values from rc mm: move 'mmap_min_addr' logic from callers into vm_unmapped_area() hugetlb: pte_alloc_huge() to replace huge pte_alloc_map() maple_tree: fix allocation in mas_sparse_area() mm: do not increment pgfault stats when page fault handler retries zsmalloc: allow only one active pool compaction context selftests/mm: add new selftests for KSM mm: add new KSM process and sysfs knobs mm: add new api to enable ksm per process mm: shrinkers: fix debugfs file permissions mm: don't check VMA write permissions if the PTE/PMD indicates write permissions migrate_pages_batch: fix statistics for longterm pin retry userfaultfd: use helper function range_in_vma() lib/show_mem.c: use for_each_populated_zone() simplify code mm: correct arg in reclaim_pages()/reclaim_clean_pages_from_list() fs/buffer: convert create_page_buffers to folio_create_buffers fs/buffer: add folio_create_empty_buffers helper ...
2023-04-18irqchip/gic-v3: Add Rockchip 3588001 erratum workaroundSebastian Reichel
Rockchip RK3588/RK3588s GIC600 integration does not support the sharability feature. Rockchip assigned Erratum ID #3588001 for this issue. Note, that the 0x0201743b ID is not Rockchip specific and thus there is an extra of_machine_is_compatible() check. The flags are named FORCE_NON_SHAREABLE to be vendor agnostic, since apparently similar integration design errors exist in other platforms and they can reuse the same flag. Co-developed-by: XiaoDong Huang <derrick.huang@rock-chips.com> Signed-off-by: XiaoDong Huang <derrick.huang@rock-chips.com> Co-developed-by: Kever Yang <kever.yang@rock-chips.com> Signed-off-by: Kever Yang <kever.yang@rock-chips.com> Co-developed-by: Lucas Tanure <lucas.tanure@collabora.com> Signed-off-by: Lucas Tanure <lucas.tanure@collabora.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230418142109.49762-2-sebastian.reichel@collabora.com
2023-04-05mm, treewide: redefine MAX_ORDER sanelyKirill A. Shutemov
MAX_ORDER currently defined as number of orders page allocator supports: user can ask buddy allocator for page order between 0 and MAX_ORDER-1. This definition is counter-intuitive and lead to number of bugs all over the kernel. Change the definition of MAX_ORDER to be inclusive: the range of orders user can ask from buddy allocator is 0..MAX_ORDER now. [kirill@shutemov.name: fix min() warning] Link: https://lkml.kernel.org/r/20230315153800.32wib3n5rickolvh@box [akpm@linux-foundation.org: fix another min_t warning] [kirill@shutemov.name: fixups per Zi Yan] Link: https://lkml.kernel.org/r/20230316232144.b7ic4cif4kjiabws@box.shutemov.name [akpm@linux-foundation.org: fix underlining in docs] Link: https://lore.kernel.org/oe-kbuild-all/202303191025.VRCTk6mP-lkp@intel.com/ Link: https://lkml.kernel.org/r/20230315113133.11326-11-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-24Merge tag 'for-linus-iommufd' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd Pull iommufd updates from Jason Gunthorpe: "Some polishing and small fixes for iommufd: - Remove IOMMU_CAP_INTR_REMAP, instead rely on the interrupt subsystem - Use GFP_KERNEL_ACCOUNT inside the iommu_domains - Support VFIO_NOIOMMU mode with iommufd - Various typos - A list corruption bug if HWPTs are used for attach" * tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: iommufd: Do not add the same hwpt to the ioas->hwpt_list twice iommufd: Make sure to zero vfio_iommu_type1_info before copying to user vfio: Support VFIO_NOIOMMU with iommufd iommufd: Add three missing structures in ucmd_buffer selftests: iommu: Fix test_cmd_destroy_access() call in user_copy iommu: Remove IOMMU_CAP_INTR_REMAP irq/s390: Add arch_is_isolated_msi() for s390 iommu/x86: Replace IOMMU_CAP_INTR_REMAP with IRQ_DOMAIN_FLAG_ISOLATED_MSI genirq/msi: Rename IRQ_DOMAIN_MSI_REMAP to IRQ_DOMAIN_ISOLATED_MSI genirq/irqdomain: Remove unused irq_domain_check_msi_remap() code iommufd: Convert to msi_device_has_isolated_msi() vfio/type1: Convert to iommu_group_has_isolated_msi() iommu: Add iommu_group_has_isolated_msi() genirq/msi: Add msi_device_has_isolated_msi()
2023-02-13irqchip/gic-v3-its: Use irq_domain_create_hierarchy()Johan Hovold
Use the irq_domain_create_hierarchy() helper to create the hierarchical domain, which both serves as documentation and avoids poking at irqdomain internals. Note that the domain host_data was first set to the struct its_node during allocation only to immediately be overwritten with the struct msi_domain_info. Tested-by: Hsin-Yi Wang <hsinyi@chromium.org> Tested-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230213104302.17307-17-johan+linaro@kernel.org
2023-01-11genirq/msi: Rename IRQ_DOMAIN_MSI_REMAP to IRQ_DOMAIN_ISOLATED_MSIJason Gunthorpe
What x86 calls "interrupt remapping" is one way to achieve isolated MSI, make it clear this is talking about isolated MSI, no matter how it is achieved. This matches the new driver facing API name of msi_device_has_isolated_msi() No functional change. Link: https://lore.kernel.org/r/6-v3-3313bb5dd3a3+10f11-secure_msi_jgg@nvidia.com Tested-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-10-10Merge tag 'iommu-updates-v6.1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: - remove the bus_set_iommu() interface which became unnecesary because of IOMMU per-device probing - make the dma-iommu.h header private - Intel VT-d changes from Lu Baolu: - Decouple PASID and PRI from SVA - Add ESRTPS & ESIRTPS capability check - Cleanups - Apple DART support for the M1 Pro/MAX SOCs - support for AMD IOMMUv2 page-tables for the DMA-API layer. The v2 page-tables are compatible with the x86 CPU page-tables. Using them for DMA-API prepares support for hardware-assisted IOMMU virtualization - support for MT6795 Helio X10 M4Us in the Mediatek IOMMU driver - some smaller fixes and cleanups * tag 'iommu-updates-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (59 commits) iommu/vt-d: Avoid unnecessary global DMA cache invalidation iommu/vt-d: Avoid unnecessary global IRTE cache invalidation iommu/vt-d: Rename cap_5lp_support to cap_fl5lp_support iommu/vt-d: Remove pasid_set_eafe() iommu/vt-d: Decouple PASID & PRI enabling from SVA iommu/vt-d: Remove unnecessary SVA data accesses in page fault path dt-bindings: iommu: arm,smmu-v3: Relax order of interrupt names iommu: dart: Support t6000 variant iommu/io-pgtable-dart: Add DART PTE support for t6000 iommu/io-pgtable: Add DART subpage protection support iommu/io-pgtable: Move Apple DART support to its own file iommu/mediatek: Add support for MT6795 Helio X10 M4Us iommu/mediatek: Introduce new flag TF_PORT_TO_ADDR_MT8173 dt-bindings: mediatek: Add bindings for MT6795 M4U iommu/iova: Fix module config properly iommu/amd: Fix sparse warning iommu/amd: Remove outdated comment iommu/amd: Free domain ID after domain_flush_pages iommu/amd: Free domain id in error path iommu/virtio: Fix compile error with viommu_capable() ...
2022-09-12irqchip/gic-v3-its: Remove cpumask_var_t allocationPierre Gondois
Running a PREEMPT_RT kernel based on v5.19-rc3-rt4 on an Ampere Altra triggers: [ 22.616229] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46 [ 22.616239] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 1884, name: kworker/80:1 [ 22.616243] preempt_count: 3, expected: 0 [ 22.616244] RCU nest depth: 0, expected: 0 [...] [ 22.616250] hardirqs last enabled at (33): _raw_spin_unlock_irq (/home/piegon01/linux/./arch/arm64/include/asm/irqflags.h:35) [ 22.616273] hardirqs last disabled at (34): __schedule (/home/piegon01/linux/kernel/sched/core.c:6432 (discriminator 1)) [ 22.616283] softirqs last enabled at (0): copy_process (/home/piegon01/linux/./include/linux/lockdep.h:191) [ 22.616297] softirqs last disabled at (0): 0x0 [ 22.616305] Preemption disabled at: [ 22.616307] __setup_irq (/home/piegon01/linux/kernel/irq/manage.c:1612) [ 22.616322] CPU: 80 PID: 1884 Comm: kworker/80:1 Tainted: G W [...] [ 22.616328] Hardware name: WIWYNN Mt.Jade Server System B81.03001.0005/Mt.Jade Motherboard, BIOS 1.08.20220218 (SCP: 1.08.20220218) 2022/02/18 [ 22.616333] Workqueue: events work_for_cpu_fn [ 22.616344] Call trace: [...] [ 22.616403] alloc_cpumask_var_node (/home/piegon01/linux/lib/cpumask.c:115) [ 22.616414] alloc_cpumask_var (/home/piegon01/linux/lib/cpumask.c:147) [ 22.616417] its_select_cpu (/home/piegon01/linux/drivers/irqchip/irq-gic-v3-its.c:1580) [ 22.616428] its_set_affinity (/home/piegon01/linux/drivers/irqchip/irq-gic-v3-its.c:1659) [ 22.616431] msi_domain_set_affinity (/home/piegon01/linux/kernel/irq/msi.c:501) [ 22.616440] irq_do_set_affinity (/home/piegon01/linux/kernel/irq/manage.c:276) [ 22.616443] irq_setup_affinity (/home/piegon01/linux/kernel/irq/manage.c:633) [ 22.616447] irq_startup (/home/piegon01/linux/kernel/irq/chip.c:280) [ 22.616453] __setup_irq (/home/piegon01/linux/kernel/irq/manage.c:1777) Follow the pattern established in commit cba4235e6031e ("genirq: Remove mask argument from setup_affinity()") and co to overcome this issue by defining a static struct cpumask and protecting it by a raw spinlock. Since its_select_cpu() can be executed with IRQs enabled or disabled, enforce that the cpumask computation is done with interrupts disabled. Signed-off-by: Pierre Gondois <pierre.gondois@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220912141857.1391343-1-pierre.gondois@arm.com
2022-09-07iommu/dma: Move public interfaces to linux/iommu.hRobin Murphy
The iommu-dma layer is now mostly encapsulated by iommu_dma_ops, with only a couple more public interfaces left pertaining to MSI integration. Since these depend on the main IOMMU API header anyway, move their declarations there, taking the opportunity to update the half-baked comments to proper kerneldoc along the way. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/9cd99738f52094e6bed44bfee03fa4f288d20695.1660668998.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-20Merge tag 'irqchip-5.19' of ↵Thomas Gleixner
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core Pull irqchip updates from Marc Zyngier: - Add new infrastructure to stop gpiolib from rewriting irq_chip structures behind our back. Convert a few of them, but this will obviously be a long effort. - A bunch of GICv3 improvements, such as using MMIO-based invalidations when possible, and reducing the amount of polling we perform when reconfiguring interrupts. - Another set of GICv3 improvements for the Pseudo-NMI functionality, with a nice cleanup making it easy to reason about the various states we can be in when an NMI fires. - The usual bunch of misc fixes and minor improvements. Link: https://lore.kernel.org/all/20220519165308.998315-1-maz@kernel.org
2022-04-10irqchip/gic-v3: Always trust the managed affinity provided by the core codeMarc Zyngier
Now that the core code has been fixed to always give us an affinity that only includes online CPUs, directly use this affinity when computing a target CPU. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20220405185040.206297-4-maz@kernel.org
2022-04-05irqchip/gic-v4: Wait for GICR_VPENDBASER.Dirty to clear before deschedulingMarc Zyngier
The way KVM drives GICv4.{0,1} is as follows: - vcpu_load() makes the VPE resident, instructing the RD to start scanning for interrupts - just before entering the guest, we check that the RD has finished scanning and that we can start running the vcpu - on preemption, we deschedule the VPE by making it invalid on the RD However, we are preemptible between the first two steps. If it so happens *and* that the RD was still scanning, we nonetheless write to the GICR_VPENDBASER register while Dirty is set, and bad things happen (we're in UNPRED land). This affects both the 4.0 and 4.1 implementations. Make sure Dirty is cleared before performing the deschedule, meaning that its_clear_vpend_valid() becomes a sort of full VPE residency barrier. Reported-by: Jingyi Wang <wangjingyi11@huawei.com> Tested-by: Nianyao Tang <tangnianyao@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Fixes: 57e3cebd022f ("KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty bit") Link: https://lore.kernel.org/r/4aae10ba-b39a-5f84-754b-69c2eb0a2c03@huawei.com
2022-02-02irqchip/gic-v3-its: Skip HP notifier when no ITS is registeredMarc Zyngier
We have some systems out there that have both LPI support and an ITS, but that don't expose the ITS in their firmware tables (either because it is broken or because they run under a hypervisor that hides it...). Is such a configuration, we still register the HP notifier to free the allocated tables if needed, resulting in a warning as there is no memory to free (nothing was allocated the first place). Fix it by keying the HP notifier on the presence of at least one sucessfully probed ITS. Fixes: d23bc2bc1d63 ("irqchip/gic-v3-its: Postpone LPI pending table freeing and memreserve") Reported-by: Steev Klimaszewski <steev@kali.org> Tested-by: Steev Klimaszewski <steev@kali.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: Valentin Schneider <valentin.schneider@arm.com> Link: https://lore.kernel.org/r/20220202103454.2480465-1-maz@kernel.org
2022-01-29Merge tag 'irqchip-fixes-5.17-1' of ↵Thomas Gleixner
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent Pull irqchip fixes from Marc Zyngier: - Drop an unused private data field in the AIC driver - Various fixes to the realtek-rtl driver - Make the GICv3 ITS driver compile again in !SMP configurations - Force reset of the GICv3 ITSs at probe time to avoid issues during kexec - Yet another kfree/bitmap_free conversion - Various DT updates (Renesas, SiFive) Link: https://lore.kernel.org/r/20220128174217.517041-1-maz@kernel.org
2022-01-26irqchip/gic-v3-its: Reset each ITS's BASERn register before probeMarc Zyngier
A recent bug report outlined that the way GICv4.1 is handled across kexec is pretty bad. We can end-up in a situation where ITSs share memory (this is the case when SVPET==1) and reprogram the base registers, creating a situation where ITSs that are part of a given affinity group see different pointers. Which is illegal. Boo. In order to restore some sanity, reset the BASERn registers to 0 *before* probing any ITS. Although this isn't optimised at all, this is only a once-per-boot cost, which shouldn't show up on anyone's radar. Cc: Jay Chen <jkchen@linux.alibaba.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Link: https://lore.kernel.org/r/20211216190315.GA14220@lpieralisi Link: https://lore.kernel.org/r/20220124133809.1291195-1-maz@kernel.org
2022-01-22irqchip/gic-v3-its: Fix build for !SMPArd Biesheuvel
Commit 835f442fdbce ("irqchip/gic-v3-its: Limit memreserve cpuhp state lifetime") added a reference to cpus_booted_once_mask, which does not exist on !SMP builds, breaking the build for such configurations. Given the intent of the check, short circuit it to always pass. Cc: Valentin Schneider <valentin.schneider@arm.com> Fixes: 835f442fdbce ("irqchip/gic-v3-its: Limit memreserve cpuhp state lifetime") Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220122151614.133766-1-ardb@kernel.org
2022-01-10Merge tag 'irqchip-5.17' of ↵Thomas Gleixner
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core Pull irqchip updates from Marc Zyngier: - Fix GICv3 redistributor table reservation with RT across kexec - Fix GICv4.1 redistributor view of the VPE table across kexec - Add support for extra interrupts on spear-shirq - Make obtaining some interrupts optional for the Renesas drivers - Various cleanups and bug fixes Link: https://lore.kernel.org/lkml/20220108130807.4109738-1-maz@kernel.org
2021-12-16irqchip/gic-v3-its: Limit memreserve cpuhp state lifetimeValentin Schneider
The new memreserve cpuhp callback only needs to survive up until a point where every CPU in the system has booted once. Beyond that, it becomes a no-op and can be put in the bin. Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211027151506.2085066-4-valentin.schneider@arm.com
2021-12-16irqchip/gic-v3-its: Postpone LPI pending table freeing and memreserveValentin Schneider
Memory used by the LPI tables have to be made persistent for kexec to have a chance to work, as explained in [1]. If they have been made persistent and we are booting into a kexec'd kernel, we also need to free the pages that were preemptively allocated by the new kernel for those tables. Both of those operations currently happen during its_cpu_init(), which happens in a _STARTING (IOW atomic) cpuhp callback for secondary CPUs. efi_mem_reserve_iomem() issues a GFP_ATOMIC allocation, which unfortunately doesn't work under PREEMPT_RT (this ends up grabbing a non-raw spinlock, which can sleep under PREEMPT_RT). Similarly, freeing the pages ends up grabbing a sleepable spinlock. Since the memreserve is only required by kexec, it doesn't have to be done so early in the secondary boot process. Issue the reservation in a new CPUHP_AP_ONLINE_DYN cpuhp callback, and piggy-back the page freeing on top of it. A CPU gets to run the body of this new callback exactly once. As kexec issues a machine_shutdown() prior to machine_kexec(), it will be serialized vs a CPU being plugged to life by the hotplug machinery - either the CPU will have been brought up and have had its redistributor's pending table memreserved, or it never went online and will have its table allocated by the new kernel. [1]: https://lore.kernel.org/lkml/20180921195954.21574-1-marc.zyngier@arm.com/ Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211027151506.2085066-3-valentin.schneider@arm.com
2021-12-16irqchip/gic-v3-its: Give the percpu rdist struct its own flags fieldValentin Schneider
Later patches will require tracking some per-rdist status. Reuse the bytes "lost" to padding within the __percpu rdist struct as a flags field, and re-encode ->lpi_enabled within said flags. No change in functionality intended. Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211027151506.2085066-2-valentin.schneider@arm.com
2021-12-08irqchip/irq-gic-v3-its.c: Force synchronisation when issuing INVALLWudi Wang
INVALL CMD specifies that the ITS must ensure any caching associated with the interrupt collection defined by ICID is consistent with the LPI configuration tables held in memory for all Redistributors. SYNC is required to ensure that INVALL is executed. Currently, LPI configuration data may be inconsistent with that in the memory within a short period of time after the INVALL command is executed. Signed-off-by: Wudi Wang <wangwudi@hisilicon.com> Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Fixes: cc2d3216f53c ("irqchip: GICv3: ITS command queue") Link: https://lore.kernel.org/r/20211208015429.5007-1-zhangshaokun@hisilicon.com
2021-09-22irqchip/gic-v3-its: Fix potential VPE leak on errorKaige Fu
In its_vpe_irq_domain_alloc, when its_vpe_init() returns an error, there is an off-by-one in the number of VPEs to be freed. Fix it by simply passing the number of VPEs allocated, which is the index of the loop iterating over the VPEs. Fixes: 7d75bbb4bc1a ("irqchip/gic-v3-its: Add VPE irq domain allocation/teardown") Signed-off-by: Kaige Fu <kaige.fu@linux.alibaba.com> [maz: fixed commit message] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/d9e36dee512e63670287ed9eff884a5d8d6d27f2.1631672311.git.kaige.fu@linux.alibaba.com
2021-07-26irqchip/gic-v3: Switch to bitmap_zalloc()Andy Shevchenko
Switch to bitmap_zalloc() to show clearly what we are allocating. Besides that it returns pointer of bitmap type instead of opaque void *. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210618151657.65305-4-andriy.shevchenko@linux.intel.com
2021-06-11irqchip/gic-v3-its: Remove unnecessary oom messageZhen Lei
Fixes scripts/checkpatch.pl warning: WARNING: Possible unnecessary 'out of memory' message Remove it can help us save a bit of memory. Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210609140643.14531-1-thunder.leizhen@huawei.com
2021-05-01Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm updates from Paolo Bonzini: "This is a large update by KVM standards, including AMD PSP (Platform Security Processor, aka "AMD Secure Technology") and ARM CoreSight (debug and trace) changes. ARM: - CoreSight: Add support for ETE and TRBE - Stage-2 isolation for the host kernel when running in protected mode - Guest SVE support when running in nVHE mode - Force W^X hypervisor mappings in nVHE mode - ITS save/restore for guests using direct injection with GICv4.1 - nVHE panics now produce readable backtraces - Guest support for PTP using the ptp_kvm driver - Performance improvements in the S2 fault handler x86: - AMD PSP driver changes - Optimizations and cleanup of nested SVM code - AMD: Support for virtual SPEC_CTRL - Optimizations of the new MMU code: fast invalidation, zap under read lock, enable/disably dirty page logging under read lock - /dev/kvm API for AMD SEV live migration (guest API coming soon) - support SEV virtual machines sharing the same encryption context - support SGX in virtual machines - add a few more statistics - improved directed yield heuristics - Lots and lots of cleanups Generic: - Rework of MMU notifier interface, simplifying and optimizing the architecture-specific code - a handful of "Get rid of oprofile leftovers" patches - Some selftests improvements" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (379 commits) KVM: selftests: Speed up set_memory_region_test selftests: kvm: Fix the check of return value KVM: x86: Take advantage of kvm_arch_dy_has_pending_interrupt() KVM: SVM: Skip SEV cache flush if no ASIDs have been used KVM: SVM: Remove an unnecessary prototype declaration of sev_flush_asids() KVM: SVM: Drop redundant svm_sev_enabled() helper KVM: SVM: Move SEV VMCB tracking allocation to sev.c KVM: SVM: Explicitly check max SEV ASID during sev_hardware_setup() KVM: SVM: Unconditionally invoke sev_hardware_teardown() KVM: SVM: Enable SEV/SEV-ES functionality by default (when supported) KVM: SVM: Condition sev_enabled and sev_es_enabled on CONFIG_KVM_AMD_SEV=y KVM: SVM: Append "_enabled" to module-scoped SEV/SEV-ES control variables KVM: SEV: Mask CPUID[0x8000001F].eax according to supported features KVM: SVM: Move SEV module params/variables to sev.c KVM: SVM: Disable SEV/SEV-ES if NPT is disabled KVM: SVM: Free sev_asid_bitmap during init if SEV setup fails KVM: SVM: Zero out the VMCB array used to track SEV ASID association x86/sev: Drop redundant and potentially misleading 'sev_enabled' KVM: x86: Move reverse CPUID helpers to separate header file KVM: x86: Rename GPR accessors to make mode-aware variants the defaults ...
2021-03-24irqchip/gic-v3-its: Drop the setting of PTZ altogetherShenming Lu
GICv4.1 gives a way to get the VLPI state, which needs to map the vPE first, and after the state read, we may remap the vPE back while the VPT is not empty. So we can't assume that the VPT is empty at the first map. Besides, the optimization of PTZ is probably limited since the HW should be fairly efficient to parse the empty VPT. Let's drop the setting of PTZ altogether. Signed-off-by: Shenming Lu <lushenming@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210322060158.1584-3-lushenming@huawei.com
2021-03-24irqchip/gic-v3-its: Add a cache invalidation right after vPE unmappingMarc Zyngier
In order to be able to manipulate the VPT once a vPE has been unmapped, perform the required CMO to invalidate the CPU view of the VPT. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Shenming Lu <lushenming@huawei.com> Link: https://lore.kernel.org/r/20210322060158.1584-2-lushenming@huawei.com
2021-03-22irq: Fix typos in commentsIngo Molnar
Fix ~36 single-word typos in the IRQ, irqchip and irqdomain code comments. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Marc Zyngier <maz@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-12-20Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "Much x86 work was pushed out to 5.12, but ARM more than made up for it. ARM: - PSCI relay at EL2 when "protected KVM" is enabled - New exception injection code - Simplification of AArch32 system register handling - Fix PMU accesses when no PMU is enabled - Expose CSV3 on non-Meltdown hosts - Cache hierarchy discovery fixes - PV steal-time cleanups - Allow function pointers at EL2 - Various host EL2 entry cleanups - Simplification of the EL2 vector allocation s390: - memcg accouting for s390 specific parts of kvm and gmap - selftest for diag318 - new kvm_stat for when async_pf falls back to sync x86: - Tracepoints for the new pagetable code from 5.10 - Catch VFIO and KVM irqfd events before userspace - Reporting dirty pages to userspace with a ring buffer - SEV-ES host support - Nested VMX support for wait-for-SIPI activity state - New feature flag (AVX512 FP16) - New system ioctl to report Hyper-V-compatible paravirtualization features Generic: - Selftest improvements" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (171 commits) KVM: SVM: fix 32-bit compilation KVM: SVM: Add AP_JUMP_TABLE support in prep for AP booting KVM: SVM: Provide support to launch and run an SEV-ES guest KVM: SVM: Provide an updated VMRUN invocation for SEV-ES guests KVM: SVM: Provide support for SEV-ES vCPU loading KVM: SVM: Provide support for SEV-ES vCPU creation/loading KVM: SVM: Update ASID allocation to support SEV-ES guests KVM: SVM: Set the encryption mask for the SVM host save area KVM: SVM: Add NMI support for an SEV-ES guest KVM: SVM: Guest FPU state save/restore not needed for SEV-ES guest KVM: SVM: Do not report support for SMM for an SEV-ES guest KVM: x86: Update __get_sregs() / __set_sregs() to support SEV-ES KVM: SVM: Add support for CR8 write traps for an SEV-ES guest KVM: SVM: Add support for CR4 write traps for an SEV-ES guest KVM: SVM: Add support for CR0 write traps for an SEV-ES guest KVM: SVM: Add support for EFER write traps for an SEV-ES guest KVM: SVM: Support string IO operations for an SEV-ES guest KVM: SVM: Support MMIO for an SEV-ES guest KVM: SVM: Create trace events for VMGEXIT MSR protocol processing KVM: SVM: Create trace events for VMGEXIT processing ...
2020-12-11irqchip/gic-v3-its: Tag ITS device as shared if allocating for a proxy deviceMarc Zyngier
The ITS already has some notion of "shared" devices. Let's map the MSI_ALLOC_FLAGS_PROXY_DEVICE flag onto this internal property. Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/20201129135208.680293-3-maz@kernel.org
2020-12-11irqchip/gic-v4.1: Reduce the delay when polling GICR_VPENDBASER.DirtyShenming Lu
The 10us delay of the poll on the GICR_VPENDBASER.Dirty bit is too high, which might greatly affect the total scheduling latency of a vCPU in our measurement. So we reduce it to 1 to lessen the impact. Signed-off-by: Shenming Lu <lushenming@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201128141857.983-2-lushenming@huawei.com