aboutsummaryrefslogtreecommitdiffstats
path: root/arch
AgeCommit message (Collapse)Author
2019-11-12x86/cpu: Add Tremont to the cpu vulnerability whitelistPawan Gupta
commit cad14885a8d32c1c0d8eaa7bf5c0152a22b6080e upstream. Add the new cpu family ATOM_TREMONT_D to the cpu vunerability whitelist. ATOM_TREMONT_D is not affected by X86_BUG_ITLB_MULTIHIT. ATOM_TREMONT_D might have mitigations against other issues as well, but only the ITLB multihit mitigation is confirmed at this point. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-12x86/bugs: Add ITLB_MULTIHIT bug infrastructureVineela Tummalapalli
commit db4d30fbb71b47e4ecb11c4efa5d8aad4b03dfae upstream. Some processors may incur a machine check error possibly resulting in an unrecoverable CPU lockup when an instruction fetch encounters a TLB multi-hit in the instruction TLB. This can occur when the page size is changed along with either the physical address or cache type. The relevant erratum can be found here: https://bugzilla.kernel.org/show_bug.cgi?id=205195 There are other processors affected for which the erratum does not fully disclose the impact. This issue affects both bare-metal x86 page tables and EPT. It can be mitigated by either eliminating the use of large pages or by using careful TLB invalidations when changing the page size in the page tables. Just like Spectre, Meltdown, L1TF and MDS, a new bit has been allocated in MSR_IA32_ARCH_CAPABILITIES (PSCHANGE_MC_NO) and will be set on CPUs which are mitigated against this issue. Signed-off-by: Vineela Tummalapalli <vineela.tummalapalli@intel.com> Co-developed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-12x86/speculation/taa: Fix printing of TAA_MSG_SMT on IBRS_ALL CPUsJosh Poimboeuf
commit 012206a822a8b6ac09125bfaa210a95b9eb8f1c1 upstream. For new IBRS_ALL CPUs, the Enhanced IBRS check at the beginning of cpu_bugs_smt_update() causes the function to return early, unintentionally skipping the MDS and TAA logic. This is not a problem for MDS, because there appears to be no overlap between IBRS_ALL and MDS-affected CPUs. So the MDS mitigation would be disabled and nothing would need to be done in this function anyway. But for TAA, the TAA_MSG_SMT string will never get printed on Cascade Lake and newer. The check is superfluous anyway: when 'spectre_v2_enabled' is SPECTRE_V2_IBRS_ENHANCED, 'spectre_v2_user' is always SPECTRE_V2_USER_NONE, and so the 'spectre_v2_user' switch statement handles it appropriately by doing nothing. So just remove the check. Fixes: 1b42f017415b ("x86/speculation/taa: Add mitigation for TSX Async Abort") Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Tyler Hicks <tyhicks@canonical.com> Reviewed-by: Borislav Petkov <bp@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-12x86/tsx: Add config options to set tsx=on|off|autoMichal Hocko
commit db616173d787395787ecc93eef075fa975227b10 upstream. There is a general consensus that TSX usage is not largely spread while the history shows there is a non trivial space for side channel attacks possible. Therefore the tsx is disabled by default even on platforms that might have a safe implementation of TSX according to the current knowledge. This is a fair trade off to make. There are, however, workloads that really do benefit from using TSX and updating to a newer kernel with TSX disabled might introduce a noticeable regressions. This would be especially a problem for Linux distributions which will provide TAA mitigations. Introduce config options X86_INTEL_TSX_MODE_OFF, X86_INTEL_TSX_MODE_ON and X86_INTEL_TSX_MODE_AUTO to control the TSX feature. The config setting can be overridden by the tsx cmdline options. [ bp: Text cleanups from Josh. ] Suggested-by: Borislav Petkov <bpetkov@suse.de> Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-12x86/tsx: Add "auto" option to the tsx= cmdline parameterPawan Gupta
commit 7531a3596e3272d1f6841e0d601a614555dc6b65 upstream. Platforms which are not affected by X86_BUG_TAA may want the TSX feature enabled. Add "auto" option to the TSX cmdline parameter. When tsx=auto disable TSX when X86_BUG_TAA is present, otherwise enable TSX. More details on X86_BUG_TAA can be found here: https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html [ bp: Extend the arg buffer to accommodate "auto\0". ] Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-12kvm/x86: Export MDS_NO=0 to guests when TSX is enabledPawan Gupta
commit e1d38b63acd843cfdd4222bf19a26700fd5c699e upstream. Export the IA32_ARCH_CAPABILITIES MSR bit MDS_NO=0 to guests on TSX Async Abort(TAA) affected hosts that have TSX enabled and updated microcode. This is required so that the guests don't complain, "Vulnerable: Clear CPU buffers attempted, no microcode" when the host has the updated microcode to clear CPU buffers. Microcode update also adds support for MSR_IA32_TSX_CTRL which is enumerated by the ARCH_CAP_TSX_CTRL bit in IA32_ARCH_CAPABILITIES MSR. Guests can't do this check themselves when the ARCH_CAP_TSX_CTRL bit is not exported to the guests. In this case export MDS_NO=0 to the guests. When guests have CPUID.MD_CLEAR=1, they deploy MDS mitigation which also mitigates TAA. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Neelima Krishnan <neelima.krishnan@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-12x86/speculation/taa: Add sysfs reporting for TSX Async AbortPawan Gupta
commit 6608b45ac5ecb56f9e171252229c39580cc85f0f upstream. Add the sysfs reporting file for TSX Async Abort. It exposes the vulnerability and the mitigation state similar to the existing files for the other hardware vulnerabilities. Sysfs file path is: /sys/devices/system/cpu/vulnerabilities/tsx_async_abort Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Neelima Krishnan <neelima.krishnan@intel.com> Reviewed-by: Mark Gross <mgross@linux.intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-12x86/speculation/taa: Add mitigation for TSX Async AbortPawan Gupta
commit 1b42f017415b46c317e71d41c34ec088417a1883 upstream. TSX Async Abort (TAA) is a side channel vulnerability to the internal buffers in some Intel processors similar to Microachitectural Data Sampling (MDS). In this case, certain loads may speculatively pass invalid data to dependent operations when an asynchronous abort condition is pending in a TSX transaction. This includes loads with no fault or assist condition. Such loads may speculatively expose stale data from the uarch data structures as in MDS. Scope of exposure is within the same-thread and cross-thread. This issue affects all current processors that support TSX, but do not have ARCH_CAP_TAA_NO (bit 8) set in MSR_IA32_ARCH_CAPABILITIES. On CPUs which have their IA32_ARCH_CAPABILITIES MSR bit MDS_NO=0, CPUID.MD_CLEAR=1 and the MDS mitigation is clearing the CPU buffers using VERW or L1D_FLUSH, there is no additional mitigation needed for TAA. On affected CPUs with MDS_NO=1 this issue can be mitigated by disabling the Transactional Synchronization Extensions (TSX) feature. A new MSR IA32_TSX_CTRL in future and current processors after a microcode update can be used to control the TSX feature. There are two bits in that MSR: * TSX_CTRL_RTM_DISABLE disables the TSX sub-feature Restricted Transactional Memory (RTM). * TSX_CTRL_CPUID_CLEAR clears the RTM enumeration in CPUID. The other TSX sub-feature, Hardware Lock Elision (HLE), is unconditionally disabled with updated microcode but still enumerated as present by CPUID(EAX=7).EBX{bit4}. The second mitigation approach is similar to MDS which is clearing the affected CPU buffers on return to user space and when entering a guest. Relevant microcode update is required for the mitigation to work. More details on this approach can be found here: https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html The TSX feature can be controlled by the "tsx" command line parameter. If it is force-enabled then "Clear CPU buffers" (MDS mitigation) is deployed. The effective mitigation state can be read from sysfs. [ bp: - massage + comments cleanup - s/TAA_MITIGATION_TSX_DISABLE/TAA_MITIGATION_TSX_DISABLED/g - Josh. - remove partial TAA mitigation in update_mds_branch_idle() - Josh. - s/tsx_async_abort_cmdline/tsx_async_abort_parse_cmdline/g ] Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-12x86/cpu: Add a "tsx=" cmdline option with TSX disabled by defaultPawan Gupta
commit 95c5824f75f3ba4c9e8e5a4b1a623c95390ac266 upstream. Add a kernel cmdline parameter "tsx" to control the Transactional Synchronization Extensions (TSX) feature. On CPUs that support TSX control, use "tsx=on|off" to enable or disable TSX. Not specifying this option is equivalent to "tsx=off". This is because on certain processors TSX may be used as a part of a speculative side channel attack. Carve out the TSX controlling functionality into a separate compilation unit because TSX is a CPU feature while the TSX async abort control machinery will go to cpu/bugs.c. [ bp: - Massage, shorten and clear the arg buffer. - Clarifications of the tsx= possible options - Josh. - Expand on TSX_CTRL availability - Pawan. ] Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-12x86/cpu: Add a helper function x86_read_arch_cap_msr()Pawan Gupta
commit 286836a70433fb64131d2590f4bf512097c255e1 upstream. Add a helper function to read the IA32_ARCH_CAPABILITIES MSR. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Neelima Krishnan <neelima.krishnan@intel.com> Reviewed-by: Mark Gross <mgross@linux.intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-12x86/msr: Add the IA32_TSX_CTRL MSRPawan Gupta
commit c2955f270a84762343000f103e0640d29c7a96f3 upstream. Transactional Synchronization Extensions (TSX) may be used on certain processors as part of a speculative side channel attack. A microcode update for existing processors that are vulnerable to this attack will add a new MSR - IA32_TSX_CTRL to allow the system administrator the option to disable TSX as one of the possible mitigations. The CPUs which get this new MSR after a microcode upgrade are the ones which do not set MSR_IA32_ARCH_CAPABILITIES.MDS_NO (bit 5) because those CPUs have CPUID.MD_CLEAR, i.e., the VERW implementation which clears all CPU buffers takes care of the TAA case as well. [ Note that future processors that are not vulnerable will also support the IA32_TSX_CTRL MSR. ] Add defines for the new IA32_TSX_CTRL MSR and its bits. TSX has two sub-features: 1. Restricted Transactional Memory (RTM) is an explicitly-used feature where new instructions begin and end TSX transactions. 2. Hardware Lock Elision (HLE) is implicitly used when certain kinds of "old" style locks are used by software. Bit 7 of the IA32_ARCH_CAPABILITIES indicates the presence of the IA32_TSX_CTRL MSR. There are two control bits in IA32_TSX_CTRL MSR: Bit 0: When set, it disables the Restricted Transactional Memory (RTM) sub-feature of TSX (will force all transactions to abort on the XBEGIN instruction). Bit 1: When set, it disables the enumeration of the RTM and HLE feature (i.e. it will make CPUID(EAX=7).EBX{bit4} and CPUID(EAX=7).EBX{bit11} read as 0). The other TSX sub-feature, Hardware Lock Elision (HLE), is unconditionally disabled by the new microcode but still enumerated as present by CPUID(EAX=7).EBX{bit4}, unless disabled by IA32_TSX_CTRL_MSR[1] - TSX_CTRL_CPUID_CLEAR. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Neelima Krishnan <neelima.krishnan@intel.com> Reviewed-by: Mark Gross <mgross@linux.intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-12KVM: x86: use Intel speculation bugs and features as derived in generic x86 codePaolo Bonzini
commit 0c54914d0c52a15db9954a76ce80fee32cf318f4 upstream. Similar to AMD bits, set the Intel bits from the vendor-independent feature and bug flags, because KVM_GET_SUPPORTED_CPUID does not care about the vendor and they should be set on AMD processors as well. Suggested-by: Jim Mattson <jmattson@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-12x86/apic/32: Avoid bogus LDR warningsJan Beulich
[ Upstream commit fe6f85ca121e9c74e7490fe66b0c5aae38e332c3 ] The removal of the LDR initialization in the bigsmp_32 APIC code unearthed a problem in setup_local_APIC(). The code checks unconditionally for a mismatch of the logical APIC id by comparing the early APIC id which was initialized in get_smp_config() with the actual LDR value in the APIC. Due to the removal of the bogus LDR initialization the check now can trigger on bigsmp_32 APIC systems emitting a warning for every booting CPU. This is of course a false positive because the APIC is not using logical destination mode. Restrict the check and the possibly resulting fixup to systems which are actually using the APIC in logical destination mode. [ tglx: Massaged changelog and added Cc stable ] Fixes: bae3a8d3308 ("x86/apic: Do not initialize LDR and DFR for bigsmp") Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/666d8f91-b5a8-1afd-7add-821e72a35f03@suse.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-12x86/apic: Drop logical_smp_processor_id() inlineDou Liyang
[ Upstream commit 8f1561680f42a5491b371b513f1ab8197f31fd62 ] The logical_smp_processor_id() inline which is only called in setup_local_APIC() on x86_32 systems has no real value. Drop it and directly use GET_APIC_LOGICAL_ID() at the call site and use a more suitable variable name for readability Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: andy.shevchenko@gmail.com Cc: bhe@redhat.com Cc: ebiederm@xmission.com Link: https://lkml.kernel.org/r/20180301055930.2396-4-douly.fnst@cn.fujitsu.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-12x86/apic: Move pending interrupt check code into it's own functionDou Liyang
[ Upstream commit 9b217f33017715903d0956dfc58f82d2a2d00e63 ] The pending interrupt check code is mixed with the local APIC setup code, that looks messy. Extract the related code, move it into a new function named apic_pending_intr_clear(). Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: bhe@redhat.com Cc: ebiederm@xmission.com Link: https://lkml.kernel.org/r/20180301055930.2396-2-douly.fnst@cn.fujitsu.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-12perf/x86/amd/ibs: Handle erratum #420 only on the affected CPU family (10h)Kim Phillips
[ Upstream commit e431e79b60603079d269e0c2a5177943b95fa4b6 ] This saves us writing the IBS control MSR twice when disabling the event. I searched revision guides for all families since 10h, and did not find occurrence of erratum #420, nor anything remotely similar: so we isolate the secondary MSR write to family 10h only. Also unconditionally update the count mask for IBS Op implementations that have read & writeable current count (CurCnt) fields in addition to the MaxCnt field. These bits were reserved on prior implementations, and therefore shouldn't have negative impact. Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: c9574fe0bdb9 ("perf/x86-ibs: Implement workaround for IBS erratum #420") Link: https://lkml.kernel.org/r/20191023150955.30292-2-kim.phillips@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-12perf/x86/amd/ibs: Fix reading of the IBS OpData register and thus precise ↵Kim Phillips
RIP validity [ Upstream commit 317b96bb14303c7998dbcd5bc606bd8038fdd4b4 ] The loop that reads all the IBS MSRs into *buf stopped one MSR short of reading the IbsOpData register, which contains the RipInvalid status bit. Fix the offset_max assignment so the MSR gets read, so the RIP invalid evaluation is based on what the IBS h/w output, instead of what was left in memory. Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: d47e8238cd76 ("perf/x86-ibs: Take instruction pointer from ibs sample") Link: https://lkml.kernel.org/r/20191023150955.30292-1-kim.phillips@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-12ARM: dts: dra7: Disable USB metastability workaround for USB2Roger Quadros
commit b8c9c6fa2002b8fd4a9710f76f80f99c6046d48c upstream. The metastability workaround causes Erratic errors [1] on the HighSpeed USB PHY which can cause upto 2 seconds delay in enumerating to a USB host while in Gadget mode. Disable the Run/Stop metastability workaround to avoid this ill effect. We are aware that this opens up the opportunity for Run/Stop metastability, however this issue has never been observed in TI releases so we think that Run/Stop metastability is a lesser evil than the PHY Erratic errors. So disable it. [1] USB controller trace during gadget enumeration irq/90-dwc3-969 [000] d... 52.323145: dwc3_event: event (00000901): Erratic Error [U0] irq/90-dwc3-969 [000] d... 52.560646: dwc3_event: event (00000901): Erratic Error [U0] irq/90-dwc3-969 [000] d... 52.798144: dwc3_event: event (00000901): Erratic Error [U0] Signed-off-by: Roger Quadros <rogerq@ti.com> Acked-by: Felipe Balbi <felipe.balbi@linux/intel.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-12arm64: Do not mask out PTE_RDONLY in pte_same()Catalin Marinas
commit 6767df245f4736d0cf0c6fb7cf9cf94b27414245 upstream. Following commit 73e86cb03cf2 ("arm64: Move PTE_RDONLY bit handling out of set_pte_at()"), the PTE_RDONLY bit is no longer managed by set_pte_at() but built into the PAGE_* attribute definitions. Consequently, pte_same() must include this bit when checking two PTEs for equality. Remove the arm64-specific pte_same() function, practically reverting commit 747a70e60b72 ("arm64: Fix copy-on-write referencing in HugeTLB") Fixes: 73e86cb03cf2 ("arm64: Move PTE_RDONLY bit handling out of set_pte_at()") Cc: <stable@vger.kernel.org> # 4.14.x- Cc: Will Deacon <will@kernel.org> Cc: Steve Capper <steve.capper@arm.com> Reported-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-10powerpc/mm: Fixup tlbie vs mtpidr/mtlpidr ordering issue on POWER9Aneesh Kumar K.V
commit 047e6575aec71d75b765c22111820c4776cd1c43 upstream. On POWER9, under some circumstances, a broadcast TLB invalidation will fail to invalidate the ERAT cache on some threads when there are parallel mtpidr/mtlpidr happening on other threads of the same core. This can cause stores to continue to go to a page after it's unmapped. The workaround is to force an ERAT flush using PID=0 or LPID=0 tlbie flush. This additional TLB flush will cause the ERAT cache invalidation. Since we are using PID=0 or LPID=0, we don't get filtered out by the TLB snoop filtering logic. We need to still follow this up with another tlbie to take care of store vs tlbie ordering issue explained in commit: a5d4b5891c2f ("powerpc/mm: Fixup tlbie vs store ordering issue on POWER9"). The presence of ERAT cache implies we can still get new stores and they may miss store queue marking flush. Cc: stable@vger.kernel.org # v4.14 Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190924035254.24612-3-aneesh.kumar@linux.ibm.com [sandipan: Backported to v4.14] Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-10powerpc/book3s64/radix: Rename CPU_FTR_P9_TLBIE_BUG feature flagAneesh Kumar K.V
commit 09ce98cacd51fcd0fa0af2f79d1e1d3192f4cbb0 upstream. Rename the #define to indicate this is related to store vs tlbie ordering issue. In the next patch, we will be adding another feature flag that is used to handles ERAT flush vs tlbie ordering issue. Cc: stable@vger.kernel.org # v4.14 Fixes: a5d4b5891c2f ("powerpc/mm: Fixup tlbie vs store ordering issue on POWER9") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190924035254.24612-2-aneesh.kumar@linux.ibm.com [sandipan: Backported to v4.14] Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-10powerpc/book3s64/mm: Don't do tlbie fixup for some hardware revisionsAneesh Kumar K.V
commit 677733e296b5c7a37c47da391fc70a43dc40bd67 upstream. The store ordering vs tlbie issue mentioned in commit a5d4b5891c2f ("powerpc/mm: Fixup tlbie vs store ordering issue on POWER9") is fixed for Nimbus 2.3 and Cumulus 1.3 revisions. We don't need to apply the fixup if we are running on them We can only do this on PowerNV. On pseries guest with kvm we still don't support redoing the feature fixup after migration. So we should be enabling all the workarounds needed, because whe can possibly migrate between DD 2.3 and DD 2.2 Cc: stable@vger.kernel.org # v4.14 Fixes: a5d4b5891c2f ("powerpc/mm: Fixup tlbie vs store ordering issue on POWER9") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190924035254.24612-1-aneesh.kumar@linux.ibm.com [sandipan: Backported to v4.14] Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-10powerpc/mm: Fixup tlbie vs store ordering issue on POWER9Aneesh Kumar K.V
commit a5d4b5891c2f1f865a2def1eb0030f534e77ff86 upstream. On POWER9, under some circumstances, a broadcast TLB invalidation might complete before all previous stores have drained, potentially allowing stale stores from becoming visible after the invalidation. This works around it by doubling up those TLB invalidations which was verified by HW to be sufficient to close the risk window. This will be documented in a yet-to-be-published errata. Cc: stable@vger.kernel.org # v4.14 Fixes: 1a472c9dba6b ("powerpc/mm/radix: Add tlbflush routines") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Enable the feature in the DT CPU features code for all Power9, rename the feature to CPU_FTR_P9_TLBIE_BUG per benh.] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20180323045627.16800-3-aneesh.kumar@linux.vnet.ibm.com/ [sandipan: Backported to v4.14] Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-10MIPS: bmips: mark exception vectors as char arraysJonas Gorski
[ Upstream commit e4f5cb1a9b27c0f94ef4f5a0178a3fde2d3d0e9e ] The vectors span more than one byte, so mark them as arrays. Fixes the following build error when building when using GCC 8.3: In file included from ./include/linux/string.h:19, from ./include/linux/bitmap.h:9, from ./include/linux/cpumask.h:12, from ./arch/mips/include/asm/processor.h:15, from ./arch/mips/include/asm/thread_info.h:16, from ./include/linux/thread_info.h:38, from ./include/asm-generic/preempt.h:5, from ./arch/mips/include/generated/asm/preempt.h:1, from ./include/linux/preempt.h:81, from ./include/linux/spinlock.h:51, from ./include/linux/mmzone.h:8, from ./include/linux/bootmem.h:8, from arch/mips/bcm63xx/prom.c:10: arch/mips/bcm63xx/prom.c: In function 'prom_init': ./arch/mips/include/asm/string.h:162:11: error: '__builtin_memcpy' forming offset [2, 32] is out of the bounds [0, 1] of object 'bmips_smp_movevec' with type 'char' [-Werror=array-bounds] __ret = __builtin_memcpy((dst), (src), __len); \ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ arch/mips/bcm63xx/prom.c:97:3: note: in expansion of macro 'memcpy' memcpy((void *)0xa0000200, &bmips_smp_movevec, 0x20); ^~~~~~ In file included from arch/mips/bcm63xx/prom.c:14: ./arch/mips/include/asm/bmips.h:80:13: note: 'bmips_smp_movevec' declared here extern char bmips_smp_movevec; Fixes: 18a1eef92dcd ("MIPS: BMIPS: Introduce bmips.h") Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Paul Burton <paulburton@kernel.org> Cc: linux-mips@vger.kernel.org Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-10ARM: 8926/1: v7m: remove register save to stack before svcafzal mohammed
[ Upstream commit 2ecb287998a47cc0a766f6071f63bc185f338540 ] r0-r3 & r12 registers are saved & restored, before & after svc respectively. Intention was to preserve those registers across thread to handler mode switch. On v7-M, hardware saves the register context upon exception in AAPCS complaint way. Restoring r0-r3 & r12 is done from stack location where hardware saves it, not from the location on stack where these registers were saved. To clarify, on stm32f429 discovery board: 1. before svc, sp - 0x90009ff8 2. r0-r3,r12 saved to 0x90009ff8 - 0x9000a00b 3. upon svc, h/w decrements sp by 32 & pushes registers onto stack 4. after svc, sp - 0x90009fd8 5. r0-r3,r12 restored from 0x90009fd8 - 0x90009feb Above means r0-r3,r12 is not restored from the location where they are saved, but since hardware pushes the registers onto stack, the registers are restored correctly. Note that during register saving to stack (step 2), it goes past 0x9000a000. And it seems, based on objdump, there are global symbols residing there, and it perhaps can cause issues on a non-XIP Kernel (on XIP, data section is setup later). Based on the analysis above, manually saving registers onto stack is at best no-op and at worst can cause data section corruption. Hence remove storing of registers onto stack before svc. Fixes: b70cd406d7fe ("ARM: 8671/1: V7M: Preserve registers across switch from Thread to Handler mode") Signed-off-by: afzal mohammed <afzal.mohd.ma@gmail.com> Acked-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-10ARM: davinci: dm365: Fix McBSP dma_slave_map entryPeter Ujfalusi
[ Upstream commit 564b6bb9d42d31fc80c006658cf38940a9b99616 ] dm365 have only single McBSP, so the device name is without .0 Fixes: 0c750e1fe481d ("ARM: davinci: dm365: Add dma_slave_map to edma") Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Signed-off-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-10ARM: dts: imx7s: Correct GPT's ipg clock sourceAnson Huang
[ Upstream commit 252b9e21bcf46b0d16f733f2e42b21fdc60addee ] i.MX7S/D's GPT ipg clock should be from GPT clock root and controlled by CCM's GPT CCGR, using correct clock source for GPT ipg clock instead of IMX7D_CLK_DUMMY. Fixes: 3ef79ca6bd1d ("ARM: dts: imx7d: use imx7s.dtsi as base device tree") Signed-off-by: Anson Huang <Anson.Huang@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-10ARM: mm: fix alignment handler faults under memory pressureRussell King
[ Upstream commit 67e15fa5b487adb9b78a92789eeff2d6ec8f5cee ] When the system has high memory pressure, the page containing the instruction may be paged out. Using probe_kernel_address() means that if the page is swapped out, the resulting page fault will not be handled because page faults are disabled by this function. Use get_user() to read the instruction instead. Reported-by: Jing Xiangfeng <jingxiangfeng@huawei.com> Fixes: b255188f90e2 ("ARM: fix scheduling while atomic warning in alignment handling code") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-10ARM: dts: logicpd-torpedo-som: Remove twl_keypadAdam Ford
[ Upstream commit 6b512b0ee091edcb8e46218894e4c917d919d3dc ] The TWL4030 used on the Logit PD Torpedo SOM does not have the keypad pins routed. This patch disables the twl_keypad driver to remove some splat during boot: twl4030_keypad 48070000.i2c:twl@48:keypad: missing or malformed property linux,keymap: -22 twl4030_keypad 48070000.i2c:twl@48:keypad: Failed to build keymap twl4030_keypad: probe of 48070000.i2c:twl@48:keypad failed with error -22 Signed-off-by: Adam Ford <aford173@gmail.com> [tony@atomide.com: removed error time stamps] Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-10arm64: dts: Fix gpio to pinmux mappingRayagonda Kokatanur
[ Upstream commit 965f6603e3335a953f4f876792074cb36bf65f7f ] There are total of 151 non-secure gpio (0-150) and four pins of pinmux (91, 92, 93 and 94) are not mapped to any gpio pin, hence update same in DT. Fixes: 8aa428cc1e2e ("arm64: dts: Add pinctrl DT nodes for Stingray SOC") Signed-off-by: Rayagonda Kokatanur <rayagonda.kokatanur@broadcom.com> Reviewed-by: Ray Jui <ray.jui@broadcom.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-06arm64: Ensure VM_WRITE|VM_SHARED ptes are clean by defaultCatalin Marinas
commit aa57157be69fb599bd4c38a4b75c5aad74a60ec0 upstream. Shared and writable mappings (__S.1.) should be clean (!dirty) initially and made dirty on a subsequent write either through the hardware DBM (dirty bit management) mechanism or through a write page fault. A clean pte for the arm64 kernel is one that has PTE_RDONLY set and PTE_DIRTY clear. The PAGE_SHARED{,_EXEC} attributes have PTE_WRITE set (PTE_DBM) and PTE_DIRTY clear. Prior to commit 73e86cb03cf2 ("arm64: Move PTE_RDONLY bit handling out of set_pte_at()"), it was the responsibility of set_pte_at() to set the PTE_RDONLY bit and mark the pte clean if the software PTE_DIRTY bit was not set. However, the above commit removed the pte_sw_dirty() check and the subsequent setting of PTE_RDONLY in set_pte_at() while leaving the PAGE_SHARED{,_EXEC} definitions unchanged. The result is that shared+writable mappings are now dirty by default Fix the above by explicitly setting PTE_RDONLY in PAGE_SHARED{,_EXEC}. In addition, remove the superfluous PTE_DIRTY bit from the kernel PROT_* attributes. Fixes: 73e86cb03cf2 ("arm64: Move PTE_RDONLY bit handling out of set_pte_at()") Cc: <stable@vger.kernel.org> # 4.14.x- Cc: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-06s390/idle: fix cpu idle time calculationHeiko Carstens
commit 3d7efa4edd07be5c5c3ffa95ba63e97e070e1f3f upstream. The idle time reported in /proc/stat sometimes incorrectly contains huge values on s390. This is caused by a bug in arch_cpu_idle_time(). The kernel tries to figure out when a different cpu entered idle by accessing its per-cpu data structure. There is an ordering problem: if the remote cpu has an idle_enter value which is not zero, and an idle_exit value which is zero, it is assumed it is idle since "now". The "now" timestamp however is taken before the idle_enter value is read. Which in turn means that "now" can be smaller than idle_enter of the remote cpu. Unconditionally subtracting idle_enter from "now" can thus lead to a negative value (aka large unsigned value). Fix this by moving the get_tod_clock() invocation out of the loop. While at it also make the code a bit more readable. A similar bug also exists for show_idle_time(). Fix this is as well. Cc: <stable@vger.kernel.org> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-06s390/cmm: fix information leak in cmm_timeout_handler()Yihui ZENG
commit b8e51a6a9db94bc1fb18ae831b3dab106b5a4b5f upstream. The problem is that we were putting the NUL terminator too far: buf[sizeof(buf) - 1] = '\0'; If the user input isn't NUL terminated and they haven't initialized the whole buffer then it leads to an info leak. The NUL terminator should be: buf[len - 1] = '\0'; Signed-off-by: Yihui Zeng <yzeng56@asu.edu> Cc: stable@vger.kernel.org Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> [heiko.carstens@de.ibm.com: keep semantics of how *lenp and *ppos are handled] Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-06s390/uaccess: avoid (false positive) compiler warningsChristian Borntraeger
[ Upstream commit 062795fcdcb2d22822fb42644b1d76a8ad8439b3 ] Depending on inlining decisions by the compiler, __get/put_user_fn might become out of line. Then the compiler is no longer able to tell that size can only be 1,2,4 or 8 due to the check in __get/put_user resulting in false positives like ./arch/s390/include/asm/uaccess.h: In function ‘__put_user_fn’: ./arch/s390/include/asm/uaccess.h:113:9: warning: ‘rc’ may be used uninitialized in this function [-Wmaybe-uninitialized] 113 | return rc; | ^~ ./arch/s390/include/asm/uaccess.h: In function ‘__get_user_fn’: ./arch/s390/include/asm/uaccess.h:143:9: warning: ‘rc’ may be used uninitialized in this function [-Wmaybe-uninitialized] 143 | return rc; | ^~ These functions are supposed to be always inlined. Mark it as such. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-06MIPS: fw: sni: Fix out of bounds init of o32 stackThomas Bogendoerfer
[ Upstream commit efcb529694c3b707dc0471b312944337ba16e4dd ] Use ARRAY_SIZE to caluculate the top of the o32 stack. Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de> Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-06MIPS: include: Mark __xchg as __always_inlineThomas Bogendoerfer
[ Upstream commit 46f1619500d022501a4f0389f9f4c349ab46bb86 ] Commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING forcibly") allows compiler to uninline functions marked as 'inline'. In cace of __xchg this would cause to reference function __xchg_called_with_bad_pointer, which is an error case for catching bugs and will not happen for correct code, if __xchg is inlined. Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-06perf/x86/amd: Change/fix NMI latency mitigation to use a timestampTom Lendacky
[ Upstream commit df4d29732fdad43a51284f826bec3e6ded177540 ] It turns out that the NMI latency workaround from commit: 6d3edaae16c6 ("x86/perf/amd: Resolve NMI latency issues for active PMCs") ends up being too conservative and results in the perf NMI handler claiming NMIs too easily on AMD hardware when the NMI watchdog is active. This has an impact, for example, on the hpwdt (HPE watchdog timer) module. This module can produce an NMI that is used to reset the system. It registers an NMI handler for the NMI_UNKNOWN type and relies on the fact that nothing has claimed an NMI so that its handler will be invoked when the watchdog device produces an NMI. After the referenced commit, the hpwdt module is unable to process its generated NMI if the NMI watchdog is active, because the current NMI latency mitigation results in the NMI being claimed by the perf NMI handler. Update the AMD perf NMI latency mitigation workaround to, instead, use a window of time. Whenever a PMC is handled in the perf NMI handler, set a timestamp which will act as a perf NMI window. Any NMIs arriving within that window will be claimed by perf. Anything outside that window will not be claimed by perf. The value for the NMI window is set to 100 msecs. This is a conservative value that easily covers any NMI latency in the hardware. While this still results in a window in which the hpwdt module will not receive its NMI, the window is now much, much smaller. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Jerry Hoemann <jerry.hoemann@hpe.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 6d3edaae16c6 ("x86/perf/amd: Resolve NMI latency issues for active PMCs") Link: https://lkml.kernel.org/r/Message-ID: Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-06x86/xen: Return from panic notifierBoris Ostrovsky
[ Upstream commit c6875f3aacf2a5a913205accddabf0bfb75cac76 ] Currently execution of panic() continues until Xen's panic notifier (xen_panic_event()) is called at which point we make a hypercall that never returns. This means that any notifier that is supposed to be called later as well as significant part of panic() code (such as pstore writes from kmsg_dump()) is never executed. There is no reason for xen_panic_event() to be this last point in execution since panic()'s emergency_restart() will call into xen_emergency_restart() from where we can perform our hypercall. Nevertheless, we will provide xen_legacy_crash boot option that will preserve original behavior during crash. This option could be used, for example, if running kernel dumper (which happens after panic notifiers) is undesirable. Reported-by: James Dingwall <james@dingwall.me.uk> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-06MIPS: include: Mark __cmpxchg as __always_inlineThomas Bogendoerfer
[ Upstream commit 88356d09904bc606182c625575237269aeece22e ] Commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING forcibly") allows compiler to uninline functions marked as 'inline'. In cace of cmpxchg this would cause to reference function __cmpxchg_called_with_bad_pointer, which is a error case for catching bugs and will not happen for correct code, if __cmpxchg is inlined. Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de> [paul.burton@mips.com: s/__cmpxchd/__cmpxchg in subject] Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: linux-mips@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-06efi/x86: Do not clean dummy variable in kexec pathDave Young
[ Upstream commit 2ecb7402cfc7f22764e7bbc80790e66eadb20560 ] kexec reboot fails randomly in UEFI based KVM guest. The firmware just resets while calling efi_delete_dummy_variable(); Unfortunately I don't know how to debug the firmware, it is also possible a potential problem on real hardware as well although nobody reproduced it. The intention of the efi_delete_dummy_variable is to trigger garbage collection when entering virtual mode. But SetVirtualAddressMap can only run once for each physical reboot, thus kexec_enter_virtual_mode() is not necessarily a good place to clean a dummy object. Drop the efi_delete_dummy_variable so that kexec reboot can work. Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Matthew Garrett <mjg59@google.com> Cc: Ben Dooks <ben.dooks@codethink.co.uk> Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Cc: Jerry Snitselaar <jsnitsel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Lukas Wunner <lukas@wunner.de> Cc: Lyude Paul <lyude@redhat.com> Cc: Octavian Purdila <octavian.purdila@intel.com> Cc: Peter Jones <pjones@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Scott Talbert <swt@techie.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Cc: linux-integrity@vger.kernel.org Link: https://lkml.kernel.org/r/20191002165904.8819-8-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-06arm64: ftrace: Ensure synchronisation in PLT setup for Neoverse-N1 #1542419James Morse
[ Upstream commit dd8a1f13488438c6c220b7cafa500baaf21a6e53 ] CPUs affected by Neoverse-N1 #1542419 may execute a stale instruction if it was recently modified. The affected sequence requires freshly written instructions to be executable before a branch to them is updated. There are very few places in the kernel that modify executable text, all but one come with sufficient synchronisation: * The module loader's flush_module_icache() calls flush_icache_range(), which does a kick_all_cpus_sync() * bpf_int_jit_compile() calls flush_icache_range(). * Kprobes calls aarch64_insn_patch_text(), which does its work in stop_machine(). * static keys and ftrace both patch between nops and branches to existing kernel code (not generated code). The affected sequence is the interaction between ftrace and modules. The module PLT is cleaned using __flush_icache_range() as the trampoline shouldn't be executable until we update the branch to it. Drop the double-underscore so that this path runs kick_all_cpus_sync() too. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-06x86/cpu: Add Atom Tremont (Jacobsville)Kan Liang
[ Upstream commit 00ae831dfe4474ef6029558f5eb3ef0332d80043 ] Add the Atom Tremont model number to the Intel family list. [ Tony: Also update comment at head of file to say "_X" suffix is also used for microserver parts. ] Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Aristeu Rozanski <aris@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: Mauro Carvalho Chehab <mchehab@s-opensource.com> Cc: Megha Dey <megha.dey@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Rajneesh Bhardwaj <rajneesh.bhardwaj@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190125195902.17109-4-tony.luck@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-06powerpc/powernv: hold device_hotplug_lock when calling memtrace_offline_pages()David Hildenbrand
[ Upstream commit 5666848774ef43d3db5151ec518f1deb63515c20 ] Let's perform all checking + offlining + removing under device_hotplug_lock, so nobody can mess with these devices via sysfs concurrently. [david@redhat.com: take device_hotplug_lock outside of loop] Link: http://lkml.kernel.org/r/20180927092554.13567-6-david@redhat.com Link: http://lkml.kernel.org/r/20180925091457.28651-6-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com> Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Rashmica Gupta <rashmica.g@gmail.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: John Allen <jallen@linux.vnet.ibm.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Len Brown <lenb@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Mathieu Malaterre <malat@debian.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-29kvm: vmx: Basic APIC virtualization controls have three settingsJim Mattson
commit 8d860bbeedef97fe981d28fa7b71d77f3b29563f upstream. Previously, we toggled between SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE and SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES, depending on whether or not the EXTD bit was set in MSR_IA32_APICBASE. However, if the local APIC is disabled, we should not set either of these APIC virtualization control bits. Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Cc: "Jitindar SIngh, Suraj" <surajjs@amazon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-29kvm: apic: Flush TLB after APIC mode/address change if VPIDs are in useJunaid Shahid
commit a468f2dbf921d02f5107378501693137a812999b upstream. Currently, KVM flushes the TLB after a change to the APIC access page address or the APIC mode when EPT mode is enabled. However, even in shadow paging mode, a TLB flush is needed if VPIDs are being used, as specified in the Intel SDM Section 29.4.5. So replace vmx_flush_tlb_ept_only() with vmx_flush_tlb(), which will flush if either EPT or VPIDs are in use. Signed-off-by: Junaid Shahid <junaids@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Cc: "Jitindar SIngh, Suraj" <surajjs@amazon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-29kvm: vmx: Introduce lapic_mode enumerationJim Mattson
commit 588716494258899389206fa50426e78cc9df89b9 upstream. The local APIC can be in one of three modes: disabled, xAPIC or x2APIC. (A fourth mode, "invalid," is included for completeness.) Using the new enumeration can make some of the APIC mode logic easier to read. In kvm_set_apic_base, for instance, it is clear that one cannot transition directly from x2APIC mode to xAPIC mode or directly from APIC disabled to x2APIC mode. Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> [Check invalid bits even if msr_info->host_initiated. Reported by Wanpeng Li. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Cc: "Jitindar SIngh, Suraj" <surajjs@amazon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-29KVM: X86: introduce invalidate_gpa argument to tlb flushWanpeng Li
commit c2ba05ccfde2f069a66c0462e5b5ef8a517dcc9c upstream. Introduce a new bool invalidate_gpa argument to kvm_x86_ops->tlb_flush, it will be used by later patches to just flush guest tlb. For VMX, this will use INVVPID instead of INVEPT, which will invalidate combined mappings while keeping guest-physical mappings. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Jitindar SIngh, Suraj" <surajjs@amazon.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-29x86/boot/64: Make level2_kernel_pgt pages invalid outside kernel areaSteve Wahl
commit 2aa85f246c181b1fa89f27e8e20c5636426be624 upstream. Our hardware (UV aka Superdome Flex) has address ranges marked reserved by the BIOS. Access to these ranges is caught as an error, causing the BIOS to halt the system. Initial page tables mapped a large range of physical addresses that were not checked against the list of BIOS reserved addresses, and sometimes included reserved addresses in part of the mapped range. Including the reserved range in the map allowed processor speculative accesses to the reserved range, triggering a BIOS halt. Used early in booting, the page table level2_kernel_pgt addresses 1 GiB divided into 2 MiB pages, and it was set up to linearly map a full 1 GiB of physical addresses that included the physical address range of the kernel image, as chosen by KASLR. But this also included a large range of unused addresses on either side of the kernel image. And unlike the kernel image's physical address range, this extra mapped space was not checked against the BIOS tables of usable RAM addresses. So there were times when the addresses chosen by KASLR would result in processor accessible mappings of BIOS reserved physical addresses. The kernel code did not directly access any of this extra mapped space, but having it mapped allowed the processor to issue speculative accesses into reserved memory, causing system halts. This was encountered somewhat rarely on a normal system boot, and much more often when starting the crash kernel if "crashkernel=512M,high" was specified on the command line (this heavily restricts the physical address of the crash kernel, in our case usually within 1 GiB of reserved space). The solution is to invalidate the pages of this table outside the kernel image's space before the page table is activated. It fixes this problem on our hardware. [ bp: Touchups. ] Signed-off-by: Steve Wahl <steve.wahl@hpe.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Baoquan He <bhe@redhat.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: dimitri.sivanich@hpe.com Cc: Feng Tang <feng.tang@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jordan Borgner <mail@jordan-borgner.de> Cc: Juergen Gross <jgross@suse.com> Cc: mike.travis@hpe.com Cc: russ.anderson@hpe.com Cc: stable@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Cc: Zhenzhong Duan <zhenzhong.duan@oracle.com> Link: https://lkml.kernel.org/r/9c011ee51b081534a7a15065b1681d200298b530.1569358539.git.steve.wahl@hpe.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-29parisc: Fix vmap memory leak in ioremap()/iounmap()Helge Deller
commit 513f7f747e1cba81f28a436911fba0b485878ebd upstream. Sven noticed that calling ioremap() and iounmap() multiple times leads to a vmap memory leak: vmap allocation for size 4198400 failed: use vmalloc=<size> to increase size It seems we missed calling vunmap() in iounmap(). Signed-off-by: Helge Deller <deller@gmx.de> Noticed-by: Sven Schnelle <svens@stackframe.org> Cc: <stable@vger.kernel.org> # v3.16+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-29xtensa: drop EXPORT_SYMBOL for outs*/ins*Max Filippov
commit 8b39da985194aac2998dd9e3a22d00b596cebf1e upstream. Custom outs*/ins* implementations are long gone from the xtensa port, remove matching EXPORT_SYMBOLs. This fixes the following build warnings issued by modpost since commit 15bfc2348d54 ("modpost: check for static EXPORT_SYMBOL* functions"): WARNING: "insb" [vmlinux] is a static EXPORT_SYMBOL WARNING: "insw" [vmlinux] is a static EXPORT_SYMBOL WARNING: "insl" [vmlinux] is a static EXPORT_SYMBOL WARNING: "outsb" [vmlinux] is a static EXPORT_SYMBOL WARNING: "outsw" [vmlinux] is a static EXPORT_SYMBOL WARNING: "outsl" [vmlinux] is a static EXPORT_SYMBOL Cc: stable@vger.kernel.org Fixes: d38efc1f150f ("xtensa: adopt generic io routines") Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>