summaryrefslogtreecommitdiffstats
path: root/arch
AgeCommit message (Collapse)Author
2014-11-14sh: fix sh770x SCIF memory regionsAndriy Skulysh
commit 5417421b270229bfce0795ccc99a4b481e4954ca upstream. Resources scif1_resources & scif2_resources overlap. Actual SCIF region size is 0x10. This is regression from commit d850acf975be ("sh: Declare SCIF register base and IRQ as resources") Signed-off-by: Andriy Skulysh <askulysh@gmail.com> Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14x86/platform/intel/iosf: Add Braswell PCI IDDavid E. Box
commit 849f5d894383d25c49132437aa289c9a9c98d5df upstream. Add Braswell PCI ID to list of supported ID's for the IOSF driver. Signed-off-by: David E. Box <david.e.box@linux.intel.com> Link: http://lkml.kernel.org/r/1411017231-20807-2-git-send-email-david.e.box@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Chang Rebecca Swee Fun <rebecca.swee.fun.chang@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14x86: Add cpu_detect_cache_sizes to init_intel() add Quark legacy_cache()Bryan O'Donoghue
commit aece118e487a744eafcdd0c77fe32b55ee2092a1 upstream. Intel processors which don't report cache information via cpuid(2) or cpuid(4) need quirk code in the legacy_cache_size callback to report this data. For Intel that callback is is intel_size_cache(). This patch enables calling of cpu_detect_cache_sizes() inside of init_intel() and hence the calling of the legacy_cache callback in intel_size_cache(). Adding this call will ensure that PIII Tualatin currently in intel_size_cache() and Quark SoC X1000 being added to intel_size_cache() in this patch will report their respective cache sizes. This model of calling cpu_detect_cache_sizes() is consistent with AMD/Via/Cirix/Transmeta and Centaur. Also added is a string to idenitfy the Quark as Quark SoC X1000 giving better and more descriptive output via /proc/cpuinfo Adding cpu_detect_cache_sizes to init_intel() will enable calling of intel_size_cache() on Intel processors which currently no code can reach. Therefore this patch will also re-enable reporting of PIII Tualatin cache size information as well as add Quark SoC X1000 support. Comment text and cache flow logic suggested by Thomas Gleixner Signed-off-by: Bryan O'Donoghue <pure.logic@nexus-software.ie> Cc: davej@redhat.com Cc: hmh@hmh.eng.br Link: http://lkml.kernel.org/r/1412641189-12415-3-git-send-email-pure.logic@nexus-software.ie Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chang Rebecca Swee Fun <rebecca.swee.fun.chang@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14x86, iosf: Add PCI ID macros for better readabilityOng Boon Leong
commit 04725ad59474d24553d526fa774179ecd2922342 upstream. Introduce PCI IDs macro for the list of supported product: BayTrail & Quark X1000. Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Link: http://lkml.kernel.org/r/1399668248-24199-5-git-send-email-david.e.box@linux.intel.com Signed-off-by: David E. Box <david.e.box@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Chang Rebecca Swee Fun <rebecca.swee.fun.chang@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14x86, iosf: Add Quark X1000 PCI IDOng Boon Leong
commit 90916e048c1e0c1d379577e43ab9b8e331490cfb upstream. Add PCI device ID, i.e. that of the Host Bridge, for IOSF MBI driver. Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Link: http://lkml.kernel.org/r/1399668248-24199-4-git-send-email-david.e.box@linux.intel.com Signed-off-by: David E. Box <david.e.box@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Chang Rebecca Swee Fun <rebecca.swee.fun.chang@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14x86, iosf: Added Quark MBI identifiersOng Boon Leong
commit 7ef1def800e907edd28ddb1a5c64bae6b8749cdd upstream. Added all the MBI units below and their associated read/write opcodes: - Host Bridge Arbiter - Host Bridge - Remote Management Unit - Memory Manager & eSRAM - SoC Unit Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Link: http://lkml.kernel.org/r/1399668248-24199-3-git-send-email-david.e.box@linux.intel.com Signed-off-by: David E. Box <david.e.box@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Chang Rebecca Swee Fun <rebecca.swee.fun.chang@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14x86, iosf: Make IOSF driver modular and usable by more driversDavid E. Box
commit 6b8f0c8780c71d78624f736d7849645b64cc88b7 upstream. Currently drivers that run on non-IOSF systems (Core/Xeon) can't use the IOSF driver on SOC's without selecting it which forces an unnecessary and limiting dependency. Provides dummy functions to allow these modules to conditionally use the driver on IOSF equipped platforms without impacting their ability to compile and load on non-IOSF platforms. Build default m to ensure availability on x86 SOC's. Signed-off-by: David E. Box <david.e.box@linux.intel.com> Link: http://lkml.kernel.org/r/1399668248-24199-2-git-send-email-david.e.box@linux.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Chang Rebecca Swee Fun <rebecca.swee.fun.chang@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14MIPS: tlbex: Properly fix HUGE TLB Refill exception handlerDavid Daney
commit 9e0f162a36914937a937358fcb45e0609ef2bfc4 upstream. In commit 8393c524a25609 (MIPS: tlbex: Fix a missing statement for HUGETLB), the TLB Refill handler was fixed so that non-OCTEON targets would work properly with huge pages. The change was incorrect in that it broke the OCTEON case. The problem is shown here: xxx0: df7a0000 ld k0,0(k1) . . . xxxc0: df610000 ld at,0(k1) xxxc4: 335a0ff0 andi k0,k0,0xff0 xxxc8: e825ffcd bbit1 at,0x5,0x0 xxxcc: 003ad82d daddu k1,at,k0 . . . In the non-octeon case there is a destructive test for the huge PTE bit, and then at 0, $k0 is reloaded (that is what the 8393c524a25609 patch added). In the octeon case, we modify k1 in the branch delay slot, but we never need k0 again, so the new load is not needed, but since k1 is modified, if we do the load, we load from a garbage location and then get a nested TLB Refill, which is seen in userspace as either SIGBUS or SIGSEGV (depending on the garbage). The real fix is to only do this reloading if it is needed, and never where it is harmful. Signed-off-by: David Daney <david.daney@cavium.com> Cc: Huacai Chen <chenhc@lemote.com> Cc: Fuxin Zhang <zhangfx@lemote.com> Cc: Zhangjin Wu <wuzhangjin@gmail.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/8151/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14MIPS: ftrace: Fix a microMIPS build problemMarkos Chandras
commit aedd153f5bb5b1f1d6d9142014f521ae2ec294cc upstream. Code before the .fixup section needs to have the .insn directive. This has no side effects on MIPS32/64 but it affects the way microMIPS loads the address for the return label. Fixes the following build problem: mips-linux-gnu-ld: arch/mips/built-in.o: .fixup+0x4a0: Unsupported jump between ISA modes; consider recompiling with interlinking enabled. mips-linux-gnu-ld: final link failed: Bad value Makefile:819: recipe for target 'vmlinux' failed The fix is similar to 1658f914ff91c3bf ("MIPS: microMIPS: Disable LL/SC and fix linker bug.") Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/8117/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14ARC: Disable caches in early boot if so configuredVineet Gupta
commit ef680cdc24376f394841a3f19b3a7ef6d57a009d upstream. Requested-by: Noam Camus <noamc@ezchip.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14ARC: fix mmuv2 warningVineet Gupta
commit d75386363ee60eb51c933c7b5e536f3a502ad7d7 upstream. Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14ARC: [SMP] General FixesVineet Gupta
commit c3441edd2dea83923421fd6050d2ffdc57696323 upstream. -Pass the expected arg to non-boot park'ing routine (It worked so far because existing SMP backends don't use the arg) -CONFIG_DEBUG_PREEMPT warning Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14ARC: Update order of registers in KGDB to match GDB 7.5Anton Kolesov
commit ebc0c74e76cec9c4dd860eb0ca1c0b39dc63c482 upstream. Order of registers has changed in GDB moving from 6.8 to 7.5. This patch updates KGDB to work properly with GDB 7.5, though makes it incompatible with 6.8. Signed-off-by: Anton Kolesov <Anton.Kolesov@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14ARC: [nsimosci] Allow "headless" models to bootVineet Gupta
commit 5c05483e2db91890faa9a7be0a831701a3f442d6 upstream. There are certain test configuration of virtual platform which don't have any real console device (uart/pgu). So add tty0 as a fallback console device to allow system to boot and be accessible via telnet Otherwise with ttyS0 as only console, but 8250 disabled in kernel build, init chokes. Reported-by: Anton Kolesov <akolesov@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14kvm: vmx: handle invvpid vm exit gracefullyPetr Matousek
commit a642fc305053cc1c6e47e4f4df327895747ab485 upstream. On systems with invvpid instruction support (corresponding bit in IA32_VMX_EPT_VPID_CAP MSR is set) guest invocation of invvpid causes vm exit, which is currently not handled and results in propagation of unknown exit to userspace. Fix this by installing an invvpid vm exit handler. This is CVE-2014-3646. Signed-off-by: Petr Matousek <pmatouse@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14KVM: x86: Emulator fixes for eip canonical checks on near branchesNadav Amit
commit 234f3ce485d54017f15cf5e0699cff4100121601 upstream. Before changing rip (during jmp, call, ret, etc.) the target should be asserted to be canonical one, as real CPUs do. During sysret, both target rsp and rip should be canonical. If any of these values is noncanonical, a #GP exception should occur. The exception to this rule are syscall and sysenter instructions in which the assigned rip is checked during the assignment to the relevant MSRs. This patch fixes the emulator to behave as real CPUs do for near branches. Far branches are handled by the next patch. This fixes CVE-2014-3647. Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14KVM: x86: Fix wrong masking on relative jump/callNadav Amit
commit 05c83ec9b73c8124555b706f6af777b10adf0862 upstream. Relative jumps and calls do the masking according to the operand size, and not according to the address size as the KVM emulator does today. This patch fixes KVM behavior. Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14kvm: x86: don't kill guest on unknown exit reasonMichael S. Tsirkin
commit 2bc19dc3754fc066c43799659f0d848631c44cfe upstream. KVM_EXIT_UNKNOWN is a kvm bug, we don't really know whether it was triggered by a priveledged application. Let's not kill the guest: WARN and inject #UD instead. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14KVM: x86: Check non-canonical addresses upon WRMSRNadav Amit
commit 854e8bb1aa06c578c2c9145fa6bfe3680ef63b23 upstream. Upon WRMSR, the CPU should inject #GP if a non-canonical value (address) is written to certain MSRs. The behavior is "almost" identical for AMD and Intel (ignoring MSRs that are not implemented in either architecture since they would anyhow #GP). However, IA32_SYSENTER_ESP and IA32_SYSENTER_EIP cause #GP if non-canonical address is written on Intel but not on AMD (which ignores the top 32-bits). Accordingly, this patch injects a #GP on the MSRs which behave identically on Intel and AMD. To eliminate the differences between the architecutres, the value which is written to IA32_SYSENTER_ESP and IA32_SYSENTER_EIP is turned to canonical value before writing instead of injecting a #GP. Some references from Intel and AMD manuals: According to Intel SDM description of WRMSR instruction #GP is expected on WRMSR "If the source register contains a non-canonical address and ECX specifies one of the following MSRs: IA32_DS_AREA, IA32_FS_BASE, IA32_GS_BASE, IA32_KERNEL_GS_BASE, IA32_LSTAR, IA32_SYSENTER_EIP, IA32_SYSENTER_ESP." According to AMD manual instruction manual: LSTAR/CSTAR (SYSCALL): "The WRMSR instruction loads the target RIP into the LSTAR and CSTAR registers. If an RIP written by WRMSR is not in canonical form, a general-protection exception (#GP) occurs." IA32_GS_BASE and IA32_FS_BASE (WRFSBASE/WRGSBASE): "The address written to the base field must be in canonical form or a #GP fault will occur." IA32_KERNEL_GS_BASE (SWAPGS): "The address stored in the KernelGSbase MSR must be in canonical form." This patch fixes CVE-2014-3610. Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14KVM: x86: Improve thread safety in pitAndy Honig
commit 2febc839133280d5a5e8e1179c94ea674489dae2 upstream. There's a race condition in the PIT emulation code in KVM. In __kvm_migrate_pit_timer the pit_timer object is accessed without synchronization. If the race condition occurs at the wrong time this can crash the host kernel. This fixes CVE-2014-3611. Signed-off-by: Andrew Honig <ahonig@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14KVM: x86: Prevent host from panicking on shared MSR writes.Andy Honig
commit 8b3c3104c3f4f706e99365c3e0d2aa61b95f969f upstream. The previous patch blocked invalid writes directly when the MSR is written. As a precaution, prevent future similar mistakes by gracefulling handle GPs caused by writes to shared MSRs. Signed-off-by: Andrew Honig <ahonig@google.com> [Remove parts obsoleted by Nadav's patch. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14um: ubd: Fix for processes stuck in D state foreverThorsten Knabe
commit 2a2361228c5e6d8c1733f00653481de918598e50 upstream. Starting with Linux 3.12 processes get stuck in D state forever in UserModeLinux under sync heavy workloads. This bug was introduced by commit 805f11a0d5 (um: ubd: Add REQ_FLUSH suppport). Fix bug by adding a check if FLUSH request was successfully submitted to the I/O thread and keeping the FLUSH request on the request queue on submission failures. Fixes: 805f11a0d5 (um: ubd: Add REQ_FLUSH suppport) Signed-off-by: Thorsten Knabe <linux@thorsten-knabe.de> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14x86, pageattr: Prevent overflow in slow_virt_to_phys() for X86_PAEDexuan Cui
commit d1cd1210834649ce1ca6bafe5ac25d2f40331343 upstream. pte_pfn() returns a PFN of long (32 bits in 32-PAE), so "long << PAGE_SHIFT" will overflow for PFNs above 4GB. Due to this issue, some Linux 32-PAE distros, running as guests on Hyper-V, with 5GB memory assigned, can't load the netvsc driver successfully and hence the synthetic network device can't work (we can use the kernel parameter mem=3000M to work around the issue). Cast pte_pfn() to phys_addr_t before shifting. Fixes: "commit d76565344512: x86, mm: Create slow_virt_to_phys()" Signed-off-by: Dexuan Cui <decui@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: gregkh@linuxfoundation.org Cc: linux-mm@kvack.org Cc: olaf@aepfle.de Cc: apw@canonical.com Cc: jasowang@redhat.com Cc: dave.hansen@intel.com Cc: riel@redhat.com Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1414580017-27444-1-git-send-email-decui@microsoft.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14x86_64, entry: Fix out of bounds read on sysenterAndy Lutomirski
commit 653bc77af60911ead1f423e588f54fc2547c4957 upstream. Rusty noticed a Really Bad Bug (tm) in my NT fix. The entry code reads out of bounds, causing the NT fix to be unreliable. But, and this is much, much worse, if your stack is somehow just below the top of the direct map (or a hole), you read out of bounds and crash. Excerpt from the crash: [ 1.129513] RSP: 0018:ffff88001da4bf88 EFLAGS: 00010296 2b:* f7 84 24 90 00 00 00 testl $0x4000,0x90(%rsp) That read is deterministically above the top of the stack. I thought I even single-stepped through this code when I wrote it to check the offset, but I clearly screwed it up. Fixes: 8c7aa698baca ("x86_64, entry: Filter RFLAGS.NT on entry from userspace") Reported-by: Rusty Russell <rusty@ozlabs.org> Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14x86_64, entry: Filter RFLAGS.NT on entry from userspaceAndy Lutomirski
commit 8c7aa698baca5e8f1ba9edb68081f1e7a1abf455 upstream. The NT flag doesn't do anything in long mode other than causing IRET to #GP. Oddly, CPL3 code can still set NT using popf. Entry via hardware or software interrupt clears NT automatically, so the only relevant entries are fast syscalls. If user code causes kernel code to run with NT set, then there's at least some (small) chance that it could cause trouble. For example, user code could cause a call to EFI code with NT set, and who knows what would happen? Apparently some games on Wine sometimes do this (!), and, if an IRET return happens, they will segfault. That segfault cannot be handled, because signal delivery fails, too. This patch programs the CPU to clear NT on entry via SYSCALL (both 32-bit and 64-bit, by my reading of the AMD APM), and it clears NT in software on entry via SYSENTER. To save a few cycles, this borrows a trick from Jan Beulich in Xen: it checks whether NT is set before trying to clear it. As a result, it seems to have very little effect on SYSENTER performance on my machine. There's another minor bug fix in here: it looks like the CFI annotations were wrong if CONFIG_AUDITSYSCALL=n. Testers beware: on Xen, SYSENTER with NT set turns into a GPF. I haven't touched anything on 32-bit kernels. The syscall mask change comes from a variant of this patch by Anish Bhatt. Note to stable maintainers: there is no known security issue here. A misguided program can set NT and cause the kernel to try and fail to deliver SIGSEGV, crashing the program. This patch fixes Far Cry on Wine: https://bugs.winehq.org/show_bug.cgi?id=33275 Reported-by: Anish Bhatt <anish@chelsio.com> Signed-off-by: Andy Lutomirski <luto@amacapital.net> Link: http://lkml.kernel.org/r/395749a5d39a29bd3e4b35899cf3a3c1340e5595.1412189265.git.luto@amacapital.net Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14x86, fpu: shift drop_init_fpu() from save_xstate_sig() to handle_signal()Oleg Nesterov
commit 66463db4fc5605d51c7bb81d009d5bf30a783a2c upstream. save_xstate_sig()->drop_init_fpu() doesn't look right. setup_rt_frame() can fail after that, in this case the next setup_rt_frame() triggered by SIGSEGV won't save fpu simply because the old state was lost. This obviously mean that fpu won't be restored after sys_rt_sigreturn() from SIGSEGV handler. Shift drop_init_fpu() into !failed branch in handle_signal(). Test-case (needs -O2): #include <stdio.h> #include <signal.h> #include <unistd.h> #include <sys/syscall.h> #include <sys/mman.h> #include <pthread.h> #include <assert.h> volatile double D; void test(double d) { int pid = getpid(); for (D = d; D == d; ) { /* sys_tkill(pid, SIGHUP); asm to avoid save/reload * fp regs around "C" call */ asm ("" : : "a"(200), "D"(pid), "S"(1)); asm ("syscall" : : : "ax"); } printf("ERR!!\n"); } void sigh(int sig) { } char altstack[4096 * 10] __attribute__((aligned(4096))); void *tfunc(void *arg) { for (;;) { mprotect(altstack, sizeof(altstack), PROT_READ); mprotect(altstack, sizeof(altstack), PROT_READ|PROT_WRITE); } } int main(void) { stack_t st = { .ss_sp = altstack, .ss_size = sizeof(altstack), .ss_flags = SS_ONSTACK, }; struct sigaction sa = { .sa_handler = sigh, }; pthread_t pt; sigaction(SIGSEGV, &sa, NULL); sigaltstack(&st, NULL); sa.sa_flags = SA_ONSTACK; sigaction(SIGHUP, &sa, NULL); pthread_create(&pt, NULL, tfunc, NULL); test(123.456); return 0; } Reported-by: Bean Anderson <bean@azulsystems.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: http://lkml.kernel.org/r/20140902175713.GA21646@redhat.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14x86, fpu: __restore_xstate_sig()->math_state_restore() needs preempt_disable()Oleg Nesterov
commit df24fb859a4e200d9324e2974229fbb7adf00aef upstream. Add preempt_disable() + preempt_enable() around math_state_restore() in __restore_xstate_sig(). Otherwise __switch_to() after __thread_fpu_begin() can overwrite fpu->state we are going to restore. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: http://lkml.kernel.org/r/20140902175717.GA21649@redhat.com Reviewed-by: Suresh Siddha <sbsiddha@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-14x86: Reject x32 executables if x32 ABI not supportedBen Hutchings
commit 0e6d3112a4e95d55cf6dca88f298d5f4b8f29bd1 upstream. It is currently possible to execve() an x32 executable on an x86_64 kernel that has only ia32 compat enabled. However all its syscalls will fail, even _exit(). This usually causes it to segfault. Change the ELF compat architecture check so that x32 executables are rejected if we don't support the x32 ABI. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Link: http://lkml.kernel.org/r/1410120305.6822.9.camel@decadent.org.uk Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30sparc64: Implement __get_user_pages_fast().David S. Miller
[ Upstream commit 06090e8ed89ea2113a236befb41f71d51f100e60 ] It is not sufficient to only implement get_user_pages_fast(), you must also implement the atomic version __get_user_pages_fast() otherwise you end up using the weak symbol fallback implementation which simply returns zero. This is dangerous, because it causes the futex code to loop forever if transparent hugepages are supported (see get_futex_key()). Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30sparc64: Fix register corruption in top-most kernel stack frame during boot.David S. Miller
[ Upstream commit ef3e035c3a9b81da8a778bc333d10637acf6c199 ] Meelis Roos reported that kernels built with gcc-4.9 do not boot, we eventually narrowed this down to only impacting machines using UltraSPARC-III and derivitive cpus. The crash happens right when the first user process is spawned: [ 54.451346] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004 [ 54.451346] [ 54.571516] CPU: 1 PID: 1 Comm: init Not tainted 3.16.0-rc2-00211-gd7933ab #96 [ 54.666431] Call Trace: [ 54.698453] [0000000000762f8c] panic+0xb0/0x224 [ 54.759071] [000000000045cf68] do_exit+0x948/0x960 [ 54.823123] [000000000042cbc0] fault_in_user_windows+0xe0/0x100 [ 54.902036] [0000000000404ad0] __handle_user_windows+0x0/0x10 [ 54.978662] Press Stop-A (L1-A) to return to the boot prom [ 55.050713] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004 Further investigation showed that compiling only per_cpu_patch() with an older compiler fixes the boot. Detailed analysis showed that the function is not being miscompiled by gcc-4.9, but it is using a different register allocation ordering. With the gcc-4.9 compiled function, something during the code patching causes some of the %i* input registers to get corrupted. Perhaps we have a TLB miss path into the firmware that is deep enough to cause a register window spill and subsequent restore when we get back from the TLB miss trap. Let's plug this up by doing two things: 1) Stop using the firmware stack for client interface calls into the firmware. Just use the kernel's stack. 2) As soon as we can, call into a new function "start_early_boot()" to put a one-register-window buffer between the firmware's deepest stack frame and the top-most initial kernel one. Reported-by: Meelis Roos <mroos@linux.ee> Tested-by: Meelis Roos <mroos@linux.ee> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30sparc64: Increase size of boot string to 1024 bytesDave Kleikamp
[ Upstream commit 1cef94c36bd4d79b5ae3a3df99ee0d76d6a4a6dc ] This is the longest boot string that silo supports. Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Cc: Bob Picco <bob.picco@oracle.com> Cc: David S. Miller <davem@davemloft.net> Cc: sparclinux@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30sparc64: Kill unnecessary tables and increase MAX_BANKS.David S. Miller
[ Upstream commit d195b71bad4347d2df51072a537f922546a904f1 ] swapper_low_pmd_dir and swapper_pud_dir are actually completely useless and unnecessary. We just need swapper_pg_dir[]. Naturally the other page table chunks will be allocated on an as-needed basis. Since the kernel actually accesses these tables in the PAGE_OFFSET view, there is not even a TLB locality advantage of placing them in the kernel image. Use the hard coded vmlinux.ld.S slot for swapper_pg_dir which is naturally page aligned. Increase MAX_BANKS to 1024 in order to handle heavily fragmented virtual guests. Even with this MAX_BANKS increase, the kernel is 20K+ smaller. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Bob Picco <bob.picco@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30sparc64: sparse irqbob picco
[ Upstream commit ee6a9333fa58e11577c1b531b8e0f5ffc0fd6f50 ] This patch attempts to do a few things. The highlights are: 1) enable SPARSE_IRQ unconditionally, 2) kills off !SPARSE_IRQ code 3) allocates ivector_table at boot time and 4) default to cookie only VIRQ mechanism for supported firmware. The first firmware with cookie only support for me appears on T5. You can optionally force the HV firmware to not cookie only mode which is the sysino support. The sysino is a deprecated HV mechanism according to the most recent SPARC Virtual Machine Specification. HV_GRP_INTR is what controls the cookie/sysino firmware versioning. The history of this interface is: 1) Major version 1.0 only supported sysino based interrupt interfaces. 2) Major version 2.0 added cookie based VIRQs, however due to the fact that OSs were using the VIRQs without negoatiating major version 2.0 (Linux and Solaris are both guilty), the VIRQs calls were allowed even with major version 1.0 To complicate things even further, the VIRQ interfaces were only actually hooked up in the hypervisor for LDC interrupt sources. VIRQ calls on other device types would result in HV_EINVAL errors. So effectively, major version 2.0 is unusable. 3) Major version 3.0 was created to signal use of VIRQs and the fact that the hypervisor has these calls hooked up for all interrupt sources, not just those for LDC devices. A new boot option is provided should cookie only HV support have issues. hvirq - this is the version for HV_GRP_INTR. This is related to HV API versioning. The code attempts major=3 first by default. The option can be used to override this default. I've tested with SPARSE_IRQ on T5-8, M7-4 and T4-X and Jalap?no. Signed-off-by: Bob Picco <bob.picco@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30sparc64: Adjust vmalloc region size based upon available virtual address bits.David S. Miller
[ Upstream commit bb4e6e85daa52a9f6210fa06a5ec6269598a202b ] In order to accomodate embedded per-cpu allocation with large numbers of cpus and numa nodes, we have to use as much virtual address space as possible for the vmalloc region. Otherwise we can get things like: PERCPU: max_distance=0x380001c10000 too large for vmalloc space 0xff00000000 So, once we select a value for PAGE_OFFSET, derive the size of the vmalloc region based upon that. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Bob Picco <bob.picco@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30sparc64: Increase MAX_PHYS_ADDRESS_BITS to 53.David S. Miller
Make sure, at compile time, that the kernel can properly support whatever MAX_PHYS_ADDRESS_BITS is defined to. On M7 chips, use a max_phys_bits value of 49. Based upon a patch by Bob Picco. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Bob Picco <bob.picco@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30sparc64: Use kernel page tables for vmemmap.David S. Miller
[ Upstream commit c06240c7f5c39c83dfd7849c0770775562441b96 ] For sparse memory configurations, the vmemmap array behaves terribly and it takes up an inordinate amount of space in the BSS section of the kernel image unconditionally. Just build huge PMDs and look them up just like we do for TLB misses in the vmalloc area. Kernel BSS shrinks by about 2MB. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Bob Picco <bob.picco@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30sparc64: Fix physical memory management regressions with large max_phys_bits.David S. Miller
[ Upstream commit 0dd5b7b09e13dae32869371e08e1048349fd040c ] If max_phys_bits needs to be > 43 (f.e. for T4 chips), things like DEBUG_PAGEALLOC stop working because the 3-level page tables only can cover up to 43 bits. Another problem is that when we increased MAX_PHYS_ADDRESS_BITS up to 47, several statically allocated tables became enormous. Compounding this is that we will need to support up to 49 bits of physical addressing for M7 chips. The two tables in question are sparc64_valid_addr_bitmap and kpte_linear_bitmap. The first holds a bitmap, with 1 bit for each 4MB chunk of physical memory, indicating whether that chunk actually exists in the machine and is valid. The second table is a set of 2-bit values which tell how large of a mapping (4MB, 256MB, 2GB, 16GB, respectively) we can use at each 256MB chunk of ram in the system. These tables are huge and take up an enormous amount of the BSS section of the sparc64 kernel image. Specifically, the sparc64_valid_addr_bitmap is 4MB, and the kpte_linear_bitmap is 128K. So let's solve the space wastage and the DEBUG_PAGEALLOC problem at the same time, by using the kernel page tables (as designed) to manage this information. We have to keep using large mappings when DEBUG_PAGEALLOC is disabled, and we do this by encoding huge PMDs and PUDs. On a T4-2 with 256GB of ram the kernel page table takes up 16K with DEBUG_PAGEALLOC disabled and 256MB with it enabled. Furthermore, this memory is dynamically allocated at run time rather than coded statically into the kernel image. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Bob Picco <bob.picco@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30sparc64: Adjust KTSB assembler to support larger physical addresses.David S. Miller
[ Upstream commit 8c82dc0e883821c098c8b0b130ffebabf9aab5df ] As currently coded the KTSB accesses in the kernel only support up to 47 bits of physical addressing. Adjust the instruction and patching sequence in order to support arbitrary 64 bits addresses. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Bob Picco <bob.picco@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30sparc64: Define VA hole at run time, rather than at compile time.David S. Miller
[ Upstream commit 4397bed080598001e88f612deb8b080bb1cc2322 ] Now that we use 4-level page tables, we can provide up to 53-bits of virtual address space to the user. Adjust the VA hole based upon the capabilities of the cpu type probed. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Bob Picco <bob.picco@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30sparc64: Switch to 4-level page tables.David S. Miller
[ Upstream commit ac55c768143aa34cc3789c4820cbb0809a76fd9c ] This has become necessary with chips that support more than 43-bits of physical addressing. Based almost entirely upon a patch by Bob Picco. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Bob Picco <bob.picco@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30sparc64: T5 PMUbob picco
The T5 (niagara5) has different PCR related HV fast trap values and a new HV API Group. This patch utilizes these and shares when possible with niagara4. We use the same sparc_pmu niagara4_pmu. Should there be new effort to obtain the MCU perf statistics then this would have to be changed. Cc: sparclinux@vger.kernel.org Signed-off-by: Bob Picco <bob.picco@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30sparc64: cpu hardware caps support for sparc M6 and M7Allen Pais
Signed-off-by: Allen Pais <allen.pais@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30sparc64: support M6 and M7 for building CPU distribution mapAllen Pais
Add M6 and M7 chip type in cpumap.c to correctly build CPU distribution map that spans all online CPUs. Signed-off-by: Allen Pais <allen.pais@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30sparc64: correctly recognise M6 and M7 cpu typeAllen Pais
The following patch adds support for correctly recognising M6 and M7 cpu type. Signed-off-by: Allen Pais <allen.pais@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30sparc64: Fix hibernation code refrence to PAGE_OFFSET.David S. Miller
We changed PAGE_OFFSET to be a variable rather than a constant, but this reference here in the hibernate assembler got missed. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30sparc64: Do not define thread fpregs save area as zero-length array.David S. Miller
[ Upstream commit e2653143d7d79a49f1a961aeae1d82612838b12c ] This breaks the stack end corruption detection facility. What that facility does it write a magic value to "end_of_stack()" and checking to see if it gets overwritten. "end_of_stack()" is "task_thread_info(p) + 1", which for sparc64 is the beginning of the FPU register save area. So once the user uses the FPU, the magic value is overwritten and the debug checks trigger. Fix this by making the size explicit. Due to the size we use for the fpsaved[], gsr[], and xfsr[] arrays we are limited to 7 levels of FPU state saves. So each FPU register set is 256 bytes, allocate 256 * 7 for the fpregs area. Reported-by: Meelis Roos <mroos@linux.ee> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30sparc64: Fix FPU register corruption with AES crypto offload.David S. Miller
[ Upstream commit f4da3628dc7c32a59d1fb7116bb042e6f436d611 ] The AES loops in arch/sparc/crypto/aes_glue.c use a scheme where the key material is preloaded into the FPU registers, and then we loop over and over doing the crypt operation, reusing those pre-cooked key registers. There are intervening blkcipher*() calls between the crypt operation calls. And those might perform memcpy() and thus also try to use the FPU. The sparc64 kernel FPU usage mechanism is designed to allow such recursive uses, but with a catch. There has to be a trap between the two FPU using threads of control. The mechanism works by, when the FPU is already in use by the kernel, allocating a slot for FPU saving at trap time. Then if, within the trap handler, we try to use the FPU registers, the pre-trap FPU register state is saved into the slot. Then at trap return time we notice this and restore the pre-trap FPU state. Over the long term there are various more involved ways we can make this work, but for a quick fix let's take advantage of the fact that the situation where this happens is very limited. All sparc64 chips that support the crypto instructiosn also are using the Niagara4 memcpy routine, and that routine only uses the FPU for large copies where we can't get the source aligned properly to a multiple of 8 bytes. We look to see if the FPU is already in use in this context, and if so we use the non-large copy path which only uses integer registers. Furthermore, we also limit this special logic to when we are doing kernel copy, rather than a user copy. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30sparc64: Fix lockdep warnings on reboot on Ultra-5David S. Miller
[ Upstream commit bdcf81b658ebc4c2640c3c2c55c8b31c601b6996 ] Inconsistently, the raw_* IRQ routines do not interact with and update the irqflags tracing and lockdep state, whereas the raw_* spinlock interfaces do. This causes problems in p1275_cmd_direct() because we disable hardirqs by hand using raw_local_irq_restore() and then do a raw_spin_lock() which triggers a lockdep trace because the CPU's hw IRQ state doesn't match IRQ tracing's internal software copy of that state. The CPU's irqs are disabled, yet current->hardirqs_enabled is true. ==================== reboot: Restarting system ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:3536 check_flags+0x7c/0x240() DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled) Modules linked in: openpromfs CPU: 0 PID: 1 Comm: systemd-shutdow Tainted: G W 3.17.0-dirty #145 Call Trace: [000000000045919c] warn_slowpath_common+0x5c/0xa0 [0000000000459210] warn_slowpath_fmt+0x30/0x40 [000000000048f41c] check_flags+0x7c/0x240 [0000000000493280] lock_acquire+0x20/0x1c0 [0000000000832b70] _raw_spin_lock+0x30/0x60 [000000000068f2fc] p1275_cmd_direct+0x1c/0x60 [000000000068ed28] prom_reboot+0x28/0x40 [000000000043610c] machine_restart+0x4c/0x80 [000000000047d2d4] kernel_restart+0x54/0x80 [000000000047d618] SyS_reboot+0x138/0x200 [00000000004060b4] linux_sparc_syscall32+0x34/0x60 ---[ end trace 5c439fe81c05a100 ]--- possible reason: unannotated irqs-off. irq event stamp: 2010267 hardirqs last enabled at (2010267): [<000000000049a358>] vprintk_emit+0x4b8/0x580 hardirqs last disabled at (2010266): [<0000000000499f08>] vprintk_emit+0x68/0x580 softirqs last enabled at (2010046): [<000000000045d278>] __do_softirq+0x378/0x4a0 softirqs last disabled at (2010039): [<000000000042bf08>] do_softirq_own_stack+0x28/0x40 Resetting ... ==================== Use local_* variables of the hw IRQ interfaces so that IRQ tracing sees all of our changes. Reported-by: Meelis Roos <mroos@linux.ee> Tested-by: Meelis Roos <mroos@linux.ee> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30sparc64: Fix reversed start/end in flush_tlb_kernel_range()David S. Miller
[ Upstream commit 473ad7f4fb005d1bb727e4ef27d370d28703a062 ] When we have to split up a flush request into multiple pieces (in order to avoid the firmware range) we don't specify the arguments in the right order for the second piece. Fix the order, or else we get hangs as the code tries to flush "a lot" of entries and we get lockups like this: [ 4422.981276] NMI watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [expect:117032] [ 4422.996130] Modules linked in: ipv6 loop usb_storage igb ptp sg sr_mod ehci_pci ehci_hcd pps_core n2_rng rng_core [ 4423.016617] CPU: 12 PID: 117032 Comm: expect Not tainted 3.17.0-rc4+ #1608 [ 4423.030331] task: fff8003cc730e220 ti: fff8003d99d54000 task.ti: fff8003d99d54000 [ 4423.045282] TSTATE: 0000000011001602 TPC: 00000000004521e8 TNPC: 00000000004521ec Y: 00000000 Not tainted [ 4423.064905] TPC: <__flush_tlb_kernel_range+0x28/0x40> [ 4423.074964] g0: 000000000052fd10 g1: 00000001295a8000 g2: ffffff7176ffc000 g3: 0000000000002000 [ 4423.092324] g4: fff8003cc730e220 g5: fff8003dfedcc000 g6: fff8003d99d54000 g7: 0000000000000006 [ 4423.109687] o0: 0000000000000000 o1: 0000000000000000 o2: 0000000000000003 o3: 00000000f0000000 [ 4423.127058] o4: 0000000000000080 o5: 00000001295a8000 sp: fff8003d99d56d01 ret_pc: 000000000052ff54 [ 4423.145121] RPC: <__purge_vmap_area_lazy+0x314/0x3a0> [ 4423.155185] l0: 0000000000000000 l1: 0000000000000000 l2: 0000000000a38040 l3: 0000000000000000 [ 4423.172559] l4: fff8003dae8965e0 l5: ffffffffffffffff l6: 0000000000000000 l7: 00000000f7e2b138 [ 4423.189913] i0: fff8003d99d576a0 i1: fff8003d99d576a8 i2: fff8003d99d575e8 i3: 0000000000000000 [ 4423.207284] i4: 0000000000008008 i5: fff8003d99d575c8 i6: fff8003d99d56df1 i7: 0000000000530c24 [ 4423.224640] I7: <free_vmap_area_noflush+0x64/0x80> [ 4423.234193] Call Trace: [ 4423.239051] [0000000000530c24] free_vmap_area_noflush+0x64/0x80 [ 4423.251029] [0000000000531a7c] remove_vm_area+0x5c/0x80 [ 4423.261628] [0000000000531b80] __vunmap+0x20/0x120 [ 4423.271352] [000000000071cf18] n_tty_close+0x18/0x40 [ 4423.281423] [00000000007222b0] tty_ldisc_close+0x30/0x60 [ 4423.292183] [00000000007225a4] tty_ldisc_reinit+0x24/0xa0 [ 4423.303120] [0000000000722ab4] tty_ldisc_hangup+0xd4/0x1e0 [ 4423.314232] [0000000000719aa0] __tty_hangup+0x280/0x3c0 [ 4423.324835] [0000000000724cb4] pty_close+0x134/0x1a0 [ 4423.334905] [000000000071aa24] tty_release+0x104/0x500 [ 4423.345316] [00000000005511d0] __fput+0x90/0x1e0 [ 4423.354701] [000000000047fa54] task_work_run+0x94/0xe0 [ 4423.365126] [0000000000404b44] __handle_signal+0xc/0x2c Fixes: 4ca9a23765da ("sparc64: Guard against flushing openfirmware mappings.") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-30sparc: Let memset return the address argumentAndreas Larsson
[ Upstream commit 74cad25c076a2f5253312c2fe82d1a4daecc1323 ] This makes memset follow the standard (instead of returning 0 on success). This is needed when certain versions of gcc optimizes around memset calls and assume that the address argument is preserved in %o0. Signed-off-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>