summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)Author
2018-07-22Linux 4.14.57v4.14.57Greg Kroah-Hartman
2018-07-22string: drop __must_check from strscpy() and restore strscpy() usages in cgroupTejun Heo
commit 08a77676f9c5fc69a681ccd2cd8140e65dcb26c7 upstream. e7fd37ba1217 ("cgroup: avoid copying strings longer than the buffers") converted possibly unsafe strncpy() usages in cgroup to strscpy(). However, although the callsites are completely fine with truncated copied, because strscpy() is marked __must_check, it led to the following warnings. kernel/cgroup/cgroup.c: In function ‘cgroup_file_name’: kernel/cgroup/cgroup.c:1400:10: warning: ignoring return value of ‘strscpy’, declared with attribute warn_unused_result [-Wunused-result] strscpy(buf, cft->name, CGROUP_FILE_NAME_MAX); ^ To avoid the warnings, 50034ed49645 ("cgroup: use strlcpy() instead of strscpy() to avoid spurious warning") switched them to strlcpy(). strlcpy() is worse than strlcpy() because it unconditionally runs strlen() on the source string, and the only reason we switched to strlcpy() here was because it was lacking __must_check, which doesn't reflect any material differences between the two function. It's just that someone added __must_check to strscpy() and not to strlcpy(). These basic string copy operations are used in variety of ways, and one of not-so-uncommon use cases is safely handling truncated copies, where the caller naturally doesn't care about the return value. The __must_check doesn't match the actual use cases and forces users to opt for inferior variants which lack __must_check by happenstance or spread ugly (void) casts. Remove __must_check from strscpy() and restore strscpy() usages in cgroup. Signed-off-by: Tejun Heo <tj@kernel.org> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ma Shimiao <mashimiao.fnst@cn.fujitsu.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Chris Metcalf <cmetcalf@ezchip.com> [backport only the string.h portion to remove build warnings starting to show up - gregkh] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22arm64: KVM: Add ARCH_WORKAROUND_2 discovery through ARCH_FEATURES_FUNC_IDMarc Zyngier
commit 5d81f7dc9bca4f4963092433e27b508cbe524a32 upstream. Now that all our infrastructure is in place, let's expose the availability of ARCH_WORKAROUND_2 to guests. We take this opportunity to tidy up a couple of SMCCC constants. Acked-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22arm64: KVM: Handle guest's ARCH_WORKAROUND_2 requestsMarc Zyngier
commit b4f18c063a13dfb33e3a63fe1844823e19c2265e upstream. In order to forward the guest's ARCH_WORKAROUND_2 calls to EL3, add a small(-ish) sequence to handle it at EL2. Special care must be taken to track the state of the guest itself by updating the workaround flags. We also rely on patching to enable calls into the firmware. Note that since we need to execute branches, this always executes after the Spectre-v2 mitigation has been applied. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22arm64: KVM: Add ARCH_WORKAROUND_2 support for guestsMarc Zyngier
commit 55e3748e8902ff641e334226bdcb432f9a5d78d3 upstream. In order to offer ARCH_WORKAROUND_2 support to guests, we need a bit of infrastructure. Let's add a flag indicating whether or not the guest uses SSBD mitigation. Depending on the state of this flag, allow KVM to disable ARCH_WORKAROUND_2 before entering the guest, and enable it when exiting it. Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22arm64: KVM: Add HYP per-cpu accessorsMarc Zyngier
commit 85478bab409171de501b719971fd25a3d5d639f9 upstream. As we're going to require to access per-cpu variables at EL2, let's craft the minimum set of accessors required to implement reading a per-cpu variable, relying on tpidr_el2 to contain the per-cpu offset. Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22arm64: ssbd: Add prctl interface for per-thread mitigationMarc Zyngier
commit 9cdc0108baa8ef87c76ed834619886a46bd70cbe upstream. If running on a system that performs dynamic SSBD mitigation, allow userspace to request the mitigation for itself. This is implemented as a prctl call, allowing the mitigation to be enabled or disabled at will for this particular thread. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22arm64: ssbd: Introduce thread flag to control userspace mitigationMarc Zyngier
commit 9dd9614f5476687abbff8d4b12cd08ae70d7c2ad upstream. In order to allow userspace to be mitigated on demand, let's introduce a new thread flag that prevents the mitigation from being turned off when exiting to userspace, and doesn't turn it on on entry into the kernel (with the assumption that the mitigation is always enabled in the kernel itself). This will be used by a prctl interface introduced in a later patch. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22arm64: ssbd: Restore mitigation status on CPU resumeMarc Zyngier
commit 647d0519b53f440a55df163de21c52a8205431cc upstream. On a system where firmware can dynamically change the state of the mitigation, the CPU will always come up with the mitigation enabled, including when coming back from suspend. If the user has requested "no mitigation" via a command line option, let's enforce it by calling into the firmware again to disable it. Similarily, for a resume from hibernate, the mitigation could have been disabled by the boot kernel. Let's ensure that it is set back on in that case. Acked-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22arm64: ssbd: Skip apply_ssbd if not using dynamic mitigationMarc Zyngier
commit 986372c4367f46b34a3c0f6918d7fb95cbdf39d6 upstream. In order to avoid checking arm64_ssbd_callback_required on each kernel entry/exit even if no mitigation is required, let's add yet another alternative that by default jumps over the mitigation, and that gets nop'ed out if we're doing dynamic mitigation. Think of it as a poor man's static key... Reviewed-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22arm64: ssbd: Add global mitigation state accessorMarc Zyngier
commit c32e1736ca03904c03de0e4459a673be194f56fd upstream. We're about to need the mitigation state in various parts of the kernel in order to do the right thing for userspace and guests. Let's expose an accessor that will let other subsystems know about the state. Reviewed-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22arm64: Add 'ssbd' command-line optionMarc Zyngier
commit a43ae4dfe56a01f5b98ba0cb2f784b6a43bafcc6 upstream. On a system where the firmware implements ARCH_WORKAROUND_2, it may be useful to either permanently enable or disable the workaround for cases where the user decides that they'd rather not get a trap overhead, and keep the mitigation permanently on or off instead of switching it on exception entry/exit. In any case, default to the mitigation being enabled. Reviewed-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22arm64: Add ARCH_WORKAROUND_2 probingMarc Zyngier
commit a725e3dda1813ed306734823ac4c65ca04e38500 upstream. As for Spectre variant-2, we rely on SMCCC 1.1 to provide the discovery mechanism for detecting the SSBD mitigation. A new capability is also allocated for that purpose, and a config option. Reviewed-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22arm64: Add per-cpu infrastructure to call ARCH_WORKAROUND_2Marc Zyngier
commit 5cf9ce6e5ea50f805c6188c04ed0daaec7b6887d upstream. In a heterogeneous system, we can end up with both affected and unaffected CPUs. Let's check their status before calling into the firmware. Reviewed-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22arm64: Call ARCH_WORKAROUND_2 on transitions between EL0 and EL1Marc Zyngier
commit 8e2906245f1e3b0d027169d9f2e55ce0548cb96e upstream. In order for the kernel to protect itself, let's call the SSBD mitigation implemented by the higher exception level (either hypervisor or firmware) on each transition between userspace and kernel. We must take the PSCI conduit into account in order to target the right exception level, hence the introduction of a runtime patching callback. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Julien Grall <julien.grall@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22arm/arm64: smccc: Add SMCCC-specific return codesMarc Zyngier
commit eff0e9e1078ea7dc1d794dc50e31baef984c46d7 upstream. We've so far used the PSCI return codes for SMCCC because they were extremely similar. But with the new ARM DEN 0070A specification, "NOT_REQUIRED" (-2) is clashing with PSCI's "PSCI_RET_INVALID_PARAMS". Let's bite the bullet and add SMCCC specific return codes. Users can be repainted as and when required. Acked-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22KVM: arm64: Avoid storing the vcpu pointer on the stackChristoffer Dall
Commit 4464e210de9e80e38de59df052fe09ea2ff80b1b upstream. We already have the percpu area for the host cpu state, which points to the VCPU, so there's no need to store the VCPU pointer on the stack on every context switch. We can be a little more clever and just use tpidr_el2 for the percpu offset and load the VCPU pointer from the host context. This has the benefit of being able to retrieve the host context even when our stack is corrupted, and it has a potential performance benefit because we trade a store plus a load for an mrs and a load on a round trip to the guest. This does require us to calculate the percpu offset without including the offset from the kernel mapping of the percpu array to the linear mapping of the array (which is what we store in tpidr_el1), because a PC-relative generated address in EL2 is already giving us the hyp alias of the linear mapping of a kernel address. We do this in __cpu_init_hyp_mode() by using kvm_ksym_ref(). The code that accesses ESR_EL2 was previously using an alternative to use the _EL1 accessor on VHE systems, but this was actually unnecessary as the _EL1 accessor aliases the ESR_EL2 register on VHE, and the _EL2 accessor does the same thing on both systems. Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22KVM: arm/arm64: Do not use kern_hyp_va() with kvm_vgic_global_stateMarc Zyngier
Commit 44a497abd621a71c645f06d3d545ae2f46448830 upstream. kvm_vgic_global_state is part of the read-only section, and is usually accessed using a PC-relative address generation (adrp + add). It is thus useless to use kern_hyp_va() on it, and actively problematic if kern_hyp_va() becomes non-idempotent. On the other hand, there is no way that the compiler is going to guarantee that such access is always PC relative. So let's bite the bullet and provide our own accessor. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22arm64: alternatives: Add dynamic patching featureMarc Zyngier
Commit dea5e2a4c5bcf196f879a66cebdcca07793e8ba4 upstream. We've so far relied on a patching infrastructure that only gave us a single alternative, without any way to provide a range of potential replacement instructions. For a single feature, this is an all or nothing thing. It would be interesting to have a more flexible grained way of patching the kernel though, where we could dynamically tune the code that gets injected. In order to achive this, let's introduce a new form of dynamic patching, assiciating a callback to a patching site. This callback gets source and target locations of the patching request, as well as the number of instructions to be patched. Dynamic patching is declared with the new ALTERNATIVE_CB and alternative_cb directives: asm volatile(ALTERNATIVE_CB("mov %0, #0\n", callback) : "r" (v)); or alternative_cb callback mov x0, #0 alternative_cb_end where callback is the C function computing the alternative. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22KVM: arm64: Stop save/restoring host tpidr_el1 on VHEJames Morse
Commit 1f742679c33bc083722cb0b442a95d458c491b56 upstream. Now that a VHE host uses tpidr_el2 for the cpu offset we no longer need KVM to save/restore tpidr_el1. Move this from the 'common' code into the non-vhe code. While we're at it, on VHE we don't need to save the ELR or SPSR as kernel_entry in entry.S will have pushed these onto the kernel stack, and will restore them from there. Move these to the non-vhe code as we need them to get back to the host. Finally remove the always-copy-tpidr we hid in the stage2 setup code, cpufeature's enable callback will do this for VHE, we only need KVM to do it for non-vhe. Add the copy into kvm-init instead. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22arm64: alternatives: use tpidr_el2 on VHE hostsJames Morse
Commit 6d99b68933fbcf51f84fcbba49246ce1209ec193 upstream. Now that KVM uses tpidr_el2 in the same way as Linux's cpu_offset in tpidr_el1, merge the two. This saves KVM from save/restoring tpidr_el1 on VHE hosts, and allows future code to blindly access per-cpu variables without triggering world-switch. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22KVM: arm64: Change hyp_panic()s dependency on tpidr_el2James Morse
Commit c97e166e54b662717d20ec2e36761758d2b6a7c2 upstream. Make tpidr_el2 a cpu-offset for per-cpu variables in the same way the host uses tpidr_el1. This lets tpidr_el{1,2} have the same value, and on VHE they can be the same register. KVM calls hyp_panic() when anything unexpected happens. This may occur while a guest owns the EL1 registers. KVM stashes the vcpu pointer in tpidr_el2, which it uses to find the host context in order to restore the host EL1 registers before parachuting into the host's panic(). The host context is a struct kvm_cpu_context allocated in the per-cpu area, and mapped to hyp. Given the per-cpu offset for this CPU, this is easy to find. Change hyp_panic() to take a pointer to the struct kvm_cpu_context. Wrap these calls with an asm function that retrieves the struct kvm_cpu_context from the host's per-cpu area. Copy the per-cpu offset from the hosts tpidr_el1 into tpidr_el2 during kvm init. (Later patches will make this unnecessary for VHE hosts) We print out the vcpu pointer as part of the panic message. Add a back reference to the 'running vcpu' in the host cpu context to preserve this. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22KVM: arm/arm64: Convert kvm_host_cpu_state to a static per-cpu allocationJames Morse
Commit 36989e7fd386a9a5822c48691473863f8fbb404d upstream. kvm_host_cpu_state is a per-cpu allocation made from kvm_arch_init() used to store the host EL1 registers when KVM switches to a guest. Make it easier for ASM to generate pointers into this per-cpu memory by making it a static allocation. Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22KVM: arm64: Store vcpu on the stack during __guest_enter()James Morse
Commit 32b03d1059667a39e089c45ee38ec9c16332430f upstream. KVM uses tpidr_el2 as its private vcpu register, which makes sense for non-vhe world switch as only KVM can access this register. This means vhe Linux has to use tpidr_el1, which KVM has to save/restore as part of the host context. If the SDEI handler code runs behind KVMs back, it mustn't access any per-cpu variables. To allow this on systems with vhe we need to make the host use tpidr_el2, saving KVM from save/restoring it. __guest_enter() stores the host_ctxt on the stack, do the same with the vcpu. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22net/nfc: Avoid stalls when nfc_alloc_send_skb() returned NULL.Tetsuo Handa
commit 3bc53be9db21040b5d2de4d455f023c8c494aa68 upstream. syzbot is reporting stalls at nfc_llcp_send_ui_frame() [1]. This is because nfc_llcp_send_ui_frame() is retrying the loop without any delay when nonblocking nfc_alloc_send_skb() returned NULL. Since there is no need to use MSG_DONTWAIT if we retry until sock_alloc_send_pskb() succeeds, let's use blocking call. Also, in case an unexpected error occurred, let's break the loop if blocking nfc_alloc_send_skb() failed. [1] https://syzkaller.appspot.com/bug?id=4a131cc571c3733e0eff6bc673f4e36ae48f19c6 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: syzbot <syzbot+d29d18215e477cfbfbdd@syzkaller.appspotmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22rds: avoid unenecessary cong_update in loop transportSantosh Shilimkar
commit f1693c63ab133d16994cc50f773982b5905af264 upstream. Loop transport which is self loopback, remote port congestion update isn't relevant. Infact the xmit path already ignores it. Receive path needs to do the same. Reported-by: syzbot+4c20b3866171ce8441d2@syzkaller.appspotmail.com Reviewed-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22bdi: Fix another oops in wb_workfn()Jan Kara
commit 3ee7e8697d5860b173132606d80a9cd35e7113ee upstream. syzbot is reporting NULL pointer dereference at wb_workfn() [1] due to wb->bdi->dev being NULL. And Dmitry confirmed that wb->state was WB_shutting_down after wb->bdi->dev became NULL. This indicates that unregister_bdi() failed to call wb_shutdown() on one of wb objects. The problem is in cgwb_bdi_unregister() which does cgwb_kill() and thus drops bdi's reference to wb structures before going through the list of wbs again and calling wb_shutdown() on each of them. This way the loop iterating through all wbs can easily miss a wb if that wb has already passed through cgwb_remove_from_bdi_list() called from wb_shutdown() from cgwb_release_workfn() and as a result fully shutdown bdi although wb_workfn() for this wb structure is still running. In fact there are also other ways cgwb_bdi_unregister() can race with cgwb_release_workfn() leading e.g. to use-after-free issues: CPU1 CPU2 cgwb_bdi_unregister() cgwb_kill(*slot); cgwb_release() queue_work(cgwb_release_wq, &wb->release_work); cgwb_release_workfn() wb = list_first_entry(&bdi->wb_list, ...) spin_unlock_irq(&cgwb_lock); wb_shutdown(wb); ... kfree_rcu(wb, rcu); wb_shutdown(wb); -> oops use-after-free We solve these issues by synchronizing writeback structure shutdown from cgwb_bdi_unregister() with cgwb_release_workfn() using a new mutex. That way we also no longer need synchronization using WB_shutting_down as the mutex provides it for CONFIG_CGROUP_WRITEBACK case and without CONFIG_CGROUP_WRITEBACK wb_shutdown() can be called only once from bdi_unregister(). Reported-by: syzbot <syzbot+4a7438e774b21ddd8eca@syzkaller.appspotmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22netfilter: ipv6: nf_defrag: drop skb dst before queueingFlorian Westphal
commit 84379c9afe011020e797e3f50a662b08a6355dcf upstream. Eric Dumazet reports: Here is a reproducer of an annoying bug detected by syzkaller on our production kernel [..] ./b78305423 enable_conntrack Then : sleep 60 dmesg | tail -10 [ 171.599093] unregister_netdevice: waiting for lo to become free. Usage count = 2 [ 181.631024] unregister_netdevice: waiting for lo to become free. Usage count = 2 [ 191.687076] unregister_netdevice: waiting for lo to become free. Usage count = 2 [ 201.703037] unregister_netdevice: waiting for lo to become free. Usage count = 2 [ 211.711072] unregister_netdevice: waiting for lo to become free. Usage count = 2 [ 221.959070] unregister_netdevice: waiting for lo to become free. Usage count = 2 Reproducer sends ipv6 fragment that hits nfct defrag via LOCAL_OUT hook. skb gets queued until frag timer expiry -- 1 minute. Normally nf_conntrack_reasm gets called during prerouting, so skb has no dst yet which might explain why this wasn't spotted earlier. Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Reported-by: John Sperbeck <jsperbeck@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Tested-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22nsh: set mac len based on inner packetWillem de Bruijn
commit bab2c80e5a6c855657482eac9e97f5f3eedb509a upstream. When pulling the NSH header in nsh_gso_segment, set the mac length based on the encapsulated packet type. skb_reset_mac_len computes an offset to the network header, which here still points to the outer packet: > skb_reset_network_header(skb); > [...] > __skb_pull(skb, nsh_len); > skb_reset_mac_header(skb); // now mac hdr starts nsh_len == 8B after net hdr > skb_reset_mac_len(skb); // mac len = net hdr - mac hdr == (u16) -8 == 65528 > [..] > skb_mac_gso_segment(skb, ..) Link: http://lkml.kernel.org/r/CAF=yD-KeAcTSOn4AxirAxL8m7QAS8GBBe1w09eziYwvPbbUeYA@mail.gmail.com Reported-by: syzbot+7b9ed9872dab8c32305d@syzkaller.appspotmail.com Fixes: c411ed854584 ("nsh: add GSO support") Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22autofs: fix slab out of bounds read in getname_kernel()Tomas Bortoli
commit 02f51d45937f7bc7f4dee21e9f85b2d5eac37104 upstream. The autofs subsystem does not check that the "path" parameter is present for all cases where it is required when it is passed in via the "param" struct. In particular it isn't checked for the AUTOFS_DEV_IOCTL_OPENMOUNT_CMD ioctl command. To solve it, modify validate_dev_ioctl(function to check that a path has been provided for ioctl commands that require it. Link: http://lkml.kernel.org/r/153060031527.26631.18306637892746301555.stgit@pluto.themaw.net Signed-off-by: Tomas Bortoli <tomasbortoli@gmail.com> Signed-off-by: Ian Kent <raven@themaw.net> Reported-by: syzbot+60c837b428dc84e83a93@syzkaller.appspotmail.com Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22tls: Stricter error checking in zerocopy sendmsg pathDave Watson
commit 32da12216e467dea70a09cd7094c30779ce0f9db upstream. In the zerocopy sendmsg() path, there are error checks to revert the zerocopy if we get any error code. syzkaller has discovered that tls_push_record can return -ECONNRESET, which is fatal, and happens after the point at which it is safe to revert the iter, as we've already passed the memory to do_tcp_sendpages. Previously this code could return -ENOMEM and we would want to revert the iter, but AFAIK this no longer returns ENOMEM after a447da7d004 ("tls: fix waitall behavior in tls_sw_recvmsg"), so we fail for all error codes. Reported-by: syzbot+c226690f7b3126c5ee04@syzkaller.appspotmail.com Reported-by: syzbot+709f2810a6a05f11d4d3@syzkaller.appspotmail.com Signed-off-by: Dave Watson <davejwatson@fb.com> Fixes: 3c4d7559159b ("tls: kernel TLS support") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22KEYS: DNS: fix parsing multiple optionsEric Biggers
commit c604cb767049b78b3075497b80ebb8fd530ea2cc upstream. My recent fix for dns_resolver_preparse() printing very long strings was incomplete, as shown by syzbot which still managed to hit the WARN_ONCE() in set_precision() by adding a crafted "dns_resolver" key: precision 50001 too large WARNING: CPU: 7 PID: 864 at lib/vsprintf.c:2164 vsnprintf+0x48a/0x5a0 The bug this time isn't just a printing bug, but also a logical error when multiple options ("#"-separated strings) are given in the key payload. Specifically, when separating an option string into name and value, if there is no value then the name is incorrectly considered to end at the end of the key payload, rather than the end of the current option. This bypasses validation of the option length, and also means that specifying multiple options is broken -- which presumably has gone unnoticed as there is currently only one valid option anyway. A similar problem also applied to option values, as the kstrtoul() when parsing the "dnserror" option will read past the end of the current option and into the next option. Fix these bugs by correctly computing the length of the option name and by copying the option value, null-terminated, into a temporary buffer. Reproducer for the WARN_ONCE() that syzbot hit: perl -e 'print "#A#", "\0" x 50000' | keyctl padd dns_resolver desc @s Reproducer for "dnserror" option being parsed incorrectly (expected behavior is to fail when seeing the unknown option "foo", actual behavior was to read the dnserror value as "1#foo" and fail there): perl -e 'print "#dnserror=1#foo\0"' | keyctl padd dns_resolver desc @s Reported-by: syzbot <syzkaller@googlegroups.com> Fixes: 4a2d789267e0 ("DNS: If the DNS server returns an error, allow that to be cached [ver #2]") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22reiserfs: fix buffer overflow with long warning messagesEric Biggers
commit fe10e398e860955bac4d28ec031b701d358465e4 upstream. ReiserFS prepares log messages into a 1024-byte buffer with no bounds checks. Long messages, such as the "unknown mount option" warning when userspace passes a crafted mount options string, overflow this buffer. This causes KASAN to report a global-out-of-bounds write. Fix it by truncating messages to the buffer size. Link: http://lkml.kernel.org/r/20180707203621.30922-1-ebiggers3@gmail.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot+b890b3335a4d8c608963@syzkaller.appspotmail.com Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22netfilter: ebtables: reject non-bridge targetsFlorian Westphal
commit 11ff7288beb2b7da889a014aff0a7b80bf8efcf3 upstream. the ebtables evaluation loop expects targets to return positive values (jumps), or negative values (absolute verdicts). This is completely different from what xtables does. In xtables, targets are expected to return the standard netfilter verdicts, i.e. NF_DROP, NF_ACCEPT, etc. ebtables will consider these as jumps. Therefore reject any target found due to unspec fallback. v2: also reject watchers. ebtables ignores their return value, so a target that assumes skb ownership (and returns NF_STOLEN) causes use-after-free. The only watchers in the 'ebtables' front-end are log and nflog; both have AF_BRIDGE specific wrappers on kernel side. Reported-by: syzbot+2b43f681169a2a0d306a@syzkaller.appspotmail.com Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22PCI: hv: Disable/enable IRQs rather than BH in hv_compose_msi_msg()Dexuan Cui
commit 35a88a18d7ea58600e11590405bc93b08e16e7f5 upstream. Commit de0aa7b2f97d ("PCI: hv: Fix 2 hang issues in hv_compose_msi_msg()") uses local_bh_disable()/enable(), because hv_pci_onchannelcallback() can also run in tasklet context as the channel event callback, so bottom halves should be disabled to prevent a race condition. With CONFIG_PROVE_LOCKING=y in the recent mainline, or old kernels that don't have commit f71b74bca637 ("irq/softirqs: Use lockdep to assert IRQs are disabled/enabled"), when the upper layer IRQ code calls hv_compose_msi_msg() with local IRQs disabled, we'll see a warning at the beginning of __local_bh_enable_ip(): IRQs not enabled as expected WARNING: CPU: 0 PID: 408 at kernel/softirq.c:162 __local_bh_enable_ip The warning exposes an issue in de0aa7b2f97d: local_bh_enable() can potentially call do_softirq(), which is not supposed to run when local IRQs are disabled. Let's fix this by using local_irq_save()/restore() instead. Note: hv_pci_onchannelcallback() is not a hot path because it's only called when the PCI device is hot added and removed, which is infrequent. Fixes: de0aa7b2f97d ("PCI: hv: Fix 2 hang issues in hv_compose_msi_msg()") Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Cc: stable@vger.kernel.org Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22block: do not use interruptible wait anywhereAlan Jenkins
commit 1dc3039bc87ae7d19a990c3ee71cfd8a9068f428 upstream. When blk_queue_enter() waits for a queue to unfreeze, or unset the PREEMPT_ONLY flag, do not allow it to be interrupted by a signal. The PREEMPT_ONLY flag was introduced later in commit 3a0a529971ec ("block, scsi: Make SCSI quiesce and resume work reliably"). Note the SCSI device is resumed asynchronously, i.e. after un-freezing userspace tasks. So that commit exposed the bug as a regression in v4.15. A mysterious SIGBUS (or -EIO) sometimes happened during the time the device was being resumed. Most frequently, there was no kernel log message, and we saw Xorg or Xwayland killed by SIGBUS.[1] [1] E.g. https://bugzilla.redhat.com/show_bug.cgi?id=1553979 Without this fix, I get an IO error in this test: # dd if=/dev/sda of=/dev/null iflag=direct & \ while killall -SIGUSR1 dd; do sleep 0.1; done & \ echo mem > /sys/power/state ; \ sleep 5; killall dd # stop after 5 seconds The interruptible wait was added to blk_queue_enter in commit 3ef28e83ab15 ("block: generic request_queue reference counting"). Before then, the interruptible wait was only in blk-mq, but I don't think it could ever have been correct. Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: stable@vger.kernel.org Signed-off-by: Alan Jenkins <alan.christopher.jenkins@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22mtd: rawnand: denali_dt: set clk_x_rate to 200 MHz unconditionallyMasahiro Yamada
commit 3f6e6986045d47f87bd982910821b7ab9758487e upstream. Since commit 1bb88666775e ("mtd: nand: denali: handle timing parameters by setup_data_interface()"), denali_dt.c gets the clock rate from the clock driver. The driver expects the frequency of the bus interface clock, whereas the clock driver of SOCFPGA provides the core clock. Thus, the setup_data_interface() hook calculates timing parameters based on a wrong frequency. To make it work without relying on the clock driver, hard-code the clock frequency, 200MHz. This is fine for existing DT of UniPhier, and also fixes the issue of SOCFPGA because both platforms use 200 MHz for the bus interface clock. Fixes: 1bb88666775e ("mtd: nand: denali: handle timing parameters by setup_data_interface()") Cc: linux-stable <stable@vger.kernel.org> #4.14+ Reported-by: Philipp Rosenberger <p.rosenberger@linutronix.de> Suggested-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Tested-by: Richard Weinberger <richard@nod.at> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22crypto: af_alg - Initialize sg_num_bytes in error code pathStephan Mueller
commit 2546da99212f22034aecf279da9c47cbfac6c981 upstream. The RX SGL in processing is already registered with the RX SGL tracking list to support proper cleanup. The cleanup code path uses the sg_num_bytes variable which must therefore be always initialized, even in the error code path. Signed-off-by: Stephan Mueller <smueller@chronox.de> Reported-by: syzbot+9c251bdd09f83b92ba95@syzkaller.appspotmail.com #syz test: https://github.com/google/kmsan.git master CC: <stable@vger.kernel.org> #4.14 Fixes: e870456d8e7c ("crypto: algif_skcipher - overhaul memory management") Fixes: d887c52d6ae4 ("crypto: algif_aead - overhaul memory management") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22clocksource: Initialize cs->wd_listPeter Zijlstra
commit 5b9e886a4af97574ca3ce1147f35545da0e7afc7 upstream. A number of places relies on list_empty(&cs->wd_list), however the list_head does not get initialized. Do so upon registration, such that thereafter it is possible to rely on list_empty() correctly reflecting the list membership status. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Diego Viola <diego.viola@gmail.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: stable@vger.kernel.org Cc: len.brown@intel.com Cc: rjw@rjwysocki.net Cc: rui.zhang@intel.com Link: https://lkml.kernel.org/r/20180430100344.472662715@infradead.org Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22media: rc: oops in ir_timer_keyup after device unplugSean Young
commit 8d4068810d9926250dd2435719a080b889eb44c3 upstream. If there is IR in the raw kfifo when ir_raw_event_unregister() is called, then kthread_stop() causes ir_raw_event_thread to be scheduled, decode some scancodes and re-arm timer_keyup. The timer_keyup then fires when the rc device is long gone. Cc: stable@vger.kernel.org Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22xhci: Fix USB3 NULL pointer dereference at logical disconnect.Mathias Nyman
commit 2278446e2b7cd33ad894b32e7eb63afc7db6c86e upstream. Hub driver will try to disable a USB3 device twice at logical disconnect, racing with xhci_free_dev() callback from the first port disable. This can be triggered with "udisksctl power-off --block-device <disk>" or by writing "1" to the "remove" sysfs file for a USB3 device in 4.17-rc4. USB3 devices don't have a similar disabled link state as USB2 devices, and use a U3 suspended link state instead. In this state the port is still enabled and connected. hub_port_connect() first disconnects the device, then later it notices that device is still enabled (due to U3 states) it will try to disable the port again (set to U3). The xhci_free_dev() called during device disable is async, so checking for existing xhci->devs[i] when setting link state to U3 the second time was successful, even if device was being freed. The regression was caused by, and whole thing revealed by, Commit 44a182b9d177 ("xhci: Fix use-after-free in xhci_free_virt_device") which sets xhci->devs[i]->udev to NULL before xhci_virt_dev() returned. and causes a NULL pointer dereference the second time we try to set U3. Fix this by checking xhci->devs[i]->udev exists before setting link state. The original patch went to stable so this fix needs to be applied there as well. Fixes: 44a182b9d177 ("xhci: Fix use-after-free in xhci_free_virt_device") Cc: <stable@vger.kernel.org> Reported-by: Jordan Glover <Golden_Miller83@protonmail.ch> Tested-by: Jordan Glover <Golden_Miller83@protonmail.ch> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22net: lan78xx: Fix race in tx pending skb size calculationStefan Wahren
commit dea39aca1d7aef1e2b95b07edeacf04cc8863a2e upstream. The skb size calculation in lan78xx_tx_bh is in race with the start_xmit, which could lead to rare kernel oopses. So protect the whole skb walk with a spin lock. As a benefit we can unlink the skb directly. This patch was tested on Raspberry Pi 3B+ Link: https://github.com/raspberrypi/linux/issues/2608 Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet") Cc: stable <stable@vger.kernel.org> Signed-off-by: Floris Bos <bos@je-eigen-domein.nl> Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22rtlwifi: rtl8821ae: fix firmware is not ready to runPing-Ke Shih
commit 9a98302de19991d51e067b88750585203b2a3ab6 upstream. Without this patch, firmware will not run properly on rtl8821ae, and it causes bad user experience. For example, bad connection performance with low rate, higher power consumption, and so on. rtl8821ae uses two kinds of firmwares for normal and WoWlan cases, and each firmware has firmware data buffer and size individually. Original code always overwrite size of normal firmware rtlpriv->rtlhal.fwsize, and this mismatch causes firmware checksum error, then firmware can't start. In this situation, driver gives message "Firmware is not ready to run!". Fixes: fe89707f0afa ("rtlwifi: rtl8821ae: Simplify loading of WOWLAN firmware") Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Cc: Stable <stable@vger.kernel.org> # 4.0+ Reviewed-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22rtlwifi: Fix kernel Oops "Fw download fail!!"Ping-Ke Shih
commit 12dfa2f68ab659636e092db13b5d17cf9aac82af upstream. When connecting to AP, mac80211 asks driver to enter and leave PS quickly, but driver deinit doesn't wait for delayed work complete when entering PS, then driver reinit procedure and delay work are running simultaneously. This will cause unpredictable kernel oops or crash like rtl8723be: error H2C cmd because of Fw download fail!!! WARNING: CPU: 3 PID: 159 at drivers/net/wireless/realtek/rtlwifi/ rtl8723be/fw.c:227 rtl8723be_fill_h2c_cmd+0x182/0x510 [rtl8723be] CPU: 3 PID: 159 Comm: kworker/3:2 Tainted: G O 4.16.13-2-ARCH #1 Hardware name: ASUSTeK COMPUTER INC. X556UF/X556UF, BIOS X556UF.406 10/21/2016 Workqueue: rtl8723be_pci rtl_c2hcmd_wq_callback [rtlwifi] RIP: 0010:rtl8723be_fill_h2c_cmd+0x182/0x510 [rtl8723be] RSP: 0018:ffffa6ab01e1bd70 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffffa26069071520 RCX: 0000000000000001 RDX: 0000000080000001 RSI: ffffffff8be70e9c RDI: 00000000ffffffff RBP: 0000000000000000 R08: 0000000000000048 R09: 0000000000000348 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 R13: ffffa26069071520 R14: 0000000000000000 R15: ffffa2607d205f70 FS: 0000000000000000(0000) GS:ffffa26081d80000(0000) knlGS:000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000443b39d3000 CR3: 000000037700a005 CR4: 00000000003606e0 Call Trace: ? halbtc_send_bt_mp_operation.constprop.17+0xd5/0xe0 [btcoexist] ? ex_btc8723b1ant_bt_info_notify+0x3b8/0x820 [btcoexist] ? rtl_c2hcmd_launcher+0xab/0x110 [rtlwifi] ? process_one_work+0x1d1/0x3b0 ? worker_thread+0x2b/0x3d0 ? process_one_work+0x3b0/0x3b0 ? kthread+0x112/0x130 ? kthread_create_on_node+0x60/0x60 ? ret_from_fork+0x35/0x40 Code: 00 76 b4 e9 e2 fe ff ff 4c 89 ee 4c 89 e7 e8 56 22 86 ca e9 5e ... This patch ensures all delayed works done before entering PS to satisfy our expectation, so use cancel_delayed_work_sync() instead. An exception is delayed work ips_nic_off_wq because running task may be itself, so add a parameter ips_wq to deinit function to handle this case. This issue is reported and fixed in below threads: https://github.com/lwfinger/rtlwifi_new/issues/367 https://github.com/lwfinger/rtlwifi_new/issues/366 Tested-by: Evgeny Kapun <abacabadabacaba@gmail.com> # 8723DE Tested-by: Shivam Kakkar <shivam543@gmail.com> # 8723BE on 4.18-rc1 Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Fixes: cceb0a597320 ("rtlwifi: Add work queue for c2h cmd.") Cc: Stable <stable@vger.kernel.org> # 4.11+ Reviewed-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22net: cxgb3_main: fix potential Spectre v1Gustavo A. R. Silva
commit 676bcfece19f83621e905aa55b5ed2d45cc4f2d3 upstream. t.qset_idx can be indirectly controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c:2286 cxgb_extension_ioctl() warn: potential spectre issue 'adapter->msix_info' Fix this by sanitizing t.qset_idx before using it to index adapter->msix_info Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2 Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22VSOCK: fix loopback on big-endian systemsClaudio Imbrenda
[ Upstream commit e5ab564c9ebee77794842ca7d7476147b83d6a27 ] The dst_cid and src_cid are 64 bits, therefore 64 bit accessors should be used, and in fact in virtio_transport_common.c only 64 bit accessors are used. Using 32 bit accessors for 64 bit values breaks big endian systems. This patch fixes a wrong use of le32_to_cpu in virtio_transport_send_pkt. Fixes: b9116823189e85ccf384 ("VSOCK: add loopback to virtio_transport") Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22vhost_net: validate sock before trying to put its fdJason Wang
[ Upstream commit b8f1f65882f07913157c44673af7ec0b308d03eb ] Sock will be NULL if we pass -1 to vhost_net_set_backend(), but when we meet errors during ubuf allocation, the code does not check for NULL before calling sockfd_put(), this will lead NULL dereferencing. Fixing by checking sock pointer before. Fixes: bab632d69ee4 ("vhost: vhost TX zero-copy support") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22tcp: prevent bogus FRTO undos with non-SACK flowsIlpo Järvinen
[ Upstream commit 1236f22fbae15df3736ab4a984c64c0c6ee6254c ] If SACK is not enabled and the first cumulative ACK after the RTO retransmission covers more than the retransmitted skb, a spurious FRTO undo will trigger (assuming FRTO is enabled for that RTO). The reason is that any non-retransmitted segment acknowledged will set FLAG_ORIG_SACK_ACKED in tcp_clean_rtx_queue even if there is no indication that it would have been delivered for real (the scoreboard is not kept with TCPCB_SACKED_ACKED bits in the non-SACK case so the check for that bit won't help like it does with SACK). Having FLAG_ORIG_SACK_ACKED set results in the spurious FRTO undo in tcp_process_loss. We need to use more strict condition for non-SACK case and check that none of the cumulatively ACKed segments were retransmitted to prove that progress is due to original transmissions. Only then keep FLAG_ORIG_SACK_ACKED set, allowing FRTO undo to proceed in non-SACK case. (FLAG_ORIG_SACK_ACKED is planned to be renamed to FLAG_ORIG_PROGRESS to better indicate its purpose but to keep this change minimal, it will be done in another patch). Besides burstiness and congestion control violations, this problem can result in RTO loop: When the loss recovery is prematurely undoed, only new data will be transmitted (if available) and the next retransmission can occur only after a new RTO which in case of multiple losses (that are not for consecutive packets) requires one RTO per loss to recover. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Tested-by: Neal Cardwell <ncardwell@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22tcp: fix Fast Open key endiannessYuchung Cheng
[ Upstream commit c860e997e9170a6d68f9d1e6e2cf61f572191aaf ] Fast Open key could be stored in different endian based on the CPU. Previously hosts in different endianness in a server farm using the same key config (sysctl value) would produce different cookies. This patch fixes it by always storing it as little endian to keep same API for LE hosts. Reported-by: Daniele Iamartino <danielei@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-22strparser: Remove early eaten to fix full tcp receive buffer stallDoron Roberts-Kedes
[ Upstream commit 977c7114ebda2e746a114840d3a875e0cdb826fb ] On receving an incomplete message, the existing code stores the remaining length of the cloned skb in the early_eaten field instead of incrementing the value returned by __strp_recv. This defers invocation of sock_rfree for the current skb until the next invocation of __strp_recv, which returns early_eaten if early_eaten is non-zero. This behavior causes a stall when the current message occupies the very tail end of a massive skb, and strp_peek/need_bytes indicates that the remainder of the current message has yet to arrive on the socket. The TCP receive buffer is totally full, causing the TCP window to go to zero, so the remainder of the message will never arrive. Incrementing the value returned by __strp_recv by the amount otherwise stored in early_eaten prevents stalls of this nature. Signed-off-by: Doron Roberts-Kedes <doronrk@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>