summaryrefslogtreecommitdiffstats
path: root/Documentation
AgeCommit message (Collapse)Author
2019-10-05ALSA: hda - Add a quirk model for fixing Huawei Matebook X right speakerTomas Espeleta
[ Upstream commit a2ef03fe617a8365fb7794531b11ba587509a9b9 ] [ This is rather a revival of the patch Tomas sent in months ago, but applying only with the quirk model option -- tiwai ] Hard coded coefficients to make Huawuei Matebook X right speaker work. The Matebook X has a ALC298, please refer to bug 197801 on how these numbers were reverse engineered from the Windows driver The reversed engineered sequence represents a repeating pattern of verbs, and the only values that are changing periodically are written on indexes 0x23 and 0x25: 0x500, 0x23 0x400, VALUE1 0x500, 0x25 0x400, VALUE2 * skipped reading sequences (0x500 - 0xc00 sequences are ignored) * static values from reverse engineering are used NOTE: since a significant risk is still considered, this is provided as an experimental fix that isn't applied as default for now. For enabling the fix, you'll have to choose huawei-mbx-stereo via model option of snd-hda-intel module. If we get feedback from users that this works stably, we may apply it per default. [ Some coding style fixes and replacement with AC_VERB_* by tiwai ] BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=197801 Signed-off-by: Tomas Espeleta <tomas.espeleta@gmail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-09-21ovl: fix regression caused by overlapping layers detectionAmir Goldstein
commit 0be0bfd2de9dfdd2098a9c5b14bdd8f739c9165d upstream. Once upon a time, commit 2cac0c00a6cd ("ovl: get exclusive ownership on upper/work dirs") in v4.13 added some sanity checks on overlayfs layers. This change caused a docker regression. The root cause was mount leaks by docker, which as far as I know, still exist. To mitigate the regression, commit 85fdee1eef1a ("ovl: fix regression caused by exclusive upper/work dir protection") in v4.14 turned the mount errors into warnings for the default index=off configuration. Recently, commit 146d62e5a586 ("ovl: detect overlapping layers") in v5.2, re-introduced exclusive upper/work dir checks regardless of index=off configuration. This changes the status quo and mount leak related bug reports have started to re-surface. Restore the status quo to fix the regressions. To clarify, index=off does NOT relax overlapping layers check for this ovelayfs mount. index=off only relaxes exclusive upper/work dir checks with another overlayfs mount. To cover the part of overlapping layers detection that used the exclusive upper/work dir checks to detect overlap with self upper/work dir, add a trap also on the work base dir. Link: https://github.com/moby/moby/issues/34672 Link: https://lore.kernel.org/linux-fsdevel/20171006121405.GA32700@veci.piliscsaba.szeredi.hu/ Link: https://github.com/containers/libpod/issues/3540 Fixes: 146d62e5a586 ("ovl: detect overlapping layers") Cc: <stable@vger.kernel.org> # v4.19+ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Tested-by: Colin Walters <walters@verbum.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-29x86/CPU/AMD: Clear RDRAND CPUID bit on AMD family 15h/16hTom Lendacky
commit c49a0a80137c7ca7d6ced4c812c9e07a949f6f24 upstream. There have been reports of RDRAND issues after resuming from suspend on some AMD family 15h and family 16h systems. This issue stems from a BIOS not performing the proper steps during resume to ensure RDRAND continues to function properly. RDRAND support is indicated by CPUID Fn00000001_ECX[30]. This bit can be reset by clearing MSR C001_1004[62]. Any software that checks for RDRAND support using CPUID, including the kernel, will believe that RDRAND is not supported. Update the CPU initialization to clear the RDRAND CPUID bit for any family 15h and 16h processor that supports RDRAND. If it is known that the family 15h or family 16h system does not have an RDRAND resume issue or that the system will not be placed in suspend, the "rdrand=force" kernel parameter can be used to stop the clearing of the RDRAND CPUID bit. Additionally, update the suspend and resume path to save and restore the MSR C001_1004 value to ensure that the RDRAND CPUID setting remains in place after resuming from suspend. Note, that clearing the RDRAND CPUID bit does not prevent a processor that normally supports the RDRAND instruction from executing it. So any code that determined the support based on family and model won't #UD. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Chen Yu <yu.c.chen@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: "linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org> Cc: "linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: <stable@vger.kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "x86@kernel.org" <x86@kernel.org> Link: https://lkml.kernel.org/r/7543af91666f491547bd86cebb1e17c66824ab9f.1566229943.git.thomas.lendacky@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-29dt-bindings: riscv: fix the schema compatible string for the HiFive ↵Paul Walmsley
Unleashed board [ Upstream commit b390e0bfd2996f1215231395f4e25a4c011eeaf9 ] The YAML binding document for SiFive boards has an incorrect compatible string for the HiFive Unleashed board. Change it to match the name of the board on the SiFive web site: https://www.sifive.com/boards/hifive-unleashed which also matches the contents of the board DT data file: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/riscv/boot/dts/sifive/hifive-unleashed-a00.dts#n13 Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-08-25net/tls: prevent skb_orphan() from leaking TLS plain text with offloadJakub Kicinski
[ Upstream commit 414776621d1006e57e80e6db7fdc3837897aaa64 ] sk_validate_xmit_skb() and drivers depend on the sk member of struct sk_buff to identify segments requiring encryption. Any operation which removes or does not preserve the original TLS socket such as skb_orphan() or skb_clone() will cause clear text leaks. Make the TCP socket underlying an offloaded TLS connection mark all skbs as decrypted, if TLS TX is in offload mode. Then in sk_validate_xmit_skb() catch skbs which have no socket (or a socket with no validation) and decrypted flag set. Note that CONFIG_SOCK_VALIDATE_XMIT, CONFIG_TLS_DEVICE and sk->sk_validate_xmit_skb are slightly interchangeable right now, they all imply TLS offload. The new checks are guarded by CONFIG_TLS_DEVICE because that's the option guarding the sk_buff->decrypted member. Second, smaller issue with orphaning is that it breaks the guarantee that packets will be delivered to device queues in-order. All TLS offload drivers depend on that scheduling property. This means skb_orphan_partial()'s trick of preserving partial socket references will cause issues in the drivers. We need a full orphan, and as a result netem delay/throttling will cause all TLS offload skbs to be dropped. Reusing the sk_buff->decrypted flag also protects from leaking clear text when incoming, decrypted skb is redirected (e.g. by TC). See commit 0608c69c9a80 ("bpf: sk_msg, sock{map|hash} redirect through ULP") for justification why the internal flag is safe. The only location which could leak the flag in is tcp_bpf_sendmsg(), which is taken care of by clearing the previously unused bit. v2: - remove superfluous decrypted mark copy (Willem); - remove the stale doc entry (Boris); - rely entirely on EOR marking to prevent coalescing (Boris); - use an internal sendpages flag instead of marking the socket (Boris). v3 (Willem): - reorganize the can_skb_orphan_partial() condition; - fix the flag leak-in through tcp_bpf_sendmsg. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-25mm/hmm: always return EBUSY for invalid ranges in hmm_range_{fault,snapshot}Christoph Hellwig
[ Upstream commit 2bcbeaefde2f0384d6ad351c151b1a9fe7791a0a ] We should not have two different error codes for the same condition. EAGAIN must be reserved for the FAULT_FLAG_ALLOW_RETRY retry case and signals to the caller that the mmap_sem has been unlocked. Use EBUSY for the !valid case so that callers can get the locking right. Link: https://lore.kernel.org/r/20190724065258.16603-2-hch@lst.de Tested-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> [jgg: elaborated commit message] Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-08-06Documentation: Add swapgs description to the Spectre v1 documentationJosh Poimboeuf
commit 4c92057661a3412f547ede95715641d7ee16ddac upstream Add documentation to the Spectre document about the new swapgs variant of Spectre v1. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06x86/speculation: Enable Spectre v1 swapgs mitigationsJosh Poimboeuf
commit a2059825986a1c8143fd6698774fa9d83733bb11 upstream The previous commit added macro calls in the entry code which mitigate the Spectre v1 swapgs issue if the X86_FEATURE_FENCE_SWAPGS_* features are enabled. Enable those features where applicable. The mitigations may be disabled with "nospectre_v1" or "mitigations=off". There are different features which can affect the risk of attack: - When FSGSBASE is enabled, unprivileged users are able to place any value in GS, using the wrgsbase instruction. This means they can write a GS value which points to any value in kernel space, which can be useful with the following gadget in an interrupt/exception/NMI handler: if (coming from user space) swapgs mov %gs:<percpu_offset>, %reg1 // dependent load or store based on the value of %reg // for example: mov %(reg1), %reg2 If an interrupt is coming from user space, and the entry code speculatively skips the swapgs (due to user branch mistraining), it may speculatively execute the GS-based load and a subsequent dependent load or store, exposing the kernel data to an L1 side channel leak. Note that, on Intel, a similar attack exists in the above gadget when coming from kernel space, if the swapgs gets speculatively executed to switch back to the user GS. On AMD, this variant isn't possible because swapgs is serializing with respect to future GS-based accesses. NOTE: The FSGSBASE patch set hasn't been merged yet, so the above case doesn't exist quite yet. - When FSGSBASE is disabled, the issue is mitigated somewhat because unprivileged users must use prctl(ARCH_SET_GS) to set GS, which restricts GS values to user space addresses only. That means the gadget would need an additional step, since the target kernel address needs to be read from user space first. Something like: if (coming from user space) swapgs mov %gs:<percpu_offset>, %reg1 mov (%reg1), %reg2 // dependent load or store based on the value of %reg2 // for example: mov %(reg2), %reg3 It's difficult to audit for this gadget in all the handlers, so while there are no known instances of it, it's entirely possible that it exists somewhere (or could be introduced in the future). Without tooling to analyze all such code paths, consider it vulnerable. Effects of SMAP on the !FSGSBASE case: - If SMAP is enabled, and the CPU reports RDCL_NO (i.e., not susceptible to Meltdown), the kernel is prevented from speculatively reading user space memory, even L1 cached values. This effectively disables the !FSGSBASE attack vector. - If SMAP is enabled, but the CPU *is* susceptible to Meltdown, SMAP still prevents the kernel from speculatively reading user space memory. But it does *not* prevent the kernel from reading the user value from L1, if it has already been cached. This is probably only a small hurdle for an attacker to overcome. Thanks to Dave Hansen for contributing the speculative_smap() function. Thanks to Andrew Cooper for providing the inside scoop on whether swapgs is serializing on AMD. [ tglx: Fixed the USER fence decision and polished the comment as suggested by Dave Hansen ] Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31drm/panel: Add support for Armadeus ST0700 AdaptSébastien Szymanski
commit c479450f61c7f1f248c9a54aedacd2a6ca521ff8 upstream. This patch adds support for the Armadeus ST0700 Adapt. It comes with a Santek ST0700I5Y-RBSLW 7.0" WVGA (800x480) TFT and an adapter board so that it can be connected on the TFT header of Armadeus Dev boards. Cc: stable@vger.kernel.org # v4.19 Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Sébastien Szymanski <sebastien.szymanski@armadeus.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190507152713.27494-1-sebastien.szymanski@armadeus.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31Revert "usb: usb251xb: Add US lanes inversion dts-bindings"Lucas Stach
commit bafe64e5f0edaa689e72e2f8dc236641da37fed4 upstream. This reverts commit 3342ce35a1, as there is no need for this separate property and it breaks compatibility with existing devicetree files (arch/arm64/boot/dts/freescale/imx8mq.dtsi). CC: stable@vger.kernel.org #5.2 Fixes: 3342ce35a183 ("usb: usb251xb: Add US lanes inversion dts-bindings") Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Link: https://lore.kernel.org/r/20190719084407.28041-1-l.stach@pengutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31dt-bindings: backlight: lm3630a: correct schema validationBrian Masney
[ Upstream commit ef4db28c1f45cda6989bc8a8e45294894786d947 ] The '#address-cells' and '#size-cells' properties were not defined in the lm3630a bindings and would cause the following error when attempting to validate the examples against the schema: Documentation/devicetree/bindings/leds/backlight/lm3630a-backlight.example.dt.yaml: '#address-cells', '#size-cells' do not match any of the regexes: '^led@[01]$', 'pinctrl-[0-9]+' Correct this by adding those two properties. While we're here, move the ti,linear-mapping-mode property to the led@[01] child nodes to correct the following validation error: Documentation/devicetree/bindings/leds/backlight/lm3630a-backlight.example.dt.yaml: led@0: 'ti,linear-mapping-mode' does not match any of the regexes: 'pinctrl-[0-9]+' Fixes: 32fcb75c66a0 ("dt-bindings: backlight: Add lm3630a bindings") Signed-off-by: Brian Masney <masneyb@onstation.org> Reported-by: Rob Herring <robh+dt@kernel.org> Acked-by: Daniel Thompson <daniel.thompson@linaro.org> Acked-by: Dan Murphy <dmurphy@ti.com> [robh: also drop maxItems from child reg] Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-07-26dt-bindings: allow up to four clocks for orion-mdioJosua Mayer
commit 80785f5a22e9073e2ded5958feb7f220e066d17b upstream. Armada 8040 needs four clocks to be enabled for MDIO accesses to work. Update the binding to allow the extra clock to be specified. Cc: stable@vger.kernel.org Fixes: 6d6a331f44a1 ("dt-bindings: allow up to three clocks for orion-mdio") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Josua Mayer <josua@solid-run.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26x86/atomic: Fix smp_mb__{before,after}_atomic()Peter Zijlstra
[ Upstream commit 69d927bba39517d0980462efc051875b7f4db185 ] Recent probing at the Linux Kernel Memory Model uncovered a 'surprise'. Strongly ordered architectures where the atomic RmW primitive implies full memory ordering and smp_mb__{before,after}_atomic() are a simple barrier() (such as x86) fail for: *x = 1; atomic_inc(u); smp_mb__after_atomic(); r0 = *y; Because, while the atomic_inc() implies memory order, it (surprisingly) does not provide a compiler barrier. This then allows the compiler to re-order like so: atomic_inc(u); *x = 1; smp_mb__after_atomic(); r0 = *y; Which the CPU is then allowed to re-order (under TSO rules) like: atomic_inc(u); r0 = *y; *x = 1; And this very much was not intended. Therefore strengthen the atomic RmW ops to include a compiler barrier. NOTE: atomic_{or,and,xor} and the bitops already had the compiler barrier. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-07-26sched/fair: Fix "runnable_avg_yN_inv" not used warningsQian Cai
[ Upstream commit 509466b7d480bc5d22e90b9fbe6122ae0e2fbe39 ] runnable_avg_yN_inv[] is only used in kernel/sched/pelt.c but was included in several other places because they need other macros all came from kernel/sched/sched-pelt.h which was generated by Documentation/scheduler/sched-pelt. As the result, it causes compilation a lot of warnings, kernel/sched/sched-pelt.h:4:18: warning: 'runnable_avg_yN_inv' defined but not used [-Wunused-const-variable=] kernel/sched/sched-pelt.h:4:18: warning: 'runnable_avg_yN_inv' defined but not used [-Wunused-const-variable=] kernel/sched/sched-pelt.h:4:18: warning: 'runnable_avg_yN_inv' defined but not used [-Wunused-const-variable=] ... Silence it by appending the __maybe_unused attribute for it, so all generated variables and macros can still be kept in the same file. Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/1559596304-31581-1-git-send-email-cai@lca.pw Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-07-14Documentation/admin: Remove the vsyscall=native documentationAndy Lutomirski
commit d974ffcfb7447db5f29a4b662a3eaf99a4e1109e upstream. The vsyscall=native feature is gone -- remove the docs. Fixes: 076ca272a14c ("x86/vsyscall/64: Drop "native" vsyscalls") Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Kees Cook <keescook@chromium.org> Cc: Florian Weimer <fweimer@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: stable@vger.kernel.org Cc: Borislav Petkov <bp@alien8.de> Cc: Kernel Hardening <kernel-hardening@lists.openwall.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/d77c7105eb4c57c1a95a95b6a5b8ba194a18e764.1561610354.git.luto@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-14Documentation: Add section about CPU vulnerabilities for SpectreTim Chen
commit 6e88559470f581741bcd0f2794f9054814ac9740 upstream. Add documentation for Spectre vulnerability and the mitigation mechanisms: - Explain the problem and risks - Document the mitigation mechanisms - Document the command line controls - Document the sysfs files Co-developed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Co-developed-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-26dt-bindings: riscv: resolve 'make dt_binding_check' warningsPaul Walmsley
Rob pointed out that one of the examples in the RISC-V 'cpus' YAML schema results in warnings from 'make dt_binding_check'. Fix these. While here, make the whitespace in the second example consistent with the first example. Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com> Cc: Rob Herring <robh@kernel.org> Reviewed-by: Rob Herring <robh@kernel.org> # for fixing the dtc warnings
2019-06-21Merge tag 'char-misc-5.2-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are a number of small driver fixes for 5.2-rc6 Nothing major, just fixes for reported issues: - soundwire fixes - thunderbolt fixes - MAINTAINERS update for fpga maintainer change - binder bugfix - habanalabs 64bit pointer fix - documentation updates All of these have been in linux-next with no reported issues" * tag 'char-misc-5.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: habanalabs: use u64_to_user_ptr() for reading user pointers doc: fix documentation about UIO_MEM_LOGICAL using MAINTAINERS / Documentation: Thorsten Scherer is the successor of Gavin Schenk docs: fb: Add TER16x32 to the available font names MAINTAINERS: fpga: hand off maintainership to Moritz thunderbolt: Implement CIO reset correctly for Titan Ridge binder: fix possible UAF when freeing buffer thunderbolt: Make sure device runtime resume completes before taking domain lock soundwire: intel: set dai min and max channels correctly soundwire: stream: fix bad unlock balance soundwire: stream: fix out of boundary access on port properties
2019-06-20Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "Fixes for ARM and x86, plus selftest patches and nicer structs for nested state save/restore" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: nVMX: reorganize initial steps of vmx_set_nested_state KVM: arm/arm64: Fix emulated ptimer irq injection tests: kvm: Check for a kernel warning kvm: tests: Sort tests in the Makefile alphabetically KVM: x86/mmu: Allocate PAE root array when using SVM's 32-bit NPT KVM: x86: Modify struct kvm_nested_state to have explicit fields for data KVM: fix typo in documentation KVM: nVMX: use correct clean fields when copying from eVMCS KVM: arm/arm64: vgic: Fix kvm_device leak in vgic_its_destroy KVM: arm64: Filter out invalid core register IDs in KVM_GET_REG_LIST KVM: arm64: Implement vq_present() as a macro
2019-06-19doc: fix documentation about UIO_MEM_LOGICAL usingYang Yingliang
After commit d4fc5069a394 ("mm: switch s_mem and slab_cache in struct page") page->mapping will be re-used by slab allocations and page->mapping->host will be used in balance_dirty_pages_ratelimited() as an inode member but it's not an inode in fact and leads an oops. [ 159.906493] Unable to handle kernel paging request at virtual address ffff200012d90be8 [ 159.908029] Mem abort info: [ 159.908552] ESR = 0x96000007 [ 159.909138] Exception class = DABT (current EL), IL = 32 bits [ 159.910155] SET = 0, FnV = 0 [ 159.910690] EA = 0, S1PTW = 0 [ 159.911241] Data abort info: [ 159.911846] ISV = 0, ISS = 0x00000007 [ 159.912567] CM = 0, WnR = 0 [ 159.913105] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000042acd000 [ 159.914269] [ffff200012d90be8] pgd=000000043ffff003, pud=000000043fffe003, pmd=000000043fffa003, pte=0000000000000000 [ 159.916280] Internal error: Oops: 96000007 [#1] SMP [ 159.917195] Dumping ftrace buffer: [ 159.917845] (ftrace buffer empty) [ 159.918521] Modules linked in: uio_dev(OE) [ 159.919276] CPU: 1 PID: 295 Comm: uio_test Tainted: G OE 5.2.0-rc4+ #46 [ 159.920859] Hardware name: linux,dummy-virt (DT) [ 159.921815] pstate: 60000005 (nZCv daif -PAN -UAO) [ 159.922809] pc : balance_dirty_pages_ratelimited+0x68/0xc38 [ 159.923965] lr : fault_dirty_shared_page.isra.8+0xe4/0x100 [ 159.925134] sp : ffff800368a77ae0 [ 159.925824] x29: ffff800368a77ae0 x28: 1ffff0006d14ce1a [ 159.926906] x27: ffff800368a670d0 x26: ffff800368a67120 [ 159.927985] x25: 1ffff0006d10f5fe x24: ffff200012d90be8 [ 159.929089] x23: ffff200013732000 x22: ffff80036ec03200 [ 159.930172] x21: ffff200012d90bc0 x20: 1fffe400025b217d [ 159.931253] x19: ffff80036ec03200 x18: 0000000000000000 [ 159.932348] x17: 0000000000000000 x16: 0ffffe0000010208 [ 159.933439] x15: 0000000000000000 x14: 0000000000000000 [ 159.934518] x13: 0000000000000000 x12: 0000000000000000 [ 159.935596] x11: 1fffefc001b452c0 x10: ffff0fc001b452c0 [ 159.936697] x9 : dfff200000000000 x8 : dfff200000000001 [ 159.937781] x7 : ffff7e000da29607 x6 : ffff0fc001b452c1 [ 159.938859] x5 : ffff0fc001b452c1 x4 : ffff0fc001b452c1 [ 159.939944] x3 : ffff200010523ad4 x2 : 1fffe400026e659b [ 159.941065] x1 : dfff200000000000 x0 : ffff200013732cd8 [ 159.942205] Call trace: [ 159.942732] balance_dirty_pages_ratelimited+0x68/0xc38 [ 159.943797] fault_dirty_shared_page.isra.8+0xe4/0x100 [ 159.944867] do_fault+0x608/0x1250 [ 159.945571] __handle_mm_fault+0x93c/0xfb8 [ 159.946412] handle_mm_fault+0x1c0/0x360 [ 159.947224] do_page_fault+0x358/0x8d0 [ 159.947993] do_translation_fault+0xf8/0x124 [ 159.948884] do_mem_abort+0x70/0x190 [ 159.949624] el0_da+0x24/0x28 According another commit 5e901d0b15c0 ("scsi: qedi: Fix bad pte call trace when iscsiuio is stopped."), using kmalloc also cause other problem. But the documentation about UIO_MEM_LOGICAL allows using kmalloc(), remove and don't allow using kmalloc() in documentation. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-19MAINTAINERS / Documentation: Thorsten Scherer is the successor of Gavin SchenkGavin Schenk
Due to new challenges in my life I can no longer take care of SIOX. Thorsten takes over my SIOX tasks. Signed-off-by: Gavin Schenk <g.schenk@eckelmann.de> Acked-by: Thorsten Scherer <t.scherer@eckelmann.de> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-19docs: fb: Add TER16x32 to the available font namesTakashi Iwai
The new font is available since recently. Signed-off-by: Takashi Iwai <tiwai@suse.de> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-19KVM: x86: Modify struct kvm_nested_state to have explicit fields for dataLiran Alon
Improve the KVM_{GET,SET}_NESTED_STATE structs by detailing the format of VMX nested state data in a struct. In order to avoid changing the ioctl values of KVM_{GET,SET}_NESTED_STATE, there is a need to preserve sizeof(struct kvm_nested_state). This is done by defining the data struct as "data.vmx[0]". It was the most elegant way I found to preserve struct size while still keeping struct readable and easy to maintain. It does have a misfortunate side-effect that now it has to be accessed as "data.vmx[0]" rather than just "data.vmx". Because we are already modifying these structs, I also modified the following: * Define the "format" field values as macros. * Rename vmcs_pa to vmcs12_pa for better readability. Signed-off-by: Liran Alon <liran.alon@oracle.com> [Remove SVM stubs, add KVM_STATE_NESTED_VMX_VMCS12_SIZE. - Paolo] Reviewed-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-06-18KVM: fix typo in documentationDennis Restle
The documentation mentions a non-existing capability KVM_CAP_USER_MEM.s The right name is KVM_CAP_USER_MEMORY. Signed-off-by: Dennis Restle <derestle@htwg-konstanz.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-06-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: "Lots of bug fixes here: 1) Out of bounds access in __bpf_skc_lookup, from Lorenz Bauer. 2) Fix rate reporting in cfg80211_calculate_bitrate_he(), from John Crispin. 3) Use after free in psock backlog workqueue, from John Fastabend. 4) Fix source port matching in fdb peer flow rule of mlx5, from Raed Salem. 5) Use atomic_inc_not_zero() in fl6_sock_lookup(), from Eric Dumazet. 6) Network header needs to be set for packet redirect in nfp, from John Hurley. 7) Fix udp zerocopy refcnt, from Willem de Bruijn. 8) Don't assume linear buffers in vxlan and geneve error handlers, from Stefano Brivio. 9) Fix TOS matching in mlxsw, from Jiri Pirko. 10) More SCTP cookie memory leak fixes, from Neil Horman. 11) Fix VLAN filtering in rtl8366, from Linus Walluij. 12) Various TCP SACK payload size and fragmentation memory limit fixes from Eric Dumazet. 13) Use after free in pneigh_get_next(), also from Eric Dumazet. 14) LAPB control block leak fix from Jeremy Sowden" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (145 commits) lapb: fixed leak of control-blocks. tipc: purge deferredq list for each grp member in tipc_group_delete ax25: fix inconsistent lock state in ax25_destroy_timer neigh: fix use-after-free read in pneigh_get_next tcp: fix compile error if !CONFIG_SYSCTL hv_sock: Suppress bogus "may be used uninitialized" warnings be2net: Fix number of Rx queues used for flow hashing net: handle 802.1P vlan 0 packets properly tcp: enforce tcp_min_snd_mss in tcp_mtu_probing() tcp: add tcp_min_snd_mss sysctl tcp: tcp_fragment() should apply sane memory limits tcp: limit payload size of sacked skbs Revert "net: phylink: set the autoneg state in phylink_phy_change" bpf: fix nested bpf tracepoints with per-cpu data bpf: Fix out of bounds memory access in bpf_sk_storage vsock/virtio: set SOCK_DONE on peer shutdown net: dsa: rtl8366: Fix up VLAN filtering net: phylink: set the autoneg state in phylink_phy_change net: add high_order_alloc_disable sysctl/static key tcp: add tcp_tx_skb_cache sysctl ...
2019-06-17Merge tag 'riscv-for-v5.2/fixes-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Paul Walmsley: "This contains fixes, defconfig, and DT data changes for the v5.2-rc series. The fixes are relatively straightforward: - Addition of a TLB fence in the vmalloc_fault path, so the CPU doesn't enter an infinite page fault loop - Readdition of the pm_power_off export, so device drivers that reassign it can now be built as modules - A udelay() fix for RV32, fixing a miscomputation of the delay time - Removal of deprecated smp_mb__*() barriers This also adds initial DT data infrastructure for arch/riscv, along with initial data for the SiFive FU540-C000 SoC and the corresponding HiFive Unleashed board. We also update the RV64 defconfig to include some core drivers for the FU540 in the build" * tag 'riscv-for-v5.2/fixes-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: remove unused barrier defines riscv: mm: synchronize MMU after pte change riscv: dts: add initial board data for the SiFive HiFive Unleashed riscv: dts: add initial support for the SiFive FU540-C000 SoC dt-bindings: riscv: convert cpu binding to json-schema dt-bindings: riscv: sifive: add YAML documentation for the SiFive FU540 arch: riscv: add support for building DTB files from DT source data riscv: Fix udelay in RV32. riscv: export pm_power_off again RISC-V: defconfig: enable clocks, serial console
2019-06-17dt-bindings: riscv: convert cpu binding to json-schemaPaul Walmsley
At Rob's request, we're starting to migrate our DT binding documentation to json-schema YAML format. Start by converting our cpu binding documentation. While doing so, document more properties and nodes. This includes adding binding documentation support for the E51 and U54 CPU cores ("harts") that are present on this SoC. These cores are described in: https://static.dev.sifive.com/FU540-C000-v1.0.pdf This cpus.yaml file is intended to be a starting point and to evolve over time. It passes dt-doc-validate as of the yaml-bindings commit 4c79d42e9216. This patch was originally based on the ARM json-schema binding documentation as added by commit 672951cbd1b7 ("dt-bindings: arm: Convert cpu binding to json-schema"). Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com> Signed-off-by: Paul Walmsley <paul@pwsan.com> Reviewed-by: Rob Herring <robh@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: devicetree@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-riscv@lists.infradead.org
2019-06-17dt-bindings: riscv: sifive: add YAML documentation for the SiFive FU540Paul Walmsley
Add YAML DT binding documentation for the SiFive FU540 SoC. This SoC is documented at: https://static.dev.sifive.com/FU540-C000-v1.0.pdf Passes dt-doc-validate, as of yaml-bindings commit 4c79d42e9216. Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com> Signed-off-by: Paul Walmsley <paul@pwsan.com> Reviewed-by: Rob Herring <robh@kernel.org> Cc: Rob Herring <robh+dt@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: devicetree@vger.kernel.org Cc: linux-riscv@lists.infradead.org Cc: linux-kernel@vger.kernel.org
2019-06-15tcp: add tcp_min_snd_mss sysctlEric Dumazet
Some TCP peers announce a very small MSS option in their SYN and/or SYN/ACK messages. This forces the stack to send packets with a very high network/cpu overhead. Linux has enforced a minimal value of 48. Since this value includes the size of TCP options, and that the options can consume up to 40 bytes, this means that each segment can include only 8 bytes of payload. In some cases, it can be useful to increase the minimal value to a saner value. We still let the default to 48 (TCP_MIN_SND_MSS), for compatibility reasons. Note that TCP_MAXSEG socket option enforces a minimal value of (TCP_MIN_MSS). David Miller increased this minimal value in commit c39508d6f118 ("tcp: Make TCP_MAXSEG minimum more correct.") from 64 to 88. We might in the future merge TCP_MIN_SND_MSS and TCP_MIN_MSS. CVE-2019-11479 -- tcp mss hardcoded to 48 Signed-off-by: Eric Dumazet <edumazet@google.com> Suggested-by: Jonathan Looney <jtl@netflix.com> Acked-by: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Tyler Hicks <tyhicks@canonical.com> Cc: Bruce Curtis <brucec@netflix.com> Cc: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-14Merge branch 'for-5.2-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: "This has an unusually high density of tricky fixes: - task_get_css() could deadlock when it races against a dying cgroup. - cgroup.procs didn't list thread group leaders with live threads. This could mislead readers to think that a cgroup is empty when it's not. Fixed by making PROCS iterator include dead tasks. I made a couple mistakes making this change and this pull request contains a couple follow-up patches. - When cpusets run out of online cpus, it updates cpusmasks of member tasks in bizarre ways. Joel improved the behavior significantly" * 'for-5.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cpuset: restore sanity to cpuset_cpus_allowed_fallback() cgroup: Fix css_task_iter_advance_css_set() cset skip condition cgroup: css_task_iter_skip()'d iterators must be advanced before accessed cgroup: Include dying leaders with live threads in PROCS iterations cgroup: Implement css_task_iter_skip() cgroup: Call cgroup_release() before __exit_signal() docs cgroups: add another example size for hugetlb cgroup: Use css_tryget() instead of css_tryget_online() in task_get_css()
2019-06-14tcp: add tcp_rx_skb_cache sysctlEric Dumazet
Instead of relying on rps_needed, it is safer to use a separate static key, since we do not want to enable TCP rx_skb_cache by default. This feature can cause huge increase of memory usage on hosts with millions of sockets. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-14qmi_wwan: extend permitted QMAP mux_id value rangeReinhard Speyerer
Permit mux_id values up to 254 to be used in qmimux_register_device() for compatibility with ip(8) and the rmnet driver. Fixes: c6adf77953bc ("net: usb: qmi_wwan: add qmap mux protocol support") Cc: Daniele Palmas <dnlplm@gmail.com> Signed-off-by: Reinhard Speyerer <rspmn@arcor.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-14Merge tag 'for-linus-20190614' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: - Remove references to old schedulers for the scheduler switching and blkio controller documentation (Andreas) - Kill duplicate check for report zone for null_blk (Chaitanya) - Two bcache fixes (Coly) - Ensure that mq-deadline is selected if zoned block device is enabled, as we need that to support them (Damien) - Fix io_uring memory leak (Eric) - ps3vram fallout from LBDAF removal (Geert) - Redundant blk-mq debugfs debugfs_create return check cleanup (Greg) - Extend NOPLM quirk for ST1000LM024 drives (Hans) - Remove error path warning that can now trigger after the queue removal/addition fixes (Ming) * tag 'for-linus-20190614' of git://git.kernel.dk/linux-block: block/ps3vram: Use %llu to format sector_t after LBDAF removal libata: Extend quirks for the ST1000LM024 drives with NOLPM quirk bcache: only set BCACHE_DEV_WB_RUNNING when cached device attached bcache: fix stack corruption by PRECEDING_KEY() blk-mq: remove WARN_ON(!q->elevator) from blk_mq_sched_free_requests blkio-controller.txt: Remove references to CFQ block/switching-sched.txt: Update to blk-mq schedulers null_blk: remove duplicate check for report zone blk-mq: no need to check return value of debugfs_create functions io_uring: fix memory leak of UNIX domain socket inode block: force select mq-deadline for zoned block devices
2019-06-14Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "Here are some arm64 fixes for -rc5. The only non-trivial change (in terms of the diffstat) is fixing our SVE ptrace API for big-endian machines, but the majority of this is actually the addition of much-needed comments and updates to the documentation to try to avoid this mess biting us again in future. There are still a couple of small things on the horizon, but nothing major at this point. Summary: - Fix broken SVE ptrace API when running in a big-endian configuration - Fix performance regression due to off-by-one in TLBI range checking - Fix build regression when using Clang" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64/sve: Fix missing SVE/FPSIMD endianness conversions arm64: tlbflush: Ensure start/end of address range are aligned to stride arm64: Don't unconditionally add -Wno-psabi to KBUILD_CFLAGS
2019-06-13arm64/sve: Fix missing SVE/FPSIMD endianness conversionsDave Martin
The in-memory representation of SVE and FPSIMD registers is different: the FPSIMD V-registers are stored as single 128-bit host-endian values, whereas SVE registers are stored in an endianness-invariant byte order. This means that the two representations differ when running on a big-endian host. But we blindly copy data from one representation to another when converting between the two, resulting in the register contents being unintentionally byteswapped in certain situations. Currently this can be triggered by the first SVE instruction after a syscall, for example (though the potential trigger points may vary in future). So, fix the conversion functions fpsimd_to_sve(), sve_to_fpsimd() and sve_sync_from_fpsimd_zeropad() to swab where appropriate. There is no common swahl128() or swab128() that we could use here. Maybe it would be worth making this generic, but for now add a simple local hack. Since the byte order differences are exposed in ABI, also clarify the documentation. Cc: Alex Bennée <alex.bennee@linaro.org> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Alan Hayward <alan.hayward@arm.com> Cc: Julien Grall <julien.grall@arm.com> Fixes: bc0ee4760364 ("arm64/sve: Core task context handling") Fixes: 8cd969d28fd2 ("arm64/sve: Signal handling support") Fixes: 43d4da2c45b2 ("arm64/sve: ptrace and ELF coredump support") Signed-off-by: Dave Martin <Dave.Martin@arm.com> [will: Fix typos in comments and docs spotted by Julien] Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-06-13blkio-controller.txt: Remove references to CFQAndreas Herrmann
CFQ is gone. No need anymore to document its "proportional weight time based division of disk policy". Signed-off-by: Andreas Herrmann <aherrmann@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-13block/switching-sched.txt: Update to blk-mq schedulersAndreas Herrmann
Remove references to CFQ and legacy block layer which are gone. Update example with what's available under blk-mq. Signed-off-by: Andreas Herrmann <aherrmann@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-12linux-next: DOC: RDS: Fix a typo in rds.txtMasanari Iida
This patch fixes a spelling typo in rds.txt Signed-off-by: Masanari Iida <standby24x7@gmail.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-09Merge tag 'linux-can-fixes-for-5.2-20190607' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2019-06-07 this is a pull reqeust of 9 patches for net/master. The first patch is by Alexander Dahl and removes a duplicate menu entry from the Kconfig. The next patch by Joakim Zhang fixes the timeout in the flexcan driver when setting small bit rates. Anssi Hannula's patch for the xilinx_can driver fixes the bittiming_const for CAN FD core. The two patches by Sean Nyekjaer bring mcp25625 to the existing mcp251x driver. The patch by Eugen Hristev implements an errata for the m_can driver. YueHaibing's patch fixes the error handling ing can_init(). The patch by Fabio Estevam for the flexcan driver removes an unneeded registration message during flexcan_probe(). And the last patch is by Willem de Bruijn and adds the missing purging the socket error queue on sock destruct. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf 2019-06-07 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix several bugs in riscv64 JIT code emission which forgot to clear high 32-bits for alu32 ops, from Björn and Luke with selftests covering all relevant BPF alu ops from Björn and Jiong. 2) Two fixes for UDP BPF reuseport that avoid calling the program in case of __udp6_lib_err and UDP GRO which broke reuseport_select_sock() assumption that skb->data is pointing to transport header, from Martin. 3) Two fixes for BPF sockmap: a use-after-free from sleep in psock's backlog workqueue, and a missing restore of sk_write_space when psock gets dropped, from Jakub and John. 4) Fix unconnected UDP sendmsg hook API which is insufficient as-is since it breaks standard applications like DNS if reverse NAT is not performed upon receive, from Daniel. 5) Fix an out-of-bounds read in __bpf_skc_lookup which in case of AF_INET6 fails to verify that the length of the tuple is long enough, from Lorenz. 6) Fix libbpf's libbpf__probe_raw_btf to return an fd instead of 0/1 (for {un,}successful probe) as that is expected to be propagated as an fd to load_sk_storage_btf() and thus closing the wrong descriptor otherwise, from Michal. 7) Fix bpftool's JSON output for the case when a lookup fails, from Krzesimir. 8) Minor misc fixes in docs, samples and selftests, from various others. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-07dt-bindings: can: mcp251x: add mcp25625 supportSean Nyekjaer
Fully compatible with mcp2515, the mcp25625 have integrated transceiver. This patch add the mcp25625 to the device tree bindings documentation. Signed-off-by: Sean Nyekjaer <sean@geanix.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-06-06Merge tag 'ovl-fixes-5.2-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs fixes from Miklos Szeredi: "Here's one fix for a class of bugs triggered by syzcaller, and one that makes xfstests fail less" * tag 'ovl-fixes-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: doc: add non-standard corner cases ovl: detect overlapping layers ovl: support the FS_IOC_FS[SG]ETXATTR ioctls
2019-06-01mm, memcg: consider subtrees in memory.eventsChris Down
memory.stat and other files already consider subtrees in their output, and we should too in order to not present an inconsistent interface. The current situation is fairly confusing, because people interacting with cgroups expect hierarchical behaviour in the vein of memory.stat, cgroup.events, and other files. For example, this causes confusion when debugging reclaim events under low, as currently these always read "0" at non-leaf memcg nodes, which frequently causes people to misdiagnose breach behaviour. The same confusion applies to other counters in this file when debugging issues. Aggregation is done at write time instead of at read-time since these counters aren't hot (unlike memory.stat which is per-page, so it does it at read time), and it makes sense to bundle this with the file notifications. After this patch, events are propagated up the hierarchy: [root@ktst ~]# cat /sys/fs/cgroup/system.slice/memory.events low 0 high 0 max 0 oom 0 oom_kill 0 [root@ktst ~]# systemd-run -p MemoryMax=1 true Running as unit: run-r251162a189fb4562b9dabfdc9b0422f5.service [root@ktst ~]# cat /sys/fs/cgroup/system.slice/memory.events low 0 high 0 max 7 oom 1 oom_kill 1 As this is a change in behaviour, this can be reverted to the old behaviour by mounting with the `memory_localevents' flag set. However, we use the new behaviour by default as there's a lack of evidence that there are any current users of memory.events that would find this change undesirable. akpm: this is a behaviour change, so Cc:stable. THis is so that forthcoming distros which use cgroup v2 are more likely to pick up the revised behaviour. Link: http://lkml.kernel.org/r/20190208224419.GA24772@chrisdown.name Signed-off-by: Chris Down <chris@chrisdown.name> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Roman Gushchin <guro@fb.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-01mm: fix Documentation/vm/hmm.rst Sphinx warningsRandy Dunlap
Fix Sphinx warnings in Documentation/vm/hmm.rst by using "::" notation and inserting a blank line. Also add a missing ';'. Documentation/vm/hmm.rst:292: WARNING: Unexpected indentation. Documentation/vm/hmm.rst:300: WARNING: Unexpected indentation. Link: http://lkml.kernel.org/r/c5995359-7c82-4e47-c7be-b58a4dda0953@infradead.org Fixes: 023a019a9b4e ("mm/hmm: add default fault flags to avoid the need to pre-fill pfns arrays") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Jérôme Glisse <jglisse@redhat.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-31Merge tag 'usb-5.2-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some tiny USB fixes for a number of reported issues for 5.2-rc3. Nothing huge here, just a small collection of xhci and other driver bugs that syzbot has been finding in some drivers. There is also a usbip fix and a fix for the usbip fix in here :) All have been in linux-next with no reported issues" * tag 'usb-5.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usbip: usbip_host: fix stub_dev lock context imbalance regression media: smsusb: better handle optional alignment xhci: Use %zu for printing size_t type xhci: Convert xhci_handshake() to use readl_poll_timeout_atomic() xhci: Fix immediate data transfer if buffer is already DMA mapped usb: xhci: avoid null pointer deref when bos field is NULL usb: xhci: Fix a potential null pointer dereference in xhci_debugfs_create_endpoint() xhci: update bounce buffer with correct sg num media: usb: siano: Fix false-positive "uninitialized variable" warning USB: rio500: update Documentation USB: rio500: simplify locking USB: rio500: fix memory leak in close after disconnect USB: rio500: refuse more than one device at a time usbip: usbip_host: fix BUG: sleeping function called from invalid context USB: sisusbvga: fix oops in error path of sisusb_probe USB: Add LPM quirk for Surface Dock GigE adapter media: usb: siano: Fix general protection fault in smsusb usb: mtu3: fix up undefined reference to usb_debug_root USB: Fix slab-out-of-bounds write in usb_get_bos_descriptor
2019-05-31ovl: doc: add non-standard corner casesMiklos Szeredi
While most corner cases have already been dealt with, some remain and should be documented. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-05-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix OOPS during nf_tables rule dump, from Florian Westphal. 2) Use after free in ip_vs_in, from Yue Haibing. 3) Fix various kTLS bugs (NULL deref during device removal resync, netdev notification ignoring, etc.) From Jakub Kicinski. 4) Fix ipv6 redirects with VRF, from David Ahern. 5) Memory leak fix in igmpv3_del_delrec(), from Eric Dumazet. 6) Missing memory allocation failure check in ip6_ra_control(), from Gen Zhang. And likewise fix ip_ra_control(). 7) TX clean budget logic error in aquantia, from Igor Russkikh. 8) SKB leak in llc_build_and_send_ui_pkt(), from Eric Dumazet. 9) Double frees in mlx5, from Parav Pandit. 10) Fix lost MAC address in r8169 during PCI D3, from Heiner Kallweit. 11) Fix botched register access in mvpp2, from Antoine Tenart. 12) Use after free in napi_gro_frags(), from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (89 commits) net: correct zerocopy refcnt with udp MSG_MORE ethtool: Check for vlan etype or vlan tci when parsing flow_rule net: don't clear sock->sk early to avoid trouble in strparser net-gro: fix use-after-free read in napi_gro_frags() net: dsa: tag_8021q: Create a stable binary format net: dsa: tag_8021q: Change order of rx_vid setup net: mvpp2: fix bad MVPP2_TXQ_SCHED_TOKEN_CNTR_REG queue value ipv4: tcp_input: fix stack out of bounds when parsing TCP options. mlxsw: spectrum: Prevent force of 56G mlxsw: spectrum_acl: Avoid warning after identical rules insertion net: dsa: mv88e6xxx: fix handling of upper half of STATS_TYPE_PORT r8169: fix MAC address being lost in PCI D3 net: core: support XDP generic on stacked devices. netvsc: unshare skb in VF rx handler udp: Avoid post-GRO UDP checksum recalculation net: phy: dp83867: Set up RGMII TX delay net: phy: dp83867: do not call config_init twice net: phy: dp83867: increase SGMII autoneg timer duration net: phy: dp83867: fix speed 10 in sgmii mode net: phy: marvell10g: report if the PHY fails to boot firmware ...
2019-05-30docs cgroups: add another example size for hugetlbOdin Ugedal
Add another example to clarify that HugePages smaller than 1MB will be displayed using "KB", with an uppercased K (eg. 20KB), and not the normal SI prefix kilo (small k). Because of a misunderstanding/copy-paste error inside runc (see https://github.com/opencontainers/runc/pull/2065), it tried accessing the cgroup control file of a 64kB HugePage using "hugetlb.64kB._____" instead of the correct "hugetlb.64KB._____". Adding a new example will make it clear how sizes smaller than 1MB are handled. Signed-off-by: Odin Ugedal <odin@ugedal.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2019-05-29Merge tag 'docs-5.2-fixes2' of git://git.lwn.net/linuxLinus Torvalds
Pull documentation fixes from Jonathan Corbet: "The Sphinx 2.0 release contained a few incompatible API changes that broke our extensions and, thus, the documentation build in general. Who knew that those deprecation warnings it was outputting actually meant we should change something? This set of fixes makes the build work again with Sphinx 2.0 and eliminates the warnings for 1.8. As part of that, we also need a few fixes to the docs for places where the new Sphinx is more strict. It is a bit late in the cycle for this kind of change, but it does fix problems that people are experiencing now. There has been some talk of raising the minimum version of Sphinx we support. I don't want to do that abruptly, though, so these changes add some glue to continue to support versions back to 1.3. We will be adding some infrastructure soon to nudge users of old versions forward, with the idea of maybe increasing our minimum version (and removing this glue) sometime in the future" * tag 'docs-5.2-fixes2' of git://git.lwn.net/linux: drm/i915: Maintain consistent documentation subsection ordering scripts/sphinx-pre-install: make it handle Sphinx versions docs: Fix conf.py for Sphinx 2.0 docs: fix multiple doc build warnings in enumeration.rst lib/list_sort: fix kerneldoc build error docs: fix numaperf.rst and add it to the doc tree doc: Cope with the deprecation of AutoReporter doc: Cope with Sphinx logging deprecations
2019-05-28Documentation: net-sysfs: Remove duplicate PHY device documentationFlorian Fainelli
Both sysfs-bus-mdio and sysfs-class-net-phydev contain the same duplication information. There is not currently any MDIO bus specific attribute, but there are PHY device (struct phy_device) specific attributes. Use the more precise description from sysfs-bus-mdio and carry that over to sysfs-class-net-phydev. Fixes: 86f22d04dfb5 ("net: sysfs: Document PHY device sysfs attributes") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>