summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)Author
2017-02-26Linux 4.8.20v4.8.20Paul Gortmaker
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26drm/i915: Remove WaDisableLSQCROPERFforOCL KBL workaround.Francisco Jerez
commit 4fc020d864647ea3ae8cb8f17d63e48e87ebd0bf upstream. The WaDisableLSQCROPERFforOCL workaround has the side effect of disabling an L3SQ optimization that has huge performance implications and is unlikely to be necessary for the correct functioning of usual graphic workloads. Userspace is free to re-enable the workaround on demand, and is generally in a better position to determine whether the workaround is necessary than the DRM is (e.g. only during the execution of compute kernels that rely on both L3 fences and HDC R/W requests). The same workaround seems to apply to BDW (at least to production stepping G1) and SKL as well (the internal workaround database claims that it does for all steppings, while the BSpec workaround table only mentions pre-production steppings), but the DRM doesn't do anything beyond whitelisting the L3SQCREG4 register so userspace can enable it when it sees fit. Do the same on KBL platforms. Improves performance of the GFXBench4 gl_manhattan31 benchmark by 60%, and gl_4 (AKA car chase) by 14% on a KBL GT2 running Mesa master -- This is followed by a regression of 35% and 10% respectively for the same benchmarks and platform caused by my recent patch series switching userspace to use the dataport constant cache instead of the sampler to implement uniform pull constant loads, which caused us to hit more heavily the L3 cache (and on platforms other than KBL had the opposite effect of improving performance of the same two benchmarks). The overall effect on KBL of this change combined with the recent userspace change is respectively 4.6% and 2.6%. SynMark2 OglShMapPcf was affected by the constant cache changes (though it improved as it did on other platforms rather than regressing), but is not significantly affected by this patch (with statistical significance of 5% and sample size 20). v2: Drop some more code to avoid unused variable warning. Fixes: 738fa1b3123f ("drm/i915/kbl: Add WaDisableLSQCROPERFforOCL") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99256 Signed-off-by: Francisco Jerez <currojerez@riseup.net> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Eero Tamminen <eero.t.tamminen@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: beignet@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v4.7+ Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [Removed double Fixes tag] Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1484217894-20505-1-git-send-email-mika.kuoppala@intel.com (cherry picked from commit 8726f2faa371514fba2f594d799db95203dfeee0) Signed-off-by: Jani Nikula <jani.nikula@intel.com> [PG: 4.8 baseline also checks IS_SKL_REVID in removed code block.] Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26mm, memcg: do not retry precharge chargesDavid Rientjes
commit 3674534b775354516e5c148ea48f51d4d1909a78 upstream. When memory.move_charge_at_immigrate is enabled and precharges are depleted during move, mem_cgroup_move_charge_pte_range() will attempt to increase the size of the precharge. Prevent precharges from ever looping by setting __GFP_NORETRY. This was probably the intention of the GFP_KERNEL & ~__GFP_NORETRY, which is pointless as written. Fixes: 0029e19ebf84 ("mm: memcontrol: remove explicit OOM parameter in charge path") Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1701130208510.69402@chino.kir.corp.google.com Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26platform/x86: intel_mid_powerbtn: Set IRQ_ONESHOTAndy Shevchenko
commit 5a00b6c2438460b870a451f14593fc40d3c7edf6 upstream. The commit 1c6c69525b40 ("genirq: Reject bogus threaded irq requests") starts refusing misconfigured interrupt handlers. This makes intel_mid_powerbtn not working anymore. Add a mandatory flag to a threaded IRQ request in the driver. Fixes: 1c6c69525b40 ("genirq: Reject bogus threaded irq requests") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26virtio_mmio: Set DMA masks appropriatelyRobin Murphy
commit f7f6634d23830ff74335734fbdb28ea109c1f349 upstream. Once DMA API usage is enabled, it becomes apparent that virtio-mmio is inadvertently relying on the default 32-bit DMA mask, which leads to problems like rapidly exhausting SWIOTLB bounce buffers. Ensure that we set the appropriate 64-bit DMA mask whenever possible, with the coherent mask suitably limited for the legacy vring as per a0be1db4304f ("virtio_pci: Limit DMA mask to 44 bits for legacy virtio devices"). Cc: Andy Lutomirski <luto@kernel.org> Cc: Michael S. Tsirkin <mst@redhat.com> Reported-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Fixes: b42111382f0e ("virtio_mmio: Use the DMA API if enabled") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26memory_hotplug: make zone_can_shift() return a boolean valueYasuaki Ishimatsu
commit 8a1f780e7f28c7c1d640118242cf68d528c456cd upstream. online_{kernel|movable} is used to change the memory zone to ZONE_{NORMAL|MOVABLE} and online the memory. To check that memory zone can be changed, zone_can_shift() is used. Currently the function returns minus integer value, plus integer value and 0. When the function returns minus or plus integer value, it means that the memory zone can be changed to ZONE_{NORNAL|MOVABLE}. But when the function returns 0, there are two meanings. One of the meanings is that the memory zone does not need to be changed. For example, when memory is in ZONE_NORMAL and onlined by online_kernel the memory zone does not need to be changed. Another meaning is that the memory zone cannot be changed. When memory is in ZONE_NORMAL and onlined by online_movable, the memory zone may not be changed to ZONE_MOVALBE due to memory online limitation(see Documentation/memory-hotplug.txt). In this case, memory must not be onlined. The patch changes the return type of zone_can_shift() so that memory online operation fails when memory zone cannot be changed as follows: Before applying patch: # grep -A 35 "Node 2" /proc/zoneinfo Node 2, zone Normal <snip> node_scanned 0 spanned 8388608 present 7864320 managed 7864320 # echo online_movable > memory4097/state # grep -A 35 "Node 2" /proc/zoneinfo Node 2, zone Normal <snip> node_scanned 0 spanned 8388608 present 8388608 managed 8388608 online_movable operation succeeded. But memory is onlined as ZONE_NORMAL, not ZONE_MOVABLE. After applying patch: # grep -A 35 "Node 2" /proc/zoneinfo Node 2, zone Normal <snip> node_scanned 0 spanned 8388608 present 7864320 managed 7864320 # echo online_movable > memory4097/state bash: echo: write error: Invalid argument # grep -A 35 "Node 2" /proc/zoneinfo Node 2, zone Normal <snip> node_scanned 0 spanned 8388608 present 7864320 managed 7864320 online_movable operation failed because of failure of changing the memory zone from ZONE_NORMAL to ZONE_MOVABLE Fixes: df429ac03936 ("memory-hotplug: more general validation of zone during online") Link: http://lkml.kernel.org/r/2f9c3837-33d7-b6e5-59c0-6ca4372b2d84@gmail.com Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Reviewed-by: Reza Arbab <arbab@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26pinctrl: baytrail: Rectify debounce supportAndy Shevchenko
commit 04ff5a095d662e0879f0eb04b9247e092210aeff upstream. The commit 658b476c742f ("pinctrl: baytrail: Add debounce configuration") implements debounce for Baytrail pin control, but seems wasn't tested properly. The register which keeps debounce value is separated from the configuration one. Writing wrong values to the latter will guarantee wrong behaviour of the driver and even might break something physically. Besides above there is missed case how to disable it, which is actually done through the bit in configuration register. Rectify implementation here by using proper register for debounce value. Fixes: 658b476c742f ("pinctrl: baytrail: Add debounce configuration") Cc: Cristina Ciocan <cristina.ciocan@intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26pinctrl: uniphier: fix Ethernet (RMII) pin-mux setting for LD20Masahiro Yamada
commit df1539c25cce98e2ac69881958850c6535240707 upstream. Fix the pin-mux values for the MDC, MDIO, MDIO_INTL, PHYRSTL pins. Fixes: 1e359ab1285e ("pinctrl: uniphier: add Ethernet pin-mux settings") Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26pinctrl: broxton: Use correct PADCFGLOCK offsetMika Westerberg
commit ecc8995363ee6231b32dad61c955b371b79cc4cf upstream. PADCFGLOCK (and PADCFGLOCK_TX) offset in Broxton actually starts at 0x060 and not 0x090 as used in the driver. Fix it to use the correct offset. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26[media] s5k4ecgx: select CRC32 helperArnd Bergmann
commit c739c0a7c3c2472d7562b8f802cdce44d2597c8b upstream. A rare randconfig build failure shows up in this driver when the CRC32 helper is not there: drivers/media/built-in.o: In function `s5k4ecgx_s_power': s5k4ecgx.c:(.text+0x9eb4): undefined reference to `crc32_le' This adds the 'select' that all other users of this function have. Fixes: 8b99312b7214 ("[media] Add v4l2 subdev driver for S5K4ECGX sensor") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26IB/rxe: Prevent from completer to operate on non valid QPYonatan Cohen
commit 2d4b21e0a2913612274a69a3ba1bfee4cffc6e77 upstream. On UD QP completer tasklet is scheduled for each packet sent. If it is followed by a destroy_qp(), the kernel panic will happen as the completer tries to operate on a destroyed QP. Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26IB/rxe: Fix rxe dev insertion to rxe_dev_listMaor Gottlieb
commit f39f775218a7520e3700de2003c84a042c3b5972 upstream. The first argument of list_add_tail is the new item and the second is the head of the list. Fix the code to pass arguments in the right order, otherwise not all the rxe devices will be removed during teardown. Fixes: 8700e3e7c4857 ('Soft RoCE driver') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26IB/umem: Release pid in error and ODP flowKenneth Lee
commit 828f6fa65ce7e80f77f5ab12942e44eb3d9d174e upstream. 1. Release pid before enter odp flow 2. Release pid when fail to allocate memory Fixes: 87773dd56d54 ("IB: ib_umem_release() should decrement mm->pinned_vm from ib_umem_get") Fixes: 8ada2c1c0c1d ("IB/core: Add support for on demand paging regions") Signed-off-by: Kenneth Lee <liguozhu@hisilicon.com> Reviewed-by: Haggai Eran <haggaie@mellanox.com> Reviewed-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26drm/i915: Don't leak edid in intel_crt_detect_ddc()Ander Conselvan de Oliveira
commit c34f078675f505c4437919bb1897b1351f16a050 upstream. In the path where intel_crt_detect_ddc() detects a CRT, if would return true without freeing the edid. Fixes: a2bd1f541f19 ("drm/i915: check whether we actually received an edid in detect_ddc") Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: intel-gfx@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v3.6+ Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1484922525-6131-1-git-send-email-ander.conselvan.de.oliveira@intel.com (cherry picked from commit c96b63a6a7ac4bd670ec2e663793a9a31418b790) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26drm/i915: prevent crash with .disable_display parameterClint Taylor
commit 27892bbdc9233f33bf0f44e08aab8f12e0dec142 upstream. The .disable_display parameter was causing a fatal crash when fbdev was dereferenced during driver init. V1: protection in i915_drv.c V2: Moved protection to intel_fbdev.c Fixes: 43cee314345a ("drm/i915/fbdev: Limit the global async-domain synchronization") Testcase: igt/drv_module_reload/basic-no-display Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1484775523-29428-1-git-send-email-clinton.a.taylor@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Lukas Wunner <lukas@wunner.de> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: <stable@vger.kernel.org> # v4.8+ Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (cherry picked from commit 5b8cd0755f8a06a851c436a013e7be0823fb155a) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26[media] v4l: tvp5150: Don't override output pinmuxing at stream on/off timeLaurent Pinchart
commit 79d6205a3f741c9fb89cfc47dfa0eddb1526726d upstream. The s_stream() handler incorrectly writes the whole MISC_CTL register to enable or disable the outputs, overriding the output pinmuxing configuration. Fix it to only touch the output enable bits. The CONF_SHARED_PIN register is also written by the same function, resulting in muxing the INTREQ signal instead of the VBLK/GPCL signal on the INTREQ/GPCL/VBLK pin. As the driver doesn't support interrupts this is obviously incorrect, and breaks operation on other devices. Fix it by removing the write. Cc: stable@vger.kernel.org # For Kernel 4.5 and upper Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26[media] v4l: tvp5150: Fix comment regarding output pin muxingLaurent Pinchart
commit b4b2de386bbb6589d81596999d4a924928dc119b upstream. The FID/GLCO/VLK/HVLK and INTREQ/GPCL/VBLK pins are muxed differently depending on whether the input is an S-Video or composite signal. The comment that explains the logic doesn't reflect the code. It appears that the comment is incorrect, as disabling the output data bus in composite mode makes no sense. Update the comment to match the code. While at it define macros for the MISC_CTL register bits, the code is too confusing with numerical values. Cc: stable@vger.kernel.org # For Kernel 4.5 and upper Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26[media] v4l: tvp5150: Reset device at probe time, not in get/set format handlersLaurent Pinchart
commit aff808e813fc2d311137754165cf53d4ee6ddcc2 upstream. The tvp5150 doesn't support format setting through the subdev pad API and thus implements the set format handler as a get format operation. The single handler, tvp5150_fill_fmt(), resets the device by calling tvp5150_reset(). This causes malfunction as the device can be reset at will, possibly from userspace when the subdev userspace API is enabled. The reset call was added in commit ec2c4f3f93cb ("[media] media: tvp5150: Add mbus_fmt callbacks"), probably as an attempt to set the device to a known state before detecting the current TV standard. However, the get format handler doesn't access the hardware to get the TV standard since commit 963ddc63e20d ("[media] media: tvp5150: Add cropping support"). There is thus no need to reset the device when getting the format. However, removing the tvp5150_reset() from the get/set format handlers results in the function not being called at all if the bridge driver doesn't use the .reset() operation. The operation is nowadays abused and shouldn't be used, so shouldn't expect bridge drivers to call it. To make sure the device is properly initialize, move the reset call from the format handlers to the probe function. Cc: stable@vger.kernel.org # For Kernel 4.5 and upper Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26iw_cxgb4: free EQ queue memory on last derefSteve Wise
commit c12a67fec8d99bb554e8d4e99120d418f1a39c87 upstream. Commit ad61a4c7a9b7 ("iw_cxgb4: don't block in destroy_qp awaiting the last deref") introduced a bug where the RDMA QP EQ queue memory (and QIDs) are possibly freed before the underlying connection has been fully shutdown. The result being a possible DMA read issued by HW after the queue memory has been unmapped and freed. This results in possible WR corruption in the worst case, system bus errors if an IOMMU is in use, and SGE "bad WR" errors reported in the very least. The fix is to defer unmap/free of queue memory and QID resources until the QP struct has been fully dereferenced. To do this, the c4iw_ucontext must also be kept around until the last QP that references it is fully freed. In addition, since the last QP deref can happen in an IRQ disabled context, we need a new workqueue thread to do the final unmap/free of the EQ queue memory. Fixes: ad61a4c7a9b7 ("iw_cxgb4: don't block in destroy_qp awaiting the last deref") Cc: stable@vger.kernel.org Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26SUNRPC: cleanup ida information when removing sunrpc moduleKinglong Mee
commit c929ea0b910355e1876c64431f3d5802f95b3d75 upstream. After removing sunrpc module, I get many kmemleak information as, unreferenced object 0xffff88003316b1e0 (size 544): comm "gssproxy", pid 2148, jiffies 4294794465 (age 4200.081s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffffb0cfb58a>] kmemleak_alloc+0x4a/0xa0 [<ffffffffb03507fe>] kmem_cache_alloc+0x15e/0x1f0 [<ffffffffb0639baa>] ida_pre_get+0xaa/0x150 [<ffffffffb0639cfd>] ida_simple_get+0xad/0x180 [<ffffffffc06054fb>] nlmsvc_lookup_host+0x4ab/0x7f0 [lockd] [<ffffffffc0605e1d>] lockd+0x4d/0x270 [lockd] [<ffffffffc06061e5>] param_set_timeout+0x55/0x100 [lockd] [<ffffffffc06cba24>] svc_defer+0x114/0x3f0 [sunrpc] [<ffffffffc06cbbe7>] svc_defer+0x2d7/0x3f0 [sunrpc] [<ffffffffc06c71da>] rpc_show_info+0x8a/0x110 [sunrpc] [<ffffffffb044a33f>] proc_reg_write+0x7f/0xc0 [<ffffffffb038e41f>] __vfs_write+0xdf/0x3c0 [<ffffffffb0390f1f>] vfs_write+0xef/0x240 [<ffffffffb0392fbd>] SyS_write+0xad/0x130 [<ffffffffb0d06c37>] entry_SYSCALL_64_fastpath+0x1a/0xa9 [<ffffffffffffffff>] 0xffffffffffffffff I found, the ida information (dynamic memory) isn't cleanup. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Fixes: 2f048db4680a ("SUNRPC: Add an identifier for struct rpc_clnt") Cc: stable@vger.kernel.org # v3.12+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26NFSv4.0: always send mode in SETATTR after EXCLUSIVE4Benjamin Coddington
commit a430607b2ef7c3be090f88c71cfcb1b3988aa7c0 upstream. Some nfsv4.0 servers may return a mode for the verifier following an open with EXCLUSIVE4 createmode, but this does not mean the client should skip setting the mode in the following SETATTR. It should only do that for EXCLUSIVE4_1 or UNGAURDED createmode. Fixes: 5334c5bdac92 ("NFS: Send attributes in OPEN request for NFS4_CREATE_EXCLUSIVE4_1") Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Cc: stable@vger.kernel.org # v4.3+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26NFSv4.1: Fix a deadlock in layoutgetTrond Myklebust
commit 8ac092519ad91931c96d306c4bfae2c6587c325f upstream. We cannot call nfs4_handle_exception() without first ensuring that the slot has been freed. If not, we end up deadlocking with the process waiting for recovery to complete, and recovery waiting for the slot table to drain. Fixes: 2e80dbe7ac51 ("NFSv4.1: Close callback races for OPEN, LAYOUTGET...") Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26nfs: Don't increment lock sequence ID after NFS4ERR_MOVEDChuck Lever
commit 059aa734824165507c65fd30a55ff000afd14983 upstream. Xuan Qi reports that the Linux NFSv4 client failed to lock a file that was migrated. The steps he observed on the wire: 1. The client sent a LOCK request to the source server 2. The source server replied NFS4ERR_MOVED 3. The client switched to the destination server 4. The client sent the same LOCK request to the destination server with a bumped lock sequence ID 5. The destination server rejected the LOCK request with NFS4ERR_BAD_SEQID RFC 3530 section 8.1.5 provides a list of NFS errors which do not bump a lock sequence ID. However, RFC 3530 is now obsoleted by RFC 7530. In RFC 7530 section 9.1.7, this list has been updated by the addition of NFS4ERR_MOVED. Reported-by: Xuan Qi <xuan.qi@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: stable@vger.kernel.org # v3.7+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26parisc: Don't use BITS_PER_LONG in userspace-exported swab.h headerHelge Deller
commit 2ad5d52d42810bed95100a3d912679d8864421ec upstream. In swab.h the "#if BITS_PER_LONG > 32" breaks compiling userspace programs if BITS_PER_LONG is #defined by userspace with the sizeof() compiler builtin. Solve this problem by using __BITS_PER_LONG instead. Since we now #include asm/bitsperlong.h avoid further potential userspace pollution by moving the #define of SHIFT_PER_LONG to bitops.h which is not exported to userspace. This patch unbreaks compiling qemu on hppa/parisc. Signed-off-by: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26ARC: [arcompact] handle unaligned access delay slot corner caseVineet Gupta
commit 9aed02feae57bf7a40cb04ea0e3017cb7a998db4 upstream. After emulating an unaligned access in delay slot of a branch, we pretend as the delay slot never happened - so return back to actual branch target (or next PC if branch was not taken). Curently we did this by handling STATUS32.DE, we also need to clear the BTA.T bit, which is disregarded when returning from original misaligned exception, but could cause weirdness if it took the interrupt return path (in case interrupt was acive too) One ARC700 customer ran into this when enabling unaligned access fixup for kernel mode accesses as well Cc: stable@vger.kernel.org Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26can: ti_hecc: add missing prepare and unprepare of the clockYegor Yefremov
commit befa60113ce7ea270cb51eada28443ca2756f480 upstream. In order to make the driver work with the common clock framework, this patch converts the clk_enable()/clk_disable() to clk_prepare_enable()/clk_disable_unprepare(). Also add error checking for clk_prepare_enable(). Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26can: c_can_pci: fix null-pointer-deref in c_can_start() - set device pointerEinar Jón
commit c97c52be78b8463ac5407f1cf1f22f8f6cf93a37 upstream. The priv->device pointer for c_can_pci is never set, but it is used without a NULL check in c_can_start(). Setting it in c_can_pci_probe() like c_can_plat_probe() prevents c_can_pci.ko from crashing, with and without CONFIG_PM. This might also cause the pm_runtime_*() functions in c_can.c to actually be executed for c_can_pci devices - they are the only other place where priv->device is used, but they all contain a null check. Signed-off-by: Einar Jón <tolvupostur@gmail.com> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26IB/srp: fix invalid indirect_sg_entries parameter valueIsrael Rukshin
commit 0a475ef4226e305bdcffe12b401ca1eab06c4913 upstream. After setting indirect_sg_entries module_param to huge value (e.g 500,000), srp_alloc_req_data() fails to allocate indirect descriptors for the request ring (kmalloc fails). This commit enforces the maximum value of indirect_sg_entries to be SG_MAX_SEGMENTS as signified in module param description. Fixes: 65e8617fba17 (scsi: rename SCSI_MAX_{SG, SG_CHAIN}_SEGMENTS) Fixes: c07d424d6118 (IB/srp: add support for indirect tables that don't fit in SRP_CMD) Cc: stable@vger.kernel.org # 4.7+ Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>-- Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26IB/srp: fix mr allocation when the device supports sg gapsIsrael Rukshin
commit ad8e66b4a80182174f73487ed25fd2140cf43361 upstream. If the device support arbitrary sg list mapping (device cap IB_DEVICE_SG_GAPS_REG set) we allocate the memory regions with IB_MR_TYPE_SG_GAPS. Fixes: 509c5f33f4f6 ("IB/srp: Prevent mapping failures") Cc: <stable@vger.kernel.org> # 4.7+ Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26IB/iser: Fix sg_tablesize calculationMax Gurtovoy
commit 1e5db6c31ade4150c2e2b1a21e39f776c38fea39 upstream. For devices that can register page list that is bigger than USHRT_MAX, we actually take the wrong value for sg_tablesize. E.g: for CX4 max_fast_reg_page_list_len is 65536 (bigger than USHRT_MAX) so we set sg_tablesize to 0 by mistake. Therefore, each IO that is bigger than 4k splitted to "< 4k" chunks that cause performance degredation. Remove wrong sg_tablesize assignment, and use the value that was set during address resolution handler with the needed casting. Cc: <stable@vger.kernel.org> # v4.5+ Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26s390/ptrace: Preserve previous registers for short regset writeMartin Schwidefsky
commit 9dce990d2cf57b5ed4e71a9cdbd7eae4335111ff upstream. Ensure that if userspace supplies insufficient data to PTRACE_SETREGSET to fill all the registers, the thread's old registers are preserved. convert_vx_to_fp() is adapted to handle only a specified number of registers rather than unconditionally handling all of them: other callers of this function are adapted appropriately. Based on an initial patch by Dave Martin. Cc: stable@vger.kernel.org Reported-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26s390/mm: Fix cmma unused transfer from pgste into pteChristian Borntraeger
commit 0d6da872d3e4a60f43c295386d7ff9a4cdcd57e9 upstream. The last pgtable rework silently disabled the CMMA unused state by setting a local pte variable (a parameter) instead of propagating it back into the caller. Fix it. Fixes: ebde765c0e85 ("s390/mm: uninline ptep_xxx functions from pgtable.h") Cc: stable@vger.kernel.org # v4.6+ Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26RDMA/cma: Fix unknown symbol when CONFIG_IPV6 is not enabledJack Morgenstein
commit b4cfe3971f6eab542dd7ecc398bfa1aeec889934 upstream. If IPV6 has not been enabled in the underlying kernel, we must avoid calling IPV6 procedures in rdma_cm.ko. This requires using "IS_ENABLED(CONFIG_IPV6)" in "if" statements surrounding any code which calls external IPV6 procedures. In the instance fixed here, procedure cma_bind_addr() called ipv6_addr_type() -- which resulted in calling external procedure __ipv6_addr_type(). Fixes: 6c26a77124ff ("RDMA/cma: fix IPv6 address resolution") Cc: <stable@vger.kernel.org> # v4.2+ Cc: Spencer Baugh <sbaugh@catern.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26Btrfs: remove ->{get, set}_acl() from btrfs_dir_ro_inode_operationsOmar Sandoval
commit 57b59ed2e5b91e958843609c7884794e29e6c4cb upstream. Subvolume directory inodes can't have ACLs. Cc: <stable@vger.kernel.org> # 4.9.x Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26Btrfs: remove old tree_root case in btrfs_read_locked_inode()Omar Sandoval
commit 67ade058ef2c65a3e56878af9c293ec76722a2e5 upstream. As Jeff explained in c2951f32d36c ("btrfs: remove old tree_root dirent processing in btrfs_real_readdir()"), supporting this old format is no longer necessary since the Btrfs magic number has been updated since we changed to the current format. There are other places where we still handle this old format, but since this is part of a fix that is going to stable, I'm only removing this one for now. Cc: <stable@vger.kernel.org> # 4.9.x Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26ISDN: eicon: silence misleading array-bounds warningArnd Bergmann
commit 950eabbd6ddedc1b08350b9169a6a51b130ebaaf upstream. With some gcc versions, we get a warning about the eicon driver, and that currently shows up as the only remaining warning in one of the build bots: In file included from ../drivers/isdn/hardware/eicon/message.c:30:0: eicon/message.c: In function 'mixer_notify_update': eicon/platform.h:333:18: warning: array subscript is above array bounds [-Warray-bounds] The code is easily changed to open-code the unusual PUT_WORD() line causing this to avoid the warning. Cc: stable@vger.kernel.org Link: http://arm-soc.lixom.net/buildlogs/stable-rc/v4.4.45/ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26xfs: prevent quotacheck from overloading inode lruBrian Foster
commit e0d76fa4475ef2cf4b52d18588b8ce95153d021b upstream. Quotacheck runs at mount time in situations where quota accounting must be recalculated. In doing so, it uses bulkstat to visit every inode in the filesystem. Historically, every inode processed during quotacheck was released and immediately tagged for reclaim because quotacheck runs before the superblock is marked active by the VFS. In other words, the final iput() lead to an immediate ->destroy_inode() call, which allowed the XFS background reclaim worker to start reclaiming inodes. Commit 17c12bcd3 ("xfs: when replaying bmap operations, don't let unlinked inodes get reaped") marks the XFS superblock active sooner as part of the mount process to support caching inodes processed during log recovery. This occurs before quotacheck and thus means all inodes processed by quotacheck are inserted to the LRU on release. The s_umount lock is held until the mount has completed and thus prevents the shrinkers from operating on the sb. This means that quotacheck can excessively populate the inode LRU and lead to OOM conditions on systems without sufficient RAM. Update the quotacheck bulkstat handler to set XFS_IGET_DONTCACHE on inodes processed by quotacheck. This causes ->drop_inode() to return 1 and in turn causes iput_final() to evict the inode. This preserves the original quotacheck behavior and prevents it from overloading the LRU and running out of memory. CC: stable@vger.kernel.org # v4.9 Reported-by: Martin Svec <martin.svec@zoner.cz> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26sysctl: fix proc_doulongvec_ms_jiffies_minmax()Eric Dumazet
commit ff9f8a7cf935468a94d9927c68b00daae701667e upstream. We perform the conversion between kernel jiffies and ms only when exporting kernel value to user space. We need to do the opposite operation when value is written by user. Only matters when HZ != 1000 Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26vring: Force use of DMA API for ARM-based systems with legacy devicesWill Deacon
commit c7070619f3408d9a0dffbed9149e6f00479cf43b upstream. Booting Linux on an ARM fastmodel containing an SMMU emulation results in an unexpected I/O page fault from the legacy virtio-blk PCI device: [ 1.211721] arm-smmu-v3 2b400000.smmu: event 0x10 received: [ 1.211800] arm-smmu-v3 2b400000.smmu: 0x00000000fffff010 [ 1.211880] arm-smmu-v3 2b400000.smmu: 0x0000020800000000 [ 1.211959] arm-smmu-v3 2b400000.smmu: 0x00000008fa081002 [ 1.212075] arm-smmu-v3 2b400000.smmu: 0x0000000000000000 [ 1.212155] arm-smmu-v3 2b400000.smmu: event 0x10 received: [ 1.212234] arm-smmu-v3 2b400000.smmu: 0x00000000fffff010 [ 1.212314] arm-smmu-v3 2b400000.smmu: 0x0000020800000000 [ 1.212394] arm-smmu-v3 2b400000.smmu: 0x00000008fa081000 [ 1.212471] arm-smmu-v3 2b400000.smmu: 0x0000000000000000 <system hangs failing to read partition table> This is because the legacy virtio-blk device is behind an SMMU, so we have consequently swizzled its DMA ops and configured the SMMU to translate accesses. This then requires the vring code to use the DMA API to establish translations, otherwise all transactions will result in fatal faults and termination. Given that ARM-based systems only see an SMMU if one is really present (the topology is all described by firmware tables such as device-tree or IORT), then we can safely use the DMA API for all legacy virtio devices. Modern devices can advertise the prescense of an IOMMU using the VIRTIO_F_IOMMU_PLATFORM feature flag. Cc: Andy Lutomirski <luto@kernel.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: <stable@vger.kernel.org> Fixes: 876945dbf649 ("arm64: Hook up IOMMU dma_ops") Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26mm, page_alloc: fix premature OOM when racing with cpuset mems updateVlastimil Babka
commit e47483bca2cc59a4593b37a270b16ee42b1d9f08 upstream. Ganapatrao Kulkarni reported that the LTP test cpuset01 in stress mode triggers OOM killer in few seconds, despite lots of free memory. The test attempts to repeatedly fault in memory in one process in a cpuset, while changing allowed nodes of the cpuset between 0 and 1 in another process. The problem comes from insufficient protection against cpuset changes, which can cause get_page_from_freelist() to consider all zones as non-eligible due to nodemask and/or current->mems_allowed. This was masked in the past by sufficient retries, but since commit 682a3385e773 ("mm, page_alloc: inline the fast path of the zonelist iterator") we fix the preferred_zoneref once, and don't iterate over the whole zonelist in further attempts, thus the only eligible zones might be placed in the zonelist before our starting point and we always miss them. A previous patch fixed this problem for current->mems_allowed. However, cpuset changes also update the task's mempolicy nodemask. The fix has two parts. We have to repeat the preferred_zoneref search when we detect cpuset update by way of seqcount, and we have to check the seqcount before considering OOM. [akpm@linux-foundation.org: fix typo in comment] Link: http://lkml.kernel.org/r/20170120103843.24587-5-vbabka@suse.cz Fixes: c33d6c06f60f ("mm, page_alloc: avoid looking up the first zone in a zonelist twice") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Ganapatrao Kulkarni <gpkulkarni@gmail.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26mm, page_alloc: move cpuset seqcount checking to slowpathVlastimil Babka
commit 5ce9bfef1d27944c119a397a9d827bef795487ce upstream. This is a preparation for the following patch to make review simpler. While the primary motivation is a bug fix, this also simplifies the fast path, although the moved code is only enabled when cpusets are in use. Link: http://lkml.kernel.org/r/20170120103843.24587-4-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Ganapatrao Kulkarni <gpkulkarni@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26mm, page_alloc: fix fast-path race with cpuset update or removalVlastimil Babka
commit 16096c25bf0ca5d87e4fa6ec6108ba53feead212 upstream. Ganapatrao Kulkarni reported that the LTP test cpuset01 in stress mode triggers OOM killer in few seconds, despite lots of free memory. The test attempts to repeatedly fault in memory in one process in a cpuset, while changing allowed nodes of the cpuset between 0 and 1 in another process. One possible cause is that in the fast path we find the preferred zoneref according to current mems_allowed, so that it points to the middle of the zonelist, skipping e.g. zones of node 1 completely. If the mems_allowed is updated to contain only node 1, we never reach it in the zonelist, and trigger OOM before checking the cpuset_mems_cookie. This patch fixes the particular case by redoing the preferred zoneref search if we switch back to the original nodemask. The condition is also slightly changed so that when the last non-root cpuset is removed, we don't miss it. Note that this is not a full fix, and more patches will follow. Link: http://lkml.kernel.org/r/20170120103843.24587-3-vbabka@suse.cz Fixes: 682a3385e773 ("mm, page_alloc: inline the fast path of the zonelist iterator") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Ganapatrao Kulkarni <gpkulkarni@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26mm, page_alloc: fix check for NULL preferred_zoneVlastimil Babka
commit ea57485af8f4221312a5a95d63c382b45e7840dc upstream. Patch series "fix premature OOM regression in 4.7+ due to cpuset races". This is v2 of my attempt to fix the recent report based on LTP cpuset stress test [1]. The intention is to go to stable 4.9 LTSS with this, as triggering repeated OOMs is not nice. That's why the patches try to be not too intrusive. Unfortunately why investigating I found that modifying the testcase to use per-VMA policies instead of per-task policies will bring the OOM's back, but that seems to be much older and harder to fix problem. I have posted a RFC [2] but I believe that fixing the recent regressions has a higher priority. Longer-term we might try to think how to fix the cpuset mess in a better and less error prone way. I was for example very surprised to learn, that cpuset updates change not only task->mems_allowed, but also nodemask of mempolicies. Until now I expected the parameter to alloc_pages_nodemask() to be stable. I wonder why do we then treat cpusets specially in get_page_from_freelist() and distinguish HARDWALL etc, when there's unconditional intersection between mempolicy and cpuset. I would expect the nodemask adjustment for saving overhead in g_p_f(), but that clearly doesn't happen in the current form. So we have both crazy complexity and overhead, AFAICS. [1] https://lkml.kernel.org/r/CAFpQJXUq-JuEP=QPidy4p_=FN0rkH5Z-kfB4qBvsf6jMS87Edg@mail.gmail.com [2] https://lkml.kernel.org/r/7c459f26-13a6-a817-e508-b65b903a8378@suse.cz This patch (of 4): Since commit c33d6c06f60f ("mm, page_alloc: avoid looking up the first zone in a zonelist twice") we have a wrong check for NULL preferred_zone, which can theoretically happen due to concurrent cpuset modification. We check the zoneref pointer which is never NULL and we should check the zone pointer. Also document this in first_zones_zonelist() comment per Michal Hocko. Fixes: c33d6c06f60f ("mm, page_alloc: avoid looking up the first zone in a zonelist twice") Link: http://lkml.kernel.org/r/20170120103843.24587-2-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Ganapatrao Kulkarni <gpkulkarni@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26mm/mempolicy.c: do not put mempolicy before using its nodemaskVlastimil Babka
commit d51e9894d27492783fc6d1b489070b4ba66ce969 upstream. Since commit be97a41b291e ("mm/mempolicy.c: merge alloc_hugepage_vma to alloc_pages_vma") alloc_pages_vma() can potentially free a mempolicy by mpol_cond_put() before accessing the embedded nodemask by __alloc_pages_nodemask(). The commit log says it's so "we can use a single exit path within the function" but that's clearly wrong. We can still do that when doing mpol_cond_put() after the allocation attempt. Make sure the mempolicy is not freed prematurely, otherwise __alloc_pages_nodemask() can end up using a bogus nodemask, which could lead e.g. to premature OOM. Fixes: be97a41b291e ("mm/mempolicy.c: merge alloc_hugepage_vma to alloc_pages_vma") Link: http://lkml.kernel.org/r/20170118141124.8345-1-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> [4.0+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26mm/huge_memory.c: respect FOLL_FORCE/FOLL_COW for thpKeno Fischer
commit 8310d48b125d19fcd9521d83b8293e63eb1646aa upstream. In commit 19be0eaffa3a ("mm: remove gup_flags FOLL_WRITE games from __get_user_pages()"), the mm code was changed from unsetting FOLL_WRITE after a COW was resolved to setting the (newly introduced) FOLL_COW instead. Simultaneously, the check in gup.c was updated to still allow writes with FOLL_FORCE set if FOLL_COW had also been set. However, a similar check in huge_memory.c was forgotten. As a result, remote memory writes to ro regions of memory backed by transparent huge pages cause an infinite loop in the kernel (handle_mm_fault sets FOLL_COW and returns 0 causing a retry, but follow_trans_huge_pmd bails out immidiately because `(flags & FOLL_WRITE) && !pmd_write(*pmd)` is true. While in this state the process is stil SIGKILLable, but little else works (e.g. no ptrace attach, no other signals). This is easily reproduced with the following code (assuming thp are set to always): #include <assert.h> #include <fcntl.h> #include <stdint.h> #include <stdio.h> #include <string.h> #include <sys/mman.h> #include <sys/stat.h> #include <sys/types.h> #include <sys/wait.h> #include <unistd.h> #define TEST_SIZE 5 * 1024 * 1024 int main(void) { int status; pid_t child; int fd = open("/proc/self/mem", O_RDWR); void *addr = mmap(NULL, TEST_SIZE, PROT_READ, MAP_ANONYMOUS | MAP_PRIVATE, 0, 0); assert(addr != MAP_FAILED); pid_t parent_pid = getpid(); if ((child = fork()) == 0) { void *addr2 = mmap(NULL, TEST_SIZE, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, 0, 0); assert(addr2 != MAP_FAILED); memset(addr2, 'a', TEST_SIZE); pwrite(fd, addr2, TEST_SIZE, (uintptr_t)addr); return 0; } assert(child == waitpid(child, &status, 0)); assert(WIFEXITED(status) && WEXITSTATUS(status) == 0); return 0; } Fix this by updating follow_trans_huge_pmd in huge_memory.c analogously to the update in gup.c in the original commit. The same pattern exists in follow_devmap_pmd. However, we should not be able to reach that check with FOLL_COW set, so add WARN_ONCE to make sure we notice if we ever do. [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/20170106015025.GA38411@juliacomputing.com Signed-off-by: Keno Fischer <keno@juliacomputing.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Greg Thelen <gthelen@google.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Willy Tarreau <w@1wt.eu> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26drm/atomic: clear out fence when duplicating stateLucas Stach
commit a2104c7cd3b24c4329b6193f7ec0882ce612f110 upstream. [Fixed differently in 4.10] The fence needs to be cleared out, otherwise the following commit might wait on a stale fence from the previous commit. This was fixed as a side effect of 9626014258a5 (drm/fence: add in-fences support) in kernel 4.10. As this commit introduces new functionality and as such can not be applied to stable, this patch is the minimal fix for the kernel 4.9 stable series. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Tested-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26drm/vc4: fix a bounds checkDan Carpenter
commit 21ccc32496b2f63228f5232b3ac0e426e8fb3c31 upstream. We accidentally return success even if vc4_full_res_bounds_check() fails. Fixes: d5b1a78a772f ("drm/vc4: Add support for drawing 3D frames.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26drm/vc4: Return -EINVAL on the overflow checks failing.Eric Anholt
commit 6b8ac63847bc2f958dd93c09edc941a0118992d9 upstream. By failing to set the errno, we'd continue on to trying to set up the RCL, and then oops on trying to dereference the tile_bo that binning validation should have set up. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Eric Anholt <eric@anholt.net> Fixes: d5b1a78a772f ("drm/vc4: Add support for drawing 3D frames.") Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26drm/vc4: Fix an integer overflow in temporary allocation layout.Eric Anholt
commit 0f2ff82e11c86c05d051cae32b58226392d33bbf upstream. We copy the unvalidated ioctl arguments from the user into kernel temporary memory to run the validation from, to avoid a race where the user updates the unvalidate contents in between validating them and copying them into the validated BO. However, in setting up the layout of the kernel side, we failed to check one of the additions (the roundup() for shader_rec_offset) against integer overflow, allowing a nearly MAX_UINT value of bin_cl_size to cause us to under-allocate the temporary space that we then copy_from_user into. Reported-by: Murray McAllister <murray.mcallister@insomniasec.com> Signed-off-by: Eric Anholt <eric@anholt.net> Fixes: d5b1a78a772f ("drm/vc4: Add support for drawing 3D frames.") Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2017-02-26drm/vc4: Fix memory leak of the CRTC state.Eric Anholt
commit 7622b25543665567d8830a63210385b7d705924b upstream. The underscores variant frees the pointers inside, while the no-underscores variant calls underscores and then frees the struct. Signed-off-by: Eric Anholt <eric@anholt.net> Fixes: d8dbf44f13b9 ("drm/vc4: Make the CRTCs cooperate on allocating display lists.") Cc: stable@vger.kernel.org Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>