summaryrefslogtreecommitdiffstats
path: root/include
AgeCommit message (Collapse)Author
2016-10-31libnvdimm: clear the internal poison_list when clearing badblocksVishal Verma
commit e046114af5fcafe8d6d3f0b6ccb99804bad34bfb upstream. nvdimm_clear_poison cleared the user-visible badblocks, and sent commands to the NVDIMM to clear the areas marked as 'poison', but it neglected to clear the same areas from the internal poison_list which is used to marshal ARS results before sorting them by namespace. As a result, once on-demand ARS functionality was added: 37b137f nfit, libnvdimm: allow an ARS scrub to be triggered on demand A scrub triggered from either sysfs or an MCE was found to be adding stale entries that had been cleared from gendisk->badblocks, but were still present in nvdimm_bus->poison_list. Additionally, the stale entries could be triggered into producing stale disk->badblocks by simply disabling and re-enabling the namespace or region. This adds the missing step of clearing poison_list entries when clearing poison, so that it is always in sync with badblocks. Fixes: 37b137f ("nfit, libnvdimm: allow an ARS scrub to be triggered on demand") Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-31mm/hugetlb: check for reserved hugepages during memory offlineGerald Schaefer
commit 082d5b6b60e9f25e1511557fcfcb21eedd267446 upstream. In dissolve_free_huge_pages(), free hugepages will be dissolved without making sure that there are enough of them left to satisfy hugepage reservations. Fix this by adding a return value to dissolve_free_huge_pages() and checking h->free_huge_pages vs. h->resv_huge_pages. Note that this may lead to the situation where dissolve_free_huge_page() returns an error and all free hugepages that were dissolved before that error are lost, while the memory block still cannot be set offline. Fixes: c8721bbb ("mm: memory-hotplug: enable memory hotplug to handle hugepage") Link: http://lkml.kernel.org/r/20160926172811.94033-3-gerald.schaefer@de.ibm.com Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Rui Teng <rui.teng@linux.vnet.ibm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-31posix_acl: Clear SGID bit when setting file permissionsJan Kara
commit 073931017b49d9458aa351605b43a7e34598caef upstream. When file permissions are modified via chmod(2) and the user is not in the owning group or capable of CAP_FSETID, the setgid bit is cleared in inode_change_ok(). Setting a POSIX ACL via setxattr(2) sets the file permissions as well as the new ACL, but doesn't clear the setgid bit in a similar way; this allows to bypass the check in chmod(2). Fix that. References: CVE-2016-7097 Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Juerg Haefliger <juerg.haefliger@hpe.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-31drm/prime: Pass the right module owner through to dma_buf_export()Chris Wilson
commit 56a76c0123d6cb034975901c80fce2627338ef9e upstream. dma_buf_export() adds a reference to the owning module to the dmabuf (to prevent the driver from being unloaded whilst a third party still refers to the dmabuf). However, drm_gem_prime_export() was passing its own THIS_MODULE (i.e. drm.ko) rather than the driver. Extract the right owner from the device->fops instead. v2: Use C99 initializers to zero out unset elements of dma_buf_export_info v3: Extract the right module from dev->fops. Testcase: igt/vgem_basic/unload Reported-by: Petri Latvala <petri.latvala@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Petri Latvala <petri.latvala@intel.com> Cc: Christian König <christian.koenig@amd.com> Tested-by: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161005122145.1507-1-chris@chris-wilson.co.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28target: Make EXTENDED_COPY 0xe4 failure return COPY TARGET DEVICE NOT REACHABLENicholas Bellinger
commit 449a137846c84829a328757cd21fd9ca65c08519 upstream. This patch addresses a bug where EXTENDED_COPY across multiple LUNs results in a CHECK_CONDITION when the source + destination are not located on the same physical node. ESX Host environments expect sense COPY_ABORTED w/ COPY TARGET DEVICE NOT REACHABLE to be returned when this occurs, in order to signal fallback to local copy method. As described in section 6.3.3 of spc4r22: "If it is not possible to complete processing of a segment because the copy manager is unable to establish communications with a copy target device, because the copy target device does not respond to INQUIRY, or because the data returned in response to INQUIRY indicates an unsupported logical unit, then the EXTENDED COPY command shall be terminated with CHECK CONDITION status, with the sense key set to COPY ABORTED, and the additional sense code set to COPY TARGET DEVICE NOT REACHABLE." Tested on v4.1.y with ESX v5.5u2+ with BlockCopy across multiple nodes. Reported-by: Nixon Vincent <nixon.vincent@calsoftinc.com> Tested-by: Nixon Vincent <nixon.vincent@calsoftinc.com> Cc: Nixon Vincent <nixon.vincent@calsoftinc.com> Tested-by: Dinesh Israni <ddi@datera.io> Signed-off-by: Dinesh Israni <ddi@datera.io> Cc: Dinesh Israni <ddi@datera.io> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28irqchip/gic-v3-its: Fix entry size mask for GITS_BASERVladimir Murzin
commit 9224eb77e63f70f16c0b6b7a20ca7d395f3bc077 upstream. Entry Size in GITS_BASER<n> occupies 5 bits [52:48], but we mask out 8 bits. Fixes: cc2d3216f53c ("irqchip: GICv3: ITS command queue") Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28cpufreq: fix overflow in cpufreq_table_find_index_dl()Sergey Senozhatsky
commit c6fe46a79ecd79606bb96fada4515f6b23f87b62 upstream. 'best' is always less or equals to 'pos', so `best - pos' returns a negative value which is then getting casted to `unsigned int' and passed to __cpufreq_driver_target()->acpi_cpufreq_target() for policy->freq_table selection. This results in BUG: unable to handle kernel paging request at ffff881019b469f8 IP: [<ffffffffa00356c1>] acpi_cpufreq_target+0x4f/0x190 [acpi_cpufreq] PGD 267f067 PUD 0 Oops: 0000 [#1] PREEMPT SMP CPU: 6 PID: 70 Comm: kworker/6:1 Not tainted 4.9.0-rc1-next-20161017-dbg-dirty Workqueue: events dbs_work_handler task: ffff88041b808000 task.stack: ffff88041b810000 RIP: 0010:[<ffffffffa00356c1>] [<ffffffffa00356c1>] acpi_cpufreq_target+0x4f/0x190 [acpi_cpufreq] RSP: 0018:ffff88041b813c60 EFLAGS: 00010282 RAX: ffff880419b46a00 RBX: ffff88041b848400 RCX: ffff880419b20f80 RDX: 00000000001dff38 RSI: 00000000ffffffff RDI: ffff88041b848400 RBP: ffff88041b813cb0 R08: 0000000000000006 R09: 0000000000000040 R10: ffffffff8207f9e0 R11: ffffffff8173595b R12: 0000000000000000 R13: ffff88041f1dff38 R14: 0000000000262900 R15: 0000000bfffffff4 FS: 0000000000000000(0000) GS:ffff88041f000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff881019b469f8 CR3: 000000041a2d3000 CR4: 00000000001406e0 Stack: ffff88041b813cb0 ffffffff813347f9 ffff88041b813ca0 ffffffff81334663 ffff88041f1d4bc0 ffff88041b848400 0000000000000000 0000000000000000 0000000000262900 0000000000000000 ffff88041b813d00 ffffffff813355dc Call Trace: [<ffffffff813347f9>] ? cpufreq_freq_transition_begin+0xf1/0xfc [<ffffffff81334663>] ? get_cpu_idle_time+0x97/0xa6 [<ffffffff813355dc>] __cpufreq_driver_target+0x3b6/0x44e [<ffffffff81336ca3>] cs_dbs_timer+0x11a/0x135 [<ffffffff81336fda>] dbs_work_handler+0x39/0x62 [<ffffffff81057823>] process_one_work+0x280/0x4a5 [<ffffffff81058719>] worker_thread+0x24f/0x397 [<ffffffff810584ca>] ? rescuer_thread+0x30b/0x30b [<ffffffff81418380>] ? nl80211_get_key+0x29/0x36a [<ffffffff8105d2b7>] kthread+0xfc/0x104 [<ffffffff8107ceea>] ? put_lock_stats.isra.9+0xe/0x20 [<ffffffff8105d1bb>] ? kthread_create_on_node+0x3f/0x3f [<ffffffff814b2092>] ret_from_fork+0x22/0x30 Code: 56 4d 6b ff 0c 41 55 41 54 53 48 83 ec 28 48 8b 15 ad 1e 00 00 44 8b 41 08 48 8b 87 c8 00 00 00 49 89 d5 4e 03 2c c5 80 b2 78 81 <46> 8b 74 38 04 45 3b 75 00 75 11 31 c0 83 39 00 0f 84 1c 01 00 RIP [<ffffffffa00356c1>] acpi_cpufreq_target+0x4f/0x190 [acpi_cpufreq] RSP <ffff88041b813c60> CR2: ffff881019b469f8 ---[ end trace 16d9fc7a17897d37 ]--- [ rjw: In some cases this bug may also cause incorrect frequencies to be selected by cpufreq governors. ] Fixes: 899bb6642f2a (cpufreq: skip invalid entries when searching the frequency) Link: http://marc.info/?l=linux-kernel&m=147672030714331&w=2 Reported-and-tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reported-and-tested-by: Jörg Otte <jrg.otte@gmail.com> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28cpufreq: skip invalid entries when searching the frequencyAaro Koskinen
commit 899bb6642f2a2f2cd3f77abd6c5a14550e3b37e6 upstream. Skip invalid entries when searching the frequency. This fixes cpufreq at least on loongson2 MIPS board. Fixes: da0c6dc00c69 (cpufreq: Handle sorted frequency tables more efficiently) Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28PM / devfreq: event: remove duplicate devfreq_event_get_drvdata()Lin Huang
commit c8a9a6daccad495c48d5435d3487956ce01bc6a1 upstream. there define two devfreq_event_get_drvdata() function in devfreq-event.h when disable CONFIG_PM_DEVFREQ_EVENT, it will lead to build fail. So remove devfreq_event_get_drvdata() function. Fixes: f262f28c1470 ("PM / devfreq: event: Add devfreq_event class") Signed-off-by: Lin Huang <hl@rock-chips.com> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28clk: imx6: fix i.MX6DL clock tree to reflect realityLucas Stach
commit b1d51b448e4e6a392283b3eab06a7c5ec6d8a4e2 upstream. The current clock tree only implements the minimal set of differences between the i.MX6Q and the i.MX6DL, but that doesn't really reflect reality. Apply the following fixes to match the RM: - DL has no GPU3D_SHADER_SEL/PODF, the shader domain is clocked by GPU3D_CORE - GPU3D_SHADER_SEL/PODF has been repurposed as GPU2D_CORE_SEL/PODF - GPU2D_CORE_SEL/PODF has been repurposed as MLB_SEL/PODF Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Acked-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-22vfs: move permission checking into notify_change() for utimes(NULL)Miklos Szeredi
commit f2b20f6ee842313a0d681dbbf7f87b70291a6a3b upstream. This fixes a bug where the permission was not properly checked in overlayfs. The testcase is ltp/utimensat01. It is also cleaner and safer to do the permission checking in the vfs helper instead of the caller. This patch introduces an additional ia_valid flag ATTR_TOUCH (since touch(1) is the most obvious user of utimes(NULL)) that is passed into notify_change whenever the conditions for this special permission checking mode are met. Reported-by: Aihua Zhang <zhangaihua1@huawei.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Tested-by: Aihua Zhang <zhangaihua1@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-22crypto: ghash-generic - move common definitions to a new header fileMarcelo Cerri
commit a397ba829d7f8aff4c90af3704573a28ccd61a59 upstream. Move common values and types used by ghash-generic to a new header file so drivers can directly use ghash-generic as a fallback implementation. Fixes: cc333cd68dfa ("crypto: vmx - Adding GHASH routines for VMX module") Signed-off-by: Marcelo Cerri <marcelo.cerri@canonical.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-22ipc/sem.c: fix complex_count vs. simple op raceManfred Spraul
commit 5864a2fd3088db73d47942370d0f7210a807b9bc upstream. Commit 6d07b68ce16a ("ipc/sem.c: optimize sem_lock()") introduced a race: sem_lock has a fast path that allows parallel simple operations. There are two reasons why a simple operation cannot run in parallel: - a non-simple operations is ongoing (sma->sem_perm.lock held) - a complex operation is sleeping (sma->complex_count != 0) As both facts are stored independently, a thread can bypass the current checks by sleeping in the right positions. See below for more details (or kernel bugzilla 105651). The patch fixes that by creating one variable (complex_mode) that tracks both reasons why parallel operations are not possible. The patch also updates stale documentation regarding the locking. With regards to stable kernels: The patch is required for all kernels that include the commit 6d07b68ce16a ("ipc/sem.c: optimize sem_lock()") (3.10?) The alternative is to revert the patch that introduced the race. The patch is safe for backporting, i.e. it makes no assumptions about memory barriers in spin_unlock_wait(). Background: Here is the race of the current implementation: Thread A: (simple op) - does the first "sma->complex_count == 0" test Thread B: (complex op) - does sem_lock(): This includes an array scan. But the scan can't find Thread A, because Thread A does not own sem->lock yet. - the thread does the operation, increases complex_count, drops sem_lock, sleeps Thread A: - spin_lock(&sem->lock), spin_is_locked(sma->sem_perm.lock) - sleeps before the complex_count test Thread C: (complex op) - does sem_lock (no array scan, complex_count==1) - wakes up Thread B. - decrements complex_count Thread A: - does the complex_count test Bug: Now both thread A and thread C operate on the same array, without any synchronization. Fixes: 6d07b68ce16a ("ipc/sem.c: optimize sem_lock()") Link: http://lkml.kernel.org/r/1469123695-5661-1-git-send-email-manfred@colorfullife.com Reported-by: <felixh@informatik.uni-bremen.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: <1vier1@web.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-22mm: filemap: don't plant shadow entries without radix tree nodeJohannes Weiner
commit d3798ae8c6f3767c726403c2ca6ecc317752c9dd upstream. When the underflow checks were added to workingset_node_shadow_dec(), they triggered immediately: kernel BUG at ./include/linux/swap.h:276! invalid opcode: 0000 [#1] SMP Modules linked in: isofs usb_storage fuse xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_REJECT nf_reject_ipv6 soundcore wmi acpi_als pinctrl_sunrisepoint kfifo_buf tpm_tis industrialio acpi_pad pinctrl_intel tpm_tis_core tpm nfsd auth_rpcgss nfs_acl lockd grace sunrpc dm_crypt CPU: 0 PID: 20929 Comm: blkid Not tainted 4.8.0-rc8-00087-gbe67d60ba944 #1 Hardware name: System manufacturer System Product Name/Z170-K, BIOS 1803 05/06/2016 task: ffff8faa93ecd940 task.stack: ffff8faa7f478000 RIP: page_cache_tree_insert+0xf1/0x100 Call Trace: __add_to_page_cache_locked+0x12e/0x270 add_to_page_cache_lru+0x4e/0xe0 mpage_readpages+0x112/0x1d0 blkdev_readpages+0x1d/0x20 __do_page_cache_readahead+0x1ad/0x290 force_page_cache_readahead+0xaa/0x100 page_cache_sync_readahead+0x3f/0x50 generic_file_read_iter+0x5af/0x740 blkdev_read_iter+0x35/0x40 __vfs_read+0xe1/0x130 vfs_read+0x96/0x130 SyS_read+0x55/0xc0 entry_SYSCALL_64_fastpath+0x13/0x8f Code: 03 00 48 8b 5d d8 65 48 33 1c 25 28 00 00 00 44 89 e8 75 19 48 83 c4 18 5b 41 5c 41 5d 41 5e 5d c3 0f 0b 41 bd ef ff ff ff eb d7 <0f> 0b e8 88 68 ef ff 0f 1f 84 00 RIP page_cache_tree_insert+0xf1/0x100 This is a long-standing bug in the way shadow entries are accounted in the radix tree nodes. The shrinker needs to know when radix tree nodes contain only shadow entries, no pages, so node->count is split in half to count shadows in the upper bits and pages in the lower bits. Unfortunately, the radix tree implementation doesn't know of this and assumes all entries are in node->count. When there is a shadow entry directly in root->rnode and the tree is later extended, the radix tree implementation will copy that entry into the new node and and bump its node->count, i.e. increases the page count bits. Once the shadow gets removed and we subtract from the upper counter, node->count underflows and triggers the warning. Afterwards, without node->count reaching 0 again, the radix tree node is leaked. Limit shadow entries to when we have actual radix tree nodes and can count them properly. That means we lose the ability to detect refaults from files that had only the first page faulted in at eviction time. Fixes: 449dd6984d0e ("mm: keep page cache radix tree nodes in check") Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-and-tested-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-22Btrfs: catch invalid free space treesOmar Sandoval
commit 6675df311db87aa2107a04ef97e19420953cbace upstream. There are two separate issues that can lead to corrupted free space trees. 1. The free space tree bitmaps had an endianness issue on big-endian systems which is fixed by an earlier patch in this series. 2. btrfs-progs before v4.7.3 modified filesystems without updating the free space tree. To catch both of these issues at once, we need to force the free space tree to be rebuilt. To do so, add a FREE_SPACE_TREE_VALID compat_ro bit. If the bit isn't set, we know that it was either produced by a broken big-endian kernel or may have been corrupted by btrfs-progs. This also provides us with a way to add rudimentary read-write support for the free space tree to btrfs-progs: it can just clear this bit and have the kernel rebuild the free space tree. Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com> Tested-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-22debugfs: introduce a public file_operations accessorChristian Lamparter
commit 86f0e06767dda7863d6d2a8f0b3b857e6ea876a0 upstream. This patch introduces an accessor which can be used by the users of debugfs (drivers, fs, ...) to get the original file_operations struct. It also removes the REAL_FOPS_DEREF macro in file.c and converts the code to use the public version. Previously, REAL_FOPS_DEREF was only available within the file.c of debugfs. But having a public getter available for debugfs users is important as some drivers (carl9170 and b43) use the pointer of the original file_operations in conjunction with container_of() within their debugfs implementations. Reviewed-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-20mm: remove gup_flags FOLL_WRITE games from __get_user_pages()Linus Torvalds
commit 19be0eaffa3ac7d8eb6784ad9bdbc7d67ed8e619 upstream. This is an ancient bug that was actually attempted to be fixed once (badly) by me eleven years ago in commit 4ceb5db9757a ("Fix get_user_pages() race for write access") but that was then undone due to problems on s390 by commit f33ea7f404e5 ("fix get_user_pages bug"). In the meantime, the s390 situation has long been fixed, and we can now fix it by checking the pte_dirty() bit properly (and do it better). The s390 dirty bit was implemented in abf09bed3cce ("s390/mm: implement software dirty bits") which made it into v3.9. Earlier kernels will have to look at the page state itself. Also, the VM has become more scalable, and what used a purely theoretical race back then has become easier to trigger. To fix it, we introduce a new internal FOLL_COW flag to mark the "yes, we already did a COW" rather than play racy games with FOLL_WRITE that is very fundamental, and then use the pte dirty flag to validate that the FOLL_COW flag is still valid. Reported-and-tested-by: Phil "not Paul" Oester <kernel@linuxace.com> Acked-by: Hugh Dickins <hughd@google.com> Reviewed-by: Michal Hocko <mhocko@suse.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Willy Tarreau <w@1wt.eu> Cc: Nick Piggin <npiggin@gmail.com> Cc: Greg Thelen <gthelen@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-20v4l: rcar-fcp: Don't force users to check for disabled FCP supportLaurent Pinchart
commit fd44aa9a254b18176ec3792a18e7de6977030ca8 upstream. The rcar_fcp_enable() function immediately returns successfully when the FCP device pointer is NULL to avoid forcing the users to check the FCP device manually before every call. However, the stub version of the function used when the FCP driver is disabled returns -ENOSYS unconditionally, resulting in a different API contract for the two versions of the function. As a user that requires FCP support will fail at probe time when calling rcar_fcp_get() if the FCP driver is disabled, the stub version of the rcar_fcp_enable() function will only be called with a NULL FCP device. We can thus return 0 unconditionally to align the behaviour with the normal version of the function. Reported-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-16mfd: 88pm80x: Double shifting bug in suspend/resumeDan Carpenter
commit 9a6dc644512fd083400a96ac4a035ac154fe6b8d upstream. set_bit() and clear_bit() take the bit number so this code is really doing "1 << (1 << irq)" which is a double shift bug. It's done consistently so it won't cause a problem unless "irq" is more than 4. Fixes: 70c6cce04066 ('mfd: Support 88pm80x in 80x driver') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-07Using BUG_ON() as an assert() is _never_ acceptableLinus Torvalds
commit 21f54ddae449f4bdd9f1498124901d67202243d9 upstream. That just generally kills the machine, and makes debugging only much harder, since the traces may long be gone. Debugging by assert() is a disease. Don't do it. If you can continue, you're much better off doing so with a live machine where you have a much higher chance that the report actually makes it to the system logs, rather than result in a machine that is just completely dead. The only valid situation for BUG_ON() is when continuing is not an option, because there is massive corruption. But if you are just verifying that something is true, you warn about your broken assumptions (preferably just once), and limp on. Fixes: 22f2ac51b6d6 ("mm: workingset: fix crash in shadow node shrinker caused by replace_page_cache_page()") Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix wrong TCP checksums on MTU probing when checksum offloading is disabled, from Douglas Caetano dos Santos. 2) Fix qdisc backlog updates in qfq and sfb schedulers, from Cong Wang. 3) Route lookup flow key protocol value is wrong in ip6gre_xmit_other(), fix from Lance Richardson. 4) Scheduling while atomic in multicast routing code of ipv4 and ipv6, fix from Nikolay Aleksandrov. 5) Fix packet alignment in fec driver, from Eric Nelson. 6) Fix perf regression in sctp due to struct layout and cache misses, from Xin Long. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: sctp: fix the issue sctp_diag uses lock_sock in rcu_read_lock sctp: change to check peer prsctp_capable when using prsctp polices sctp: remove prsctp_param from sctp_chunk sctp: move sent_count to the memory hole in sctp_chunk tg3: Avoid NULL pointer dereference in tg3_io_error_detected() act_ife: Fix false encoding act_ife: Fix external mac header on encode VSOCK: Don't dec ack backlog twice for rejected connections Revert "net: ethernet: bcmgenet: use phydev from struct net_device" net: fec: align IP header in hardware net: fec: remove QUIRK_HAS_RACC from i.mx27 net: fec: remove QUIRK_HAS_RACC from i.mx25 ipmr, ip6mr: fix scheduling while atomic and a deadlock with ipmr_get_route ip6_gre: fix flowi6_proto value in ip6gre_xmit_other() tcp: fix a compile error in DBGUNDO() tcp: fix wrong checksum calculation on MTU probing sch_sfb: keep backlog updated with qlen sch_qfq: keep backlog updated with qlen can: dev: fix deadlock reported after bus-off
2016-10-01Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fix from James Bottomley: "One final fix before 4.8. There was a memory leak triggered by turning scsi mq off due to the fact that we assume on host release that the already running hosts weren't mq based because that's the state of the global flag (even though they were). Fix it by tracking this on a per host host basis" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: Avoid that toggling use_blk_mq triggers a memory leak
2016-09-30include/linux/property.h: fix typo/compile errorJohn Youn
This fixes commit d76eebfa175e ("include/linux/property.h: fix build issues with gcc-4.4.4"). With that commit we get the following compile error when using the PROPERTY_ENTRY_INTEGER_ARRAY macro. include/linux/property.h:201:39: error: `u32_data' undeclared (first use in this function) PROPERTY_ENTRY_INTEGER_ARRAY(_name_, u32, _val_) ^ include/linux/property.h:193:17: note: in definition of macro `PROPERTY_ENTRY_INTEGER_ARRAY' { .pointer = { _type_##_data = _val_ } }, \ ^ This needs a '.' to reference the union member. It seems this was just overlooked here since it is done correctly in similar constructs in other parts of the original commit. This fix is in preparation of upcoming commits that will use this macro. Fixes: commit d76eebfa175e ("include/linux/property.h: fix build issues with gcc-4.4.4") Link: http://lkml.kernel.org/r/2de3b929290d88a723ed829a3e3cbd02044714df.1475114627.git.johnyoun@synopsys.com Signed-off-by: John Youn <johnyoun@synopsys.com> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-30mm: workingset: fix crash in shadow node shrinker caused by ↵Johannes Weiner
replace_page_cache_page() Antonio reports the following crash when using fuse under memory pressure: kernel BUG at /build/linux-a2WvEb/linux-4.4.0/mm/workingset.c:346! invalid opcode: 0000 [#1] SMP Modules linked in: all of them CPU: 2 PID: 63 Comm: kswapd0 Not tainted 4.4.0-36-generic #55-Ubuntu Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013 task: ffff88040cae6040 ti: ffff880407488000 task.ti: ffff880407488000 RIP: shadow_lru_isolate+0x181/0x190 Call Trace: __list_lru_walk_one.isra.3+0x8f/0x130 list_lru_walk_one+0x23/0x30 scan_shadow_nodes+0x34/0x50 shrink_slab.part.40+0x1ed/0x3d0 shrink_zone+0x2ca/0x2e0 kswapd+0x51e/0x990 kthread+0xd8/0xf0 ret_from_fork+0x3f/0x70 which corresponds to the following sanity check in the shadow node tracking: BUG_ON(node->count & RADIX_TREE_COUNT_MASK); The workingset code tracks radix tree nodes that exclusively contain shadow entries of evicted pages in them, and this (somewhat obscure) line checks whether there are real pages left that would interfere with reclaim of the radix tree node under memory pressure. While discussing ways how fuse might sneak pages into the radix tree past the workingset code, Miklos pointed to replace_page_cache_page(), and indeed there is a problem there: it properly accounts for the old page being removed - __delete_from_page_cache() does that - but then does a raw raw radix_tree_insert(), not accounting for the replacement page. Eventually the page count bits in node->count underflow while leaving the node incorrectly linked to the shadow node LRU. To address this, make sure replace_page_cache_page() uses the tracked page insertion code, page_cache_tree_insert(). This fixes the page accounting and makes sure page-containing nodes are properly unlinked from the shadow node LRU again. Also, make the sanity checks a bit less obscure by using the helpers for checking the number of pages and shadows in a radix tree node. Fixes: 449dd6984d0e ("mm: keep page cache radix tree nodes in check") Link: http://lkml.kernel.org/r/20160919155822.29498-1-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: Antonio SJ Musumeci <trapexit@spawn.link> Debugged-by: Miklos Szeredi <miklos@szeredi.hu> Cc: <stable@vger.kernel.org> [3.15+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-30sctp: remove prsctp_param from sctp_chunkXin Long
Now sctp uses chunk->prsctp_param to save the prsctp param for all the prsctp polices, we didn't need to introduce prsctp_param to sctp_chunk. We can just use chunk->sinfo.sinfo_timetolive for RTX and BUF polices, and reuse msg->expires_at for TTL policy, as the prsctp polices and old expires policy are mutual exclusive. This patch is to remove prsctp_param from sctp_chunk, and reuse msg's expires_at for TTL and chunk's sinfo.sinfo_timetolive for RTX and BUF polices. Note that sctp can't use chunk's sinfo.sinfo_timetolive for TTL policy, as it needs a u64 variables to save the expires_at time. This one also fixes the "netperf-Throughput_Mbps -37.2% regression" issue. Fixes: a6c2f792873a ("sctp: implement prsctp TTL policy") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-30sctp: move sent_count to the memory hole in sctp_chunkXin Long
Now pahole sctp_chunk, it has 2 memory holes: struct sctp_chunk { struct list_head list; atomic_t refcnt; /* XXX 4 bytes hole, try to pack */ ... long unsigned int prsctp_param; int sent_count; /* XXX 4 bytes hole, try to pack */ This patch is to move up sent_count to fill the 1st one and eliminate the 2nd one. It's not just another struct compaction, it also fixes the "netperf- Throughput_Mbps -37.2% regression" issue when overloading the CPU. Fixes: a6c2f792873a ("sctp: implement prsctp TTL policy") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-28dma-mapping.h: preserve unmap info for CONFIG_DMA_API_DEBUGAndrey Smirnov
When CONFIG_DMA_API_DEBUG is enabled we need to preserve unmapping address even if "unmap" is a no-op for our architecutre because we need debug_dma_unmap_page() to correctly cleanup all of the debug bookkeeping. Failing to do so results in a false positive warnings about previously mapped areas never being unmapped. Link: http://lkml.kernel.org/r/1474387125-3713-1-git-send-email-andrew.smirnov@gmail.com Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Will Deacon <will.deacon@arm.com> Cc: Zhen Lei <thunder.leizhen@huawei.com> Cc: "Luis R. Rodriguez" <mcgrof@suse.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Geliang Tang <geliangtang@163.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-27Merge remote-tracking branch 'mkp-scsi/4.8/scsi-fixes' into fixesJames Bottomley
2016-09-26scsi: Avoid that toggling use_blk_mq triggers a memory leakBart Van Assche
This patch avoids that the following memory leak is triggered if use_blk_mq is disabled after a SCSI host has been allocated by the ib_srp driver and before the same SCSI host is freed: unreferenced object 0xffff8803a168c568 (size 256): backtrace: [<ffffffff81620c95>] kmemleak_alloc+0x45/0xa0 [<ffffffff811bb104>] __kmalloc_node+0x1e4/0x400 [<ffffffff81309fe4>] blk_mq_alloc_tag_set+0xb4/0x230 [<ffffffff814731b7>] scsi_mq_setup_tags+0xc7/0xd0 [<ffffffff81469c26>] scsi_add_host_with_dma+0x216/0x2d0 [<ffffffffa064bef5>] srp_create_target+0xe55/0x13d0 [ib_srp] [<ffffffff8143ce23>] dev_attr_store+0x13/0x20 [<ffffffff8125f030>] sysfs_kf_write+0x40/0x50 [<ffffffff8125e397>] kernfs_fop_write+0x137/0x1c0 [<ffffffff811d8c13>] __vfs_write+0x23/0x140 [<ffffffff811d92e0>] vfs_write+0xb0/0x190 [<ffffffff811da5b4>] SyS_write+0x44/0xa0 [<ffffffff8162c8a5>] entry_SYSCALL_64_fastpath+0x18/0xa8 Fixes: 9aa9cc4221f5 ("scsi: remove the disable_blk_mq host flag") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: <stable@vger.kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-09-25ipmr, ip6mr: fix scheduling while atomic and a deadlock with ipmr_get_routeNikolay Aleksandrov
Since the commit below the ipmr/ip6mr rtnl_unicast() code uses the portid instead of the previous dst_pid which was copied from in_skb's portid. Since the skb is new the portid is 0 at that point so the packets are sent to the kernel and we get scheduling while atomic or a deadlock (depending on where it happens) by trying to acquire rtnl two times. Also since this is RTM_GETROUTE, it can be triggered by a normal user. Here's the sleeping while atomic trace: [ 7858.212557] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620 [ 7858.212748] in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/0 [ 7858.212881] 2 locks held by swapper/0/0: [ 7858.213013] #0: (((&mrt->ipmr_expire_timer))){+.-...}, at: [<ffffffff810fbbf5>] call_timer_fn+0x5/0x350 [ 7858.213422] #1: (mfc_unres_lock){+.....}, at: [<ffffffff8161e005>] ipmr_expire_process+0x25/0x130 [ 7858.213807] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.0-rc7+ #179 [ 7858.213934] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 [ 7858.214108] 0000000000000000 ffff88005b403c50 ffffffff813a7804 0000000000000000 [ 7858.214412] ffffffff81a1338e ffff88005b403c78 ffffffff810a4a72 ffffffff81a1338e [ 7858.214716] 000000000000026c 0000000000000000 ffff88005b403ca8 ffffffff810a4b9f [ 7858.215251] Call Trace: [ 7858.215412] <IRQ> [<ffffffff813a7804>] dump_stack+0x85/0xc1 [ 7858.215662] [<ffffffff810a4a72>] ___might_sleep+0x192/0x250 [ 7858.215868] [<ffffffff810a4b9f>] __might_sleep+0x6f/0x100 [ 7858.216072] [<ffffffff8165bea3>] mutex_lock_nested+0x33/0x4d0 [ 7858.216279] [<ffffffff815a7a5f>] ? netlink_lookup+0x25f/0x460 [ 7858.216487] [<ffffffff8157474b>] rtnetlink_rcv+0x1b/0x40 [ 7858.216687] [<ffffffff815a9a0c>] netlink_unicast+0x19c/0x260 [ 7858.216900] [<ffffffff81573c70>] rtnl_unicast+0x20/0x30 [ 7858.217128] [<ffffffff8161cd39>] ipmr_destroy_unres+0xa9/0xf0 [ 7858.217351] [<ffffffff8161e06f>] ipmr_expire_process+0x8f/0x130 [ 7858.217581] [<ffffffff8161dfe0>] ? ipmr_net_init+0x180/0x180 [ 7858.217785] [<ffffffff8161dfe0>] ? ipmr_net_init+0x180/0x180 [ 7858.217990] [<ffffffff810fbc95>] call_timer_fn+0xa5/0x350 [ 7858.218192] [<ffffffff810fbbf5>] ? call_timer_fn+0x5/0x350 [ 7858.218415] [<ffffffff8161dfe0>] ? ipmr_net_init+0x180/0x180 [ 7858.218656] [<ffffffff810fde10>] run_timer_softirq+0x260/0x640 [ 7858.218865] [<ffffffff8166379b>] ? __do_softirq+0xbb/0x54f [ 7858.219068] [<ffffffff816637c8>] __do_softirq+0xe8/0x54f [ 7858.219269] [<ffffffff8107a948>] irq_exit+0xb8/0xc0 [ 7858.219463] [<ffffffff81663452>] smp_apic_timer_interrupt+0x42/0x50 [ 7858.219678] [<ffffffff816625bc>] apic_timer_interrupt+0x8c/0xa0 [ 7858.219897] <EOI> [<ffffffff81055f16>] ? native_safe_halt+0x6/0x10 [ 7858.220165] [<ffffffff810d64dd>] ? trace_hardirqs_on+0xd/0x10 [ 7858.220373] [<ffffffff810298e3>] default_idle+0x23/0x190 [ 7858.220574] [<ffffffff8102a20f>] arch_cpu_idle+0xf/0x20 [ 7858.220790] [<ffffffff810c9f8c>] default_idle_call+0x4c/0x60 [ 7858.221016] [<ffffffff810ca33b>] cpu_startup_entry+0x39b/0x4d0 [ 7858.221257] [<ffffffff8164f995>] rest_init+0x135/0x140 [ 7858.221469] [<ffffffff81f83014>] start_kernel+0x50e/0x51b [ 7858.221670] [<ffffffff81f82120>] ? early_idt_handler_array+0x120/0x120 [ 7858.221894] [<ffffffff81f8243f>] x86_64_start_reservations+0x2a/0x2c [ 7858.222113] [<ffffffff81f8257c>] x86_64_start_kernel+0x13b/0x14a Fixes: 2942e9005056 ("[RTNETLINK]: Use rtnl_unicast() for rtnetlink unicasts") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-25fault_in_multipages_readable() throws set-but-unused errorDave Chinner
When building XFS with -Werror, it now fails with: include/linux/pagemap.h: In function 'fault_in_multipages_readable': include/linux/pagemap.h:602:16: error: variable 'c' set but not used [-Werror=unused-but-set-variable] volatile char c; ^ This is a regression caused by commit e23d4159b109 ("fix fault_in_multipages_...() on architectures with no-op access_ok()"). Fix it by re-adding the "(void)c" trick taht was previously used to make the compiler think the variable is used. Signed-off-by: Dave Chinner <dchinner@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-23Merge tag 'linux-can-fixes-for-4.8-20160922' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2016-09-22 this is a pull request of one patch for the upcoming linux-4.8 release. The patch by Sergei Miroshnichenko fixes a potential deadlock in the generic CAN device code that cann occour after a bus-off. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-22Merge tag 'media/v4.8-7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: - several fixes for new drivers added for Kernel 4.8 addition (cec core, pulse8 cec driver and Mediatek vcodec) - a regression fix for cx23885 and saa7134 drivers - an important fix for rcar-fcp, making rcar_fcp_enable() return 0 on success * tag 'media/v4.8-7' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (25 commits) [media] cx23885/saa7134: assign q->dev to the PCI device [media] rcar-fcp: Make sure rcar_fcp_enable() returns 0 on success [media] cec: fix ioctl return code when not registered [media] cec: don't Feature Abort broadcast msgs when unregistered [media] vcodec:mediatek: Refine VP8 encoder driver [media] vcodec:mediatek: Refine H264 encoder driver [media] vcodec:mediatek: change H264 profile default to profile high [media] vcodec:mediatek: Add timestamp and timecode copy for V4L2 Encoder [media] vcodec:mediatek: Fix visible_height larger than coded_height issue in s_fmt_out [media] vcodec:mediatek: Fix fops_vcodec_release flow for V4L2 Encoder [media] vcodec:mediatek:code refine for v4l2 Encoder driver [media] cec-funcs.h: add missing vendor-specific messages [media] cec-edid: check for IEEE identifier [media] pulse8-cec: fix error handling [media] pulse8-cec: set correct Signal Free Time [media] mtk-vcodec: add HAS_DMA dependency [media] cec: ignore messages when log_addr_mask == 0 [media] cec: add item to TODO [media] cec: set unclaimed addresses to CEC_LOG_ADDR_INVALID [media] cec: add CEC_LOG_ADDRS_FL_ALLOW_UNREG_FALLBACK flag ...
2016-09-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: "Mostly small bits scattered all over the place, which is usually how things go this late in the -rc series. 1) Proper driver init device resets in bnx2, from Baoquan He. 2) Fix accounting overflow in __tcp_retransmit_skb(), sk_forward_alloc, and ip_idents_reserve, from Eric Dumazet. 3) Fix crash in bna driver ethtool stats handling, from Ivan Vecera. 4) Missing check of skb_linearize() return value in mac80211, from Johannes Berg. 5) Endianness fix in nf_table_trace dumps, from Liping Zhang. 6) SSN comparison fix in SCTP, from Marcelo Ricardo Leitner. 7) Update DSA and b44 MAINTAINERS entries. 8) Make input path of vti6 driver work again, from Nicolas Dichtel. 9) Off-by-one in mlx4, from Sebastian Ott. 10) Fix fallback route lookup handling in ipv6, from Vincent Bernat. 11) Fix stack corruption on probe in qed driver, from Yuval Mintz. 12) PHY init fixes in r8152 from Hayes Wang. 13) Missing SKB free in irda_accept error path, from Phil Turnbull" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (61 commits) tcp: properly account Fast Open SYN-ACK retrans tcp: fix under-accounting retransmit SNMP counters MAINTAINERS: Update b44 maintainer. net: get rid of an signed integer overflow in ip_idents_reserve() net/mlx4_core: Fix to clean devlink resources net: can: ifi: Configure transmitter delay vti6: fix input path ipmr, ip6mr: return lastuse relative to now r8152: disable ALDPS and EEE before setting PHY r8152: remove r8153_enable_eee r8152: move PHY settings to hw_phy_cfg r8152: move enabling PHY r8152: move some functions cxgb4/cxgb4vf: Allocate more queues for 25G and 100G adapter qed: Fix stack corruption on probe MAINTAINERS: Add an entry for the core network DSA code net: ipv6: fallback to full lookup if table lookup is unsuitable net/mlx5: E-Switch, Handle mode change failures net/mlx5: E-Switch, Fix error flow in the SRIOV e-switch init code net/mlx5: Fix flow counter bulk command out mailbox allocation ...
2016-09-22can: dev: fix deadlock reported after bus-offSergei Miroshnichenko
A timer was used to restart after the bus-off state, leading to a relatively large can_restart() executed in an interrupt context, which in turn sets up pinctrl. When this happens during system boot, there is a high probability of grabbing the pinctrl_list_mutex, which is locked already by the probe() of other device, making the kernel suspect a deadlock condition [1]. To resolve this issue, the restart_timer is replaced by a delayed work. [1] https://github.com/victronenergy/venus/issues/24 Signed-off-by: Sergei Miroshnichenko <sergeimir@emcraft.com> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2016-09-22Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2016-09-21 1) Propagate errors on security context allocation. From Mathias Krause. 2) Fix inbound policy checks for inter address family tunnels. From Thomas Zeitlhofer. 3) Fix an old memory leak on aead algorithm usage. From Ilan Tayari. 4) A recent patch fixed a possible NULL pointer dereference but broke the vti6 input path. Fix from Nicolas Dichtel. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21vti6: fix input pathNicolas Dichtel
Since commit 1625f4529957, vti6 is broken, all input packets are dropped (LINUX_MIB_XFRMINNOSTATES is incremented). XFRM_TUNNEL_SKB_CB(skb)->tunnel.ip6 is set by vti6_rcv() before calling xfrm6_rcv()/xfrm6_rcv_spi(), thus we cannot set to NULL that value in xfrm6_rcv_spi(). A new function xfrm6_rcv_tnl() that enables to pass a value to xfrm6_rcv_spi() is added, so that xfrm6_rcv() is not touched (this function is used in several handlers). CC: Alexey Kodanev <alexey.kodanev@oracle.com> Fixes: 1625f4529957 ("net/xfrm_input: fix possible NULL deref of tunnel.ip6->parms.i_key") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2016-09-20fix fault_in_multipages_...() on architectures with no-op access_ok()Al Viro
Switching iov_iter fault-in to multipages variants has exposed an old bug in underlying fault_in_multipages_...(); they break if the range passed to them wraps around. Normally access_ok() done by callers will prevent such (and it's a guaranteed EFAULT - ERR_PTR() values fall into such a range and they should not point to any valid objects). However, on architectures where userland and kernel live in different MMU contexts (e.g. s390) access_ok() is a no-op and on those a range with a wraparound can reach fault_in_multipages_...(). Since any wraparound means EFAULT there, the fix is trivial - turn those while (uaddr <= end) ... into if (unlikely(uaddr > end)) return -EFAULT; do ... while (uaddr <= end); Reported-by: Jan Stancek <jstancek@redhat.com> Tested-by: Jan Stancek <jstancek@redhat.com> Cc: stable@vger.kernel.org # v3.5+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-19fanotify: fix list corruption in fanotify_get_response()Jan Kara
fanotify_get_response() calls fsnotify_remove_event() when it finds that group is being released from fanotify_release() (bypass_perm is set). However the event it removes need not be only in the group's notification queue but it can have already moved to access_list (userspace read the event before closing the fanotify instance fd) which is protected by a different lock. Thus when fsnotify_remove_event() races with fanotify_release() operating on access_list, the list can get corrupted. Fix the problem by moving all the logic removing permission events from the lists to one place - fanotify_release(). Fixes: 5838d4442bd5 ("fanotify: fix double free of pending permission events") Link: http://lkml.kernel.org/r/1473797711-14111-3-git-send-email-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz> Reported-by: Miklos Szeredi <mszeredi@redhat.com> Tested-by: Miklos Szeredi <mszeredi@redhat.com> Reviewed-by: Miklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-19fsnotify: add a way to stop queueing events on group shutdownJan Kara
Implement a function that can be called when a group is being shutdown to stop queueing new events to the group. Fanotify will use this. Fixes: 5838d4442bd5 ("fanotify: fix double free of pending permission events") Link: http://lkml.kernel.org/r/1473797711-14111-2-git-send-email-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Miklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-18Merge branch 'smp-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull SMP build fixlet from Thomas Gleixner: "Add a missing include in cpuhotplug.h" * 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: cpu/hotplug: Include linux/types.h in linux/cpuhotplug.h
2016-09-18Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "Two patches from Boris which address a potential deadlock in the atmel irq chip driver" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/atmel-aic: Fix potential deadlock in ->xlate() genirq: Provide irq_gc_{lock_irqsave,unlock_irqrestore}() helpers
2016-09-17fix iov_iter_fault_in_readable()Al Viro
... by turning it into what used to be multipages counterpart Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-17sctp: fix SSN comparisionMarcelo Ricardo Leitner
This function actually operates on u32 yet its paramteres were declared as u16, causing integer truncation upon calling. Note in patch context that ADDIP_SERIAL_SIGN_BIT is already 32 bits. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-17net: avoid sk_forward_alloc overflowsEric Dumazet
A malicious TCP receiver, sending SACK, can force the sender to split skbs in write queue and increase its memory usage. Then, when socket is closed and its write queue purged, we might overflow sk_forward_alloc (It becomes negative) sk_mem_reclaim() does nothing in this case, and more than 2GB are leaked from TCP perspective (tcp_memory_allocated is not changed) Then warnings trigger from inet_sock_destruct() and sk_stream_kill_queues() seeing a not zero sk_forward_alloc All TCP stack can be stuck because TCP is under memory pressure. A simple fix is to preemptively reclaim from sk_mem_uncharge(). This makes sure a socket wont have more than 2 MB forward allocated, after burst and idle period. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-14Merge branch 'uaccess-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull uaccess fixes from Al Viro: "Fixes for broken uaccess primitives - mostly lack of proper zeroing in copy_from_user()/get_user()/__get_user(), but for several architectures there's more (broken clear_user() on frv and strncpy_from_user() on hexagon)" * 'uaccess-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (28 commits) avr32: fix copy_from_user() microblaze: fix __get_user() microblaze: fix copy_from_user() m32r: fix __get_user() blackfin: fix copy_from_user() sparc32: fix copy_from_user() sh: fix copy_from_user() sh64: failing __get_user() should zero score: fix copy_from_user() and friends score: fix __get_user/get_user s390: get_user() should zero on failure ppc32: fix copy_from_user() parisc: fix copy_from_user() openrisc: fix copy_from_user() nios2: fix __get_user() nios2: copy_from_user() should zero the tail of destination mn10300: copy_from_user() should zero on access_ok() failure... mn10300: failing __get_user() and get_user() should zero mips: copy_from_user() must zero the destination on access_ok() failure ARC: uaccess: get_user to zero out dest in cause of fault ...
2016-09-14cpu/hotplug: Include linux/types.h in linux/cpuhotplug.hPaul Burton
The linux/cpuhotplug.h header makes use of the bool type, but wasn't including linux/types.h to ensure that type has been defined. Fix this by including linux/types.h in preparation for including linux/cpuhotplug.h in a file that doesn't do so already. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Cc: Richard Cochran <rcochran@linutronix.de> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Anna-Maria Gleixner <anna-maria@linutronix.de> Link: http://lkml.kernel.org/r/20160914100027.20945-1-paul.burton@imgtec.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-13asm-generic: make get_user() clear the destination on errorsAl Viro
both for access_ok() failures and for faults halfway through Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-13Merge branch 'locking-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fix from Ingo Molnar: "Another lockless_dereference() Sparse fix" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/barriers: Don't use sizeof(void) in lockless_dereference()
2016-09-13Merge branch 'efi-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI fixes from Ingo Molnar: "This contains a Xen fix, an arm64 fix and a race condition / robustization set of fixes related to ExitBootServices() usage and boundary conditions" * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/efi: Use efi_exit_boot_services() efi/libstub: Use efi_exit_boot_services() in FDT efi/libstub: Introduce ExitBootServices helper efi/libstub: Allocate headspace in efi_get_memory_map() efi: Fix handling error value in fdt_find_uefi_params efi: Make for_each_efi_memory_desc_in_map() cope with running on Xen