summaryrefslogtreecommitdiffstats
path: root/include/trace
AgeCommit message (Collapse)Author
2020-04-01afs: Fix some tracing detailsDavid Howells
commit 4636cf184d6d9a92a56c2554681ea520dd4fe49a upstream. Fix a couple of tracelines to indicate the usage count after the atomic op, not the usage count before it to be consistent with other afs and rxrpc trace lines. Change the wording of the afs_call_trace_work trace ID label from "WORK" to "QUEUE" to reflect the fact that it's queueing work, not doing work. Fixes: 341f741f04be ("afs: Refcount the afs_call struct") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-24rcu: Fix data-race due to atomic_t copy-by-valueMarco Elver
[ Upstream commit 6cf539a87a61a4fbc43f625267dbcbcf283872ed ] This fixes a data-race where `atomic_t dynticks` is copied by value. The copy is performed non-atomically, resulting in a data-race if `dynticks` is updated concurrently. This data-race was found with KCSAN: ================================================================== BUG: KCSAN: data-race in dyntick_save_progress_counter / rcu_irq_enter write to 0xffff989dbdbe98e0 of 4 bytes by task 10 on cpu 3: atomic_add_return include/asm-generic/atomic-instrumented.h:78 [inline] rcu_dynticks_snap kernel/rcu/tree.c:310 [inline] dyntick_save_progress_counter+0x43/0x1b0 kernel/rcu/tree.c:984 force_qs_rnp+0x183/0x200 kernel/rcu/tree.c:2286 rcu_gp_fqs kernel/rcu/tree.c:1601 [inline] rcu_gp_fqs_loop+0x71/0x880 kernel/rcu/tree.c:1653 rcu_gp_kthread+0x22c/0x3b0 kernel/rcu/tree.c:1799 kthread+0x1b5/0x200 kernel/kthread.c:255 <snip> read to 0xffff989dbdbe98e0 of 4 bytes by task 154 on cpu 7: rcu_nmi_enter_common kernel/rcu/tree.c:828 [inline] rcu_irq_enter+0xda/0x240 kernel/rcu/tree.c:870 irq_enter+0x5/0x50 kernel/softirq.c:347 <snip> Reported by Kernel Concurrency Sanitizer on: CPU: 7 PID: 154 Comm: kworker/7:1H Not tainted 5.3.0+ #5 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 Workqueue: kblockd blk_mq_run_work_fn ================================================================== Signed-off-by: Marco Elver <elver@google.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: rcu@vger.kernel.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-11memcg: fix a crash in wb_workfn when a device disappearsTheodore Ts'o
commit 68f23b89067fdf187763e75a56087550624fdbee upstream. Without memcg, there is a one-to-one mapping between the bdi and bdi_writeback structures. In this world, things are fairly straightforward; the first thing bdi_unregister() does is to shutdown the bdi_writeback structure (or wb), and part of that writeback ensures that no other work queued against the wb, and that the wb is fully drained. With memcg, however, there is a one-to-many relationship between the bdi and bdi_writeback structures; that is, there are multiple wb objects which can all point to a single bdi. There is a refcount which prevents the bdi object from being released (and hence, unregistered). So in theory, the bdi_unregister() *should* only get called once its refcount goes to zero (bdi_put will drop the refcount, and when it is zero, release_bdi gets called, which calls bdi_unregister). Unfortunately, del_gendisk() in block/gen_hd.c never got the memo about the Brave New memcg World, and calls bdi_unregister directly. It does this without informing the file system, or the memcg code, or anything else. This causes the root wb associated with the bdi to be unregistered, but none of the memcg-specific wb's are shutdown. So when one of these wb's are woken up to do delayed work, they try to dereference their wb->bdi->dev to fetch the device name, but unfortunately bdi->dev is now NULL, thanks to the bdi_unregister() called by del_gendisk(). As a result, *boom*. Fortunately, it looks like the rest of the writeback path is perfectly happy with bdi->dev and bdi->owner being NULL, so the simplest fix is to create a bdi_dev_name() function which can handle bdi->dev being NULL. This also allows us to bulletproof the writeback tracepoints to prevent them from dereferencing a NULL pointer and crashing the kernel if one is tracing with memcg's enabled, and an iSCSI device dies or a USB storage stick is pulled. The most common way of triggering this will be hotremoval of a device while writeback with memcg enabled is going on. It was triggering several times a day in a heavily loaded production environment. Google Bug Id: 145475544 Link: https://lore.kernel.org/r/20191227194829.150110-1-tytso@mit.edu Link: http://lkml.kernel.org/r/20191228005211.163952-1-tytso@mit.edu Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: Chris Mason <clm@fb.com> Cc: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-29tracing: xen: Ordered comparison of function pointersChangbin Du
commit d0695e2351102affd8efae83989056bc4b275917 upstream. Just as commit 0566e40ce7 ("tracing: initcall: Ordered comparison of function pointers"), this patch fixes another remaining one in xen.h found by clang-9. In file included from arch/x86/xen/trace.c:21: In file included from ./include/trace/events/xen.h:475: In file included from ./include/trace/define_trace.h:102: In file included from ./include/trace/trace_events.h:473: ./include/trace/events/xen.h:69:7: warning: ordered comparison of function \ pointers ('xen_mc_callback_fn_t' (aka 'void (*)(void *)') and 'xen_mc_callback_fn_t') [-Wordered-compare-function-pointers] __field(xen_mc_callback_fn_t, fn) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ./include/trace/trace_events.h:421:29: note: expanded from macro '__field' ^ ./include/trace/trace_events.h:407:6: note: expanded from macro '__field_ext' is_signed_type(type), filter_type); \ ^ ./include/linux/trace_events.h:554:44: note: expanded from macro 'is_signed_type' ^ Fixes: c796f213a6934 ("xen/trace: add multicall tracing") Signed-off-by: Changbin Du <changbin.du@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23mm: khugepaged: add trace status description for SCAN_PAGE_HAS_PRIVATEYang Shi
commit 554913f600b45d73de12ad58c1ac7baa0f22a703 upstream. Commit 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS") introduced a new khugepaged scan result: SCAN_PAGE_HAS_PRIVATE, but the corresponding description for trace events were not added. Link: http://lkml.kernel.org/r/1574793844-2914-1-git-send-email-yang.shi@linux.alibaba.com Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS") Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> Cc: Song Liu <songliubraving@fb.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-17afs: Fix use-after-loss-of-refDavid Howells
commit 40a708bd622b78582ae3d280de29b09b50bd04c0 upstream. afs_lookup() has a tracepoint to indicate the outcome of d_splice_alias(), passing it the inode to retrieve the fid from. However, the function gave up its ref on that inode when it called d_splice_alias(), which may have failed and dropped the inode. Fix this by caching the fid. Fixes: 80548b03991f ("afs: Add more tracepoints") Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-17xprtrdma: Add unique trace points for posting Local Invalidate WRsChuck Lever
commit 4b93dab36f28e673725e5e6123ebfccf7697f96a upstream. When adding frwr_unmap_async way back when, I re-used the existing trace_xprtrdma_post_send() trace point to record the return code of ib_post_send. Unfortunately there are some cases where re-using that trace point causes a crash. Instead, construct a trace point specific to posting Local Invalidate WRs that will always be safe to use in that context, and will act as a trace log eye-catcher for Local Invalidation. Fixes: 847568942f93 ("xprtrdma: Remove fr_state") Fixes: d8099feda483 ("xprtrdma: Reduce context switching due ... ") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Bill Baker <bill.baker@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-14tracing: Change offset type to s32 in preempt/irq tracepointsJoel Fernandes (Google)
commit bf44f488e168368cae4139b4b33c3d0aaa11679c upstream. Discussion in the below link reported that symbols in modules can appear to be before _stext on ARM architecture, causing wrapping with the offsets of this tracepoint. Change the offset type to s32 to fix this. Link: http://lore.kernel.org/r/20191127154428.191095-1-antonio.borneo@st.com Link: http://lkml.kernel.org/r/20200102194625.226436-1-joel@joelfernandes.org Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: David Sterba <dsterba@suse.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: Sakari Ailus <sakari.ailus@linux.intel.com> Cc: Antonio Borneo <antonio.borneo@st.com> Cc: stable@vger.kernel.org Fixes: d59158162e032 ("tracing: Add support for preempt and irq enable/disable events") Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-31block: Fix writeback throttling W=1 compiler warningsBart Van Assche
[ Upstream commit 1d200e9d6f635ae894993a7d0f1b9e0b6e522e3b ] Fix the following compiler warnings: In file included from ./include/linux/bitmap.h:9, from ./include/linux/cpumask.h:12, from ./arch/x86/include/asm/cpumask.h:5, from ./arch/x86/include/asm/msr.h:11, from ./arch/x86/include/asm/processor.h:21, from ./arch/x86/include/asm/cpufeature.h:5, from ./arch/x86/include/asm/thread_info.h:53, from ./include/linux/thread_info.h:38, from ./arch/x86/include/asm/preempt.h:7, from ./include/linux/preempt.h:78, from ./include/linux/spinlock.h:51, from ./include/linux/mmzone.h:8, from ./include/linux/gfp.h:6, from ./include/linux/mm.h:10, from ./include/linux/bvec.h:13, from ./include/linux/blk_types.h:10, from block/blk-wbt.c:23: In function 'strncpy', inlined from 'perf_trace_wbt_stat' at ./include/trace/events/wbt.h:15:1: ./include/linux/string.h:260:9: warning: '__builtin_strncpy' specified bound 32 equals destination size [-Wstringop-truncation] return __builtin_strncpy(p, q, size); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In function 'strncpy', inlined from 'perf_trace_wbt_lat' at ./include/trace/events/wbt.h:58:1: ./include/linux/string.h:260:9: warning: '__builtin_strncpy' specified bound 32 equals destination size [-Wstringop-truncation] return __builtin_strncpy(p, q, size); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In function 'strncpy', inlined from 'perf_trace_wbt_step' at ./include/trace/events/wbt.h:87:1: ./include/linux/string.h:260:9: warning: '__builtin_strncpy' specified bound 32 equals destination size [-Wstringop-truncation] return __builtin_strncpy(p, q, size); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In function 'strncpy', inlined from 'perf_trace_wbt_timer' at ./include/trace/events/wbt.h:126:1: ./include/linux/string.h:260:9: warning: '__builtin_strncpy' specified bound 32 equals destination size [-Wstringop-truncation] return __builtin_strncpy(p, q, size); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In function 'strncpy', inlined from 'trace_event_raw_event_wbt_stat' at ./include/trace/events/wbt.h:15:1: ./include/linux/string.h:260:9: warning: '__builtin_strncpy' specified bound 32 equals destination size [-Wstringop-truncation] return __builtin_strncpy(p, q, size); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In function 'strncpy', inlined from 'trace_event_raw_event_wbt_lat' at ./include/trace/events/wbt.h:58:1: ./include/linux/string.h:260:9: warning: '__builtin_strncpy' specified bound 32 equals destination size [-Wstringop-truncation] return __builtin_strncpy(p, q, size); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In function 'strncpy', inlined from 'trace_event_raw_event_wbt_timer' at ./include/trace/events/wbt.h:126:1: ./include/linux/string.h:260:9: warning: '__builtin_strncpy' specified bound 32 equals destination size [-Wstringop-truncation] return __builtin_strncpy(p, q, size); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In function 'strncpy', inlined from 'trace_event_raw_event_wbt_step' at ./include/trace/events/wbt.h:87:1: ./include/linux/string.h:260:9: warning: '__builtin_strncpy' specified bound 32 equals destination size [-Wstringop-truncation] return __builtin_strncpy(p, q, size); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Cc: Christoph Hellwig <hch@infradead.org> Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Fixes: e34cbd307477 ("blk-wbt: add general throttling mechanism"; v4.10). Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-18page_pool: do not release pool until inflight == 0.Jonathan Lemon
[ Upstream commit c3f812cea0d7006469d1cf33a4a9f0a12bb4b3a3 ] The page pool keeps track of the number of pages in flight, and it isn't safe to remove the pool until all pages are returned. Disallow removing the pool until all pages are back, so the pool is always available for page producers. Make the page pool responsible for its own delayed destruction instead of relying on XDP, so the page pool can be used without the xdp memory model. When all pages are returned, free the pool and notify xdp if the pool is registered with the xdp memory system. Have the callback perform a table walk since some drivers (cpsw) may share the pool among multiple xdp_rxq_info. Note that the increment of pages_state_release_cnt may result in inflight == 0, resulting in the pool being released. Fixes: d956a048cd3f ("xdp: force mem allocator removal and periodic warning") Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-09tcp: remove redundant new line from tcp_event_sk_skbTony Lu
This removes '\n' from trace event class tcp_event_sk_skb to avoid redundant new blank line and make output compact. Fixes: af4325ecc24f ("tcp: expose sk_state in tcp_retransmit_skb tracepoint") Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-23Merge tag 'for-5.4-rc4-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - fixes of error handling cleanup of metadata accounting with qgroups enabled - fix swapped values for qgroup tracepoints - fix race when handling full sync flag - don't start unused worker thread, functionality removed already * tag 'for-5.4-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: Btrfs: check for the full sync flag while holding the inode lock during fsync Btrfs: fix qgroup double free after failure to reserve metadata for delalloc btrfs: tracepoints: Fix bad entry members of qgroup events btrfs: tracepoints: Fix wrong parameter order for qgroup events btrfs: qgroup: Always free PREALLOC META reserve in btrfs_delalloc_release_extents() btrfs: don't needlessly create extent-refs kernel thread btrfs: block-group: Fix a memory leak due to missing btrfs_put_block_group() Btrfs: add missing extents release on file extent cluster relocation error
2019-10-17btrfs: tracepoints: Fix bad entry members of qgroup eventsQu Wenruo
[BUG] For btrfs:qgroup_meta_reserve event, the trace event can output garbage: qgroup_meta_reserve: 9c7f6acc-b342-4037-bc47-7f6e4d2232d7: refroot=5(FS_TREE) type=DATA diff=2 qgroup_meta_reserve: 9c7f6acc-b342-4037-bc47-7f6e4d2232d7: refroot=5(FS_TREE) type=0x258792 diff=2 The @type can be completely garbage, as DATA type is not possible for trace_qgroup_meta_reserve() trace event. [CAUSE] Ther are several problems related to qgroup trace events: - Unassigned entry member Member entry::type of trace_qgroup_update_reserve() and trace_qgourp_meta_reserve() is not assigned - Redundant entry member Member entry::type is completely useless in trace_qgroup_meta_convert() Fixes: 4ee0d8832c2e ("btrfs: qgroup: Update trace events for metadata reservation") CC: stable@vger.kernel.org # 4.10+ Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-10-13tcp: annotate sk->sk_wmem_queued lockless readsEric Dumazet
For the sake of tcp_poll(), there are few places where we fetch sk->sk_wmem_queued while this field can change from IRQ or other cpu. We need to add READ_ONCE() annotations, and also make sure write sides use corresponding WRITE_ONCE() to avoid store-tearing. sk_wmem_queued_add() helper is added so that we can in the future convert to ADD_ONCE() or equivalent if/when available. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-13tcp: annotate sk->sk_rcvbuf lockless readsEric Dumazet
For the sake of tcp_poll(), there are few places where we fetch sk->sk_rcvbuf while this field can change from IRQ or other cpu. We need to add READ_ONCE() annotations, and also make sure write sides use corresponding WRITE_ONCE() to avoid store-tearing. Note that other transports probably need similar fixes. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-07rxrpc: Fix trace-after-put looking at the put call recordDavid Howells
rxrpc_put_call() calls trace_rxrpc_call() after it has done the decrement of the refcount - which looks at the debug_id in the call record. But unless the refcount was reduced to zero, we no longer have the right to look in the record and, indeed, it may be deleted by some other thread. Fix this by getting the debug_id out before decrementing the refcount and then passing that into the tracepoint. Fixes: e34d4234b0b7 ("rxrpc: Trace rxrpc_call usage") Signed-off-by: David Howells <dhowells@redhat.com>
2019-10-07rxrpc: Fix trace-after-put looking at the put connection recordDavid Howells
rxrpc_put_*conn() calls trace_rxrpc_conn() after they have done the decrement of the refcount - which looks at the debug_id in the connection record. But unless the refcount was reduced to zero, we no longer have the right to look in the record and, indeed, it may be deleted by some other thread. Fix this by getting the debug_id out before decrementing the refcount and then passing that into the tracepoint. Fixes: 363deeab6d0f ("rxrpc: Add connection tracepoint and client conn state tracepoint") Signed-off-by: David Howells <dhowells@redhat.com>
2019-10-07rxrpc: Fix trace-after-put looking at the put peer recordDavid Howells
rxrpc_put_peer() calls trace_rxrpc_peer() after it has done the decrement of the refcount - which looks at the debug_id in the peer record. But unless the refcount was reduced to zero, we no longer have the right to look in the record and, indeed, it may be deleted by some other thread. Fix this by getting the debug_id out before decrementing the refcount and then passing that into the tracepoint. This can cause the following symptoms: BUG: KASAN: use-after-free in __rxrpc_put_peer net/rxrpc/peer_object.c:411 [inline] BUG: KASAN: use-after-free in rxrpc_put_peer+0x685/0x6a0 net/rxrpc/peer_object.c:435 Read of size 8 at addr ffff888097ec0058 by task syz-executor823/24216 Fixes: 1159d4b496f5 ("rxrpc: Add a tracepoint to track rxrpc_peer refcounting") Reported-by: syzbot+b9be979c55f2bea8ed30@syzkaller.appspotmail.com Signed-off-by: David Howells <dhowells@redhat.com>
2019-10-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix ieeeu02154 atusb driver use-after-free, from Johan Hovold. 2) Need to validate TCA_CBQ_WRROPT netlink attributes, from Eric Dumazet. 3) txq null deref in mac80211, from Miaoqing Pan. 4) ionic driver needs to select NET_DEVLINK, from Arnd Bergmann. 5) Need to disable bh during nft_connlimit GC, from Pablo Neira Ayuso. 6) Avoid division by zero in taprio scheduler, from Vladimir Oltean. 7) Various xgmac fixes in stmmac driver from Jose Abreu. 8) Avoid 64-bit division in mlx5 leading to link errors on 32-bit from Michal Kubecek. 9) Fix bad VLAN check in rtl8366 DSA driver, from Linus Walleij. 10) Fix sleep while atomic in sja1105, from Vladimir Oltean. 11) Suspend/resume deadlock in stmmac, from Thierry Reding. 12) Various UDP GSO fixes from Josh Hunt. 13) Fix slab out of bounds access in tcp_zerocopy_receive(), from Eric Dumazet. 14) Fix OOPS in __ipv6_ifa_notify(), from David Ahern. 15) Memory leak in NFC's llcp_sock_bind, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (72 commits) selftests/net: add nettest to .gitignore net: qlogic: Fix memory leak in ql_alloc_large_buffers nfc: fix memory leak in llcp_sock_bind() sch_dsmark: fix potential NULL deref in dsmark_init() net: phy: at803x: use operating parameters from PHY-specific status net: phy: extract pause mode net: phy: extract link partner advertisement reading net: phy: fix write to mii-ctrl1000 register ipv6: Handle missing host route in __ipv6_ifa_notify net: phy: allow for reset line to be tied to a sleepy GPIO controller net: ipv4: avoid mixed n_redirects and rate_tokens usage r8152: Set macpassthru in reset_resume callback cxgb4:Fix out-of-bounds MSI-X info array access Revert "ipv6: Handle race in addrconf_dad_work" net: make sock_prot_memory_pressure() return "const char *" rxrpc: Fix rxrpc_recvmsg tracepoint qmi_wwan: add support for Cinterion CLS8 devices tcp: fix slab-out-of-bounds in tcp_zerocopy_receive() lib: textsearch: fix escapes in example code udp: only do GSO if # of segs > 1 ...
2019-10-04rxrpc: Fix rxrpc_recvmsg tracepointDavid Howells
Fix the rxrpc_recvmsg tracepoint to handle being called with a NULL call parameter. Fixes: a25e21f0bcd2 ("rxrpc, afs: Use debug_ids rather than pointers in traces") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-30Merge tag 'trace-v5.4-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "A few more tracing fixes: - Fix a buffer overflow by checking nr_args correctly in probes - Fix a warning that is reported by clang - Fix a possible memory leak in error path of filter processing - Fix the selftest that checks for failures, but wasn't failing - Minor clean up on call site output of a memory trace event" * tag 'trace-v5.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: selftests/ftrace: Fix same probe error test mm, tracing: Print symbol name for call_site in trace events tracing: Have error path in predicate_parse() free its allocated memory tracing: Fix clang -Wint-in-bool-context warnings in IF_ASSIGN macro tracing/probe: Fix to check the difference of nr_args before adding probe
2019-09-28mm, tracing: Print symbol name for call_site in trace eventsChangbin Du
To improve the readability of raw slab trace points, print the call_site ip using '%pS'. Then we can grep events with function names. [002] .... 808.188897: kmem_cache_free: call_site=putname+0x47/0x50 ptr=00000000cef40c80 [002] .... 808.188898: kfree: call_site=security_cred_free+0x42/0x50 ptr=0000000062400820 [002] .... 808.188904: kmem_cache_free: call_site=put_cred_rcu+0x88/0xa0 ptr=0000000058d74ef8 [002] .... 808.188913: kmem_cache_alloc: call_site=prepare_creds+0x26/0x100 ptr=0000000058d74ef8 bytes_req=168 bytes_alloc=576 gfp_flags=GFP_KERNEL [002] .... 808.188917: kmalloc: call_site=security_prepare_creds+0x77/0xa0 ptr=0000000062400820 bytes_req=8 bytes_alloc=336 gfp_flags=GFP_KERNEL|__GFP_ZERO [002] .... 808.188920: kmem_cache_alloc: call_site=getname_flags+0x4f/0x1e0 ptr=00000000cef40c80 bytes_req=4096 bytes_alloc=4480 gfp_flags=GFP_KERNEL [002] .... 808.188925: kmem_cache_free: call_site=putname+0x47/0x50 ptr=00000000cef40c80 [002] .... 808.188926: kfree: call_site=security_cred_free+0x42/0x50 ptr=0000000062400820 [002] .... 808.188931: kmem_cache_free: call_site=put_cred_rcu+0x88/0xa0 ptr=0000000058d74ef8 Link: http://lkml.kernel.org/r/20190914103215.23301-1-changbin.du@gmail.com Signed-off-by: Changbin Du <changbin.du@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-09-26Merge tag 'nfs-for-5.4-1' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds
Pull NFS client updates from Anna Schumaker: "Stable bugfixes: - Dequeue the request from the receive queue while we're re-encoding # v4.20+ - Fix buffer handling of GSS MIC without slack # 5.1 Features: - Increase xprtrdma maximum transport header and slot table sizes - Add support for nfs4_call_sync() calls using a custom rpc_task_struct - Optimize the default readahead size - Enable pNFS filelayout LAYOUTGET on OPEN Other bugfixes and cleanups: - Fix possible null-pointer dereferences and memory leaks - Various NFS over RDMA cleanups - Various NFS over RDMA comment updates - Don't receive TCP data into a reset request buffer - Don't try to parse incomplete RPC messages - Fix congestion window race with disconnect - Clean up pNFS return-on-close error handling - Fixes for NFS4ERR_OLD_STATEID handling" * tag 'nfs-for-5.4-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (53 commits) pNFS/filelayout: enable LAYOUTGET on OPEN NFS: Optimise the default readahead size NFSv4: Handle NFS4ERR_OLD_STATEID in LOCKU NFSv4: Handle NFS4ERR_OLD_STATEID in CLOSE/OPEN_DOWNGRADE NFSv4: Fix OPEN_DOWNGRADE error handling pNFS: Handle NFS4ERR_OLD_STATEID on layoutreturn by bumping the state seqid NFSv4: Add a helper to increment stateid seqids NFSv4: Handle RPC level errors in LAYOUTRETURN NFSv4: Handle NFS4ERR_DELAY correctly in return-on-close NFSv4: Clean up pNFS return-on-close error handling pNFS: Ensure we do clear the return-on-close layout stateid on fatal errors NFS: remove unused check for negative dentry NFSv3: use nfs_add_or_obtain() to create and reference inodes NFS: Refactor nfs_instantiate() for dentry referencing callers SUNRPC: Fix congestion window race with disconnect SUNRPC: Don't try to parse incomplete RPC messages SUNRPC: Rename xdr_buf_read_netobj to xdr_buf_read_mic SUNRPC: Fix buffer handling of GSS MIC without slack SUNRPC: RPC level errors should always set task->tk_rpc_status SUNRPC: Don't receive TCP data into a request buffer that has been reset ...
2019-09-25include/trace/events/writeback.h: fix -Wstringop-truncation warningsQian Cai
There are many of those warnings. In file included from ./arch/powerpc/include/asm/paca.h:15, from ./arch/powerpc/include/asm/current.h:13, from ./include/linux/thread_info.h:21, from ./include/asm-generic/preempt.h:5, from ./arch/powerpc/include/generated/asm/preempt.h:1, from ./include/linux/preempt.h:78, from ./include/linux/spinlock.h:51, from fs/fs-writeback.c:19: In function 'strncpy', inlined from 'perf_trace_writeback_page_template' at ./include/trace/events/writeback.h:56:1: ./include/linux/string.h:260:9: warning: '__builtin_strncpy' specified bound 32 equals destination size [-Wstringop-truncation] return __builtin_strncpy(p, q, size); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fix it by using the new strscpy_pad() which was introduced in "lib/string: Add strscpy_pad() function" and will always be NUL-terminated instead of strncpy(). Also, change strlcpy() to use strscpy_pad() in this file for consistency. Link: http://lkml.kernel.org/r/1564075099-27750-1-git-send-email-cai@lca.pw Fixes: 455b2864686d ("writeback: Initial tracing support") Fixes: 028c2dd184c0 ("writeback: Add tracing to balance_dirty_pages") Fixes: e84d0a4f8e39 ("writeback: trace event writeback_queue_io") Fixes: b48c104d2211 ("writeback: trace event bdi_dirty_ratelimit") Fixes: cc1676d917f3 ("writeback: Move requeueing when I_SYNC set to writeback_sb_inodes()") Fixes: 9fb0a7da0c52 ("writeback: add more tracepoints") Signed-off-by: Qian Cai <cai@lca.pw> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Tobin C. Harding <tobin@kernel.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Dave Chinner <dchinner@redhat.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joe Perches <joe@perches.com> Cc: Kees Cook <keescook@chromium.org> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Nitin Gote <nitin.r.gote@intel.com> Cc: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Cc: Stephen Kitt <steve@sk2.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-18Merge tag 'for-5.4-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs updates from David Sterba: "This continues with work on code refactoring, sanity checks and space handling. There are some less user visible changes, nothing that would particularly stand out. User visible changes: - tree checker, more sanity checks of: - ROOT_ITEM (key, size, generation, level, alignment, flags) - EXTENT_ITEM and METADATA_ITEM checks (key, size, offset, alignment, refs) - tree block reference items - EXTENT_DATA_REF (key, hash, offset) - deprecate flag BTRFS_SUBVOL_CREATE_ASYNC for subvolume creation ioctl, scheduled removal in 5.7 - delete stale and unused UAPI definitions BTRFS_DEV_REPLACE_ITEM_STATE_* - improved export of debugging information available via existing sysfs directory structure - try harder to delete relations between qgroups and allow to delete orphan entries - remove unreliable space checks before relocation starts Core: - space handling: - improved ticket reservations and other high level logic in order to remove special cases - factor flushing infrastructure and use it for different contexts, allows to remove some special case handling - reduce metadata reservation when only updating inodes - reduce global block reserve minimum size (affects small filesystems) - improved overcommit logic wrt global block reserve - tests: - fix memory leaks in extent IO tree - catch all TRIM range Fixes: - fix ENOSPC errors, leading to transaction aborts, when cloning extents - several fixes for inode number cache (mount option inode_cache) - fix potential soft lockups during send when traversing large trees - fix unaligned access to space cache pages with SLUB debug on (PowerPC) Other: - refactoring public/private functions, moving to new or more appropriate files - defines converted to enums - error handling improvements - more assertions and comments - old code deletion" * tag 'for-5.4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (138 commits) btrfs: Relinquish CPUs in btrfs_compare_trees btrfs: Don't assign retval of btrfs_try_tree_write_lock/btrfs_tree_read_lock_atomic btrfs: create structure to encode checksum type and length btrfs: turn checksum type define into an enum btrfs: add enospc debug messages for ticket failure btrfs: do not account global reserve in can_overcommit btrfs: use btrfs_try_granting_tickets in update_global_rsv btrfs: always reserve our entire size for the global reserve btrfs: change the minimum global reserve size btrfs: rename btrfs_space_info_add_old_bytes btrfs: remove orig_bytes from reserve_ticket btrfs: fix may_commit_transaction to deal with no partial filling btrfs: rework wake_all_tickets btrfs: refactor the ticket wakeup code btrfs: stop partially refilling tickets when releasing space btrfs: add space reservation tracepoint for reserved bytes btrfs: roll tracepoint into btrfs_space_info_update helper btrfs: do not allow reservations if we have pending tickets btrfs: stop clearing EXTENT_DIRTY in inode I/O tree btrfs: treat RWF_{,D}SYNC writes as sync for CRCs ...
2019-09-18Merge tag 'filelock-v5.4-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux Pull file locking updates from Jeff Layton: "Just a couple of minor bugfixes, a revision to a tracepoint to account for some earlier changes to the internals, and a patch to add a pr_warn message when someone tries to mount a filesystem with '-o mand' on a kernel that has that support disabled" * tag 'filelock-v5.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux: locks: fix a memory leak bug in __break_lease() locks: print a warning when mount fails due to lack of "mand" support locks: Fix procfs output for file leases locks: revise generic_add_lease tracepoint
2019-09-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextLinus Torvalds
Pull networking updates from David Miller: 1) Support IPV6 RA Captive Portal Identifier, from Maciej Żenczykowski. 2) Use bio_vec in the networking instead of custom skb_frag_t, from Matthew Wilcox. 3) Make use of xmit_more in r8169 driver, from Heiner Kallweit. 4) Add devmap_hash to xdp, from Toke Høiland-Jørgensen. 5) Support all variants of 5750X bnxt_en chips, from Michael Chan. 6) More RTNL avoidance work in the core and mlx5 driver, from Vlad Buslov. 7) Add TCP syn cookies bpf helper, from Petar Penkov. 8) Add 'nettest' to selftests and use it, from David Ahern. 9) Add extack support to drop_monitor, add packet alert mode and support for HW drops, from Ido Schimmel. 10) Add VLAN offload to stmmac, from Jose Abreu. 11) Lots of devm_platform_ioremap_resource() conversions, from YueHaibing. 12) Add IONIC driver, from Shannon Nelson. 13) Several kTLS cleanups, from Jakub Kicinski. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1930 commits) mlxsw: spectrum_buffers: Add the ability to query the CPU port's shared buffer mlxsw: spectrum: Register CPU port with devlink mlxsw: spectrum_buffers: Prevent changing CPU port's configuration net: ena: fix incorrect update of intr_delay_resolution net: ena: fix retrieval of nonadaptive interrupt moderation intervals net: ena: fix update of interrupt moderation register net: ena: remove all old adaptive rx interrupt moderation code from ena_com net: ena: remove ena_restore_ethtool_params() and relevant fields net: ena: remove old adaptive interrupt moderation code from ena_netdev net: ena: remove code duplication in ena_com_update_nonadaptive_moderation_interval _*() net: ena: enable the interrupt_moderation in driver_supported_features net: ena: reimplement set/get_coalesce() net: ena: switch to dim algorithm for rx adaptive interrupt moderation net: ena: add intr_moder_rx_interval to struct ena_com_dev and use it net: phy: adin: implement Energy Detect Powerdown mode via phy-tunable ethtool: implement Energy Detect Powerdown support via phy-tunable xen-netfront: do not assume sk_buff_head list is empty in error handling s390/ctcm: Delete unnecessary checks before the macro call “dev_kfree_skb” net: ena: don't wake up tx queue when down drop_monitor: Better sanitize notified packets ...
2019-09-18Merge tag 'staging-5.4-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging and IIO driver updates from Greg KH: "Here is the big staging/iio driver update for 5.4-rc1. Lots of churn here, with a few driver/filesystems moving out of staging finally: - erofs moved out of staging - greybus core code moved out of staging Along with that, a new filesytem has been added: - extfat to provide support for those devices requiring that filesystem (i.e. transfer devices to/from windows systems or printers) Other than that, there a number of new IIO drivers, and lots and lots and lots of staging driver cleanups and minor fixes as people continue to dig into those for easy changes. All of these have been in linux-next for a while with no reported issues" * tag 'staging-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (453 commits) Staging: gasket: Use temporaries to reduce line length. Staging: octeon: Avoid several usecases of strcpy staging: vhciq_core: replace snprintf with scnprintf staging: wilc1000: avoid twice IRQ handler execution for each single interrupt staging: wilc1000: remove unused interrupt status handling code staging: fbtft: make several arrays static const, makes object smaller staging: rtl8188eu: make two arrays static const, makes object smaller staging: rtl8723bs: core: Remove Macro "IS_MAC_ADDRESS_BROADCAST" dt-bindings: anybus-controller: move to staging/ tree staging: emxx_udc: remove local TRUE/FALSE definition staging: wilc1000: look for rtc_clk clock staging: dt-bindings: wilc1000: add optional rtc_clk property staging: nvec: make use of devm_platform_ioremap_resource staging: exfat: drop unused function parameter Staging: exfat: Avoid use of strcpy staging: exfat: use integer constants staging: exfat: cleanup spacing for casts staging: exfat: cleanup spacing for operators staging: rtl8723bs: hal: remove redundant variable n staging: pi433: Fix typo in documentation ...
2019-09-17Merge tag 'pm-5.4-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "These include a rework of the main suspend-to-idle code flow (related to the handling of spurious wakeups), a switch over of several users of cpufreq notifiers to QoS-based limits, a new devfreq driver for Tegra20, a new cpuidle driver and governor for virtualized guests, an extension of the wakeup sources framework to expose wakeup sources as device objects in sysfs, and more. Specifics: - Rework the main suspend-to-idle control flow to avoid repeating "noirq" device resume and suspend operations in case of spurious wakeups from the ACPI EC and decouple the ACPI EC wakeups support from the LPS0 _DSM support (Rafael Wysocki). - Extend the wakeup sources framework to expose wakeup sources as device objects in sysfs (Tri Vo, Stephen Boyd). - Expose system suspend statistics in sysfs (Kalesh Singh). - Introduce a new haltpoll cpuidle driver and a new matching governor for virtualized guests wanting to do guest-side polling in the idle loop (Marcelo Tosatti, Joao Martins, Wanpeng Li, Stephen Rothwell). - Fix the menu and teo cpuidle governors to allow the scheduler tick to be stopped if PM QoS is used to limit the CPU idle state exit latency in some cases (Rafael Wysocki). - Increase the resolution of the play_idle() argument to microseconds for more fine-grained injection of CPU idle cycles (Daniel Lezcano). - Switch over some users of cpuidle notifiers to the new QoS-based frequency limits and drop the CPUFREQ_ADJUST and CPUFREQ_NOTIFY policy notifier events (Viresh Kumar). - Add new cpufreq driver based on nvmem for sun50i (Yangtao Li). - Add support for MT8183 and MT8516 to the mediatek cpufreq driver (Andrew-sh.Cheng, Fabien Parent). - Add i.MX8MN support to the imx-cpufreq-dt cpufreq driver (Anson Huang). - Add qcs404 to cpufreq-dt-platdev blacklist (Jorge Ramirez-Ortiz). - Update the qcom cpufreq driver (among other things, to make it easier to extend and to use kryo cpufreq for other nvmem-based SoCs) and add qcs404 support to it (Niklas Cassel, Douglas RAILLARD, Sibi Sankar, Sricharan R). - Fix assorted issues and make assorted minor improvements in the cpufreq code (Colin Ian King, Douglas RAILLARD, Florian Fainelli, Gustavo Silva, Hariprasad Kelam). - Add new devfreq driver for NVidia Tegra20 (Dmitry Osipenko, Arnd Bergmann). - Add new Exynos PPMU events to devfreq events and extend that mechanism (Lukasz Luba). - Fix and clean up the exynos-bus devfreq driver (Kamil Konieczny). - Improve devfreq documentation and governor code, fix spelling typos in devfreq (Ezequiel Garcia, Krzysztof Kozlowski, Leonard Crestez, MyungJoo Ham, Gaël PORTAY). - Add regulators enable and disable to the OPP (operating performance points) framework (Kamil Konieczny). - Update the OPP framework to support multiple opp-suspend properties (Anson Huang). - Fix assorted issues and make assorted minor improvements in the OPP code (Niklas Cassel, Viresh Kumar, Yue Hu). - Clean up the generic power domains (genpd) framework (Ulf Hansson). - Clean up assorted pieces of power management code and documentation (Akinobu Mita, Amit Kucheria, Chuhong Yuan). - Update the pm-graph tool to version 5.5 including multiple fixes and improvements (Todd Brandt). - Update the cpupower utility (Benjamin Weis, Geert Uytterhoeven, Sébastien Szymanski)" * tag 'pm-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (126 commits) cpuidle-haltpoll: Enable kvm guest polling when dedicated physical CPUs are available cpuidle-haltpoll: do not set an owner to allow modunload cpuidle-haltpoll: return -ENODEV on modinit failure cpuidle-haltpoll: set haltpoll as preferred governor cpuidle: allow governor switch on cpuidle_register_driver() PM: runtime: Documentation: add runtime_status ABI document pm-graph: make setVal unbuffered again for python2 and python3 powercap: idle_inject: Use higher resolution for idle injection cpuidle: play_idle: Increase the resolution to usec cpuidle-haltpoll: vcpu hotplug support cpufreq: Add qcs404 to cpufreq-dt-platdev blacklist cpufreq: qcom: Add support for qcs404 on nvmem driver cpufreq: qcom: Refactor the driver to make it easier to extend cpufreq: qcom: Re-organise kryo cpufreq to use it for other nvmem based qcom socs dt-bindings: opp: Add qcom-opp bindings with properties needed for CPR dt-bindings: opp: qcom-nvmem: Support pstates provided by a power domain Documentation: cpufreq: Update policy notifier documentation cpufreq: Remove CPUFREQ_ADJUST and CPUFREQ_NOTIFY policy notifier events PM / Domains: Verify PM domain type in dev_pm_genpd_set_performance_state() PM / Domains: Simplify genpd_lookup_dev() ...
2019-09-17Merge tag 'for-5.4/block-2019-09-16' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block updates from Jens Axboe: - Two NVMe pull requests: - ana log parse fix from Anton - nvme quirks support for Apple devices from Ben - fix missing bio completion tracing for multipath stack devices from Hannes and Mikhail - IP TOS settings for nvme rdma and tcp transports from Israel - rq_dma_dir cleanups from Israel - tracing for Get LBA Status command from Minwoo - Some nvme-tcp cleanups from Minwoo, Potnuri and Myself - Some consolidation between the fabrics transports for handling the CAP register - reset race with ns scanning fix for fabrics (move fabrics commands to a dedicated request queue with a different lifetime from the admin request queue)." - controller reset and namespace scan races fixes - nvme discovery log change uevent support - naming improvements from Keith - multiple discovery controllers reject fix from James - some regular cleanups from various people - Series fixing (and re-fixing) null_blk debug printing and nr_devices checks (André) - A few pull requests from Song, with fixes from Andy, Guoqing, Guilherme, Neil, Nigel, and Yufen. - REQ_OP_ZONE_RESET_ALL support (Chaitanya) - Bio merge handling unification (Christoph) - Pick default elevator correctly for devices with special needs (Damien) - Block stats fixes (Hou) - Timeout and support devices nbd fixes (Mike) - Series fixing races around elevator switching and device add/remove (Ming) - sed-opal cleanups (Revanth) - Per device weight support for BFQ (Fam) - Support for blk-iocost, a new model that can properly account cost of IO workloads. (Tejun) - blk-cgroup writeback fixes (Tejun) - paride queue init fixes (zhengbin) - blk_set_runtime_active() cleanup (Stanley) - Block segment mapping optimizations (Bart) - lightnvm fixes (Hans/Minwoo/YueHaibing) - Various little fixes and cleanups * tag 'for-5.4/block-2019-09-16' of git://git.kernel.dk/linux-block: (186 commits) null_blk: format pr_* logs with pr_fmt null_blk: match the type of parameter nr_devices null_blk: do not fail the module load with zero devices block: also check RQF_STATS in blk_mq_need_time_stamp() block: make rq sector size accessible for block stats bfq: Fix bfq linkage error raid5: use bio_end_sector in r5_next_bio raid5: remove STRIPE_OPS_REQ_PENDING md: add feature flag MD_FEATURE_RAID0_LAYOUT md/raid0: avoid RAID0 data corruption due to layout confusion. raid5: don't set STRIPE_HANDLE to stripe which is in batch list raid5: don't increment read_errors on EILSEQ return nvmet: fix a wrong error status returned in error log page nvme: send discovery log page change events to userspace nvme: add uevent variables for controller devices nvme: enable aen regardless of the presence of I/O queues nvme-fabrics: allow discovery subsystems accept a kato nvmet: Use PTR_ERR_OR_ZERO() in nvmet_init_discovery() nvme: Remove redundant assignment of cq vector nvme: Assign subsys instance from first ctrl ...
2019-09-17Merge branches 'pm-opp', 'pm-qos', 'acpi-pm', 'pm-domains' and 'pm-tools'Rafael J. Wysocki
* pm-opp: PM / OPP: Correct Documentation about library location opp: of: Support multiple suspend OPPs defined in DT dt-bindings: opp: Support multiple opp-suspend properties opp: core: add regulators enable and disable opp: Don't decrement uninitialized list_kref * pm-qos: PM: QoS: Get rid of unused flags * acpi-pm: ACPI: PM: Print debug messages on device power state changes * pm-domains: PM / Domains: Verify PM domain type in dev_pm_genpd_set_performance_state() PM / Domains: Simplify genpd_lookup_dev() PM / Domains: Align in-parameter names for some genpd functions * pm-tools: pm-graph: make setVal unbuffered again for python2 and python3 cpupower: update German translation tools/power/cpupower: fix 64bit detection when cross-compiling cpupower: Add missing newline at end of file pm-graph v5.5
2019-09-16Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "This cycle's RCU changes were: - A few more RCU flavor consolidation cleanups. - Updates to RCU's list-traversal macros improving lockdep usability. - Forward-progress improvements for no-CBs CPUs: Avoid ignoring incoming callbacks during grace-period waits. - Forward-progress improvements for no-CBs CPUs: Use ->cblist structure to take advantage of others' grace periods. - Also added a small commit that avoids needlessly inflicting scheduler-clock ticks on callback-offloaded CPUs. - Forward-progress improvements for no-CBs CPUs: Reduce contention on ->nocb_lock guarding ->cblist. - Forward-progress improvements for no-CBs CPUs: Add ->nocb_bypass list to further reduce contention on ->nocb_lock guarding ->cblist. - Miscellaneous fixes. - Torture-test updates. - minor LKMM updates" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (86 commits) MAINTAINERS: Update from paulmck@linux.ibm.com to paulmck@kernel.org rcu: Don't include <linux/ktime.h> in rcutiny.h rcu: Allow rcu_do_batch() to dynamically adjust batch sizes rcu/nocb: Don't wake no-CBs GP kthread if timer posted under overload rcu/nocb: Reduce __call_rcu_nocb_wake() leaf rcu_node ->lock contention rcu/nocb: Reduce nocb_cb_wait() leaf rcu_node ->lock contention rcu/nocb: Advance CBs after merge in rcutree_migrate_callbacks() rcu/nocb: Avoid synchronous wakeup in __call_rcu_nocb_wake() rcu/nocb: Print no-CBs diagnostics when rcutorture writer unduly delayed rcu/nocb: EXP Check use and usefulness of ->nocb_lock_contended rcu/nocb: Add bypass callback queueing rcu/nocb: Atomic ->len field in rcu_segcblist structure rcu/nocb: Unconditionally advance and wake for excessive CBs rcu/nocb: Reduce ->nocb_lock contention with separate ->nocb_gp_lock rcu/nocb: Reduce contention at no-CBs invocation-done time rcu/nocb: Reduce contention at no-CBs registry-time CB advancement rcu/nocb: Round down for number of no-CBs grace-period kthreads rcu/nocb: Avoid ->nocb_lock capture by corresponding CPU rcu/nocb: Avoid needless wakeups of no-CBs grace-period kthread rcu/nocb: Make __call_rcu_nocb_wake() safe for many callbacks ...
2019-09-11Merge branches 'arm/omap', 'arm/exynos', 'arm/smmu', 'arm/mediatek', ↵Joerg Roedel
'arm/qcom', 'arm/renesas', 'x86/amd', 'x86/vt-d' and 'core' into next
2019-09-11iommu/vt-d: Add trace events for device dma map/unmapLu Baolu
This adds trace support for the Intel IOMMU driver. It also declares some events which could be used to trace the events when an IOVA is being mapped or unmapped in a domain. Cc: Ashok Raj <ashok.raj@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-09-09btrfs: add a flush step for delayed iputsJosef Bacik
Delayed iputs could very well free up enough space without needing to commit the transaction, so make this step it's own step. This will allow us to skip the step for evictions in a later patch. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-09-09btrfs: Remove unused locking functionsNikolay Borisov
Those were split out of btrfs_clear_lock_blocking_rw by aa12c02778a9 ("btrfs: split btrfs_clear_lock_blocking_rw to read and write helpers") however at that time this function was unused due to commit 523983401644 ("Btrfs: kill btrfs_clear_path_blocking"). Put the final nail in the coffin of those 2 functions. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2019-09-05erofs: use erofs_inode namingGao Xiang
As Christoph suggested [1], "Why is this called vnode instead of inode? That seems like a rather odd naming for a Linux file system." [1] https://lore.kernel.org/r/20190829101545.GC20598@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190904020912.63925-10-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-09-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
r8152 conflicts are the NAPI fixes in 'net' overlapping with some tasklet stuff in net-next Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-30writeback: don't access page->mapping directly in track_foreign_dirty TPTejun Heo
page->mapping may encode different values in it and page_mapping() should always be used to access the mapping pointer. track_foreign_dirty tracepoint was incorrectly accessing page->mapping directly. Use page_mapping() instead. Also, add NULL checks while at it. Fixes: 3a8e9ac89e6a ("writeback: add tracepoints for cgroup foreign writebacks") Reported-by: Jan Kara <jack@suse.cz> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-30writeback: add tracepoints for cgroup foreign writebacksTejun Heo
cgroup foreign inode handling has quite a bit of heuristics and internal states which sometimes makes it difficult to understand what's going on. Add tracepoints to improve visibility. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-29blkcg: blk-iocost: predeclare used structsStephen Rothwell
Fixes: 7caa47151ab2 ("blkcg: implement blk-iocost") Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-28blkcg: implement blk-iocostTejun Heo
This patchset implements IO cost model based work-conserving proportional controller. While io.latency provides the capability to comprehensively prioritize and protect IOs depending on the cgroups, its protection is binary - the lowest latency target cgroup which is suffering is protected at the cost of all others. In many use cases including stacking multiple workload containers in a single system, it's necessary to distribute IO capacity with better granularity. One challenge of controlling IO resources is the lack of trivially observable cost metric. The most common metrics - bandwidth and iops - can be off by orders of magnitude depending on the device type and IO pattern. However, the cost isn't a complete mystery. Given several key attributes, we can make fairly reliable predictions on how expensive a given stream of IOs would be, at least compared to other IO patterns. The function which determines the cost of a given IO is the IO cost model for the device. This controller distributes IO capacity based on the costs estimated by such model. The more accurate the cost model the better but the controller adapts based on IO completion latency and as long as the relative costs across differents IO patterns are consistent and sensible, it'll adapt to the actual performance of the device. Currently, the only implemented cost model is a simple linear one with a few sets of default parameters for different classes of device. This covers most common devices reasonably well. All the infrastructure to tune and add different cost models is already in place and a later patch will also allow using bpf progs for cost models. Please see the top comment in blk-iocost.c and documentation for more details. v2: Rebased on top of RQ_ALLOC_TIME changes and folded in Rik's fix for a divide-by-zero bug in current_hweight() triggered by zero inuse_sum. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Andy Newell <newella@fb.com> Cc: Josef Bacik <jbacik@fb.com> Cc: Rik van Riel <riel@surriel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-08-27erofs: fix compile warnings when moving out include/trace/events/erofs.hGao Xiang
As Stephon reported [1], many compile warnings are raised when moving out include/trace/events/erofs.h: In file included from include/trace/events/erofs.h:8, from <command-line>: include/trace/events/erofs.h:28:37: warning: 'struct dentry' declared inside parameter list will not be visible outside of this definition or declaration TP_PROTO(struct inode *dir, struct dentry *dentry, unsigned int flags), ^~~~~~ include/linux/tracepoint.h:233:34: note: in definition of macro '__DECLARE_TRACE' static inline void trace_##name(proto) \ ^~~~~ include/linux/tracepoint.h:396:24: note: in expansion of macro 'PARAMS' __DECLARE_TRACE(name, PARAMS(proto), PARAMS(args), \ ^~~~~~ include/linux/tracepoint.h:532:2: note: in expansion of macro 'DECLARE_TRACE' DECLARE_TRACE(name, PARAMS(proto), PARAMS(args)) ^~~~~~~~~~~~~ include/linux/tracepoint.h:532:22: note: in expansion of macro 'PARAMS' DECLARE_TRACE(name, PARAMS(proto), PARAMS(args)) ^~~~~~ include/trace/events/erofs.h:26:1: note: in expansion of macro 'TRACE_EVENT' TRACE_EVENT(erofs_lookup, ^~~~~~~~~~~ include/trace/events/erofs.h:28:2: note: in expansion of macro 'TP_PROTO' TP_PROTO(struct inode *dir, struct dentry *dentry, unsigned int flags), ^~~~~~~~ That makes me very confused since most original EROFS tracepoint code was taken from f2fs, and finally I found commit 43c78d88036e ("kbuild: compile-test kernel headers to ensure they are self-contained") It seems these warnings are generated from KERNEL_HEADER_TEST feature and ext4/f2fs tracepoint files were in blacklist. Anyway, let's fix these issues for KERNEL_HEADER_TEST feature instead of adding to blacklist... [1] https://lore.kernel.org/lkml/20190826162432.11100665@canb.auug.org.au/ Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Link: https://lore.kernel.org/r/20190826132653.100731-1-gaoxiang25@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-27rxrpc: Use skb_unshare() rather than skb_cow_data()David Howells
The in-place decryption routines in AF_RXRPC's rxkad security module currently call skb_cow_data() to make sure the data isn't shared and that the skb can be written over. This has a problem, however, as the softirq handler may be still holding a ref or the Rx ring may be holding multiple refs when skb_cow_data() is called in rxkad_verify_packet() - and so skb_shared() returns true and __pskb_pull_tail() dislikes that. If this occurs, something like the following report will be generated. kernel BUG at net/core/skbuff.c:1463! ... RIP: 0010:pskb_expand_head+0x253/0x2b0 ... Call Trace: __pskb_pull_tail+0x49/0x460 skb_cow_data+0x6f/0x300 rxkad_verify_packet+0x18b/0xb10 [rxrpc] rxrpc_recvmsg_data.isra.11+0x4a8/0xa10 [rxrpc] rxrpc_kernel_recv_data+0x126/0x240 [rxrpc] afs_extract_data+0x51/0x2d0 [kafs] afs_deliver_fs_fetch_data+0x188/0x400 [kafs] afs_deliver_to_call+0xac/0x430 [kafs] afs_wait_for_call_to_complete+0x22f/0x3d0 [kafs] afs_make_call+0x282/0x3f0 [kafs] afs_fs_fetch_data+0x164/0x300 [kafs] afs_fetch_data+0x54/0x130 [kafs] afs_readpages+0x20d/0x340 [kafs] read_pages+0x66/0x180 __do_page_cache_readahead+0x188/0x1a0 ondemand_readahead+0x17d/0x2e0 generic_file_read_iter+0x740/0xc10 __vfs_read+0x145/0x1a0 vfs_read+0x8c/0x140 ksys_read+0x4a/0xb0 do_syscall_64+0x43/0xf0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fix this by using skb_unshare() instead in the input path for DATA packets that have a security index != 0. Non-DATA packets don't need in-place encryption and neither do unencrypted DATA packets. Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code") Reported-by: Julian Wollrath <jwollrath@web.de> Signed-off-by: David Howells <dhowells@redhat.com>
2019-08-27rxrpc: Use the tx-phase skb flag to simplify tracingDavid Howells
Use the previously-added transmit-phase skbuff private flag to simplify the socket buffer tracing a bit. Which phase the skbuff comes from can now be divined from the skb rather than having to be guessed from the call state. We can also reduce the number of rxrpc_skb_trace values by eliminating the difference between Tx and Rx in the symbols. Signed-off-by: David Howells <dhowells@redhat.com>
2019-08-24erofs: move erofs out of stagingGao Xiang
EROFS filesystem has been merged into linux-staging for a year. EROFS is designed to be a better solution of saving extra storage space with guaranteed end-to-end performance for read-only files with the help of reduced metadata, fixed-sized output compression and decompression inplace technologies. In the past year, EROFS was greatly improved by many people as a staging driver, self-tested, betaed by a large number of our internal users, successfully applied to almost all in-service HUAWEI smartphones as the part of EMUI 9.1 and proven to be stable enough to be moved out of staging. EROFS is a self-contained filesystem driver. Although there are still some TODOs to be more generic, we have a dedicated team actively keeping on working on EROFS in order to make it better with the evolution of Linux kernel as the other in-kernel filesystems. As Pavel suggested, it's better to do as one commit since git can do moves and all histories will be saved in this way. Let's promote it from staging and enhance it more actively as a "real" part of kernel for more wider scenarios! Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Pavel Machek <pavel@denx.de> Cc: David Sterba <dsterba@suse.cz> Cc: Amir Goldstein <amir73il@gmail.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Darrick J . Wong <darrick.wong@oracle.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Richard Weinberger <richard@nod.at> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Chao Yu <yuchao0@huawei.com> Cc: Miao Xie <miaoxie@huawei.com> Cc: Li Guifu <bluce.liguifu@huawei.com> Cc: Fang Wei <fangwei1@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Link: https://lore.kernel.org/r/20190822213659.5501-1-hsiangkao@aol.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-22Merge branch 'for-mingo' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU and LKMM changes from Paul E. McKenney: - A few more RCU flavor consolidation cleanups. - Miscellaneous fixes. - Updates to RCU's list-traversal macros improving lockdep usability. - Torture-test updates. - Forward-progress improvements for no-CBs CPUs: Avoid ignoring incoming callbacks during grace-period waits. - Forward-progress improvements for no-CBs CPUs: Use ->cblist structure to take advantage of others' grace periods. - Also added a small commit that avoids needlessly inflicting scheduler-clock ticks on callback-offloaded CPUs. - Forward-progress improvements for no-CBs CPUs: Reduce contention on ->nocb_lock guarding ->cblist. - Forward-progress improvements for no-CBs CPUs: Add ->nocb_bypass list to further reduce contention on ->nocb_lock guarding ->cblist. - LKMM updates. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-08-21xprtrdma: Cache free MRs in each rpcrdma_reqChuck Lever
Instead of a globally-contended MR free list, cache MRs in each rpcrdma_req as they are released. This means acquiring and releasing an MR will be lock-free in the common case, even outside the transport send lock. The original idea of per-rpcrdma_req MR free lists was suggested by Shirley Ma <shirley.ma@oracle.com> several years ago. I just now figured out how to make that idea work with on-demand MR allocation. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-21PM: QoS: Get rid of unused flagsAmit Kucheria
The network_latency and network_throughput flags for PM-QoS have not found much use in drivers or in userspace since they were introduced. Commit 4a733ef1bea7 ("mac80211: remove PM-QoS listener") removed the only user PM_QOS_NETWORK_LATENCY in the kernel a while ago and there don't seem to be any userspace tools using the character device files either. PM_QOS_MEMORY_BANDWIDTH was never even added to the trace events. Remove all the flags except cpu_dma_latency. Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-08-20xprtrdma: Move rpcrdma_mr_get out of frwr_mapChuck Lever
Refactor: Retrieve an MR and handle error recovery entirely in rpc_rdma.c, as this is not a device-specific function. Note that since commit 89f90fe1ad8b ("SUNRPC: Allow calls to xprt_transmit() to drain the entire transmit queue"), the xprt_transmit function handles the cond_resched. The transport no longer has to do this itself. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>