aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)Author
2014-03-26tracing: Add 'hash' event trigger commandtzanussi/hashtriggers-v0Tom Zanussi
Hash triggers allow users to continually hash events which can then be dumped later by simply reading the trigger file. This is done strictly via one-liners and without any kind of programming language. The syntax follows the existing trigger syntax: # echo hash:key(s):value(s)[:sort_keys()][ if filter] > event/trigger The values used as keys and values are just the fields that define the trace event and available in the event's 'format' file. For example, the kmalloc event: root@ie:/sys/kernel/debug/tracing/events/kmem/kmalloc# cat format name: kmalloc ID: 370 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1;signed:0; field:int common_pid; offset:4; size:4; signed:1; field:unsigned long call_site; offset:8; size:4; signed:0; field:const void * ptr; offset:12; size:4; signed:0; field:size_t bytes_req; offset:16; size:4; signed:0; field:size_t bytes_alloc; offset:20; size:4; signed:0; field:gfp_t gfp_flags; offset:24; size:4; signed:0; The key can be made up of one or more of these fields and any number of values can specified - these are automatically tallied in the hash entry any time the event is hit. Stacktraces can also be used as keys. For example, the following uses the stacktrace leading up to a kmalloc as the key for hashing kmalloc events. For each hash entry a tally of the bytes_alloc field is kept. Dumping out the trigger shows the sum of bytes allocated for each execution path that led to a kmalloc: # echo 'hash:call_site:bytes_alloc' > /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger key: stacktrace: kmem_cache_alloc_trace+0xeb/0x140 intel_ring_begin+0xd8/0x1a0 [i915] gen6_ring_sync+0x3c/0x140 [i915] i915_gem_object_sync+0xd1/0x130 [i915] i915_gem_do_execbuffer.isra.21+0x632/0x10d0 [i915] i915_gem_execbuffer2+0xac/0x280 [i915] drm_ioctl+0x4e9/0x610 [drm] do_vfs_ioctl+0x83/0x510 SyS_ioctl+0x91/0xb0 system_call_fastpath+0x16/0x1b vals: count:1595 bytes_alloc:153120 key: stacktrace: __kmalloc+0x10b/0x180 i915_gem_do_execbuffer.isra.21+0x67a/0x10d0 [i915] i915_gem_execbuffer2+0xac/0x280 [i915] drm_ioctl+0x4e9/0x610 [drm] do_vfs_ioctl+0x83/0x510 SyS_ioctl+0x91/0xb0 system_call_fastpath+0x16/0x1b vals: count:2850 bytes_alloc:888736 key: stacktrace: __kmalloc+0x10b/0x180 i915_gem_execbuffer2+0x60/0x280 [i915] drm_ioctl+0x4e9/0x610 [drm] do_vfs_ioctl+0x83/0x510 SyS_ioctl+0x91/0xb0 system_call_fastpath+0x16/0x1b vals: count:2850 bytes_alloc:2560384 key: stacktrace: __kmalloc+0x10b/0x180 hid_report_raw_event+0x15b/0x450 [hid] hid_input_report+0x119/0x1a0 [hid] hid_irq_in+0x20b/0x250 [usbhid] __usb_hcd_giveback_urb+0x7c/0x130 usb_giveback_urb_bh+0x96/0xe0 tasklet_hi_action+0xd7/0xe0 __do_softirq+0x125/0x2e0 irq_exit+0xb5/0xc0 do_IRQ+0x67/0x110 ret_from_intr+0x0/0x13 cpuidle_idle_call+0xbb/0x1f0 arch_cpu_idle+0xe/0x30 cpu_startup_entry+0x9f/0x240 rest_init+0x77/0x80 start_kernel+0x3db/0x3e8 vals: count:5968 bytes_alloc:131296 Totals: Hits: 22648 Entries: 119 Dropped: 0 This turns the hash trigger off: # echo '!hash:stacktrace:bytes_alloc' > /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger Stack traces, of course, are very useful but a bit of overkill for many uses. For instance, suppose we just want a line per caller. Here, we keep a tally of bytes_alloc per caller. Note that you don't need to explicitly keep a 'count' tally - counts are automatically tallied and displayed (and are in fact the default sort key). Also note that the raw call_site printed here isn't very useful (we'll remedy that later). # echo 'hash:call_site:bytes_alloc' > /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger hash:unlimited key: call_site:18446744071579450186 vals: count:1 bytes_alloc:64 key: call_site:18446744071579439780 vals: count:1 bytes_alloc:64 key: call_site:18446744071579400894 vals: count:1 bytes_alloc:1024 key: call_site:18446744072104627352 vals: count:1 bytes_alloc:512 key: call_site:18446744071580027351 vals: count:1 bytes_alloc:512 key: call_site:18446744071580991590 vals: count:1 bytes_alloc:16 key: call_site:18446744071579463899 vals: count:1 bytes_alloc:64 key: call_site:18446744072102260685 vals: count:1 bytes_alloc:512 key: call_site:18446744071579439821 vals: count:1 bytes_alloc:64 key: call_site:18446744071579532598 vals: count:1 bytes_alloc:1024 key: call_site:18446744071584838347 vals: count:1 bytes_alloc:64 key: call_site:18446744071579450148 vals: count:1 bytes_alloc:64 key: call_site:18446744071580886173 vals: count:2 bytes_alloc:256 key: call_site:18446744071580886422 vals: count:2 bytes_alloc:1024 key: call_site:18446744071580987082 vals: count:2 bytes_alloc:8192 key: call_site:18446744071580652885 vals: count:2 bytes_alloc:128 key: call_site:18446744071580565960 vals: count:2 bytes_alloc:512 key: call_site:18446744071580680412 vals: count:2 bytes_alloc:64 key: call_site:18446744071580891052 vals: count:2 bytes_alloc:1024 key: call_site:18446744071580886777 vals: count:2 bytes_alloc:64 key: call_site:18446744071580572594 vals: count:3 bytes_alloc:3072 key: call_site:18446744071580592783 vals: count:3 bytes_alloc:48 key: call_site:18446744071580679805 vals: count:3 bytes_alloc:12288 key: call_site:18446744071582021108 vals: count:3 bytes_alloc:768 key: call_site:18446744071580572564 vals: count:3 bytes_alloc:576 key: call_site:18446744071581165381 vals: count:4 bytes_alloc:256 key: call_site:18446744071580953553 vals: count:4 bytes_alloc:256 key: call_site:18446744072102160648 vals: count:4 bytes_alloc:1024 key: call_site:18446744071580652708 vals: count:4 bytes_alloc:4224 key: call_site:18446744071580680238 vals: count:5 bytes_alloc:640 key: call_site:18446744071581375333 vals: count:6 bytes_alloc:384 key: call_site:18446744072102162313 vals: count:16 bytes_alloc:7616 key: call_site:18446744071581165832 vals: count:24 bytes_alloc:1600 key: call_site:18446744071582016247 vals: count:26 bytes_alloc:832 key: call_site:18446744071580843814 vals: count:35 bytes_alloc:2240 key: call_site:18446744071581367368 vals: count:39 bytes_alloc:3744 key: call_site:18446744072101806931 vals: count:39 bytes_alloc:1248 key: call_site:18446744072103721852 vals: count:89 bytes_alloc:8544 key: call_site:18446744072101850501 vals: count:89 bytes_alloc:8544 key: call_site:18446744072103729728 vals: count:89 bytes_alloc:17088 key: call_site:18446744071583128580 vals: count:154 bytes_alloc:157696 key: call_site:18446744072103573325 vals: count:643 bytes_alloc:10288 key: call_site:18446744071582381017 vals: count:643 bytes_alloc:159008 key: call_site:18446744072103563942 vals: count:645 bytes_alloc:123840 key: call_site:18446744071582043239 vals: count:765 bytes_alloc:6120 key: call_site:18446744072101884462 vals: count:776 bytes_alloc:49664 key: call_site:18446744072103903864 vals: count:1026 bytes_alloc:98496 key: call_site:18446744072103596026 vals: count:1026 bytes_alloc:287040 key: call_site:18446744072103599888 vals: count:1026 bytes_alloc:724736 key: call_site:18446744071580813202 vals: count:2433 bytes_alloc:155712 key: call_site:18446744072099520315 vals: count:2948 bytes_alloc:64856 Totals: Hits: 12601 Entries: 51 Dropped: 0 A little more useful, but not much, would be to display the call_sites as hex addresses. To do this we add a '.hex' modifier to the call_site key : root@trz-ThinkPad-T420:~# echo 'hash:call_site.hex:bytes_alloc' > /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger root@trz-ThinkPad-T420:~# cat /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger hash:unlimited key: call_site:ffffffff811e7f26 vals: count:1 bytes_alloc:64 key: call_site:ffffffff811a5bb2 vals: count:1 bytes_alloc:1024 key: call_site:ffffffff811a41c8 vals: count:1 bytes_alloc:256 key: call_site:ffffffff811c002e vals: count:1 bytes_alloc:128 key: call_site:ffffffff811209d7 vals: count:1 bytes_alloc:256 key: call_site:ffffffff811f26f9 vals: count:1 bytes_alloc:32 key: call_site:ffffffff811f2596 vals: count:1 bytes_alloc:512 key: call_site:ffffffff811f249d vals: count:1 bytes_alloc:128 key: call_site:ffffffff811f37ac vals: count:1 bytes_alloc:512 key: call_site:ffffffff811bfe7d vals: count:1 bytes_alloc:4096 key: call_site:ffffffff811a5b94 vals: count:1 bytes_alloc:192 key: call_site:ffffffff813075f4 vals: count:1 bytes_alloc:256 key: call_site:ffffffff811b9555 vals: count:1 bytes_alloc:64 key: call_site:ffffffff811b94a4 vals: count:2 bytes_alloc:2112 key: call_site:ffffffff81236745 vals: count:2 bytes_alloc:128 key: call_site:ffffffff813062f7 vals: count:5 bytes_alloc:160 key: call_site:ffffffff811e0792 vals: count:8 bytes_alloc:512 key: call_site:ffffffff81236908 vals: count:12 bytes_alloc:800 key: call_site:ffffffffa0491a40 vals: count:12 bytes_alloc:2304 key: call_site:ffffffffa02c6d85 vals: count:12 bytes_alloc:1152 key: call_site:ffffffffa048fb7c vals: count:12 bytes_alloc:1152 key: call_site:ffffffffa0470ffa vals: count:144 bytes_alloc:40192 key: call_site:ffffffffa0471f10 vals: count:144 bytes_alloc:96192 key: call_site:ffffffffa04bc278 vals: count:144 bytes_alloc:13824 key: call_site:ffffffffa04692a6 vals: count:218 bytes_alloc:41856 key: call_site:ffffffffa046b74d vals: count:218 bytes_alloc:3488 key: call_site:ffffffff8135f3d9 vals: count:218 bytes_alloc:53344 key: call_site:ffffffffa02cf22e vals: count:230 bytes_alloc:14720 key: call_site:ffffffff8130cc67 vals: count:1229 bytes_alloc:9832 Totals: Hits: 2623 Entries: 29 Dropped: 0 Even more useful would be to display the call_sites as symbolic names. To do that we can add a '.sym' modifier to the call_site key: root@trz-ThinkPad-T420:~# echo 'hash:call_site.sym:bytes_alloc' > /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger root@trz-ThinkPad-T420:~# cat /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger hash:unlimited key: call_site:[ffffffff8120aeca] stat_open vals: count:1 bytes_alloc:4096 key: call_site:[ffffffff811a5bb2] alloc_pipe_info vals: count:1 bytes_alloc:1024 key: call_site:[ffffffff811f2596] load_elf_binary vals: count:1 bytes_alloc:512 key: call_site:[ffffffff811209d7] event_hash_trigger_print vals: count:1 bytes_alloc:256 key: call_site:[ffffffff811f26f9] load_elf_binary vals: count:1 bytes_alloc:32 key: call_site:[ffffffff811b9555] alloc_fdtable vals: count:1 bytes_alloc:64 key: call_site:[ffffffff811f37ac] load_elf_binary vals: count:1 bytes_alloc:512 key: call_site:[ffffffff811a41c8] do_execve_common.isra.28 vals: count:1 bytes_alloc:256 key: call_site:[ffffffff811c00dc] single_open vals: count:1 bytes_alloc:32 key: call_site:[ffffffff811f249d] load_elf_binary vals: count:1 bytes_alloc:128 key: call_site:[ffffffff811a5b94] alloc_pipe_info vals: count:1 bytes_alloc:192 key: call_site:[ffffffff813075f4] aa_path_name vals: count:1 bytes_alloc:256 key: call_site:[ffffffff811dd155] mounts_open_common vals: count:2 bytes_alloc:384 key: call_site:[ffffffff811b94a4] alloc_fdmem vals: count:2 bytes_alloc:2112 key: call_site:[ffffffff81202bd1] proc_reg_open vals: count:2 bytes_alloc:128 key: call_site:[ffffffff8120c066] proc_self_follow_link vals: count:2 bytes_alloc:32 key: call_site:[ffffffff811c002e] seq_open vals: count:3 bytes_alloc:384 key: call_site:[ffffffff811bfe7d] seq_read vals: count:4 bytes_alloc:16384 key: call_site:[ffffffff811e0792] inotify_handle_event vals: count:4 bytes_alloc:256 key: call_site:[ffffffff813062f7] aa_alloc_task_context vals: count:5 bytes_alloc:160 key: call_site:[ffffffffa0491a40] intel_framebuffer_create vals: count:8 bytes_alloc:1536 key: call_site:[ffffffffa02c6d85] drm_mode_page_flip_ioctl vals: count:8 bytes_alloc:768 key: call_site:[ffffffffa048fb7c] intel_crtc_page_flip vals: count:8 bytes_alloc:768 key: call_site:[ffffffffa04692a6] i915_gem_obj_lookup_or_create_vma vals: count:112 bytes_alloc:21504 key: call_site:[ffffffffa046b74d] i915_gem_object_get_pages_gtt vals: count:112 bytes_alloc:1792 key: call_site:[ffffffff8135f3d9] sg_kmalloc vals: count:112 bytes_alloc:33088 key: call_site:[ffffffffa02cf22e] drm_vma_node_allow vals: count:120 bytes_alloc:7680 key: call_site:[ffffffffa0470ffa] i915_gem_do_execbuffer.isra.21 vals: count:122 bytes_alloc:34432 key: call_site:[ffffffffa0471f10] i915_gem_execbuffer2 vals: count:122 bytes_alloc:80960 key: call_site:[ffffffffa04bc278] intel_ring_begin vals: count:122 bytes_alloc:11712 key: call_site:[ffffffff8130cc67] apparmor_file_alloc_security vals: count:126 bytes_alloc:1008 Totals: Hits: 1008 Entries: 31 Dropped: 0 Most useful of all would be to not only display the call_sites symbolically, but also display tallies of the total number of bytes requested by each caller, the number allocated, and sort by the difference betwen the two, which essentially gives you a listing of the callers that waste the most bytes due to the lack of allocation granularity. This is a good demonstration of hashing multiple values, tallying the difference between values (- is the only 'operator' supported), and specifying a non-default sort order. # echo 'hash:call_site.sym:bytes_req,bytes_alloc,bytes_alloc-bytes_req:sort=bytes_alloc-bytes_req' > /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger key: call_site:[ffffffff813062f7] aa_alloc_task_context vals: count:30 bytes_req:960, bytes_alloc:960, bytes_alloc-bytes_req:0 key: call_site:[ffffffff813075f4] aa_path_name vals: count:4 bytes_req:1024, bytes_alloc:1024, bytes_alloc-bytes_req:0 key: call_site:[ffffffff811c002e] seq_open vals: count:18 bytes_req:2304, bytes_alloc:2304, bytes_alloc-bytes_req:0 key: call_site:[ffffffff811bfd3a] seq_read vals: count:3 bytes_req:24576, bytes_alloc:24576, bytes_alloc-bytes_req:0 key: call_site:[ffffffff810912cd] alloc_fair_sched_group vals: count:1 bytes_req:64, bytes_alloc:64, bytes_alloc-bytes_req:0 key: call_site:[ffffffff810970db] sched_autogroup_create_attach vals: count:1 bytes_req:64, bytes_alloc:64, bytes_alloc-bytes_req:0 key: call_site:[ffffffff811aaa8f] vfs_rename vals: count:2 bytes_req:22, bytes_alloc:32, bytes_alloc-bytes_req:10 key: call_site:[ffffffff8120c066] proc_self_follow_link vals: count:3 bytes_req:36, bytes_alloc:48, bytes_alloc-bytes_req:12 key: call_site:[ffffffff811f26f9] load_elf_binary vals: count:4 bytes_req:112, bytes_alloc:128, bytes_alloc-bytes_req:16 key: call_site:[ffffffff811f2596] load_elf_binary vals: count:4 bytes_req:2016, bytes_alloc:2048, bytes_alloc-bytes_req:32 key: call_site:[ffffffff81269b65] ext4_ext_remove_space vals: count:3 bytes_req:144, bytes_alloc:192, bytes_alloc-bytes_req:48 key: call_site:[ffffffff811dd155] mounts_open_common vals: count:2 bytes_req:320, bytes_alloc:384, bytes_alloc-bytes_req:64 key: call_site:[ffffffff811b9555] alloc_fdtable vals: count:4 bytes_req:192, bytes_alloc:256, bytes_alloc-bytes_req:64 key: call_site:[ffffffff81236745] ext4_readdir vals: count:13 bytes_req:624, bytes_alloc:832, bytes_alloc-bytes_req:208 key: call_site:[ffffffff811a5b94] alloc_pipe_info vals: count:5 bytes_req:680, bytes_alloc:960, bytes_alloc-bytes_req:280 key: call_site:[ffffffff81202bd1] proc_reg_open vals: count:14 bytes_req:560, bytes_alloc:896, bytes_alloc-bytes_req:336 key: call_site:[ffffffff81087abe] sched_create_group vals: count:1 bytes_req:664, bytes_alloc:1024, bytes_alloc-bytes_req:360 key: call_site:[ffffffffa0312f89] cfg80211_inform_bss_width_frame vals: count:2 bytes_req:546, bytes_alloc:1024, bytes_alloc-bytes_req:478 key: call_site:[ffffffff811f37ac] load_elf_binary vals: count:4 bytes_req:1568, bytes_alloc:2048, bytes_alloc-bytes_req:480 key: call_site:[ffffffff811209d7] event_hash_trigger_print vals: count:7 bytes_req:2520, bytes_alloc:3328, bytes_alloc-bytes_req:808 key: call_site:[ffffffff811e7f26] eventfd_file_create vals: count:71 bytes_req:3408, bytes_alloc:4544, bytes_alloc-bytes_req:1136 key: call_site:[ffffffff81236908] ext4_htree_store_dirent vals: count:100 bytes_req:6246, bytes_alloc:7456, bytes_alloc-bytes_req:1210 key: call_site:[ffffffff811a5bb2] alloc_pipe_info vals: count:5 bytes_req:3200, bytes_alloc:5120, bytes_alloc-bytes_req:1920 key: call_site:[ffffffffa02c6d85] drm_mode_page_flip_ioctl vals: count:370 bytes_req:32560, bytes_alloc:35520, bytes_alloc-bytes_req:2960 key: call_site:[ffffffff8120aeca] stat_open vals: count:7 bytes_req:24752, bytes_alloc:28672, bytes_alloc-bytes_req:3920 key: call_site:[ffffffff811e0792] inotify_handle_event vals: count:644 bytes_req:37470, bytes_alloc:41792, bytes_alloc-bytes_req:4322 key: call_site:[ffffffffa048fb7c] intel_crtc_page_flip vals: count:370 bytes_req:26640, bytes_alloc:35520, bytes_alloc-bytes_req:8880 key: call_site:[ffffffffa008df3b] hid_report_raw_event vals: count:7048 bytes_req:140960, bytes_alloc:155056, bytes_alloc-bytes_req:14096 key: call_site:[ffffffffa0491a40] intel_framebuffer_create vals: count:370 bytes_req:53280, bytes_alloc:71040, bytes_alloc-bytes_req:17760 key: call_site:[ffffffff8130cc67] apparmor_file_alloc_security vals: count:3058 bytes_req:6116, bytes_alloc:24464, bytes_alloc-bytes_req:18348 key: call_site:[ffffffffa04bc278] intel_ring_begin vals: count:2754 bytes_req:242352, bytes_alloc:264384, bytes_alloc-bytes_req:22032 key: call_site:[ffffffffa04692a6] i915_gem_obj_lookup_or_create_vma vals: count:1835 bytes_req:308280, bytes_alloc:352320, bytes_alloc-bytes_req:44040 key: call_site:[ffffffffa02cf22e] drm_vma_node_allow vals: count:2291 bytes_req:91640, bytes_alloc:146624, bytes_alloc-bytes_req:54984 key: call_site:[ffffffff8135f3d9] sg_kmalloc vals: count:1827 bytes_req:432512, bytes_alloc:491808, bytes_alloc-bytes_req:59296 key: call_site:[ffffffffa0470ffa] i915_gem_do_execbuffer.isra.21 vals: count:2754 bytes_req:534960, bytes_alloc:922624, bytes_alloc-bytes_req:387664 key: call_site:[ffffffffa0471f10] i915_gem_execbuffer2 vals: count:2754 bytes_req:2030840, bytes_alloc:2729792, bytes_alloc-bytes_req:698952 Totals: Hits: 28354 Entries: 48 Dropped: 0 Here's an example of using a compound key. The below tallies syscall hits for every unique combination of pid/syscall id ('hitcount' is essentially a placeholder - as mentioned before, counts are always kept - using 'hitcount' essentially references that 'fake' event field in the hash trigger specification). Both the syscall id and the pid are displayed symbolically via the .syscall and .execname modifiers. # echo 'hash:common_pid.execname,id.syscall:hitcount:sort=common_pid,hitcount' > /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger key: common_pid:bash[3112], id:sys_write vals: count:69 key: common_pid:bash[3112], id:sys_rt_sigprocmask vals: count:218 key: common_pid:update-notifier[3164], id:sys_poll vals: count:37 key: common_pid:update-notifier[3164], id:sys_recvfrom vals: count:118 key: common_pid:deja-dup-monito[3194], id:sys_sendto vals: count:1 key: common_pid:deja-dup-monito[3194], id:sys_read vals: count:4 key: common_pid:deja-dup-monito[3194], id:sys_poll vals: count:8 key: common_pid:deja-dup-monito[3194], id:sys_recvmsg vals: count:8 key: common_pid:deja-dup-monito[3194], id:sys_geteuid vals: count:8 key: common_pid:deja-dup-monito[3194], id:sys_write vals: count:8 key: common_pid:deja-dup-monito[3194], id:sys_getegid vals: count:8 key: common_pid:emacs[3275], id:sys_fsync vals: count:1 key: common_pid:emacs[3275], id:sys_open vals: count:1 key: common_pid:emacs[3275], id:sys_unlink vals: count:1 key: common_pid:emacs[3275], id:sys_close vals: count:1 key: common_pid:emacs[3275], id:sys_symlink vals: count:2 key: common_pid:emacs[3275], id:sys_readlink vals: count:2 key: common_pid:emacs[3275], id:sys_access vals: count:2 key: common_pid:emacs[3275], id:sys_geteuid vals: count:2 key: common_pid:emacs[3275], id:sys_getgid vals: count:2 key: common_pid:emacs[3275], id:sys_getuid vals: count:2 key: common_pid:emacs[3275], id:sys_getegid vals: count:3 key: common_pid:emacs[3275], id:sys_newlstat vals: count:4 key: common_pid:emacs[3275], id:sys_setitimer vals: count:7 key: common_pid:emacs[3275], id:sys_newstat vals: count:8 key: common_pid:emacs[3275], id:sys_read vals: count:9 key: common_pid:emacs[3275], id:sys_write vals: count:14 key: common_pid:emacs[3275], id:sys_kill vals: count:14 key: common_pid:emacs[3275], id:sys_poll vals: count:23 key: common_pid:emacs[3275], id:sys_select vals: count:23 key: common_pid:emacs[3275], id:unknown_syscall vals: count:34 key: common_pid:emacs[3275], id:sys_ioctl vals: count:60 key: common_pid:emacs[3275], id:sys_rt_sigprocmask vals: count:116 key: common_pid:cat[3323], id:sys_munmap vals: count:1 key: common_pid:cat[3323], id:sys_fadvise64 vals: count:1 Finally, the below uses a string as a hash key, and simply tallies and displays the default count ('hitcount'). # echo 'hash:child_comm:hitcount' > /sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger # cat /sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger hash:unlimited key: child_comm:pool vals: count:1 key: child_comm:unity-panel-ser vals: count:1 key: child_comm:pool vals: count:1 key: child_comm:hud-service vals: count:1 key: child_comm:Cache I/O vals: count:1 key: child_comm:postgres vals: count:1 key: child_comm:gdbus vals: count:1 key: child_comm:bash vals: count:1 key: child_comm:ubuntu-webapps- vals: count:2 key: child_comm:dbus-daemon vals: count:2 key: child_comm:compiz vals: count:3 key: child_comm:apt-cache vals: count:3 key: child_comm:unity-webapps-s vals: count:4 key: child_comm:java vals: count:6 key: child_comm:firefox vals: count:52 Totals: Hits: 80 Entries: 15 Dropped: 0 Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
2014-03-26tracing: Add hash trigger to DocumentationTom Zanussi
Add documentation and usage examples for 'hash' triggers. Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
2014-03-26tracing: Add get_syscall_name()Tom Zanussi
Add a utility function to grab the syscall name from the syscall metadata, given a syscall id. Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
2014-03-26tracing: Add event record param to trigger_ops.func()Tom Zanussi
Some triggers may need access to the trace event, so pass it in. Also fix up the existing trigger funcs and their callers. Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
2014-03-26tracing: Make ftrace_event_field checking functions availableTom Zanussi
Make is_string_field() and is_function_field() accessible outside of trace_event_filters.c for other users of ftrace_event_fields. Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
2014-03-26Merge tag 'trace-fixes-v3.14-rc7-v2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fix from Steven Rostedt: "While on my flight to Linux Collaboration Summit, I was working on my slides for the event trigger tutorial. I booted a 3.14-rc7 kernel to perform what I wanted to teach and cut and paste it into my slides. When I tried the traceon event trigger with a condition attached to it (turns tracing on only if a field of the trigger event matches a condition set by the user), nothing happened. Tracing would not turn on. I stopped working on my presentation in order to find what was wrong. It ended up being the way trace event triggers work when they have conditions. Instead of copying the fields, the condition code just looks at the fields that were copied into the ring buffer. This works great, unless tracing is off. That's because when the event is reserved on the ring buffer, the ring buffer returns a NULL pointer, this tells the tracing code that the ring buffer is disabled. This ends up being a problem for the traceon trigger if it is using this information to check its condition. Luckily the code that checks if tracing is on returns the ring buffer to use (because the ring buffer is determined by the event file also passed to that field). I was able to easily solve this bug by checking in that helper function if the returned ring buffer entry is NULL, and if so, also check the file flag if it has a trace event trigger condition, and if so, to pass back a temp ring buffer to use. This will allow the trace event trigger condition to still test the event fields, but nothing will be recorded" * tag 'trace-fixes-v3.14-rc7-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Fix traceon trigger condition to actually turn tracing on
2014-03-25tracing: Fix traceon trigger condition to actually turn tracing onSteven Rostedt (Red Hat)
While working on my tutorial for 2014 Linux Collaboration Summit I found that the traceon trigger did not work when conditions were used. The other triggers worked fine though. Looking into it, it is because of the way the triggers use the ring buffer to store the fields it will use for the condition. But if tracing is off, nothing is stored in the buffer, and the tracepoint exits before calling the trigger to test the condition. This is fine for all the triggers that only work when tracing is on, but for traceon trigger that is to work when tracing is off, nothing happens. The fix is simple, just use a temp ring buffer to record the event if tracing is off and the event has a trace event conditional trigger enabled. The rest of the tracepoint code will work just fine, but the tracepoint wont be recorded in the other buffers. Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-03-25fs: remove now stale label in anon_inode_init()Linus Torvalds
The previous commit removed the register_filesystem() call and the associated error handling, but left the label for the error path that no longer exists. Remove that too. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-25fs: Avoid userspace mounting anon_inodefs filesystemJan Kara
anon_inodefs filesystem is a kernel internal filesystem userspace shouldn't mess with. Remove registration of it so userspace cannot even try to mount it (which would fail anyway because the filesystem is MS_NOUSER). This fixes an oops triggered by trinity when it tried mounting anon_inodefs which overwrote anon_inode_inode pointer while other CPU has been in anon_inode_getfile() between ihold() and d_instantiate(). Thus effectively creating dentry pointing to an inode without holding a reference to it. Reported-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-25Merge branch 'nfsd-next' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd fix frm Bruce Fields: "J R Okajima sent this early and I was just slow to pass it along, apologies. Fortunately it's a simple fix" * 'nfsd-next' of git://linux-nfs.org/~bfields/linux: nfsd: fix lost nfserrno() call in nfsd_setattr()
2014-03-25Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "These four commits are obvious fixes (a couple of fdget_pos()-related ones from Eric Biggers, prepend_name() fix, missing checks for false negatives from __lookup_mnt() in fs/namei.c)" For now I'm pulling just the four obvious fixes, there's another four pending in Al's 'for-linus' branch wrt the mnt_hash list that were more involved. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: rcuwalk: recheck mount_lock after mountpoint crossing attempts make prepend_name() work correctly when called with negative *buflen vfs: Don't let __fdget_pos() get FMODE_PATH files vfs: atomic f_pos access in llseek()
2014-03-24Linux 3.14-rc8Linus Torvalds
2014-03-24Merge branch 'parisc-3.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc updates from Helge Deller: - revert parts of the latest patch regarding font selection with STICON console - wire up the utimes() syscall for parisc - remove the unused parisc tmpalias code and unnecessary arch*relax defines * 'parisc-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: locks: remove redundant arch_*_relax operations parisc: wire up sys_utimes parisc: Remove unused CONFIG_PARISC_TMPALIAS code partly revert commit 8a10bc9: parisc/sti_console: prefer Linux fonts over built-in ROM fonts
2014-03-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds
Pull sparc fixes from David Miller: 1) Do serial locking in a way that makes things clear that these are IRQ spinlocks. 2) Conversion to generic idle loop broke first generation Niagara machines, need to have %pil interrupts enabled during cpu yield hypervisor call. 3) Do not use magic constants for iterations over tsb tables, from Doug Wilson. 4) Fix erroneous truncation of 64-bit system call return values to 32-bit. From Dave Kleikamp. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc64: Make sure %pil interrupts are enabled during hypervisor yield. sparc64:tsb.c:use array size macro rather than number sparc64: don't treat 64-bit syscall return codes as 32-bit sparc: serial: Clean up the locking for -rt
2014-03-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) OpenVswitch's lookup_datapath() returns error pointers, so don't check against NULL. From Jiri Pirko. 2) pfkey_compile_policy() code path tries to do a GFP_KERNEL allocation under RCU locks, fix by using GFP_ATOMIC when necessary. From Nikolay Aleksandrov. 3) phy_suspend() indirectly passes uninitialized data into the ethtool get wake-on-land implementations. Fix from Sebastian Hesselbarth. 4) CPSW driver unregisters CPTS twice, fix from Benedikt Spranger. 5) If SKB allocation of reply packet fails, vxlan's arp_reduce() defers a NULL pointer. Fix from David Stevens. 6) IPV6 neigh handling in vxlan doesn't validate the destination address properly, and it builds a packet with the src and dst reversed. Fix also from David Stevens. 7) Fix spinlock recursion during subscription failures in TIPC stack, from Erik Hugne. 8) Revert buggy conversion of davinci_emac to devm_request_irq, from Chrstian Riesch. 9) Wrong flags passed into forwarding database netlink notifications, from Nicolas Dichtel. 10) The netpoll neighbour soliciation handler checks wrong ethertype, needs to be ETH_P_IPV6 rather than ETH_P_ARP. Fix from Li RongQing. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (34 commits) tipc: fix spinlock recursion bug for failed subscriptions vxlan: fix nonfunctional neigh_reduce() net: davinci_emac: Fix rollback of emac_dev_open() net: davinci_emac: Replace devm_request_irq with request_irq netpoll: fix the skb check in pkt_is_ns net: micrel : ks8851-ml: add vdd-supply support ip6mr: fix mfc notification flags ipmr: fix mfc notification flags rtnetlink: fix fdb notification flags tcp: syncookies: do not use getnstimeofday() netlink: fix setsockopt in mmap examples in documentation openvswitch: Correctly report flow used times for first 5 minutes after boot. via-rhine: Disable device in error path ATHEROS-ATL1E: Convert iounmap to pci_iounmap vxlan: fix potential NULL dereference in arp_reduce() cnic: Update version to 2.5.20 and copyright year. cnic,bnx2i,bnx2fc: Fix inconsistent use of page size cnic: Use proper ulp_ops for per device operations. net: cdc_ncm: fix control message ordering ipv6: ip6_append_data_mtu do not handle the mtu of the second fragment properly ...
2014-03-24tipc: fix spinlock recursion bug for failed subscriptionsErik Hugne
If a topology event subscription fails for any reason, such as out of memory, max number reached or because we received an invalid request the correct behavior is to terminate the subscribers connection to the topology server. This is currently broken and produces the following oops: [27.953662] tipc: Subscription rejected, illegal request [27.955329] BUG: spinlock recursion on CPU#1, kworker/u4:0/6 [27.957066] lock: 0xffff88003c67f408, .magic: dead4ead, .owner: kworker/u4:0/6, .owner_cpu: 1 [27.958054] CPU: 1 PID: 6 Comm: kworker/u4:0 Not tainted 3.14.0-rc6+ #5 [27.960230] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [27.960874] Workqueue: tipc_rcv tipc_recv_work [tipc] [27.961430] ffff88003c67f408 ffff88003de27c18 ffffffff815c0207 ffff88003de1c050 [27.962292] ffff88003de27c38 ffffffff815beec5 ffff88003c67f408 ffffffff817f0a8a [27.963152] ffff88003de27c58 ffffffff815beeeb ffff88003c67f408 ffffffffa0013520 [27.964023] Call Trace: [27.964292] [<ffffffff815c0207>] dump_stack+0x45/0x56 [27.964874] [<ffffffff815beec5>] spin_dump+0x8c/0x91 [27.965420] [<ffffffff815beeeb>] spin_bug+0x21/0x26 [27.965995] [<ffffffff81083df6>] do_raw_spin_lock+0x116/0x140 [27.966631] [<ffffffff815c6215>] _raw_spin_lock_bh+0x15/0x20 [27.967256] [<ffffffffa0008540>] subscr_conn_shutdown_event+0x20/0xa0 [tipc] [27.968051] [<ffffffffa000fde4>] tipc_close_conn+0xa4/0xb0 [tipc] [27.968722] [<ffffffffa00101ba>] tipc_conn_terminate+0x1a/0x30 [tipc] [27.969436] [<ffffffffa00089a2>] subscr_conn_msg_event+0x1f2/0x2f0 [tipc] [27.970209] [<ffffffffa0010000>] tipc_receive_from_sock+0x90/0xf0 [tipc] [27.970972] [<ffffffffa000fa79>] tipc_recv_work+0x29/0x50 [tipc] [27.971633] [<ffffffff8105dbf5>] process_one_work+0x165/0x3e0 [27.972267] [<ffffffff8105e869>] worker_thread+0x119/0x3a0 [27.972896] [<ffffffff8105e750>] ? manage_workers.isra.25+0x2a0/0x2a0 [27.973622] [<ffffffff810648af>] kthread+0xdf/0x100 [27.974168] [<ffffffff810647d0>] ? kthread_create_on_node+0x1a0/0x1a0 [27.974893] [<ffffffff815ce13c>] ret_from_fork+0x7c/0xb0 [27.975466] [<ffffffff810647d0>] ? kthread_create_on_node+0x1a0/0x1a0 The recursion occurs when subscr_terminate tries to grab the subscriber lock, which is already taken by subscr_conn_msg_event. We fix this by checking if the request to establish a new subscription was successful, and if not we initiate termination of the subscriber after we have released the subscriber lock. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-24vxlan: fix nonfunctional neigh_reduce()David Stevens
The VXLAN neigh_reduce() code is completely non-functional since check-in. Specific errors: 1) The original code drops all packets with a multicast destination address, even though neighbor solicitations are sent to the solicited-node address, a multicast address. The code after this check was never run. 2) The neighbor table lookup used the IPv6 header destination, which is the solicited node address, rather than the target address from the neighbor solicitation. So neighbor lookups would always fail if it got this far. Also for L3MISSes. 3) The code calls ndisc_send_na(), which does a send on the tunnel device. The context for neigh_reduce() is the transmit path, vxlan_xmit(), where the host or a bridge-attached neighbor is trying to transmit a neighbor solicitation. To respond to it, the tunnel endpoint needs to do a *receive* of the appropriate neighbor advertisement. Doing a send, would only try to send the advertisement, encapsulated, to the remote destinations in the fdb -- hosts that definitely did not do the corresponding solicitation. 4) The code uses the tunnel endpoint IPv6 forwarding flag to determine the isrouter flag in the advertisement. This has nothing to do with whether or not the target is a router, and generally won't be set since the tunnel endpoint is bridging, not routing, traffic. The patch below creates a proxy neighbor advertisement to respond to neighbor solicitions as intended, providing proper IPv6 support for neighbor reduction. Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-24Merge branch 'davinci_emac'David S. Miller
Christian Riesch says: ==================== net: davinci_emac: Fix interrupt requests and error handling since commit 6892b41d9701283085b655c6086fb57a5d63fa47 (Linux 3.11) the davinci_emac driver is broken. After doing ifconfig down, ifconfig up, requesting the interrupts for the driver fails. The interface remains dead until the board is rebooted. The first patch in this patchset reverts commit 6892b41d9701283085b655c6086fb57a5d63fa47 partially and makes the driver useable again. During the work on the first patch, a number of bugs in the error handling of the driver's ndo_open code were found. The second patch fixes these bugs. I believe the first patch meets the rules for stable kernels, I therefore added the stable tag to this patch. The second patch is just cleanup, the code that is fixed by this patch is only executed in case of an error. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-24net: davinci_emac: Fix rollback of emac_dev_open()Christian Riesch
If an error occurs during the initialization in emac_dev_open() (the driver's ndo_open function), interrupts, DMA descriptors etc. must be freed. The current rollback code is buggy in several ways. 1) Freeing the interrupts. The current code will not free all interrupts that were requested by the driver. Furthermore, the code tries to do a platform_get_resource(priv->pdev, IORESOURCE_IRQ, -1) in its last iteration. This patch fixes these bugs. 2) Wrong order of err: and rollback: labels. If the setup of the PHY in the code fails, the interrupts that have been requested before are not freed: request irq if requesting irqs fails, goto rollback setup phy if phy setup fails, goto err return 0 rollback: free irqs err: This patch brings the code into the correct order. 3) The code calls napi_enable() and emac_int_enable(), but does not undo both in case of an error. This patch adds calls of emac_int_disable() and napi_disable() to the rollback code. 4) RX DMA descriptors are not freed in case of an error: Right before requesting the irqs, the function creates DMA descriptors for the RX channel. These RX descriptors are never freed when we jump to either rollback or err. This patch adds code for freeing the DMA descriptors in the case of an initialization error. This required a modification of cpdma_ctrl_stop() in davinci_cpdma.c: We must be able to call this function to free the DMA descriptors while the DMA channels are in IDLE state (before cpdma_ctlr_start() was called). Tested on a custom board with the Texas Instruments AM1808. Signed-off-by: Christian Riesch <christian.riesch@omicron.at> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-24net: davinci_emac: Replace devm_request_irq with request_irqChristian Riesch
In commit 6892b41d9701283085b655c6086fb57a5d63fa47 Author: Lad, Prabhakar <prabhakar.csengg@gmail.com> Date: Tue Jun 25 21:24:51 2013 +0530 net: davinci: emac: Convert to devm_* api the call of request_irq is replaced by devm_request_irq and the call of free_irq is removed. But since interrupts are requested in emac_dev_open, doing ifconfig up/down on the board requests the interrupts again each time, causing devm_request_irq to fail. The interface is dead until the device is rebooted. This patch reverts said commit partially: It changes the driver back to use request_irq instead of devm_request_irq, puts free_irq back in place, but keeps the remaining changes of the original patch. Reported-by: Jon Ringle <jon@ringle.org> Signed-off-by: Christian Riesch <christian.riesch@omicron.at> Cc: Lad, Prabhakar <prabhakar.csengg@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-24netpoll: fix the skb check in pkt_is_nsLi RongQing
Neighbor Solicitation is ipv6 protocol, so we should check skb->protocol with ETH_P_IPV6 Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Cc: WANG Cong <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-24sparc64: Make sure %pil interrupts are enabled during hypervisor yield.David S. Miller
In arch_cpu_idle() we must enable %pil based interrupts before potentially invoking the hypervisor cpu yield call. As per the Hypervisor API documentation for cpu_yield: Interrupts which are blocked by some mechanism other that pstate.ie (for example %pil) are not guaranteed to cause a return from this service. It seems that only first generation Niagara chips are hit by this bug. My best guess is that later chips implement this in hardware and wake up anyways from %pil events, whereas in first generation chips the yield is implemented completely in hypervisor code and requires %pil to be enabled in order to wake properly from this call. Fixes: 87fa05aeb3a5 ("sparc: Use generic idle loop") Reported-by: Fabio M. Di Nitto <fabbione@fabbione.net> Reported-by: Jan Engelhardt <jengelh@inai.de> Tested-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-24net: micrel : ks8851-ml: add vdd-supply supportNishanth Menon
Few platforms use external regulator to keep the ethernet MAC supplied. So, request and enable the regulator for driver functionality. Fixes: 66fda75f47dc (regulator: core: Replace direct ops->disable usage) Reported-by: Russell King <rmk+kernel@arm.linux.org.uk> Suggested-by: Markus Pargmann <mpa@pengutronix.de> Signed-off-by: Nishanth Menon <nm@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-23parisc: locks: remove redundant arch_*_relax operationsWill Deacon
Now that the arch_{spin,read,write}_relax macros default to cpu_relax(), remove the redundant definitions for parisc. Cc: Helge Deller <deller@gmx.de> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Helge Deller <deller@gmx.de>
2014-03-23parisc: wire up sys_utimesHelge Deller
We seem to be nearly the only platform which does not provide the sys_utimes syscall. Adding it now makes our life much easier with userspace applications (like dietlibc and e2fsprogs) since we then behave like all other platforms too and don't need extra patches which are hard to get upstream anyway because we are not a mainstream architecture. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # v3.13
2014-03-23parisc: Remove unused CONFIG_PARISC_TMPALIAS codeJohn David Anglin
The attached change removes the unused and experimental CONFIG_PARISC_TMPALIAS code. It doesn't work and I don't believe it will ever be used. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
2014-03-23partly revert commit 8a10bc9: parisc/sti_console: prefer Linux fonts over ↵Helge Deller
built-in ROM fonts STI console is used on parisc and m68k HP machines. This patch partly reverts my previous commit and as such restores the fonts for the m68k machines. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # v3.13
2014-03-23rcuwalk: recheck mount_lock after mountpoint crossing attemptsAl Viro
We can get false negative from __lookup_mnt() if an unrelated vfsmount gets moved. In that case legitimize_mnt() is guaranteed to fail, and we will fall back to non-RCU walk... unless we end up running into a hard error on a filesystem object we wouldn't have reached if not for that false negative. IOW, delaying that check until the end of pathname resolution is wrong - we should recheck right after we attempt to cross the mountpoint. We don't need to recheck unless we see d_mountpoint() being true - in that case even if we have just raced with mount/umount, we can simply go on as if we'd come at the moment when the sucker wasn't a mountpoint; if we run into a hard error as the result, it was a legitimate outcome. __lookup_mnt() returning NULL is different in that respect, since it might've happened due to operation on completely unrelated mountpoint. Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-03-23make prepend_name() work correctly when called with negative *buflenAl Viro
In all callchains leading to prepend_name(), the value left in *buflen is eventually discarded unused if prepend_name() has returned a negative. So we are free to do what prepend() does, and subtract from *buflen *before* checking for underflow (which turns into checking the sign of subtraction result, of course). Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-03-23vfs: Don't let __fdget_pos() get FMODE_PATH filesEric Biggers
Commit bd2a31d522344 ("get rid of fget_light()") introduced the __fdget_pos() function, which returns the resulting file pointer and fdput flags combined in an 'unsigned long'. However, it also changed the behavior to return files with FMODE_PATH set, which shouldn't happen because read(), write(), lseek(), etc. aren't allowed on such files. This commit restores the old behavior. This regression actually had no effect on read() and write() since FMODE_READ and FMODE_WRITE are not set on file descriptors opened with O_PATH, but it did cause lseek() on a file descriptor opened with O_PATH to fail with ESPIPE rather than EBADF. Signed-off-by: Eric Biggers <ebiggers3@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-03-23vfs: atomic f_pos access in llseek()Eric Biggers
Commit 9c225f2655e36a4 ("vfs: atomic f_pos accesses as per POSIX") changed several system calls to use fdget_pos() instead of fdget(), but missed sys_llseek(). Fix it. Signed-off-by: Eric Biggers <ebiggers3@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-03-22Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Two oneliner 'perf bench' tooling fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf bench: Fix NULL pointer dereference in "perf bench all" perf bench numa: Make no args mean 'run all tests'
2014-03-22Merge tag 'fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "Only two patches this time, one to fix ethernet probe order on at91 (better fix with proper device aliasing will be done for 3.15, this is stop-gap), and one update to MAINTAINERS due to Freescale moving their repo to kernel.org" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: at91: fix network interface ordering for sama5d36 MAINTAINERS: update IMX kernel git tree
2014-03-20Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds
Pull drm fixes from Dave Airlie: "Some final few intel fixes, all regressions, all stable cc, and one exynos oops fixer. The biggest is probably the intel display error irqs one, but it seems to fix a few crashes on startup, and one use after free in drm core" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: drm/exynos: Fix (more) freeing issues in exynos_drm_drv.c drm/i915: Disable stolen memory when DMAR is active Revert "drm/i915: don't touch the VDD when disabling the panel" drm: Fix use-after-free in the shadow-attache exit code drm/i915: Don't enable display error interrupts from the start drm/i915: Fix scanline counter fixup on BDW drm/i915: Add a workaround for HSW scanline counter weirdness drm/i915: Fix PSR programming
2014-03-20block: free q->flush_rq in blk_init_allocated_queue error pathsDave Jones
Commit 7982e90c3a57 ("block: fix q->flush_rq NULL pointer crash on dm-mpath flush") moved an allocation to blk_init_allocated_queue(), but neglected to free that allocation on the error paths that follow. Signed-off-by: Dave Jones <davej@fedoraproject.org> Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-20futex: revert back to the explicit waiter counting codeLinus Torvalds
Srikar Dronamraju reports that commit b0c29f79ecea ("futexes: Avoid taking the hb->lock if there's nothing to wake up") causes java threads getting stuck on futexes when runing specjbb on a power7 numa box. The cause appears to be that the powerpc spinlocks aren't using the same ticket lock model that we use on x86 (and other) architectures, which in turn result in the "spin_is_locked()" test in hb_waiters_pending() occasionally reporting an unlocked spinlock even when there are pending waiters. So this reinstates Davidlohr Bueso's original explicit waiter counting code, which I had convinced Davidlohr to drop in favor of figuring out the pending waiters by just using the existing state of the spinlock and the wait queue. Reported-and-tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Original-code-by: Davidlohr Bueso <davidlohr@hp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-20Merge tag 'trace-fixes-v3.14-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull trace fix from Steven Rostedt: "Vaibhav Nagarnaik discovered that since 3.10 a clean-up patch made the array index in the trace event format bogus. He supplied an elegant solution that uses __stringify() and also removes the need for the event_storage and event_storage_mutex and also cuts off a few K of overhead from the trace events" * tag 'trace-fixes-v3.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Fix array size mismatch in format string
2014-03-20mm: fix swapops.h:131 bug if remap_file_pages raced migrationHugh Dickins
Add remove_linear_migration_ptes_from_nonlinear(), to fix an interesting little include/linux/swapops.h:131 BUG_ON(!PageLocked) found by trinity: indicating that remove_migration_ptes() failed to find one of the migration entries that was temporarily inserted. The problem comes from remap_file_pages()'s switch from vma_interval_tree (good for inserting the migration entry) to i_mmap_nonlinear list (no good for locating it again); but can only be a problem if the remap_file_pages() range does not cover the whole of the vma (zap_pte() clears the range). remove_migration_ptes() needs a file_nonlinear method to go down the i_mmap_nonlinear list, applying linear location to look for migration entries in those vmas too, just in case there was this race. The file_nonlinear method does need rmap_walk_control.arg to do this; but it never needed vma passed in - vma comes from its own iteration. Reported-and-tested-by: Dave Jones <davej@redhat.com> Reported-and-tested-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-20Merge branch 'fixes' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch Jesse Gross says: ==================== Open vSwitch Four small fixes for net/3.14. I realize that these are late in the cycle - just got back from vacation. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-20ip6mr: fix mfc notification flagsNicolas Dichtel
Commit 812e44dd1829 ("ip6mr: advertise new mfc entries via rtnl") reuses the function ip6mr_fill_mroute() to notify mfc events. But this function was used only for dump and thus was always setting the flag NLM_F_MULTI, which is wrong in case of a single notification. Libraries like libnl will wait forever for NLMSG_DONE. CC: Thomas Graf <tgraf@suug.ch> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-20ipmr: fix mfc notification flagsNicolas Dichtel
Commit 8cd3ac9f9b7b ("ipmr: advertise new mfc entries via rtnl") reuses the function ipmr_fill_mroute() to notify mfc events. But this function was used only for dump and thus was always setting the flag NLM_F_MULTI, which is wrong in case of a single notification. Libraries like libnl will wait forever for NLMSG_DONE. CC: Thomas Graf <tgraf@suug.ch> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-20rtnetlink: fix fdb notification flagsNicolas Dichtel
Commit 3ff661c38c84 ("net: rtnetlink notify events for FDB NTF_SELF adds and deletes") reuses the function nlmsg_populate_fdb_fill() to notify fdb events. But this function was used only for dump and thus was always setting the flag NLM_F_MULTI, which is wrong in case of a single notification. Libraries like libnl will wait forever for NLMSG_DONE. CC: Thomas Graf <tgraf@suug.ch> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-20tcp: syncookies: do not use getnstimeofday()Eric Dumazet
While it is true that getnstimeofday() uses about 40 cycles if TSC is available, it can use 1600 cycles if hpet is the clocksource. Switch to get_jiffies_64(), as this is more than enough, and go back to 60 seconds periods. Fixes: 8c27bd75f04f ("tcp: syncookies: reduce cookie lifetime to 128 seconds") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Florian Westphal <fw@strlen.de> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-20netlink: fix setsockopt in mmap examples in documentationstephen hemminger
The documentation for how to use netlink mmap interface is incorrect. The calls to setsockopt() require an additional argument. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-20Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linusLinus Torvalds
Pull MIPS fixes from Ralf Baechle: "Another set of five fixes. The most interesting one is a fix for race condition in the local_irq_disable() implementation used by .S code for pre-MIPS R2 processors only. It leaves a race that's hard but not impossible to hit; the others fairly obvious" * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: MIPS: Make local_irq_disable macro safe for non-Mipsr2 MIPS: Octeon: Fix warning in of_device_alloc on cn3xxx MIPS: ftrace: Tweak safe_load()/safe_store() macros MIPS: BCM47XX: Check all (32) GPIOs when looking for a pin MIPS: Fix possible build error with transparent hugepages enabled
2014-03-20Merge tag 'sound-3.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Just two minor bug fixes: a fix for a regression in oxygen driver that was introduced in 3.14-rc1, and a stable fix for the return value of compress offload open callback" * tag 'sound-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: compress: Pass through return value of open ops callback ALSA: oxygen: Xonar DG(X): fix Stereo Upmixing regression
2014-03-20openvswitch: Correctly report flow used times for first 5 minutes after boot.Ben Pfaff
The kernel starts out its "jiffies" timer as 5 minutes below zero, as shown in include/linux/jiffies.h: /* * Have the 32 bit jiffies value wrap 5 minutes after boot * so jiffies wrap bugs show up earlier. */ #define INITIAL_JIFFIES ((unsigned long)(unsigned int) (-300*HZ)) The loop in ovs_flow_stats_get() starts out with 'used' set to 0, then takes any "later" time. This means that for the first five minutes after boot, flows will always be reported as never used, since 0 is greater than any time already seen. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2014-03-20tracing: Fix array size mismatch in format stringVaibhav Nagarnaik
In event format strings, the array size is reported in two locations. One in array subscript and then via the "size:" attribute. The values reported there have a mismatch. For e.g., in sched:sched_switch the prev_comm and next_comm character arrays have subscript values as [32] where as the actual field size is 16. name: sched_switch ID: 301 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1;signed:0; field:int common_pid; offset:4; size:4; signed:1; field:char prev_comm[32]; offset:8; size:16; signed:1; field:pid_t prev_pid; offset:24; size:4; signed:1; field:int prev_prio; offset:28; size:4; signed:1; field:long prev_state; offset:32; size:8; signed:1; field:char next_comm[32]; offset:40; size:16; signed:1; field:pid_t next_pid; offset:56; size:4; signed:1; field:int next_prio; offset:60; size:4; signed:1; After bisection, the following commit was blamed: 92edca0 tracing: Use direct field, type and system names This commit removes the duplication of strings for field->name and field->type assuming that all the strings passed in __trace_define_field() are immutable. This is not true for arrays, where the type string is created in event_storage variable and field->type for all array fields points to event_storage. Use __stringify() to create a string constant for the type string. Also, get rid of event_storage and event_storage_mutex that are not needed anymore. also, an added benefit is that this reduces the overhead of events a bit more: text data bss dec hex filename 8424787 2036472 1302528 11763787 b3804b vmlinux 8420814 2036408 1302528 11759750 b37086 vmlinux.patched Link: http://lkml.kernel.org/r/1392349908-29685-1-git-send-email-vnagarnaik@google.com Cc: Laurent Chavey <chavey@google.com> Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-03-20MIPS: Make local_irq_disable macro safe for non-Mipsr2Jim Quinlan
For non-mipsr2 processors, the local_irq_disable contains an mfc0-mtc0 pair with instructions inbetween. With preemption enabled, this sequence may get preempted and effect a stale value of CP0_STATUS when executing the mtc0 instruction. This commit avoids this scenario by incrementing the preempt count before the mfc0 and decrementing it after the mtc9. [ralf@linux-mips.org: This patch is sorting out the part that were missed by e97c5b6098 [MIPS: Make irqflags.h functions preempt-safe for non-mipsr2 cpus.] I also re-enabled the inclusion of <asm/asm-offsets.h> at the top of <asm/asmmacro.h>]. Signed-off-by: Jim Quinlan <jim2101024@gmail.com> Cc: linux-mips@linux-mips.org Cc: cernekee@gmail.com Patchwork: https://patchwork.linux-mips.org/patch/6164/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-03-20Merge branch 'exynos-drm-fixes' of ↵Dave Airlie
git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-fixes Just fixed resource release issue at open fail. * 'exynos-drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos: drm/exynos: Fix (more) freeing issues in exynos_drm_drv.c