summaryrefslogtreecommitdiffstats
path: root/net
AgeCommit message (Collapse)Author
2012-03-22ceph: eliminate some abusive castsAlex Elder
This fixes some spots where a type cast to (void *) was used as as a universal type hiding mechanism. Instead, properly cast the type to the intended target type. Signed-off-by: Alex Elder <elder@newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22ceph: eliminate some needless castsAlex Elder
This eliminates type casts in some places where they are not required. Signed-off-by: Alex Elder <elder@newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22ceph: kill addr_str_lock spinlock; use atomic insteadAlex Elder
A spinlock is used to protect a value used for selecting an array index for a string used for formatting a socket address for human consumption. The index is reset to 0 if it ever reaches the maximum index value. Instead, use an ever-increasing atomic variable as a sequence number, and compute the array index by masking off all but the sequence number's lowest bits. Make the number of entries in the array a power of two to allow the use of such a mask (to avoid jumps in the index value when the sequence number wraps). The length of these strings is somewhat arbitrarily set at 60 bytes. The worst-case length of a string produced is 54 bytes, for an IPv6 address that can't be shortened, e.g.: [1234:5678:9abc:def0:1111:2222:123.234.210.100]:32767 Change it so we arbitrarily use 64 bytes instead; if nothing else it will make the array of these line up better in hex dumps. Rename a few things to reinforce the distinction between the number of strings in the array and the length of individual strings. Signed-off-by: Alex Elder <elder@newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22ceph: make use of "else" where appropriateAlex Elder
Rearrange ceph_tcp_connect() a bit, making use of "else" rather than re-testing a value with consecutive "if" statements. Don't record a connection's socket pointer unless the connect operation is successful. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22ceph: use a shared zero page rather than one per messengerAlex Elder
Each messenger allocates a page to be used when writing zeroes out in the event of error or other abnormal condition. Instead, use the kernel ZERO_PAGE() for that purpose. Signed-off-by: Alex Elder <elder@dreamhost.com> Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22libceph: fix overflow check in crush_decode()Xi Wang
The existing overflow check (n > ULONG_MAX / b) didn't work, because n = ULONG_MAX / b would both bypass the check and still overflow the allocation size a + n * b. The correct check should be (n > (ULONG_MAX - a) / b). Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22net/ceph: Only clear SOCK_NOSPACE when there is sufficient space in the ↵Jim Schutt
socket buffer The Ceph messenger would sometimes queue multiple work items to write data to a socket when the socket buffer was full. Fix this problem by making ceph_write_space() use SOCK_NOSPACE in the same way that net/core/stream.c:sk_stream_write_space() does, i.e., clearing it only when sufficient space is available in the socket buffer. Signed-off-by: Jim Schutt <jaschut@sandia.gov> Reviewed-by: Alex Elder <elder@dreamhost.com>
2012-03-22netfilter: xt_LOG: use CONFIG_IP6_NF_IPTABLES instead of CONFIG_IPV6Pablo Neira Ayuso
This fixes the following linking error: xt_LOG.c:(.text+0x789b1): undefined reference to `ip6t_ext_hdr' ifdefs have to use CONFIG_IP6_NF_IPTABLES instead of CONFIG_IPV6. Acked-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-03-21l2tp: enable automatic module loading for l2tp_pppBenjamin LaHaise
When L2TP is configured as a module, requests for L2TP sockets do not result in the l2tp_ppp module being loaded. Fix this by adding the appropriate MODULE_ALIAS to be recognized by pppox's request_module() call. Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-21net: fix napi_reuse_skb() skb reserveEric Dumazet
napi->skb is allocated in napi_get_frags() using netdev_alloc_skb_ip_align(), with a reserve of NET_SKB_PAD + NET_IP_ALIGN bytes. However, when such skb is recycled in napi_reuse_skb(), it ends with a reserve of NET_IP_ALIGN which is suboptimal. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-21Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile 1 from Al Viro: "This is _not_ all; in particular, Miklos' and Jan's stuff is not there yet." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (64 commits) ext4: initialization of ext4_li_mtx needs to be done earlier debugfs-related mode_t whack-a-mole hfsplus: add an ioctl to bless files hfsplus: change finder_info to u32 hfsplus: initialise userflags qnx4: new helper - try_extent() qnx4: get rid of qnx4_bread/qnx4_getblk take removal of PF_FORKNOEXEC to flush_old_exec() trim includes in inode.c um: uml_dup_mmap() relies on ->mmap_sem being held, but activate_mm() doesn't hold it um: embed ->stub_pages[] into mmu_context gadgetfs: list_for_each_safe() misuse ocfs2: fix leaks on failure exits in module_init ecryptfs: make register_filesystem() the last potential failure exit ntfs: forgets to unregister sysctls on register_filesystem() failure logfs: missing cleanup on register_filesystem() failure jfs: mising cleanup on register_filesystem() failure make configfs_pin_fs() return root dentry on success configfs: configfs_create_dir() has parent dentry in dentry->d_parent configfs: sanitize configfs_create() ...
2012-03-21Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates for 3.4 from James Morris: "The main addition here is the new Yama security module from Kees Cook, which was discussed at the Linux Security Summit last year. Its purpose is to collect miscellaneous DAC security enhancements in one place. This also marks a departure in policy for LSM modules, which were previously limited to being standalone access control systems. Chromium OS is using Yama, and I believe there are plans for Ubuntu, at least. This patchset also includes maintenance updates for AppArmor, TOMOYO and others." Fix trivial conflict in <net/sock.h> due to the jumo_label->static_key rename. * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (38 commits) AppArmor: Fix location of const qualifier on generated string tables TOMOYO: Return error if fails to delete a domain AppArmor: add const qualifiers to string arrays AppArmor: Add ability to load extended policy TOMOYO: Return appropriate value to poll(). AppArmor: Move path failure information into aa_get_name and rename AppArmor: Update dfa matching routines. AppArmor: Minor cleanup of d_namespace_path to consolidate error handling AppArmor: Retrieve the dentry_path for error reporting when path lookup fails AppArmor: Add const qualifiers to generated string tables AppArmor: Fix oops in policy unpack auditing AppArmor: Fix error returned when a path lookup is disconnected KEYS: testing wrong bit for KEY_FLAG_REVOKED TOMOYO: Fix mount flags checking order. security: fix ima kconfig warning AppArmor: Fix the error case for chroot relative path name lookup AppArmor: fix mapping of META_READ to audit and quiet flags AppArmor: Fix underflow in xindex calculation AppArmor: Fix dropping of allowed operations that are force audited AppArmor: Add mising end of structure test to caps unpacking ...
2012-03-21Merge branch 'kmap_atomic' of git://github.com/congwang/linuxLinus Torvalds
Pull kmap_atomic cleanup from Cong Wang. It's been in -next for a long time, and it gets rid of the (no longer used) second argument to k[un]map_atomic(). Fix up a few trivial conflicts in various drivers, and do an "evil merge" to catch some new uses that have come in since Cong's tree. * 'kmap_atomic' of git://github.com/congwang/linux: (59 commits) feature-removal-schedule.txt: schedule the deprecated form of kmap_atomic() for removal highmem: kill all __kmap_atomic() [swarren@nvidia.com: highmem: Fix ARM build break due to __kmap_atomic rename] drbd: remove the second argument of k[un]map_atomic() zcache: remove the second argument of k[un]map_atomic() gma500: remove the second argument of k[un]map_atomic() dm: remove the second argument of k[un]map_atomic() tomoyo: remove the second argument of k[un]map_atomic() sunrpc: remove the second argument of k[un]map_atomic() rds: remove the second argument of k[un]map_atomic() net: remove the second argument of k[un]map_atomic() mm: remove the second argument of k[un]map_atomic() lib: remove the second argument of k[un]map_atomic() power: remove the second argument of k[un]map_atomic() kdb: remove the second argument of k[un]map_atomic() udf: remove the second argument of k[un]map_atomic() ubifs: remove the second argument of k[un]map_atomic() squashfs: remove the second argument of k[un]map_atomic() reiserfs: remove the second argument of k[un]map_atomic() ocfs2: remove the second argument of k[un]map_atomic() ntfs: remove the second argument of k[un]map_atomic() ...
2012-03-21xprtrdma: Remove assumption that each segment is <= PAGE_SIZETom Tucker
The xprtrdma FRMR mapping logic assumes that a segment is <= PAGE_SIZE. This is not true for NFS4. Signed-off-by: Tom Tucker <tom@ogc.us> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-21xprtrdma: The transport should not bug-check when a dup reply is receivedTom Tucker
The client side RDMA transport will bug check if it receives a duplicate reply, instead we should simply drop the duplicate reply. Signed-off-by: Tom Tucker <tom@ogc.us> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-21SUNRPC/LOCKD: Fix build warnings when CONFIG_SUNRPC_DEBUG is undefinedTrond Myklebust
Stephen Rothwell reports: net/sunrpc/rpcb_clnt.c: In function 'rpcb_enc_mapping': net/sunrpc/rpcb_clnt.c:820:19: warning: unused variable 'task' [-Wunused-variable] net/sunrpc/rpcb_clnt.c: In function 'rpcb_dec_getport': net/sunrpc/rpcb_clnt.c:837:19: warning: unused variable 'task' [-Wunused-variable] net/sunrpc/rpcb_clnt.c: In function 'rpcb_dec_set': net/sunrpc/rpcb_clnt.c:860:19: warning: unused variable 'task' [-Wunused-variable] net/sunrpc/rpcb_clnt.c: In function 'rpcb_enc_getaddr': net/sunrpc/rpcb_clnt.c:892:19: warning: unused variable 'task' [-Wunused-variable] net/sunrpc/rpcb_clnt.c: In function 'rpcb_dec_getaddr': net/sunrpc/rpcb_clnt.c:914:19: warning: unused variable 'task' [-Wunused-variable] fs/lockd/svclock.c:49:20: warning: 'nlmdbg_cookie2a' declared 'static' but never defined [-Wunused-function] Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-20Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree from Jiri Kosina: "It's indeed trivial -- mostly documentation updates and a bunch of typo fixes from Masanari. There are also several linux/version.h include removals from Jesper." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (101 commits) kcore: fix spelling in read_kcore() comment constify struct pci_dev * in obvious cases Revert "char: Fix typo in viotape.c" init: fix wording error in mm_init comment usb: gadget: Kconfig: fix typo for 'different' Revert "power, max8998: Include linux/module.h just once in drivers/power/max8998_charger.c" writeback: fix fn name in writeback_inodes_sb_nr_if_idle() comment header writeback: fix typo in the writeback_control comment Documentation: Fix multiple typo in Documentation tpm_tis: fix tis_lock with respect to RCU Revert "media: Fix typo in mixer_drv.c and hdmi_drv.c" Doc: Update numastat.txt qla4xxx: Add missing spaces to error messages compiler.h: Fix typo security: struct security_operations kerneldoc fix Documentation: broken URL in libata.tmpl Documentation: broken URL in filesystems.tmpl mtd: simplify return logic in do_map_probe() mm: fix comment typo of truncate_inode_pages_range power: bq27x00: Fix typos in comment ...
2012-03-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking merge from David Miller: "1) Move ixgbe driver over to purely page based buffering on receive. From Alexander Duyck. 2) Add receive packet steering support to e1000e, from Bruce Allan. 3) Convert TCP MD5 support over to RCU, from Eric Dumazet. 4) Reduce cpu usage in handling out-of-order TCP packets on modern systems, also from Eric Dumazet. 5) Support the IP{,V6}_UNICAST_IF socket options, making the wine folks happy, from Erich Hoover. 6) Support VLAN trunking from guests in hyperv driver, from Haiyang Zhang. 7) Support byte-queue-limtis in r8169, from Igor Maravic. 8) Outline code intended for IP_RECVTOS in IP_PKTOPTIONS existed but was never properly implemented, Jiri Benc fixed that. 9) 64-bit statistics support in r8169 and 8139too, from Junchang Wang. 10) Support kernel side dump filtering by ctmark in netfilter ctnetlink, from Pablo Neira Ayuso. 11) Support byte-queue-limits in gianfar driver, from Paul Gortmaker. 12) Add new peek socket options to assist with socket migration, from Pavel Emelyanov. 13) Add sch_plug packet scheduler whose queue is controlled by userland daemons using explicit freeze and release commands. From Shriram Rajagopalan. 14) Fix FCOE checksum offload handling on transmit, from Yi Zou." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1846 commits) Fix pppol2tp getsockname() Remove printk from rds_sendmsg ipv6: fix incorrent ipv6 ipsec packet fragment cpsw: Hook up default ndo_change_mtu. net: qmi_wwan: fix build error due to cdc-wdm dependecy netdev: driver: ethernet: Add TI CPSW driver netdev: driver: ethernet: add cpsw address lookup engine support phy: add am79c874 PHY support mlx4_core: fix race on comm channel bonding: send igmp report for its master fs_enet: Add MPC5125 FEC support and PHY interface selection net: bpf_jit: fix BPF_S_LDX_B_MSH compilation net: update the usage of CHECKSUM_UNNECESSARY fcoe: use CHECKSUM_UNNECESSARY instead of CHECKSUM_PARTIAL on tx net: do not do gso for CHECKSUM_UNNECESSARY in netif_needs_gso ixgbe: Fix issues with SR-IOV loopback when flow control is disabled net/hyperv: Fix the code handling tx busy ixgbe: fix namespace issues when FCoE/DCB is not enabled rtlwifi: Remove unused ETH_ADDR_LEN defines igbvf: Use ETH_ALEN ... Fix up fairly trivial conflicts in drivers/isdn/gigaset/interface.c and drivers/net/usb/{Kconfig,qmi_wwan.c} as per David.
2012-03-20switch touch_atime to struct pathAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20switch unix_sock to struct pathAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20switch open-coded instances of d_make_root() to new helperAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-03-20Merge branch 'for-3.4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup changes from Tejun Heo: "Out of the 8 commits, one fixes a long-standing locking issue around tasklist walking and others are cleanups." * 'for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: Walk task list under tasklist_lock in cgroup_enable_task_cg_list cgroup: Remove wrong comment on cgroup_enable_task_cg_list() cgroup: remove cgroup_subsys argument from callbacks cgroup: remove extra calls to find_existing_css_set cgroup: replace tasklist_lock with rcu_read_lock cgroup: simplify double-check locking in cgroup_attach_proc cgroup: move struct cgroup_pidlist out from the header file cgroup: remove cgroup_attach_task_current_cg()
2012-03-20Fix pppol2tp getsockname()Benjamin LaHaise
While testing L2TP functionality, I came across a bug in getsockname(). The IP address returned within the pppol2tp_addr's addr memember was not being set to the IP address in use. This bug is caused by using inet_sk() on the wrong socket (the L2TP socket rather than the underlying UDP socket), and was likely introduced during the addition of L2TPv3 support. Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-20Remove printk from rds_sendmsgDave Jones
no socket layer outputs a message for this error and neither should rds. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-20Merge tag 'tty-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/ttyLinus Torvalds
Pull TTY/serial patches from Greg KH: "tty and serial merge for 3.4-rc1 Here's the big serial and tty merge for the 3.4-rc1 tree. There's loads of fixes and reworks in here from Jiri for the tty layer, and a number of patches from Alan to help try to wrestle the vt layer into a sane model. Other than that, lots of driver updates and fixes, and other minor stuff, all detailed in the shortlog." * tag 'tty-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (132 commits) serial: pxa: add clk_prepare/clk_unprepare calls TTY: Wrong unicode value copied in con_set_unimap() serial: PL011: clear pending interrupts serial: bfin-uart: Don't access tty circular buffer in TX DMA interrupt after it is reset. vt: NULL dereference in vt_do_kdsk_ioctl() tty: serial: vt8500: fix annotations for probe/remove serial: remove back and forth conversions in serial_out_sync serial: use serial_port_in/out vs serial_in/out in 8250 serial: introduce generic port in/out helpers serial: reduce number of indirections in 8250 code serial: delete useless void casts in 8250.c serial: make 8250's serial_in shareable to other drivers. serial: delete last unused traces of pausing I/O in 8250 pch_uart: Add module parameter descriptions pch_uart: Use existing default_baud in setup_console pch_uart: Add user_uartclk parameter pch_uart: Add Fish River Island II uart clock quirks pch_uart: Use uartclk instead of base_baud mpc5200b/uart: select more tolerant uart prescaler on low baudrates tty: moxa: fix bit test in moxa_start() ...
2012-03-20Merge branch 'perf-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf events changes for v3.4 from Ingo Molnar: - New "hardware based branch profiling" feature both on the kernel and the tooling side, on CPUs that support it. (modern x86 Intel CPUs with the 'LBR' hardware feature currently.) This new feature is basically a sophisticated 'magnifying glass' for branch execution - something that is pretty difficult to extract from regular, function histogram centric profiles. The simplest mode is activated via 'perf record -b', and the result looks like this in perf report: $ perf record -b any_call,u -e cycles:u branchy $ perf report -b --sort=symbol 52.34% [.] main [.] f1 24.04% [.] f1 [.] f3 23.60% [.] f1 [.] f2 0.01% [k] _IO_new_file_xsputn [k] _IO_file_overflow 0.01% [k] _IO_vfprintf_internal [k] _IO_new_file_xsputn 0.01% [k] _IO_vfprintf_internal [k] strchrnul 0.01% [k] __printf [k] _IO_vfprintf_internal 0.01% [k] main [k] __printf This output shows from/to branch columns and shows the highest percentage (from,to) jump combinations - i.e. the most likely taken branches in the system. "branches" can also include function calls and any other synchronous and asynchronous transitions of the instruction pointer that are not 'next instruction' - such as system calls, traps, interrupts, etc. This feature comes with (hopefully intuitive) flat ascii and TUI support in perf report. - Various 'perf annotate' visual improvements for us assembly junkies. It will now recognize function calls in the TUI and by hitting enter you can follow the call (recursively) and back, amongst other improvements. - Multiple threads/processes recording support in perf record, perf stat, perf top - which is activated via a comma-list of PIDs: perf top -p 21483,21485 perf stat -p 21483,21485 -ddd perf record -p 21483,21485 - Support for per UID views, via the --uid paramter to perf top, perf report, etc. For example 'perf top --uid mingo' will only show the tasks that I am running, excluding other users, root, etc. - Jump label restructurings and improvements - this includes the factoring out of the (hopefully much clearer) include/linux/static_key.h generic facility: struct static_key key = STATIC_KEY_INIT_FALSE; ... if (static_key_false(&key)) do unlikely code else do likely code ... static_key_slow_inc(); ... static_key_slow_inc(); ... The static_key_false() branch will be generated into the code with as little impact to the likely code path as possible. the static_key_slow_*() APIs flip the branch via live kernel code patching. This facility can now be used more widely within the kernel to micro-optimize hot branches whose likelihood matches the static-key usage and fast/slow cost patterns. - SW function tracer improvements: perf support and filtering support. - Various hardenings of the perf.data ABI, to make older perf.data's smoother on newer tool versions, to make new features integrate more smoothly, to support cross-endian recording/analyzing workflows better, etc. - Restructuring of the kprobes code, the splitting out of 'optprobes', and a corner case bugfix. - Allow the tracing of kernel console output (printk). - Improvements/fixes to user-space RDPMC support, allowing user-space self-profiling code to extract PMU counts without performing any system calls, while playing nice with the kernel side. - 'perf bench' improvements - ... and lots of internal restructurings, cleanups and fixes that made these features possible. And, as usual this list is incomplete as there were also lots of other improvements * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (120 commits) perf report: Fix annotate double quit issue in branch view mode perf report: Remove duplicate annotate choice in branch view mode perf/x86: Prettify pmu config literals perf report: Enable TUI in branch view mode perf report: Auto-detect branch stack sampling mode perf record: Add HEADER_BRANCH_STACK tag perf record: Provide default branch stack sampling mode option perf tools: Make perf able to read files from older ABIs perf tools: Fix ABI compatibility bug in print_event_desc() perf tools: Enable reading of perf.data files from different ABI rev perf: Add ABI reference sizes perf report: Add support for taken branch sampling perf record: Add support for sampling taken branch perf tools: Add code to support PERF_SAMPLE_BRANCH_STACK x86/kprobes: Split out optprobe related code to kprobes-opt.c x86/kprobes: Fix a bug which can modify kernel code permanently x86/kprobes: Fix instruction recovery on optimized path perf: Add callback to flush branch_stack on context switch perf: Disable PERF_SAMPLE_BRANCH_* when not supported perf/x86: Add LBR software filter support for Intel CPUs ...
2012-03-20Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU changes for v3.4 from Ingo Molnar. The major features of this series are: - making RCU more aggressive about entering dyntick-idle mode in order to improve energy efficiency - converting a few more call_rcu()s to kfree_rcu()s - applying a number of rcutree fixes and cleanups to rcutiny - removing CONFIG_SMP #ifdefs from treercu - allowing RCU CPU stall times to be set via sysfs - adding CPU-stall capability to rcutorture - adding more RCU-abuse diagnostics - updating documentation - fixing yet more issues located by the still-ongoing top-to-bottom inspection of RCU, this time with a special focus on the CPU-hotplug code path. * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (48 commits) rcu: Stop spurious warnings from synchronize_sched_expedited rcu: Hold off RCU_FAST_NO_HZ after timer posted rcu: Eliminate softirq-mediated RCU_FAST_NO_HZ idle-entry loop rcu: Add RCU_NONIDLE() for idle-loop RCU read-side critical sections rcu: Allow nesting of rcu_idle_enter() and rcu_idle_exit() rcu: Remove redundant check for rcu_head misalignment PTR_ERR should be called before its argument is cleared. rcu: Convert WARN_ON_ONCE() in rcu_lock_acquire() to lockdep rcu: Trace only after NULL-pointer check rcu: Call out dangers of expedited RCU primitives rcu: Rework detection of use of RCU by offline CPUs lockdep: Add CPU-idle/offline warning to lockdep-RCU splat rcu: No interrupt disabling for rcu_prepare_for_idle() rcu: Move synchronize_sched_expedited() to rcutree.c rcu: Check for illegal use of RCU from offlined CPUs rcu: Update stall-warning documentation rcu: Add CPU-stall capability to rcutorture rcu: Make documentation give more realistic rcutorture duration rcutorture: Permit holding off CPU-hotplug operations during boot rcu: Print scheduling-clock information on RCU CPU stall-warning messages ...
2012-03-20SUNRPC/NFS: Add Kbuild dependencies for NFS_DEBUG/RPC_DEBUGTrond Myklebust
This allows us to turn on/off the dprintk() debugging interfaces for those distributions that don't ship the 'rpcdebug' utility. It also allows us to add Kbuild dependencies. Specifically, we already know that dprintk() in general relies on CONFIG_SYSCTL. Now it turns out that the NFS dprintks depend on CONFIG_CRC32 after we added support for the filehandle hash. Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-20sunrpc: remove the second argument of k[un]map_atomic()Cong Wang
Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20rds: remove the second argument of k[un]map_atomic()Cong Wang
Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20net: remove the second argument of k[un]map_atomic()Cong Wang
Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20ipv6: fix incorrent ipv6 ipsec packet fragmentGao feng
Since commit 299b0767(ipv6: Fix IPsec slowpath fragmentation problem) In func ip6_append_data,after call skb_put(skb, fraglen + dst_exthdrlen) the skb->len contains dst_exthdrlen,and we don't reduce dst_exthdrlen at last This will make fraggap>0 in next "while cycle",and cause the size of skb incorrent Fix this by reserve headroom for dst_exthdrlen. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-19tcp: reduce out_of_order memory useEric Dumazet
With increasing receive window sizes, but speed of light not improved that much, out of order queue can contain a huge number of skbs, waiting to be moved to receive_queue when missing packets can fill the holes. Some devices happen to use fat skbs (truesize of 4096 + sizeof(struct sk_buff)) to store regular (MTU <= 1500) frames. This makes highly probable sk_rmem_alloc hits sk_rcvbuf limit, which can be 4Mbytes in many cases. When limit is hit, tcp stack calls tcp_collapse_ofo_queue(), a true latency killer and cpu cache blower. Doing the coalescing attempt each time we add a frame in ofo queue permits to keep memory use tight and in many cases avoid the tcp_collapse() thing later. Tested on various wireless setups (b43, ath9k, ...) known to use big skb truesize, this patch removed the "packets collapsed in receive queue due to low socket buffer" I had before. This also reduced average memory used by tcp sockets. With help from Neal Cardwell. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: H.K. Jerry Chu <hkchu@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-19tcp: introduce tcp_data_queue_ofoEric Dumazet
Split tcp_data_queue() in two parts for better readability. tcp_data_queue_ofo() is responsible for queueing incoming skb into out of order queue. Change code layout so that the skb_set_owner_r() is performed only if skb is not dropped. This is a preliminary patch before "reduce out_of_order memory use" following patch. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: H.K. Jerry Chu <hkchu@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-19SUNRPC: We must not use list_for_each_entry_safe() in rpc_wake_up()Trond Myklebust
The problem is that for the case of priority queues, we have to assume that __rpc_remove_wait_queue_priority will move new elements from the tk_wait.links lists into the queue->tasks[] list. We therefore cannot use list_for_each_entry_safe() on queue->tasks[], since that will skip these new tasks that __rpc_remove_wait_queue_priority is adding. Without this fix, rpc_wake_up and rpc_wake_up_status will both fail to wake up all functions on priority wait queues, which can result in some nasty hangs. Reported-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org
2012-03-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2012-03-17netfilter: ctnetlink: fix race between delete and timeout expirationPablo Neira Ayuso
Kerin Millar reported hardlockups while running `conntrackd -c' in a busy firewall. That system (with several processors) was acting as backup in a primary-backup setup. After several tries, I found a race condition between the deletion operation of ctnetlink and timeout expiration. This patch fixes this problem. Tested-by: Kerin Millar <kerframil@gmail.com> Reported-by: Kerin Millar <kerframil@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-16arp: allow arp processing to honor per interface arp_accept sysctlNeil Horman
I found recently that the arp_process function which handles all of our received arp frames, is using IPV4_DEVCONF_ALL macro to check the state of the arp_process flag. This seems wrong, as it implies that either none or all of the network interfaces accept gratuitous arps. This patch corrects that, allowing per-interface arp_accept configuration to deviate from the all setting. Note this also brings us into line with the way the arp_filter setting is handled during arp_process execution. Tested this myself on my home network, and confirmed it works as expected. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-16ipv6: Don't dev_hold(dev) in ip6_mc_find_dev_rcu.RongQing.Li
ip6_mc_find_dev_rcu() is called with rcu_read_lock(), so don't need to dev_hold(). With dev_hold(), not corresponding dev_put(), will lead to leak. [ bug introduced in 96b52e61be1 (ipv6: mcast: RCU conversions) ] Signed-off-by: RongQing.Li <roy.qing.li@gmail.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-16Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem Conflicts: drivers/net/wireless/ath/ath9k/hw.c
2012-03-16sch_sfq: revert dont put new flow at the end of flowsEric Dumazet
This reverts commit d47a0ac7b6 (sch_sfq: dont put new flow at the end of flows) As Jesper found out, patch sounded great but has bad side effects. In stress situation, pushing new flows in front of the queue can prevent old flows doing any progress. Packets can stay in SFQ queue for unlimited amount of time. It's possible to add heuristics to limit this problem, but this would add complexity outside of SFQ scope. A more sensible answer to Dave Taht concerns (who reported the issued I tried to solve in original commit) is probably to use a qdisc hierarchy so that high prio packets dont enter a potentially crowded SFQ qdisc. Reported-by: Jesper Dangaard Brouer <jdb@comx.dk> Cc: Dave Taht <dave.taht@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-16ipv6: fix icmp6_dst_alloc()Eric Dumazet
commit 87a115783 ( ipv6: Move xfrm_lookup() call down into icmp6_dst_alloc().) forgot to convert one error path, leading to crashes in mld_sendpack() Many thanks to Dave Jones for providing a very complete bug report. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-15mac80211: make uapsd_* keys per-vifEliad Peller
uapsd_queues and uapsd_max_sp_len are relevant only for managed interfaces, and can be configured differently for each vif. Move them from the local struct to sdata->u.mgd, and update the debugfs functions accordingly. Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-03-15mac80211: add NULL terminator to debugfs_netdev write bufEliad Peller
Some debugfs write functions call kstrto* functions, which assume the string is null-terminated. Make it valid by changing ieee80211_if_write() to use static buffer instead of allocating one, and set the last char to NULL. (The write functions try to parse some integer/mac address, so 64 bytes buffer should be enough) Signed-off-by: Eliad Peller <eliad@wizery.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-03-15mac80211: Don't sample max throughput rate in minstrel_htHelmut Schaa
The current max throughput rate is known to be good as otherwise it wouldn't be the max throughput rate. Since rate sampling can introduce some overhead (by adding RTS for example or due to not aggregating the frame) don't sample the max throughput rate. Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com> Acked-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-03-13mac80211: Don't let regulatory make us deafPaul Stewart
When regulatory information changes our HT behavior (e.g, when we get a country code from the AP we have just associated with), we should use this information to change the power with which we transmit, and what channels we transmit. Sometimes the channel parameters we derive from regulatory information contradicts the parameters we used in association. For example, we could have associated specifying HT40, but the regulatory rules we apply may forbid HT40 operation. In the situation above, we should reconfigure ourselves to transmit in HT20 only, however it makes no sense for us to disable receive in HT40, since if we associated with these parameters, the AP has every reason to expect we can and will receive packets this way. The code in mac80211 does not have the capability of sending the appropriate action frames to signal a change in HT behaviour so the AP has no clue we can no longer receive frames encoded this way. In some broken AP implementations, this can leave us effectively deaf if the AP never retries in lower HT rates. This change breaks up the channel_type parameter in the ieee80211_enable_ht function into a separate receive and transmit part. It honors the channel flags set by regulatory in order to configure the rate control algorithm, but uses the capability flags to configure the channel on the radio, since these were used in association to set the AP's transmit rate. Signed-off-by: Paul Stewart <pstew@chromium.org> Cc: Sam Leffler <sleffler@chromium.org> Cc: Johannes Berg <johannes@sipsolutions.net> Reviewed-by: Luis R Rodriguez <mcgrof@frijolero.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-03-13mac80211: rename bss_conf timestamp to last_tsfJohannes Berg
This value is not really very useful by itself, yet some drivers (including iwlwifi until I can figure out what it should do) use it. At least rename it to "last_tsf" to indicate the meaning and add a note that it may be really old. I suspect the value may become useful combined with the rx_status->mactime, but we don't (yet) store that value and pass it to the driver. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-03-13cfg80211: clarify timestamp in cfg80211_inform_bssJohannes Berg
This is intended to be the timestamp sent by the peer in the beacon/probe response, not any form of host timestamp. Clarify the documentation and variable names. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-03-13mac80211: linearize SKBs as needed for cryptoJohannes Berg
Not linearizing every SKB will help actually pass non-linear SKBs all the way up when on an encrypted connection. For now, linearize TKIP completely as it is lower performance and I don't quite grok all the details. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-03-13mac80211: move RX WEP weak IV countingJohannes Berg
This is better done inside the WEP decrypt function where it doesn't have to check all the conditions any more since they've been tested already. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>