Age | Commit message (Collapse) | Author |
|
[ Upstream commit 30313a3d5794472c3548d7288e306a5492030370 ]
When bridge device is created with IFLA_ADDRESS, we are not calling
br_stp_change_bridge_id(), which leads to incorrect local fdb
management and bridge id calculation, and prevents us from receiving
frames on the bridge device.
Reported-by: Tom Gundersen <teg@jklm.no>
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 1c2658545816088477e91860c3a645053719cb54 ]
When the ipv6 fib changes during a table dump, the walk is
restarted and the number of nodes dumped are skipped. But the existing
code doesn't advance to the next node after a node is skipped. This can
cause the dump to loop or produce lots of duplicates when the fib
is modified during the dump.
This change advances the walk to the next node if the current node is
skipped after a restart.
Signed-off-by: Kumar Sundararajan <kumar@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit c53864fd60227de025cb79e05493b13f69843971 ]
Since 115c9b81928360d769a76c632bae62d15206a94a (rtnetlink: Fix problem with
buffer allocation), RTM_NEWLINK messages only contain the IFLA_VFINFO_LIST
attribute if they were solicited by a GETLINK message containing an
IFLA_EXT_MASK attribute with the RTEXT_FILTER_VF flag.
That was done because some user programs broke when they received more data
than expected - because IFLA_VFINFO_LIST contains information for each VF
it can become large if there are many VFs.
However, the IFLA_VF_PORTS attribute, supplied for devices which implement
ndo_get_vf_port (currently the 'enic' driver only), has the same problem.
It supplies per-VF information and can therefore become large, but it is
not currently conditional on the IFLA_EXT_MASK value.
Worse, it interacts badly with the existing EXT_MASK handling. When
IFLA_EXT_MASK is not supplied, the buffer for netlink replies is fixed at
NLMSG_GOODSIZE. If the information for IFLA_VF_PORTS exceeds this, then
rtnl_fill_ifinfo() returns -EMSGSIZE on the first message in a packet.
netlink_dump() will misinterpret this as having finished the listing and
omit data for this interface and all subsequent ones. That can cause
getifaddrs(3) to enter an infinite loop.
This patch addresses the problem by only supplying IFLA_VF_PORTS when
IFLA_EXT_MASK is supplied with the RTEXT_FILTER_VF flag set.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 973462bbde79bb827824c73b59027a0aed5c9ca6 ]
Without IFLA_EXT_MASK specified, the information reported for a single
interface in response to RTM_GETLINK is expected to fit within a netlink
packet of NLMSG_GOODSIZE.
If it doesn't, however, things will go badly wrong, When listing all
interfaces, netlink_dump() will incorrectly treat -EMSGSIZE on the first
message in a packet as the end of the listing and omit information for
that interface and all subsequent ones. This can cause getifaddrs(3) to
enter an infinite loop.
This patch won't fix the problem, but it will WARN_ON() making it easier to
track down what's going wrong.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 78541c1dc60b65ecfce5a6a096fc260219d6784e ]
The caller needs capabilities on the namespace being queried, not on
their own namespace. This is a security bug, although it likely has
only a minor impact.
Cc: stable@vger.kernel.org
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit b14878ccb7fac0242db82720b784ab62c467c0dc ]
Currently, it is possible to create an SCTP socket, then switch
auth_enable via sysctl setting to 1 and crash the system on connect:
Oops[#1]:
CPU: 0 PID: 0 Comm: swapper Not tainted 3.14.1-mipsgit-20140415 #1
task: ffffffff8056ce80 ti: ffffffff8055c000 task.ti: ffffffff8055c000
[...]
Call Trace:
[<ffffffff8043c4e8>] sctp_auth_asoc_set_default_hmac+0x68/0x80
[<ffffffff8042b300>] sctp_process_init+0x5e0/0x8a4
[<ffffffff8042188c>] sctp_sf_do_5_1B_init+0x234/0x34c
[<ffffffff804228c8>] sctp_do_sm+0xb4/0x1e8
[<ffffffff80425a08>] sctp_endpoint_bh_rcv+0x1c4/0x214
[<ffffffff8043af68>] sctp_rcv+0x588/0x630
[<ffffffff8043e8e8>] sctp6_rcv+0x10/0x24
[<ffffffff803acb50>] ip6_input+0x2c0/0x440
[<ffffffff8030fc00>] __netif_receive_skb_core+0x4a8/0x564
[<ffffffff80310650>] process_backlog+0xb4/0x18c
[<ffffffff80313cbc>] net_rx_action+0x12c/0x210
[<ffffffff80034254>] __do_softirq+0x17c/0x2ac
[<ffffffff800345e0>] irq_exit+0x54/0xb0
[<ffffffff800075a4>] ret_from_irq+0x0/0x4
[<ffffffff800090ec>] rm7k_wait_irqoff+0x24/0x48
[<ffffffff8005e388>] cpu_startup_entry+0xc0/0x148
[<ffffffff805a88b0>] start_kernel+0x37c/0x398
Code: dd0900b8 000330f8 0126302d <dcc60000> 50c0fff1 0047182a a48306a0
03e00008 00000000
---[ end trace b530b0551467f2fd ]---
Kernel panic - not syncing: Fatal exception in interrupt
What happens while auth_enable=0 in that case is, that
ep->auth_hmacs is initialized to NULL in sctp_auth_init_hmacs()
when endpoint is being created.
After that point, if an admin switches over to auth_enable=1,
the machine can crash due to NULL pointer dereference during
reception of an INIT chunk. When we enter sctp_process_init()
via sctp_sf_do_5_1B_init() in order to respond to an INIT chunk,
the INIT verification succeeds and while we walk and process
all INIT params via sctp_process_param() we find that
net->sctp.auth_enable is set, therefore do not fall through,
but invoke sctp_auth_asoc_set_default_hmac() instead, and thus,
dereference what we have set to NULL during endpoint
initialization phase.
The fix is to make auth_enable immutable by caching its value
during endpoint initialization, so that its original value is
being carried along until destruction. The bug seems to originate
from the very first days.
Fix in joint work with Daniel Borkmann.
Reported-by: Joshua Kinard <kumba@gentoo.org>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Tested-by: Joshua Kinard <kumba@gentoo.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The patch fixes a problem with dropped jumbo frames after usage of
'ethtool -G ... rx'.
Scenario:
1. ip link set eth0 up
2. ethtool -G eth0 rx N # <- This zeroes rx-jumbo
3. ip link set mtu 9000 dev eth0
The ethtool command set rx_jumbo_pending to zero so any received jumbo
packets are dropped and you need to use 'ethtool -G eth0 rx-jumbo N'
to workaround the issue.
The patch changes the logic so rx_jumbo_pending value is changed only if
jumbo frames are enabled (MTU > 1500).
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit c674ac30c549596295eb0a5af7f4714c0b905b6f ]
Macvlan devices try to avoid stacking, but that's not always
successfull or even desired. As an example, the following
configuration is perefectly legal and valid:
eth0 <--- macvlan0 <---- vlan0.10 <--- macvlan1
However, this configuration produces the following lockdep
trace:
[ 115.620418] ======================================================
[ 115.620477] [ INFO: possible circular locking dependency detected ]
[ 115.620516] 3.15.0-rc1+ #24 Not tainted
[ 115.620540] -------------------------------------------------------
[ 115.620577] ip/1704 is trying to acquire lock:
[ 115.620604] (&vlan_netdev_addr_lock_key/1){+.....}, at: [<ffffffff815df49c>] dev_uc_sync+0x3c/0x80
[ 115.620686]
but task is already holding lock:
[ 115.620723] (&macvlan_netdev_addr_lock_key){+.....}, at: [<ffffffff815da5be>] dev_set_rx_mode+0x1e/0x40
[ 115.620795]
which lock already depends on the new lock.
[ 115.620853]
the existing dependency chain (in reverse order) is:
[ 115.620894]
-> #1 (&macvlan_netdev_addr_lock_key){+.....}:
[ 115.620935] [<ffffffff810d57f2>] lock_acquire+0xa2/0x130
[ 115.620974] [<ffffffff816f62e7>] _raw_spin_lock_nested+0x37/0x50
[ 115.621019] [<ffffffffa07296c3>] vlan_dev_set_rx_mode+0x53/0x110 [8021q]
[ 115.621066] [<ffffffff815da557>] __dev_set_rx_mode+0x57/0xa0
[ 115.621105] [<ffffffff815da5c6>] dev_set_rx_mode+0x26/0x40
[ 115.621143] [<ffffffff815da6be>] __dev_open+0xde/0x140
[ 115.621174] [<ffffffff815da9ad>] __dev_change_flags+0x9d/0x170
[ 115.621174] [<ffffffff815daaa9>] dev_change_flags+0x29/0x60
[ 115.621174] [<ffffffff815e7f11>] do_setlink+0x321/0x9a0
[ 115.621174] [<ffffffff815ea59f>] rtnl_newlink+0x51f/0x730
[ 115.621174] [<ffffffff815e6e75>] rtnetlink_rcv_msg+0x95/0x250
[ 115.621174] [<ffffffff81608b19>] netlink_rcv_skb+0xa9/0xc0
[ 115.621174] [<ffffffff815e6dca>] rtnetlink_rcv+0x2a/0x40
[ 115.621174] [<ffffffff81608150>] netlink_unicast+0xf0/0x1c0
[ 115.621174] [<ffffffff8160851f>] netlink_sendmsg+0x2ff/0x740
[ 115.621174] [<ffffffff815bc9db>] sock_sendmsg+0x8b/0xc0
[ 115.621174] [<ffffffff815bd4b9>] ___sys_sendmsg+0x369/0x380
[ 115.621174] [<ffffffff815bdbb2>] __sys_sendmsg+0x42/0x80
[ 115.621174] [<ffffffff815bdc02>] SyS_sendmsg+0x12/0x20
[ 115.621174] [<ffffffff816ffd69>] system_call_fastpath+0x16/0x1b
[ 115.621174]
-> #0 (&vlan_netdev_addr_lock_key/1){+.....}:
[ 115.621174] [<ffffffff810d4d43>] __lock_acquire+0x1773/0x1a60
[ 115.621174] [<ffffffff810d57f2>] lock_acquire+0xa2/0x130
[ 115.621174] [<ffffffff816f62e7>] _raw_spin_lock_nested+0x37/0x50
[ 115.621174] [<ffffffff815df49c>] dev_uc_sync+0x3c/0x80
[ 115.621174] [<ffffffffa0696d2a>] macvlan_set_mac_lists+0xca/0x110 [macvlan]
[ 115.621174] [<ffffffff815da557>] __dev_set_rx_mode+0x57/0xa0
[ 115.621174] [<ffffffff815da5c6>] dev_set_rx_mode+0x26/0x40
[ 115.621174] [<ffffffff815da6be>] __dev_open+0xde/0x140
[ 115.621174] [<ffffffff815da9ad>] __dev_change_flags+0x9d/0x170
[ 115.621174] [<ffffffff815daaa9>] dev_change_flags+0x29/0x60
[ 115.621174] [<ffffffff815e7f11>] do_setlink+0x321/0x9a0
[ 115.621174] [<ffffffff815ea59f>] rtnl_newlink+0x51f/0x730
[ 115.621174] [<ffffffff815e6e75>] rtnetlink_rcv_msg+0x95/0x250
[ 115.621174] [<ffffffff81608b19>] netlink_rcv_skb+0xa9/0xc0
[ 115.621174] [<ffffffff815e6dca>] rtnetlink_rcv+0x2a/0x40
[ 115.621174] [<ffffffff81608150>] netlink_unicast+0xf0/0x1c0
[ 115.621174] [<ffffffff8160851f>] netlink_sendmsg+0x2ff/0x740
[ 115.621174] [<ffffffff815bc9db>] sock_sendmsg+0x8b/0xc0
[ 115.621174] [<ffffffff815bd4b9>] ___sys_sendmsg+0x369/0x380
[ 115.621174] [<ffffffff815bdbb2>] __sys_sendmsg+0x42/0x80
[ 115.621174] [<ffffffff815bdc02>] SyS_sendmsg+0x12/0x20
[ 115.621174] [<ffffffff816ffd69>] system_call_fastpath+0x16/0x1b
[ 115.621174]
other info that might help us debug this:
[ 115.621174] Possible unsafe locking scenario:
[ 115.621174] CPU0 CPU1
[ 115.621174] ---- ----
[ 115.621174] lock(&macvlan_netdev_addr_lock_key);
[ 115.621174] lock(&vlan_netdev_addr_lock_key/1);
[ 115.621174] lock(&macvlan_netdev_addr_lock_key);
[ 115.621174] lock(&vlan_netdev_addr_lock_key/1);
[ 115.621174]
*** DEADLOCK ***
[ 115.621174] 2 locks held by ip/1704:
[ 115.621174] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff815e6dbb>] rtnetlink_rcv+0x1b/0x40
[ 115.621174] #1: (&macvlan_netdev_addr_lock_key){+.....}, at: [<ffffffff815da5be>] dev_set_rx_mode+0x1e/0x40
[ 115.621174]
stack backtrace:
[ 115.621174] CPU: 3 PID: 1704 Comm: ip Not tainted 3.15.0-rc1+ #24
[ 115.621174] Hardware name: Hewlett-Packard HP xw8400 Workstation/0A08h, BIOS 786D5 v02.38 10/25/2010
[ 115.621174] ffffffff82339ae0 ffff880465f79568 ffffffff816ee20c ffffffff82339ae0
[ 115.621174] ffff880465f795a8 ffffffff816e9e1b ffff880465f79600 ffff880465b019c8
[ 115.621174] 0000000000000001 0000000000000002 ffff880465b019c8 ffff880465b01230
[ 115.621174] Call Trace:
[ 115.621174] [<ffffffff816ee20c>] dump_stack+0x4d/0x66
[ 115.621174] [<ffffffff816e9e1b>] print_circular_bug+0x200/0x20e
[ 115.621174] [<ffffffff810d4d43>] __lock_acquire+0x1773/0x1a60
[ 115.621174] [<ffffffff810d3172>] ? trace_hardirqs_on_caller+0xb2/0x1d0
[ 115.621174] [<ffffffff810d57f2>] lock_acquire+0xa2/0x130
[ 115.621174] [<ffffffff815df49c>] ? dev_uc_sync+0x3c/0x80
[ 115.621174] [<ffffffff816f62e7>] _raw_spin_lock_nested+0x37/0x50
[ 115.621174] [<ffffffff815df49c>] ? dev_uc_sync+0x3c/0x80
[ 115.621174] [<ffffffff815df49c>] dev_uc_sync+0x3c/0x80
[ 115.621174] [<ffffffffa0696d2a>] macvlan_set_mac_lists+0xca/0x110 [macvlan]
[ 115.621174] [<ffffffff815da557>] __dev_set_rx_mode+0x57/0xa0
[ 115.621174] [<ffffffff815da5c6>] dev_set_rx_mode+0x26/0x40
[ 115.621174] [<ffffffff815da6be>] __dev_open+0xde/0x140
[ 115.621174] [<ffffffff815da9ad>] __dev_change_flags+0x9d/0x170
[ 115.621174] [<ffffffff815daaa9>] dev_change_flags+0x29/0x60
[ 115.621174] [<ffffffff811e1db1>] ? mem_cgroup_bad_page_check+0x21/0x30
[ 115.621174] [<ffffffff815e7f11>] do_setlink+0x321/0x9a0
[ 115.621174] [<ffffffff810d394c>] ? __lock_acquire+0x37c/0x1a60
[ 115.621174] [<ffffffff815ea59f>] rtnl_newlink+0x51f/0x730
[ 115.621174] [<ffffffff815ea169>] ? rtnl_newlink+0xe9/0x730
[ 115.621174] [<ffffffff815e6e75>] rtnetlink_rcv_msg+0x95/0x250
[ 115.621174] [<ffffffff810d329d>] ? trace_hardirqs_on+0xd/0x10
[ 115.621174] [<ffffffff815e6dbb>] ? rtnetlink_rcv+0x1b/0x40
[ 115.621174] [<ffffffff815e6de0>] ? rtnetlink_rcv+0x40/0x40
[ 115.621174] [<ffffffff81608b19>] netlink_rcv_skb+0xa9/0xc0
[ 115.621174] [<ffffffff815e6dca>] rtnetlink_rcv+0x2a/0x40
[ 115.621174] [<ffffffff81608150>] netlink_unicast+0xf0/0x1c0
[ 115.621174] [<ffffffff8160851f>] netlink_sendmsg+0x2ff/0x740
[ 115.621174] [<ffffffff815bc9db>] sock_sendmsg+0x8b/0xc0
[ 115.621174] [<ffffffff8119d4af>] ? might_fault+0x5f/0xb0
[ 115.621174] [<ffffffff8119d4f8>] ? might_fault+0xa8/0xb0
[ 115.621174] [<ffffffff8119d4af>] ? might_fault+0x5f/0xb0
[ 115.621174] [<ffffffff815cb51e>] ? verify_iovec+0x5e/0xe0
[ 115.621174] [<ffffffff815bd4b9>] ___sys_sendmsg+0x369/0x380
[ 115.621174] [<ffffffff816faa0d>] ? __do_page_fault+0x11d/0x570
[ 115.621174] [<ffffffff810cfe9f>] ? up_read+0x1f/0x40
[ 115.621174] [<ffffffff816fab04>] ? __do_page_fault+0x214/0x570
[ 115.621174] [<ffffffff8120a10b>] ? mntput_no_expire+0x6b/0x1c0
[ 115.621174] [<ffffffff8120a0b7>] ? mntput_no_expire+0x17/0x1c0
[ 115.621174] [<ffffffff8120a284>] ? mntput+0x24/0x40
[ 115.621174] [<ffffffff815bdbb2>] __sys_sendmsg+0x42/0x80
[ 115.621174] [<ffffffff815bdc02>] SyS_sendmsg+0x12/0x20
[ 115.621174] [<ffffffff816ffd69>] system_call_fastpath+0x16/0x1b
Fix this by correctly providing macvlan lockdep class.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit d38569ab2bba6e6b3233acfc3a84cdbcfbd1f79f ]
This reverts commit dc8eaaa006350d24030502a4521542e74b5cb39f.
vlan: Fix lockdep warning when vlan dev handle notification
Instead we use the new new API to find the lock subclass of
our vlan device. This way we can support configurations where
vlans are interspersed with other devices:
bond -> vlan -> macvlan -> vlan
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 25175ba5c9bff9aaf0229df34bb5d54c81633ec3 ]
Currently netif_addr_lock_nested assumes that there can be only
a single nesting level between 2 devices. However, if we
have multiple devices of the same type stacked, this fails.
For example:
eth0 <-- vlan0.10 <-- vlan0.10.20
A more complicated configuration may stack more then one type of
device in different order.
Ex:
eth0 <-- vlan0.10 <-- macvlan0 <-- vlan1.10.20 <-- macvlan1
This patch adds an ndo_* function that allows each stackable
device to report its nesting level. If the device doesn't
provide this function default subclass of 1 is used.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 4085ebe8c31face855fd01ee40372cb4aab1df3a ]
Multiple devices in the kernel can be stacked/nested and they
need to know their nesting level for the purposes of lockdep.
This patch provides a generic function that determines a nesting
level of a particular device by its type (ex: vlan, macvlan, etc).
We only care about nesting of the same type of devices.
For example:
eth0 <- vlan0.10 <- macvlan0 <- vlan1.20
The nesting level of vlan1.20 would be 1, since there is another vlan
in the stack under it.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit dc8eaaa006350d24030502a4521542e74b5cb39f ]
When I open the LOCKDEP config and run these steps:
modprobe 8021q
vconfig add eth2 20
vconfig add eth2.20 30
ifconfig eth2 xx.xx.xx.xx
then the Call Trace happened:
[32524.386288] =============================================
[32524.386293] [ INFO: possible recursive locking detected ]
[32524.386298] 3.14.0-rc2-0.7-default+ #35 Tainted: G O
[32524.386302] ---------------------------------------------
[32524.386306] ifconfig/3103 is trying to acquire lock:
[32524.386310] (&vlan_netdev_addr_lock_key/1){+.....}, at: [<ffffffff814275f4>] dev_mc_sync+0x64/0xb0
[32524.386326]
[32524.386326] but task is already holding lock:
[32524.386330] (&vlan_netdev_addr_lock_key/1){+.....}, at: [<ffffffff8141af83>] dev_set_rx_mode+0x23/0x40
[32524.386341]
[32524.386341] other info that might help us debug this:
[32524.386345] Possible unsafe locking scenario:
[32524.386345]
[32524.386350] CPU0
[32524.386352] ----
[32524.386354] lock(&vlan_netdev_addr_lock_key/1);
[32524.386359] lock(&vlan_netdev_addr_lock_key/1);
[32524.386364]
[32524.386364] *** DEADLOCK ***
[32524.386364]
[32524.386368] May be due to missing lock nesting notation
[32524.386368]
[32524.386373] 2 locks held by ifconfig/3103:
[32524.386376] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff81431d42>] rtnl_lock+0x12/0x20
[32524.386387] #1: (&vlan_netdev_addr_lock_key/1){+.....}, at: [<ffffffff8141af83>] dev_set_rx_mode+0x23/0x40
[32524.386398]
[32524.386398] stack backtrace:
[32524.386403] CPU: 1 PID: 3103 Comm: ifconfig Tainted: G O 3.14.0-rc2-0.7-default+ #35
[32524.386409] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[32524.386414] ffffffff81ffae40 ffff8800d9625ae8 ffffffff814f68a2 ffff8800d9625bc8
[32524.386421] ffffffff810a35fb ffff8800d8a8d9d0 00000000d9625b28 ffff8800d8a8e5d0
[32524.386428] 000003cc00000000 0000000000000002 ffff8800d8a8e5f8 0000000000000000
[32524.386435] Call Trace:
[32524.386441] [<ffffffff814f68a2>] dump_stack+0x6a/0x78
[32524.386448] [<ffffffff810a35fb>] __lock_acquire+0x7ab/0x1940
[32524.386454] [<ffffffff810a323a>] ? __lock_acquire+0x3ea/0x1940
[32524.386459] [<ffffffff810a4874>] lock_acquire+0xe4/0x110
[32524.386464] [<ffffffff814275f4>] ? dev_mc_sync+0x64/0xb0
[32524.386471] [<ffffffff814fc07a>] _raw_spin_lock_nested+0x2a/0x40
[32524.386476] [<ffffffff814275f4>] ? dev_mc_sync+0x64/0xb0
[32524.386481] [<ffffffff814275f4>] dev_mc_sync+0x64/0xb0
[32524.386489] [<ffffffffa0500cab>] vlan_dev_set_rx_mode+0x2b/0x50 [8021q]
[32524.386495] [<ffffffff8141addf>] __dev_set_rx_mode+0x5f/0xb0
[32524.386500] [<ffffffff8141af8b>] dev_set_rx_mode+0x2b/0x40
[32524.386506] [<ffffffff8141b3cf>] __dev_open+0xef/0x150
[32524.386511] [<ffffffff8141b177>] __dev_change_flags+0xa7/0x190
[32524.386516] [<ffffffff8141b292>] dev_change_flags+0x32/0x80
[32524.386524] [<ffffffff8149ca56>] devinet_ioctl+0x7d6/0x830
[32524.386532] [<ffffffff81437b0b>] ? dev_ioctl+0x34b/0x660
[32524.386540] [<ffffffff814a05b0>] inet_ioctl+0x80/0xa0
[32524.386550] [<ffffffff8140199d>] sock_do_ioctl+0x2d/0x60
[32524.386558] [<ffffffff81401a52>] sock_ioctl+0x82/0x2a0
[32524.386568] [<ffffffff811a7123>] do_vfs_ioctl+0x93/0x590
[32524.386578] [<ffffffff811b2705>] ? rcu_read_lock_held+0x45/0x50
[32524.386586] [<ffffffff811b39e5>] ? __fget_light+0x105/0x110
[32524.386594] [<ffffffff811a76b1>] SyS_ioctl+0x91/0xb0
[32524.386604] [<ffffffff815057e2>] system_call_fastpath+0x16/0x1b
========================================================================
The reason is that all of the addr_lock_key for vlan dev have the same class,
so if we change the status for vlan dev, the vlan dev and its real dev will
hold the same class of addr_lock_key together, so the warning happened.
we should distinguish the lock depth for vlan dev and its real dev.
v1->v2: Convert the vlan_netdev_addr_lock_key to an array of eight elements, which
could support to add 8 vlan id on a same vlan dev, I think it is enough for current
scene, because a netdev's name is limited to IFNAMSIZ which could not hold 8 vlan id,
and the vlan dev would not meet the same class key with its real dev.
The new function vlan_dev_get_lockdep_subkey() will return the subkey and make the vlan
dev could get a suitable class key.
v2->v3: According David's suggestion, I use the subclass to distinguish the lock key for vlan dev
and its real dev, but it make no sense, because the difference for subclass in the
lock_class_key doesn't mean that the difference class for lock_key, so I use lock_depth
to distinguish the different depth for every vlan dev, the same depth of the vlan dev
could have the same lock_class_key, I import the MAX_LOCK_DEPTH from the include/linux/sched.h,
I think it is enough here, the lockdep should never exceed that value.
v3->v4: Add a huge array of locking keys will waste static kernel memory and is not a appropriate method,
we could use _nested() variants to fix the problem, calculate the depth for every vlan dev,
and use the depth as the subclass for addr_lock_key.
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 54d63f787b652755e66eb4dd8892ee6d3f5197fc ]
It's possible to remove the FB tunnel with the command 'ip link del ip6gre0' but
this is unsafe, the module always supposes that this device exists. For example,
ip6gre_tunnel_lookup() may use it unconditionally.
Let's add a rtnl handler for dellink, which will never remove the FB tunnel (we
let ip6gre_destroy_tunnels() do the job).
Introduced by commit c12b395a4664 ("gre: Support GRE over IPv6").
CC: Dmitry Kozlov <xeb@mail.ru>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 1e785f48d29a09b6cf96db7b49b6320dada332e1 ]
Sometimes, when the packet arrives at skb_mac_gso_segment()
its skb->mac_len already accounts for some of the mac lenght
headers in the packet. This seems to happen when forwarding
through and OpenSSL tunnel.
When we start looking for any vlan headers in skb_network_protocol()
we seem to ignore any of the already known mac headers and start
with an ETH_HLEN. This results in an incorrect offset, dropped
TSO frames and general slowness of the connection.
We can start counting from the known skb->mac_len
and return at least that much if all mac level headers
are known and accounted for.
Fixes: 53d6471cef17262d3ad1c7ce8982a234244f68ec (net: Account for all vlan headers in skb_mac_gso_segment)
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Daniel Borkman <dborkman@redhat.com>
Tested-by: Martin Filip <nexus+kernel@smoula.net>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
receiver's buffer"
[ Upstream commit 362d52040c71f6e8d8158be48c812d7729cb8df1 ]
This reverts commit ef2820a735f7 ("net: sctp: Fix a_rwnd/rwnd management
to reflect real state of the receiver's buffer") as it introduced a
serious performance regression on SCTP over IPv4 and IPv6, though a not
as dramatic on the latter. Measurements are on 10Gbit/s with ixgbe NICs.
Current state:
[root@Lab200slot2 ~]# iperf3 --sctp -4 -c 192.168.241.3 -V -l 1452 -t 60
iperf version 3.0.1 (10 January 2014)
Linux Lab200slot2 3.14.0 #1 SMP Thu Apr 3 23:18:29 EDT 2014 x86_64
Time: Fri, 11 Apr 2014 17:56:21 GMT
Connecting to host 192.168.241.3, port 5201
Cookie: Lab200slot2.1397238981.812898.548918
[ 4] local 192.168.241.2 port 38616 connected to 192.168.241.3 port 5201
Starting Test: protocol: SCTP, 1 streams, 1452 byte blocks, omitting 0 seconds, 60 second test
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-1.09 sec 20.8 MBytes 161 Mbits/sec
[ 4] 1.09-2.13 sec 10.8 MBytes 86.8 Mbits/sec
[ 4] 2.13-3.15 sec 3.57 MBytes 29.5 Mbits/sec
[ 4] 3.15-4.16 sec 4.33 MBytes 35.7 Mbits/sec
[ 4] 4.16-6.21 sec 10.4 MBytes 42.7 Mbits/sec
[ 4] 6.21-6.21 sec 0.00 Bytes 0.00 bits/sec
[ 4] 6.21-7.35 sec 34.6 MBytes 253 Mbits/sec
[ 4] 7.35-11.45 sec 22.0 MBytes 45.0 Mbits/sec
[ 4] 11.45-11.45 sec 0.00 Bytes 0.00 bits/sec
[ 4] 11.45-11.45 sec 0.00 Bytes 0.00 bits/sec
[ 4] 11.45-11.45 sec 0.00 Bytes 0.00 bits/sec
[ 4] 11.45-12.51 sec 16.0 MBytes 126 Mbits/sec
[ 4] 12.51-13.59 sec 20.3 MBytes 158 Mbits/sec
[ 4] 13.59-14.65 sec 13.4 MBytes 107 Mbits/sec
[ 4] 14.65-16.79 sec 33.3 MBytes 130 Mbits/sec
[ 4] 16.79-16.79 sec 0.00 Bytes 0.00 bits/sec
[ 4] 16.79-17.82 sec 5.94 MBytes 48.7 Mbits/sec
(etc)
[root@Lab200slot2 ~]# iperf3 --sctp -6 -c 2001:db8:0:f101::1 -V -l 1400 -t 60
iperf version 3.0.1 (10 January 2014)
Linux Lab200slot2 3.14.0 #1 SMP Thu Apr 3 23:18:29 EDT 2014 x86_64
Time: Fri, 11 Apr 2014 19:08:41 GMT
Connecting to host 2001:db8:0:f101::1, port 5201
Cookie: Lab200slot2.1397243321.714295.2b3f7c
[ 4] local 2001:db8:0:f101::2 port 55804 connected to 2001:db8:0:f101::1 port 5201
Starting Test: protocol: SCTP, 1 streams, 1400 byte blocks, omitting 0 seconds, 60 second test
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-1.00 sec 169 MBytes 1.42 Gbits/sec
[ 4] 1.00-2.00 sec 201 MBytes 1.69 Gbits/sec
[ 4] 2.00-3.00 sec 188 MBytes 1.58 Gbits/sec
[ 4] 3.00-4.00 sec 174 MBytes 1.46 Gbits/sec
[ 4] 4.00-5.00 sec 165 MBytes 1.39 Gbits/sec
[ 4] 5.00-6.00 sec 199 MBytes 1.67 Gbits/sec
[ 4] 6.00-7.00 sec 163 MBytes 1.36 Gbits/sec
[ 4] 7.00-8.00 sec 174 MBytes 1.46 Gbits/sec
[ 4] 8.00-9.00 sec 193 MBytes 1.62 Gbits/sec
[ 4] 9.00-10.00 sec 196 MBytes 1.65 Gbits/sec
[ 4] 10.00-11.00 sec 157 MBytes 1.31 Gbits/sec
[ 4] 11.00-12.00 sec 175 MBytes 1.47 Gbits/sec
[ 4] 12.00-13.00 sec 192 MBytes 1.61 Gbits/sec
[ 4] 13.00-14.00 sec 199 MBytes 1.67 Gbits/sec
(etc)
After patch:
[root@Lab200slot2 ~]# iperf3 --sctp -4 -c 192.168.240.3 -V -l 1452 -t 60
iperf version 3.0.1 (10 January 2014)
Linux Lab200slot2 3.14.0+ #1 SMP Mon Apr 14 12:06:40 EDT 2014 x86_64
Time: Mon, 14 Apr 2014 16:40:48 GMT
Connecting to host 192.168.240.3, port 5201
Cookie: Lab200slot2.1397493648.413274.65e131
[ 4] local 192.168.240.2 port 50548 connected to 192.168.240.3 port 5201
Starting Test: protocol: SCTP, 1 streams, 1452 byte blocks, omitting 0 seconds, 60 second test
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-1.00 sec 240 MBytes 2.02 Gbits/sec
[ 4] 1.00-2.00 sec 239 MBytes 2.01 Gbits/sec
[ 4] 2.00-3.00 sec 240 MBytes 2.01 Gbits/sec
[ 4] 3.00-4.00 sec 239 MBytes 2.00 Gbits/sec
[ 4] 4.00-5.00 sec 245 MBytes 2.05 Gbits/sec
[ 4] 5.00-6.00 sec 240 MBytes 2.01 Gbits/sec
[ 4] 6.00-7.00 sec 240 MBytes 2.02 Gbits/sec
[ 4] 7.00-8.00 sec 239 MBytes 2.01 Gbits/sec
With the reverted patch applied, the SCTP/IPv4 performance is back
to normal on latest upstream for IPv4 and IPv6 and has same throughput
as 3.4.2 test kernel, steady and interval reports are smooth again.
Fixes: ef2820a735f7 ("net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer")
Reported-by: Peter Butler <pbutler@sonusnet.com>
Reported-by: Dongsheng Song <dongsheng.song@gmail.com>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Tested-by: Peter Butler <pbutler@sonusnet.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nsn.com>
Cc: Alexander Sverdlin <alexander.sverdlin@nsn.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 05ab8f2647e4221cbdb3856dd7d32bd5407316b3 ]
The BPF_S_ANC_NLATTR and BPF_S_ANC_NLATTR_NEST extensions fail to check
for a minimal message length before testing the supplied offset to be
within the bounds of the message. This allows the subtraction of the nla
header to underflow and therefore -- as the data type is unsigned --
allowing far to big offset and length values for the search of the
netlink attribute.
The remainder calculation for the BPF_S_ANC_NLATTR_NEST extension is
also wrong. It has the minuend and subtrahend mixed up, therefore
calculates a huge length value, allowing to overrun the end of the
message while looking for the netlink attribute.
The following three BPF snippets will trigger the bugs when attached to
a UNIX datagram socket and parsing a message with length 1, 2 or 3.
,-[ PoC for missing size check in BPF_S_ANC_NLATTR ]--
| ld #0x87654321
| ldx #42
| ld #nla
| ret a
`---
,-[ PoC for the same bug in BPF_S_ANC_NLATTR_NEST ]--
| ld #0x87654321
| ldx #42
| ld #nlan
| ret a
`---
,-[ PoC for wrong remainder calculation in BPF_S_ANC_NLATTR_NEST ]--
| ; (needs a fake netlink header at offset 0)
| ld #0
| ldx #42
| ld #nlan
| ret a
`---
Fix the first issue by ensuring the message length fulfills the minimal
size constrains of a nla header. Fix the second bug by getting the math
for the remainder calculation right.
Fixes: 4738c1db15 ("[SKFILTER]: Add SKF_ADF_NLATTR instruction")
Fixes: d214c7537b ("filter: add SKF_AD_NLATTR_NEST to look for nested..")
Cc: Patrick McHardy <kaber@trash.net>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 91146153da2feab18efab2e13b0945b6bb704ded ]
Extend commit 13378cad02afc2adc6c0e07fca03903c7ada0b37
("ipv4: Change rt->rt_iif encoding.") from 3.6 to return valid
RTA_IIF on 'ip route get ... iif DEVICE' instead of rt_iif 0
which is displayed as 'iif *'.
inet_iif is not appropriate to use because skb_iif is not set.
Use the skb->dev->ifindex instead.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit b04c46190219a4f845e46a459e3102137b7f6cac ]
Plug a group_info refcount leak in ping_init.
group_info is only needed during initialization and
the code failed to release the reference on exit.
While here move grabbing the reference to a place
where it is actually needed.
Signed-off-by: Chuansheng Liu <chuansheng.liu@intel.com>
Signed-off-by: Zhang Dongxing <dongxing.zhang@intel.com>
Signed-off-by: xiaoming wang <xiaoming.wang@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 8d89dcdf80d88007647945a753821a06eb6cc5a5 ]
Before the patch, it was possible to add two times the same tunnel:
ip l a vti1 type vti remote 10.16.0.121 local 10.16.0.249 key 41
ip l a vti2 type vti remote 10.16.0.121 local 10.16.0.249 key 41
It was possible, because ip_tunnel_newlink() calls ip_tunnel_find() with the
argument dev->type, which was set only later (when calling ndo_init handler
in register_netdevice()). Let's set this type in the setup handler, which is
called before newlink handler.
Introduced by commit b9959fd3b0fa ("vti: switch to new ip tunnel code").
CC: Cong Wang <amwang@redhat.com>
CC: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 5a4552752d8f7f4cef1d98775ece7adb7616fde2 ]
Before the patch, it was possible to add two times the same tunnel:
ip l a gre1 type gre remote 10.16.0.121 local 10.16.0.249
ip l a gre2 type gre remote 10.16.0.121 local 10.16.0.249
It was possible, because ip_tunnel_newlink() calls ip_tunnel_find() with the
argument dev->type, which was set only later (when calling ndo_init handler
in register_netdevice()). Let's set this type in the setup handler, which is
called before newlink handler.
Introduced by commit c54419321455 ("GRE: Refactor GRE tunneling code.").
CC: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 30f78d8ebf7f514801e71b88a10c948275168518 ]
Francois reported that setting big mtu on loopback device could prevent
tcp sessions making progress.
We do not support (yet ?) IPv6 Jumbograms and cook corrupted packets.
We must limit the IPv6 MTU to (65535 + 40) bytes in theory.
Tested:
ifconfig lo mtu 70000
netperf -H ::1
Before patch : Throughput : 0.05 Mbits
After patch : Throughput : 35484 Mbits
Reported-by: Francois WELLENREITER <f.wellenreiter@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit eb7076182d1ae4bc4641534134ed707100d76acc ]
br_allowed_ingress() has two problems.
1. If br_allowed_ingress() is called by br_handle_frame_finish() and
vlan_untag() in br_allowed_ingress() fails, skb will be freed by both
vlan_untag() and br_handle_frame_finish().
2. If br_allowed_ingress() is called by br_dev_xmit() and
br_allowed_ingress() fails, the skb will not be freed.
Fix these two problems by freeing the skb in br_allowed_ingress()
if it fails.
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit db29868653394937037d71dc3545768302dda643 ]
Remove the bonding debug_fs entries when the
module initialization fails. The debug_fs
entries should be removed together with all other
already allocated resources.
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: Jay Vosburgh <j.vosburgh@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 6d39d589bb76ee8a1c6cde6822006ae0053decff ]
In case of tcp, gso_size contains the tcpmss.
For UFO (udp fragmentation offloading) skbs, gso_size is the fragment
payload size, i.e. we must not account for udp header size.
Otherwise, when using virtio drivers, a to-be-forwarded UFO GSO packet
will be needlessly fragmented in the forward path, because we think its
individual segments are too large for the outgoing link.
Fixes: fe6cc55f3a9a053 ("net: ip, ipv6: handle gso skbs in forwarding path")
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: Tobias Brunner <tobias@strongswan.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit f34c4a35d87949fbb0e0f31eba3c054e9f8199ba ]
When l2tp driver tries to get PMTU for the tunnel destination, it uses
the pointer to struct sock that represents PPPoX socket, while it
should use the pointer that represents UDP socket of the tunnel.
Signed-off-by: Dmitry Petukhov <dmgenp@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 1e1cdf8ac78793e0875465e98a648df64694a8d0 ]
In function sctp_wake_up_waiters(), we need to involve a test
if the association is declared dead. If so, we don't have any
reference to a possible sibling association anymore and need
to invoke sctp_write_space() instead, and normally walk the
socket's associations and notify them of new wmem space. The
reason for special casing is that otherwise, we could run
into the following issue when a sctp_primitive_SEND() call
from sctp_sendmsg() fails, and tries to flush an association's
outq, i.e. in the following way:
sctp_association_free()
`-> list_del(&asoc->asocs) <-- poisons list pointer
asoc->base.dead = true
sctp_outq_free(&asoc->outqueue)
`-> __sctp_outq_teardown()
`-> sctp_chunk_free()
`-> consume_skb()
`-> sctp_wfree()
`-> sctp_wake_up_waiters() <-- dereferences poisoned pointers
if asoc->ep->sndbuf_policy=0
Therefore, only walk the list in an 'optimized' way if we find
that the current association is still active. We could also use
list_del_init() in addition when we call sctp_association_free(),
but as Vlad suggests, we want to trap such bugs and thus leave
it poisoned as is.
Why is it safe to resolve the issue by testing for asoc->base.dead?
Parallel calls to sctp_sendmsg() are protected under socket lock,
that is lock_sock()/release_sock(). Only within that path under
lock held, we're setting skb/chunk owner via sctp_set_owner_w().
Eventually, chunks are freed directly by an association still
under that lock. So when traversing association list on destruction
time from sctp_wake_up_waiters() via sctp_wfree(), a different
CPU can't be running sctp_wfree() while another one calls
sctp_association_free() as both happens under the same lock.
Therefore, this can also not race with setting/testing against
asoc->base.dead as we are guaranteed for this to happen in order,
under lock. Further, Vlad says: the times we check asoc->base.dead
is when we've cached an association pointer for later processing.
In between cache and processing, the association may have been
freed and is simply still around due to reference counts. We check
asoc->base.dead under a lock, so it should always be safe to check
and not race against sctp_association_free(). Stress-testing seems
fine now, too.
Fixes: cd253f9f357d ("net: sctp: wake up all assocs if sndbuf policy is per socket")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Vlad Yasevich <vyasevic@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 52c35befb69b005c3fc5afdaae3a5717ad013411 ]
SCTP charges chunks for wmem accounting via skb->truesize in
sctp_set_owner_w(), and sctp_wfree() respectively as the
reverse operation. If a sender runs out of wmem, it needs to
wait via sctp_wait_for_sndbuf(), and gets woken up by a call
to __sctp_write_space() mostly via sctp_wfree().
__sctp_write_space() is being called per association. Although
we assign sk->sk_write_space() to sctp_write_space(), which
is then being done per socket, it is only used if send space
is increased per socket option (SO_SNDBUF), as SOCK_USE_WRITE_QUEUE
is set and therefore not invoked in sock_wfree().
Commit 4c3a5bdae293 ("sctp: Don't charge for data in sndbuf
again when transmitting packet") fixed an issue where in case
sctp_packet_transmit() manages to queue up more than sndbuf
bytes, sctp_wait_for_sndbuf() will never be woken up again
unless it is interrupted by a signal. However, a still
remaining issue is that if net.sctp.sndbuf_policy=0, that is
accounting per socket, and one-to-many sockets are in use,
the reclaimed write space from sctp_wfree() is 'unfairly'
handed back on the server to the association that is the lucky
one to be woken up again via __sctp_write_space(), while
the remaining associations are never be woken up again
(unless by a signal).
The effect disappears with net.sctp.sndbuf_policy=1, that
is wmem accounting per association, as it guarantees a fair
share of wmem among associations.
Therefore, if we have reclaimed memory in case of per socket
accounting, wake all related associations to a socket in a
fair manner, that is, traverse the socket association list
starting from the current neighbour of the association and
issue a __sctp_write_space() to everyone until we end up
waking ourselves. This guarantees that no association is
preferred over another and even if more associations are
taken into the one-to-many session, all receivers will get
messages from the server and are not stalled forever on
high load. This setting still leaves the advantage of per
socket accounting in touch as an association can still use
up global limits if unused by others.
Fixes: 4eb701dfc618 ("[SCTP] Fix SCTP sendbuffer accouting.")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Vlad Yasevich <vyasevic@redhat.com>
Acked-by: Vlad Yasevich <vyasevic@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 9297ebf29ad9118edd6c0fedc84f03e35028827d upstream.
The TP_printk() should never dereference any pointers, because the ring
buffer can be read at some unknown time in the future. If a device no
longer exists, it can cause a kernel oops. This also makes this
event useless when saving the ring buffer in userspaces tools such as
perf and trace-cmd.
The i915_gem_evict_vm dereferences the vm pointer which may also not
exist when the ring buffer is read sometime in the future.
Link: http://lkml.kernel.org/r/1395095198-20034-3-git-send-email-artagnon@gmail.com
Reported-by: Ramkumar Ramachandra <artagnon@gmail.com>
Fixes: bcccff847d1f "drm/i915: trace vm eviction instead of everything"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
[danvet: Try to make it actually compile]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e1f23f3dd817f53f622e486913ac662add46eeed upstream.
This is *not* bisected, but the likely regression is
commit c35614380d5c956bfda20eab2755b2f5a7d6f1e7
Author: Zhao Yakui <yakui.zhao@intel.com>
Date: Tue Nov 24 09:48:48 2009 +0800
drm/i915: Don't set up the TV port if it isn't in the BIOS table.
The commit does not check for all TV device types that might be present
in the VBT, disabling TV out for the missing ones. Add composite
S-video.
Reported-and-tested-by: Matthew Khouzam <matthew.khouzam@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73362
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f1553174a207f68a4ec19d436003097e0a4dc405 upstream.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit a8947f576728a66bd3aac629bd8ca021a010c808 upstream.
Need to swap on BE.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 16086279353cbfecbb3ead474072dced17b97ddc upstream.
This needs to be done to update some of the fields in
the connector structure used by the audio code.
Noticed by several users on irc.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 06a139f7a0885fa2c84962300edd181821ddc2c9 upstream.
If the IB test fails we don't want to reset the card over
and over again, just accept that it isn't working.
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=76501
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 41ccec352f3c823931a7d9d2a9c7880c14d7415a upstream.
This fixes a BUG_ON(bo->sync_obj != NULL); in ttm_bo_release_list.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit cbd75e97a525e3819c02dc18bc2d67aa544c9e45 upstream.
We already check that the buffer object we're accessing is registered with
the file. Now also make sure that we can't DMA across buffer object boundaries.
v2: Code commenting update.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit c8e5e010ef12df6707a1d711a5279a22f67a355e upstream.
The query buffers were reserved while holding the binding mutex, which
caused a circular locking dependency.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit aa6de142c901cd2d90ef08db30ae87da214bedcc upstream.
Previously, the vmwgfx_fb driver would allow users to call FBIOSET_VINFO, but it would not adjust
the FINFO properly, resulting in distorted screen rendering. The patch corrects that behaviour.
See https://bugs.gentoo.org/show_bug.cgi?id=494794 for examples.
Signed-off-by: Christopher Friedt <chrisfriedt@gmail.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit c0da71ff4d2cbf113465bff9a7c413154be25a89 upstream.
Some fields are missing from the event mailbox
struct definitions, which cause issues when
trying to handle some events.
Add the missing fields in order to align the
struct size (without adding actual support
for the new fields).
Reported-and-tested-by: Imre Kaloz <kaloz@openwrt.org>
Fixes: 028e724 ("wl18xx: move to new firmware (wl18xx-fw-3.bin)")
Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit a2a4dc494a7b7135f460e38e788c4a58f65e4ac3 upstream.
Commit 9e30cc9595303b27b48 removed an internal mount. This
has the side-effect that rootfs now has FSID 0. Many
userspace utilities assume that st_dev in struct stat
is never 0, so this change breaks a number of tools in
early userspace.
Since we don't know how many userspace programs are affected,
make sure that FSID is at least 1.
References: http://article.gmane.org/gmane.linux.kernel/1666905
References: http://permalink.gmane.org/gmane.linux.utilities.util-linux-ng/8557
Signed-off-by: Thomas Bächler <thomas@archlinux.org>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Tested-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit c98235cb8584a72e95786e17d695a8e5fafcd766 upstream.
The mlx4 driver is triggering schedules while atomic inside
mlx4_en_netpoll:
spin_lock_irqsave(&cq->lock, flags);
napi_synchronize(&cq->napi);
^^^^^ msleep here
mlx4_en_process_rx_cq(dev, cq, 0);
spin_unlock_irqrestore(&cq->lock, flags);
This was part of a patch by Alexander Guller from Mellanox in 2011,
but it still isn't upstream.
Signed-off-by: Chris Mason <clm@fb.com>
Acked-By: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d758c9c1b36b4d9a141c2146c70398d756167ed1 upstream.
The lack of pm_runtime_resume handling for the device state leads into
device wake-up interrupts not working after a while for runtime PM.
Also, serial-omap is confused about the use of device_may_wakeup.
The checks for device_may_wakeup should only be done for suspend and
resume, not for pm_runtime_suspend and pm_runtime_resume. The wake-up
events for PM runtime should always be enabled.
The lack of pm_runtime_resume handling leads into device wake-up
interrupts not working after a while for runtime PM.
Rather than try to patch over the issue of adding complex tests to
the pm_runtime_resume, let's fix the issues properly:
1. Make serial_omap_enable_wakeup deal with all internal PM state
handling so we don't need to test for up->wakeups_enabled elsewhere.
Later on once omap3 boots in device tree only mode we can also
remove the up->wakeups_enabled flag and rely on the wake-up
interrupt enable/disable state alone.
2. Do the device_may_wakeup checks in suspend and resume only,
for runtime PM the wake-up events need to be always enabled.
3. Finally just call serial_omap_enable_wakeup and make sure we
call it also in pm_runtime_resume.
4. Note that we also have to use disable_irq_nosync as serial_omap_irq
calls pm_runtime_get_sync.
Fixes: 2a0b965cfb6e (serial: omap: Add support for optional wake-up)
Signed-off-by: Tony Lindgren <tony@atomide.com>
Acked-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 34f972d6156fe9eea2ab7bb418c71f9d1d5c8e7b upstream.
A number of older CMOTech modems are based on Qualcomm
chips. The blacklisted interfaces are QMI/wwan.
Reported-by: Lars Melin <larsm17@gmail.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit dd6b48ecec2ea7d15f28d5e5474388681899a5e1 upstream.
Device interface layout:
0: ff/ff/ff - serial
1: ff/00/00 - serial AT+PPP
2: ff/ff/ff - QMI/wwan
3: 08/06/50 - storage
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 533b3994610f316e5cd61b56d0c4daa15c830f89 upstream.
Device interface layout:
0: ff/ff/ff - serial
1: ff/ff/ff - serial AT+PPP
2: 08/06/50 - storage
3: ff/ff/ff - serial
4: ff/ff/ff - QMI/wwan
Reported-by: Julio Araujo <julio.araujo@wllctel.com.br>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit bce4f588f19d59fc07fadfeb0b2a3a06c942827a upstream.
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 70a3615fc07c2330ed7c1e922f3c44f4a67c0762 upstream.
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit a00986f81182a69dee4d2c48e8c19805bdf0f790 upstream.
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 5509076d1b4485ce9fb07705fcbcd2695907ab5b upstream.
During firmware download the device expects memory addresses in
big-endian byte order. As the wIndex parameter which hold the address is
sent in little-endian byte order regardless of host byte order, we need
to use swab16 rather than cpu_to_be16.
Also make sure to handle the struct ti_i2c_desc size parameter which is
returned in little-endian byte order.
Reported-by: Ludovic Drolez <ldrolez@debian.org>
Tested-by: Ludovic Drolez <ldrolez@debian.org>
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 10164c2ad6d2c16809f6c09e278f946e47801b3a upstream.
Fix driver new_id sysfs-attribute removal deadlock by making sure to
not hold any locks that the attribute operations grab when removing the
attribute.
Specifically, usb_serial_deregister holds the table mutex when
deregistering the driver, which includes removing the new_id attribute.
This can lead to a deadlock as writing to new_id increments the
attribute's active count before trying to grab the same mutex in
usb_serial_probe.
The deadlock can easily be triggered by inserting a sleep in
usb_serial_deregister and writing the id of an unbound device to new_id
during module unload.
As the table mutex (in this case) is used to prevent subdriver unload
during probe, it should be sufficient to only hold the lock while
manipulating the usb-serial driver list during deregister. A racing
probe will then either fail to find a matching subdriver or fail to get
the corresponding module reference.
Since v3.15-rc1 this also triggers the following lockdep warning:
======================================================
[ INFO: possible circular locking dependency detected ]
3.15.0-rc2 #123 Tainted: G W
-------------------------------------------------------
modprobe/190 is trying to acquire lock:
(s_active#4){++++.+}, at: [<c0167aa0>] kernfs_remove_by_name_ns+0x4c/0x94
but task is already holding lock:
(table_lock){+.+.+.}, at: [<bf004d84>] usb_serial_deregister+0x3c/0x78 [usbserial]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (table_lock){+.+.+.}:
[<c0075f84>] __lock_acquire+0x1694/0x1ce4
[<c0076de8>] lock_acquire+0xb4/0x154
[<c03af3cc>] _raw_spin_lock+0x4c/0x5c
[<c02bbc24>] usb_store_new_id+0x14c/0x1ac
[<bf007eb4>] new_id_store+0x68/0x70 [usbserial]
[<c025f568>] drv_attr_store+0x30/0x3c
[<c01690e0>] sysfs_kf_write+0x5c/0x60
[<c01682c0>] kernfs_fop_write+0xd4/0x194
[<c010881c>] vfs_write+0xbc/0x198
[<c0108e4c>] SyS_write+0x4c/0xa0
[<c000f880>] ret_fast_syscall+0x0/0x48
-> #0 (s_active#4){++++.+}:
[<c03a7a28>] print_circular_bug+0x68/0x2f8
[<c0076218>] __lock_acquire+0x1928/0x1ce4
[<c0076de8>] lock_acquire+0xb4/0x154
[<c0166b70>] __kernfs_remove+0x254/0x310
[<c0167aa0>] kernfs_remove_by_name_ns+0x4c/0x94
[<c0169fb8>] remove_files.isra.1+0x48/0x84
[<c016a2fc>] sysfs_remove_group+0x58/0xac
[<c016a414>] sysfs_remove_groups+0x34/0x44
[<c02623b8>] driver_remove_groups+0x1c/0x20
[<c0260e9c>] bus_remove_driver+0x3c/0xe4
[<c026235c>] driver_unregister+0x38/0x58
[<bf007fb4>] usb_serial_bus_deregister+0x84/0x88 [usbserial]
[<bf004db4>] usb_serial_deregister+0x6c/0x78 [usbserial]
[<bf005330>] usb_serial_deregister_drivers+0x2c/0x4c [usbserial]
[<bf016618>] usb_serial_module_exit+0x14/0x1c [sierra]
[<c009d6cc>] SyS_delete_module+0x184/0x210
[<c000f880>] ret_fast_syscall+0x0/0x48
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(table_lock);
lock(s_active#4);
lock(table_lock);
lock(s_active#4);
*** DEADLOCK ***
1 lock held by modprobe/190:
#0: (table_lock){+.+.+.}, at: [<bf004d84>] usb_serial_deregister+0x3c/0x78 [usbserial]
stack backtrace:
CPU: 0 PID: 190 Comm: modprobe Tainted: G W 3.15.0-rc2 #123
[<c0015e10>] (unwind_backtrace) from [<c0013728>] (show_stack+0x20/0x24)
[<c0013728>] (show_stack) from [<c03a9a54>] (dump_stack+0x24/0x28)
[<c03a9a54>] (dump_stack) from [<c03a7cac>] (print_circular_bug+0x2ec/0x2f8)
[<c03a7cac>] (print_circular_bug) from [<c0076218>] (__lock_acquire+0x1928/0x1ce4)
[<c0076218>] (__lock_acquire) from [<c0076de8>] (lock_acquire+0xb4/0x154)
[<c0076de8>] (lock_acquire) from [<c0166b70>] (__kernfs_remove+0x254/0x310)
[<c0166b70>] (__kernfs_remove) from [<c0167aa0>] (kernfs_remove_by_name_ns+0x4c/0x94)
[<c0167aa0>] (kernfs_remove_by_name_ns) from [<c0169fb8>] (remove_files.isra.1+0x48/0x84)
[<c0169fb8>] (remove_files.isra.1) from [<c016a2fc>] (sysfs_remove_group+0x58/0xac)
[<c016a2fc>] (sysfs_remove_group) from [<c016a414>] (sysfs_remove_groups+0x34/0x44)
[<c016a414>] (sysfs_remove_groups) from [<c02623b8>] (driver_remove_groups+0x1c/0x20)
[<c02623b8>] (driver_remove_groups) from [<c0260e9c>] (bus_remove_driver+0x3c/0xe4)
[<c0260e9c>] (bus_remove_driver) from [<c026235c>] (driver_unregister+0x38/0x58)
[<c026235c>] (driver_unregister) from [<bf007fb4>] (usb_serial_bus_deregister+0x84/0x88 [usbserial])
[<bf007fb4>] (usb_serial_bus_deregister [usbserial]) from [<bf004db4>] (usb_serial_deregister+0x6c/0x78 [usbserial])
[<bf004db4>] (usb_serial_deregister [usbserial]) from [<bf005330>] (usb_serial_deregister_drivers+0x2c/0x4c [usbserial])
[<bf005330>] (usb_serial_deregister_drivers [usbserial]) from [<bf016618>] (usb_serial_module_exit+0x14/0x1c [sierra])
[<bf016618>] (usb_serial_module_exit [sierra]) from [<c009d6cc>] (SyS_delete_module+0x184/0x210)
[<c009d6cc>] (SyS_delete_module) from [<c000f880>] (ret_fast_syscall+0x0/0x48)
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 2e01280d2801c72878cf3a7119eac30077b463d5 upstream.
This reverts commit 1ebca9dad5abe8b2ed4dbd186cd657fb47c1f321.
This device was erroneously added to the sierra driver even though it's
not a Sierra device and was already handled by the option driver.
Cc: Richard Farina <sidhayn@gmail.com>
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|