aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/net/ethernet/mellanox/mlx4
AgeCommit message (Collapse)Author
2017-08-08net/mlx4_en: don't set CHECKSUM_COMPLETE on SCTP packetsDavide Caratti
if the NIC fails to validate the checksum on TCP/UDP, and validation of IP checksum is successful, the driver subtracts the pseudo-header checksum from the value obtained by the hardware and sets CHECKSUM_COMPLETE. Don't do that if protocol is IPPROTO_SCTP, otherwise CRC32c validation fails. V2: don't test MLX4_CQE_STATUS_IPV6 if MLX4_CQE_STATUS_IPV4 is set Reported-by: Shuang Li <shuali@redhat.com> Fixes: f8c6455bb04b ("net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07net: sched: get rid of struct tc_to_netdevJiri Pirko
Get rid of struct tc_to_netdev which is now just unnecessary container and rather pass per-type structures down to drivers directly. Along with that, consolidate the naming of per-type structure variables in cls_*. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07net: sched: change return value of ndo_setup_tc for driver supporting mqprio ↵Jiri Pirko
only Change the return value from -EINVAL to -EOPNOTSUPP. The rest of the drivers have it like that, so be aligned. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07net: sched: push cls related args into cls_common structureJiri Pirko
As ndo_setup_tc is generic offload op for whole tc subsystem, does not really make sense to have cls-specific args. So move them under cls_common structurure which is embedded in all cls structs. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07net: sched: make type an argument for ndo_setup_tcJiri Pirko
Since the type is always present, push it to be a separate argument to ndo_setup_tc. On the way, name the type enum and use it for arg type. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-02net/mlx4_core: Fixes missing capability bit in flags2 capability dumpJack Morgenstein
The cited commit introduced the following new enum value in file include/linux/mlx4/device.h: QUERY_DEV_CAP_DIAG_RPRT_PER_PORT However, it failed to introduce a corresponding entry in function dump_dev_cap_flags2() for outputting a line in the message log when this capability bit is set. The change here fixes that omission. Fixes: c7c122ed67e4 ("net/mlx4: Add diagnostic counters capability bit") Reported-by: Mukesh Kacker <mukesh.kacker@oracle.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-02net/mlx4_core: Fix namespace misalignment in QinQ VST support commitJack Morgenstein
The cited commit introduced the following new enum value in file include/linux/mlx4/device.h: MLX4_DEV_CAP_FLAG2_SVLAN_BY_QP However the value of MLX4_DEV_CAP_FLAG2_SVLAN_BY_QP needs to stay consistent with the value used in another namespace in function dump_dev_cap_flags2(), which is manually kept in sync. The change here restores that consistency. Fixes: 7c3d21c8153c ("net/mlx4_core: Preparation for VF vlan protocol 802.1ad") Reported-by: Mukesh Kacker <mukesh.kacker@oracle.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-02net/mlx4_core: Fix sl_to_vl_change bit offset in flags2 dumpJack Morgenstein
The index value in function dump_dev_cap_flags2() for outputting "sl to vl mapping table change event support" needs to be consistent with the value of the enumerated constant MLX4_DEV_CAP_FLAG2_SL_TO_VL_CHANGE_EVENT defined in file include/linux/mlx4_device.h The change here restores that consistency. Fixes: fd10ed8e6f42 ("IB/mlx4: Fix possible vl/sl field mismatch in LRH header in QP1 packets") Reported-by: Mukesh Kacker <mukesh.kacker@oracle.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-02net/mlx4_en: Fix wrong indication of Wake-on-LAN (WoL) supportInbar Karmy
Currently when WoL is supported but disabled, ethtool reports: "Supports Wake-on: d". Fix the indication of Wol support, so that the indication remains "g" all the time if the NIC supports WoL. Tested: As accepted, when NIC supports WoL- ethtool reports: Supports Wake-on: g Wake-on: d when NIC doesn't support WoL- ethtool reports: Supports Wake-on: d Wake-on: d Fixes: 14c07b1358ed ("mlx4: Wake on LAN support") Signed-off-by: Inbar Karmy <inbark@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-24mlx4_en: remove unnecessary error checkZhu Yanjun
The function mlx4_en_get_profile always return zero. So it is not necessary to check its return value. CC: Joe Jin <joe.jin@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-24mlx4_en: remove unnecessary returned valueZhu Yanjun
The function mlx4_en_arm_cq always returns zero. So change the return type of the function mlx4_en_arm_cq to void. CC: Joe Jin <joe.jin@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-24(IB, net)/mlx4: Add resource utilization supportMoshe Shemesh
Adding visibility of resource usage of QPs, CQs and counters used by virtual functions. This feature will be used to give the PF administrator more data while debugging VF status. Usage info was added to ALLOC_RES command, to notify the PF if the resource which is being reserved or allocated for the VF will be used by kernel driver or by user verbs. Updated reservation and allocation functions of QP, CQ and counter with additional usage parameter. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) BPF verifier signed/unsigned value tracking fix, from Daniel Borkmann, Edward Cree, and Josef Bacik. 2) Fix memory allocation length when setting up calls to ->ndo_set_mac_address, from Cong Wang. 3) Add a new cxgb4 device ID, from Ganesh Goudar. 4) Fix FIB refcount handling, we have to set it's initial value before the configure callback (which can bump it). From David Ahern. 5) Fix double-free in qcom/emac driver, from Timur Tabi. 6) A bunch of gcc-7 string format overflow warning fixes from Arnd Bergmann. 7) Fix link level headroom tests in ip_do_fragment(), from Vasily Averin. 8) Fix chunk walking in SCTP when iterating over error and parameter headers. From Alexander Potapenko. 9) TCP BBR congestion control fixes from Neal Cardwell. 10) Fix SKB fragment handling in bcmgenet driver, from Doug Berger. 11) BPF_CGROUP_RUN_PROG_SOCK_OPS needs to check for null __sk, from Cong Wang. 12) xmit_recursion in ppp driver needs to be per-device not per-cpu, from Gao Feng. 13) Cannot release skb->dst in UDP if IP options processing needs it. From Paolo Abeni. 14) Some netdev ioctl ifr_name[] NULL termination fixes. From Alexander Levin and myself. 15) Revert some rtnetlink notification changes that are causing regressions, from David Ahern. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (83 commits) net: bonding: Fix transmit load balancing in balance-alb mode rds: Make sure updates to cp_send_gen can be observed net: ethernet: ti: cpsw: Push the request_irq function to the end of probe ipv4: initialize fib_trie prior to register_netdev_notifier call. rtnetlink: allocate more memory for dev_set_mac_address() net: dsa: b53: Add missing ARL entries for BCM53125 bpf: more tests for mixed signed and unsigned bounds checks bpf: add test for mixed signed and unsigned bounds checks bpf: fix up test cases with mixed signed/unsigned bounds bpf: allow to specify log level and reduce it for test_verifier bpf: fix mixed signed/unsigned derived min/max value bounds ipv6: avoid overflow of offset in ip6_find_1stfragopt net: tehuti: don't process data if it has not been copied from userspace Revert "rtnetlink: Do not generate notifications for CHANGEADDR event" net: dsa: mv88e6xxx: Enable CMODE config support for 6390X dt-binding: ptp: Add SoC compatibility strings for dte ptp clock NET: dwmac: Make dwmac reset unconditional net: Zero terminate ifr_name in dev_ifname(). wireless: wext: terminate ifr name coming from userspace netfilter: fix netfilter_net_init() return ...
2017-07-17{net, IB}/mlx4: Remove gfp flags argumentLeon Romanovsky
The caller to the driver marks GFP_NOIO allocations with help of memalloc_noio-* calls now. This makes redundant to pass down to the driver gfp flags, which can be GFP_KERNEL only. The patch removes the gfp flags argument and updates all driver paths. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-15mlx4_en: remove unnecessary returned value checkZhu Yanjun
The function __mlx4_zone_remove_one_entry always returns zero. So it is not necessary to check it. Cc: Joe Jin <joe.jin@oracle.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-03mlx4_en: make mlx4_log_num_mgm_entry_size staticZhu Yanjun
The variable mlx4_log_num_mgm_entry_size is only called in main.c. CC: Joe Jin <joe.jin@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29net/mlx4_en: Do not allocate redundant TX queues when TC is disabledInbar Karmy
Currently the number of TX queues that are allocated doesn't depend on the number of TCs, the module always loads with max num of UP per channel. In order to prevent the allocation of unnecessary memory, the module will load with minimum number of UPs per channel, and the user will be able to control the number of TX queues per channel by changing the number of TC to 8 using the tc command. The variable num_up will hold the information about the current number of UPs. Due to the change, needed to remove the lines that set the value of UP to be different than zero in the func "mlx4_en_select_queue", since now the num of TX queues that are allocated is only one per channel in default. In order not to force the UP to be zero in case of only one TC, added a condition before forcing it in the func "mlx4_en_fill_qp_context". Tested: After the module is loaded with minimum number of UP per channel, to increase num of TCs to 8, use: tc qdisc add dev ens8 root mqprio num_tc 8 In order to decrease the number of TCs to minimum number of UP per channel, use: tc qdisc del dev ens8 root Signed-off-by: Inbar Karmy <inbark@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Cc: Tarick Bedeir <tarick@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29net/mlx4_en: Add dynamic variable to hold the number of user priorities (UP)Inbar Karmy
Until this patch, the number of UPs was hard coded for eight. Replace this with a variable in struct "mlx4_en_port_profile". Currently, the variable will hold the maximum number of UP, as before. The patch creates an infrastructure to add an option for dynamic change of the actual number of TCs. Signed-off-by: Inbar Karmy <inbark@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Cc: Tarick Bedeir <tarick@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-29net/mlx4: fix spelling mistake: "enforcment" -> "enforcement"Colin Ian King
Trivial fix to spelling mistake in mlx4_dbg debug message Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-26net/mlx4: fix spelling mistake: "coalesing" -> "coalescing"Colin Ian King
Trivial fix to spelling mistake in en_dbg debug message Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-16bpf: mlx4: Report bpf_prog ID during XDP_QUERY_PROGMartin KaFai Lau
Add support to mlx4 to report bpf_prog ID during XDP_QUERY_PROG. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Cc: Tariq Toukan <tariqt@mellanox.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Alexei Starovoitov <ast@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-16networking: make skb_put & friends return void pointersJohannes Berg
It seems like a historic accident that these return unsigned char *, and in many places that means casts are required, more often than not. Make these functions (skb_put, __skb_put and pskb_put) return void * and remove all the casts across the tree, adding a (u8 *) cast only where the unsigned char pointer was used directly, all done with the following spatch: @@ expression SKB, LEN; typedef u8; identifier fn = { skb_put, __skb_put }; @@ - *(fn(SKB, LEN)) + *(u8 *)fn(SKB, LEN) @@ expression E, SKB, LEN; identifier fn = { skb_put, __skb_put }; type T; @@ - E = ((T *)(fn(SKB, LEN))) + E = fn(SKB, LEN) which actually doesn't cover pskb_put since there are only three users overall. A handful of stragglers were converted manually, notably a macro in drivers/isdn/i4l/isdn_bsdcomp.c and, oddly enough, one of the many instances in net/bluetooth/hci_sock.c. In the former file, I also had to fix one whitespace problem spatch introduced. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15net/mlx4_en: Refactor mlx4_en_free_tx_descTariq Toukan
Some code re-ordering, functionally equivalent. - The !tx_info->inl check is evaluated anyway in both flows (common case/end case). Run it first, this might finish the flows earlier. - dma_unmap calls are identical in both flows, get it out of the if block into the common area. Performance tests: Tested on ConnectX3Pro, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz Gain is too small to be measurable, no degradation sensed. Results are similar for IPv4 and IPv6. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Cc: kernel-team@fb.com Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15net/mlx4_en: Replace TXBB_SIZE multiplications with shift operationsTariq Toukan
Define LOG_TXBB_SIZE, log of TXBB_SIZE, and use it with a shift operation instead of a multiplication with TXBB_SIZE. Operations are equivalent as TXBB_SIZE is a power of two. Performance tests: Tested on ConnectX3Pro, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz Gain is too small to be measurable, no degradation sensed. Results are similar for IPv4 and IPv6. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Cc: kernel-team@fb.com Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15net/mlx4_en: Increase default TX ring sizeTariq Toukan
Increase the default TX ring size (from 512 to 1024) to match the RX ring size. This gives the XDP TX ring a better chance to keep up with the rate of its RX ring in case of a high load of XDP_TX actions. Tested: Ethtool counter rx_xdp_tx_full used to increase, after applying this patch it stopped. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Cc: kernel-team@fb.com Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15net/mlx4_en: Poll XDP TX completion queue in RX NAPITariq Toukan
Instead of having their own NAPIs, XDP TX completion queues get polled within the corresponding RX NAPI. This prevents any possible race on TX ring prod/cons indices, between the context that issues the transmits (RX NAPI) and the context that handles the completions (was previously done in a separate NAPI). This also improves performance, as it decreases the number of NAPIs running on a CPU, saving the overhead of syncing and switching between the contexts. Performance tests: Tested on ConnectX3Pro, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz Single queue no-RSS optimization ON. XDP_TX packet rate: ------------------------------------- | Before | After | Gain | IPv4 | 12.0 Mpps | 13.8 Mpps | 15% | IPv6 | 12.0 Mpps | 13.8 Mpps | 15% | ------------------------------------- Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Cc: kernel-team@fb.com Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15net/mlx4_en: Improve XDP xmit functionTariq Toukan
Several performance improvements in XDP TX datapath, including: - Ring a single doorbell for XDP TX ring per NAPI budget, instead of doing it per a lower threshold (was 8). This includes removing the flow of immediate doorbell ringing in case of a full TX ring. - Compiler branch predictor hints. - Calculate values in compile time rather than in runtime. Performance tests: Tested on ConnectX3Pro, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz Single queue no-RSS optimization ON. XDP_TX packet rate: ------------------------------------- | Before | After | Gain | IPv4 | 10.3 Mpps | 12.0 Mpps | 17% | IPv6 | 10.3 Mpps | 12.0 Mpps | 17% | ------------------------------------- Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Cc: kernel-team@fb.com Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15net/mlx4_en: Improve stack xmit functionTariq Toukan
Several small code and performance improvements in stack TX datapath, including: - Compiler branch predictor hints. - Minimize variables scope. - Move tx_info non-inline flow handling to a separate function. - Calculate data_offset in compile time rather than in runtime (for !lso_header_size branch). - Avoid trinary-operator ("?") when value can be preset in a matching branch. Performance tests: Tested on ConnectX3Pro, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz Gain is too small to be measurable, no degradation sensed. Results are similar for IPv4 and IPv6. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Cc: kernel-team@fb.com Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15net/mlx4_en: Improve transmit CQ pollingTariq Toukan
Several small performance improvements in TX CQ polling, including: - Compiler branch predictor hints. - Minimize variables scope. - More proper check of cq type. - Use boolean instead of int for a binary indication. Performance tests: Tested on ConnectX3Pro, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz Packet-rate tests for both regular stack and XDP use cases: No noticeable gain, no degradation. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Cc: kernel-team@fb.com Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15net/mlx4_en: Improve receive data-pathTariq Toukan
Several small performance improvements in RX datapath, including: - Compiler branch predictor hints. - Replace a multiplication with a shift operation. - Minimize variables scope. - Write-prefetch for packet header. - Avoid trinary-operator ("?") when value can be preset in a matching branch. - Save a branch by updating RX ring doorbell within mlx4_en_refill_rx_buffers(), which now returns void. Performance tests: Tested on ConnectX3Pro, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz Single queue no-RSS optimization ON (enable by ethtool -L <interface> rx 1). XDP_DROP packet rate: Same (28.1 Mpps), lower CPU utilization (from ~100% to ~92%). Drop packets in TC: ------------------------------------- | Before | After | Gain | IPv4 | 4.14 Mpps | 4.18 Mpps | 1% | ------------------------------------- XDP_TX packet rate: ------------------------------------- | Before | After | Gain | IPv4 | 10.1 Mpps | 10.3 Mpps | 2% | IPv6 | 10.1 Mpps | 10.3 Mpps | 2% | ------------------------------------- Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Cc: kernel-team@fb.com Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15net/mlx4_en: Optimized single ring steeringSaeed Mahameed
Avoid touching RX QP RSS context when loading with only one RX ring, to allow optimized A0 RX steering. Enable by: - loading mlx4_core with module param: log_num_mgm_entry_size = -6. - then: ethtool -L <interface> rx 1 Performance tests: Tested on ConnectX3Pro, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz XDP_DROP packet rate: ------------------------------------- | Before | After | Gain | IPv4 | 20.5 Mpps | 28.1 Mpps | 37% | IPv6 | 18.4 Mpps | 28.1 Mpps | 53% | ------------------------------------- Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Cc: kernel-team@fb.com Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15net/mlx4_en: Remove unused argument in TX datapath functionTariq Toukan
Remove owner argument, as it is obsolete and unused. This also saves the overhead of calculating its value in data-path. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Cc: kernel-team@fb.com Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-08net: propagate tc filter chain index down the ndo_setup_tc callJiri Pirko
We need to push the chain index down to the drivers, so they have the information to which chain the rule belongs. For now, no driver supports multichain offload, so only chain 0 is supported. This is needed to prevent chain squashes during offload for now. Later this will be used to implement multichain offload. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-07net/mlx4_en: Bump driver versionTariq Toukan
Remove date and bump version for mlx4_en driver. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-07net/mlx4_core: Bump driver versionTariq Toukan
Remove date and bump version for mlx4_core driver. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Just some simple overlapping changes in marvell PHY driver and the DSA core code. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-05net/mlx4: Check if Granular QoS per VF has been enabled before updating QP ↵Ido Shamay
qos_vport The Granular QoS per VF feature must be enabled in FW before it can be used. Thus, the driver cannot modify a QP's qos_vport value (via the UPDATE_QP FW command) if the feature has not been enabled -- the FW returns an error if this is attempted. Fixes: 08068cd5683f ("net/mlx4: Added qos_vport QP configuration in VST mode") Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-04net/mlx4: Fix the check in attaching steering rulesTalat Batheesh
Our previous patch (cited below) introduced a regression for RAW Eth QPs. Fix it by checking if the QP number provided by user-space exists, hence allowing steering rules to be added for valid QPs only. Fixes: 89c557687a32 ("net/mlx4_en: Avoid adding steering rules with invalid ring") Reported-by: Or Gerlitz <gerlitz.or@gmail.com> Signed-off-by: Talat Batheesh <talatb@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-21net: ethernet: update drivers to handle HWTSTAMP_FILTER_NTP_ALLMiroslav Lichvar
Include HWTSTAMP_FILTER_NTP_ALL in net_hwtstamp_validate() as a valid filter and update drivers which can timestamp all packets, or which explicitly list unsupported filters instead of using a default case, to handle the filter. CC: Richard Cochran <richardcochran@gmail.com> CC: Willem de Bruijn <willemb@google.com> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-15net/mlx4_core: Use min3 to select number of MSI-X vectorsyuval.shaia@oracle.com
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix multiqueue in stmmac driver on PCI, from Andy Shevchenko. 2) cdc_ncm doesn't actually fully zero out the padding area is allocates on TX, from Jim Baxter. 3) Don't leak map addresses in BPF verifier, from Daniel Borkmann. 4) If we randomize TCP timestamps, we have to do it everywhere including SYN cookies. From Eric Dumazet. 5) Fix "ethtool -S" crash in aquantia driver, from Pavel Belous. 6) Fix allocation size for ntp filter bitmap in bnxt_en driver, from Dan Carpenter. 7) Add missing memory allocation return value check to DSA loop driver, from Christophe Jaillet. 8) Fix XDP leak on driver unload in qed driver, from Suddarsana Reddy Kalluru. 9) Don't inherit MC list from parent inet connection sockets, another syzkaller spotted gem. Fix from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (43 commits) dccp/tcp: do not inherit mc_list from parent qede: Split PF/VF ndos. qed: Correct doorbell configuration for !4Kb pages qed: Tell QM the number of tasks qed: Fix VF removal sequence qede: Fix XDP memory leak on unload net/mlx4_core: Reduce harmless SRIOV error message to debug level net/mlx4_en: Avoid adding steering rules with invalid ring net/mlx4_en: Change the error print to debug print drivers: net: wimax: i2400m: i2400m-usb: Use time_after for time comparison DECnet: Use container_of() for embedded struct Revert "ipv4: restore rt->fi for reference counting" net: mdio-mux: bcm-iproc: call mdiobus_free() in error path net: ethernet: ti: cpsw: adjust cpsw fifos depth for fullduplex flow control ipv6: reorder ip6_route_dev_notifier after ipv6_dev_notf net: cdc_ncm: Fix TX zero padding stmmac: pci: split out common_default_data() helper stmmac: pci: RX queue routing configuration stmmac: pci: TX and RX queue priority configuration stmmac: pci: set default number of rx and tx queues ...
2017-05-09net/mlx4_core: Reduce harmless SRIOV error message to debug levelJack Morgenstein
Under SRIOV resource management, extra counters are allocated to VFs from a free pool. If that pool is empty, the ALLOC_RES command for a counter resource fails -- and this generates a misleading error message in the message log. Under SRIOV, each VF is allocated (i.e., guaranteed) 2 counters -- one counter per port. For ETH ports, the RoCE driver requests an additional counter (above the guaranteed counters). If that request fails, the VF RoCE driver simply uses the default (i.e., guaranteed) counter for that port. Thus, failing to allocate an additional counter does not constitute a problem, and the error message on the PF when this occurs should be reduced to debug level. Finally, to identify the situation that the reason for the failure is that no resources are available to grant to the VF, we modified the error returned by mlx4_grant_resource to -EDQUOT (Quota exceeded), which more accurately describes the error. Fixes: c3abb51bdb0e ("IB/mlx4: Add RoCE/IB dedicated counters") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-09net/mlx4_en: Avoid adding steering rules with invalid ringTalat Batheesh
Inserting steering rules with illegal ring is an invalid operation, block it. Fixes: 820672812f82 ('net/mlx4_en: Manage flow steering rules with ethtool') Signed-off-by: Talat Batheesh <talatb@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-09net/mlx4_en: Change the error print to debug printKamal Heib
The error print within mlx4_en_calc_rx_buf() should be a debug print. Fixes: 51151a16a60f ('mlx4: allow order-0 memory allocations in RX path') Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-08treewide: use kv[mz]alloc* rather than opencoded variantsMichal Hocko
There are many code paths opencoding kvmalloc. Let's use the helper instead. The main difference to kvmalloc is that those users are usually not considering all the aspects of the memory allocator. E.g. allocation requests <= 32kB (with 4kB pages) are basically never failing and invoke OOM killer to satisfy the allocation. This sounds too disruptive for something that has a reasonable fallback - the vmalloc. On the other hand those requests might fallback to vmalloc even when the memory allocator would succeed after several more reclaim/compaction attempts previously. There is no guarantee something like that happens though. This patch converts many of those places to kv[mz]alloc* helpers because they are more conservative. Link: http://lkml.kernel.org/r/20170306103327.2766-2-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> # Xen bits Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Andreas Dilger <andreas.dilger@intel.com> # Lustre Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> # KVM/s390 Acked-by: Dan Williams <dan.j.williams@intel.com> # nvdim Acked-by: David Sterba <dsterba@suse.com> # btrfs Acked-by: Ilya Dryomov <idryomov@gmail.com> # Ceph Acked-by: Tariq Toukan <tariqt@mellanox.com> # mlx4 Acked-by: Leon Romanovsky <leonro@mellanox.com> # mlx5 Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Anton Vorontsov <anton@enomsg.org> Cc: Colin Cross <ccross@android.com> Cc: Tony Luck <tony.luck@intel.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Santosh Raspatur <santosh@chelsio.com> Cc: Hariprasad S <hariprasad@chelsio.com> Cc: Yishai Hadas <yishaih@mellanox.com> Cc: Oleg Drokin <oleg.drokin@intel.com> Cc: "Yan, Zheng" <zyan@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-20net/mlx4: suppress 'may be used uninitialized' warningGreg Thelen
gcc 4.8.4 complains that mlx4_SW2HW_MPT_wrapper() uses an uninitialized 'mpt' variable: drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'mlx4_SW2HW_MPT_wrapper': drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:2802:12: warning: 'mpt' may be used uninitialized in this function [-Wmaybe-uninitialized] mpt->mtt = mtt; I think this warning is a false complaint. mpt is only used when mr_res_start_move_to() return zero, and in all such cases it initializes mpt. But apparently gcc cannot see that. Initialize mpt to avoid the warning. Signed-off-by: Greg Thelen <gthelen@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-06mlx4: trust shinfo->gso_segsEric Dumazet
mlx4 is the only driver in the tree making a point to recompute shinfo->gso_segs. Lets remove superfluous code. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/ethernet/broadcom/genet/bcmmii.c drivers/net/hyperv/netvsc.c kernel/bpf/hashtab.c Almost entirely overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-16net/mlx4_core: Avoid delays during VF driver device shutdownJack Morgenstein
Some Hypervisors detach VFs from VMs by instantly causing an FLR event to be generated for a VF. In the mlx4 case, this will cause that VF's comm channel to be disabled before the VM has an opportunity to invoke the VF device's "shutdown" method. For such Hypervisors, there is a race condition between the VF's shutdown method and its internal-error detection/reset thread. The internal-error detection/reset thread (which runs every 5 seconds) also detects a disabled comm channel. If the internal-error detection/reset flow wins the race, we still get delays (while that flow tries repeatedly to detect comm-channel recovery). The cited commit fixed the command timeout problem when the internal-error detection/reset flow loses the race. This commit avoids the unneeded delays when the internal-error detection/reset flow wins. Fixes: d585df1c5ccf ("net/mlx4_core: Avoid command timeouts during VF driver device shutdown") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reported-by: Simon Xiao <sixiao@microsoft.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-15mqprio: Modify mqprio to pass user parameters via ndo_setup_tc.Amritha Nambiar
The configurable priority to traffic class mapping and the user specified queue ranges are used to configure the traffic class, overriding the hardware defaults when the 'hw' option is set to 0. However, when the 'hw' option is non-zero, the hardware QOS defaults are used. This patch makes it so that we can pass the data the user provided to ndo_setup_tc. This allows us to pull in the queue configuration if the user requested it as well as any additional hardware offload type requested by using a value other than 1 for the hw value. Finally it also provides a means for the device driver to return the level supported for the offload type via the qopt->hw value. Previously we were just always assuming the value to be 1, in the future values beyond just 1 may be supported. Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>