summaryrefslogtreecommitdiffstats
path: root/drivers/net/ethernet
AgeCommit message (Collapse)Author
2021-04-15ch_ktls: do not send snd_una update to TCB in middleVinay Kumar Yadav
snd_una update should not be done when the same skb is being sent out.chcr_short_record_handler() sends it again even though SND_UNA update is already sent for the skb in chcr_ktls_xmit(), which causes mismatch in un-acked TCP seq number, later causes problem in sending out complete record. Fixes: 429765a149f1 ("chcr: handle partial end part of a record") Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com> Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-15ch_ktls: tcb close causes tls connection failureVinay Kumar Yadav
HW doesn't need marking TCB closed. This TCB state change sometimes causes problem to the new connection which gets the same tid. Fixes: 34aba2c45024 ("cxgb4/chcr : Register to tls add and del callback") Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com> Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-15ch_ktls: fix device connection closeVinay Kumar Yadav
When sge queue is full and chcr_ktls_xmit_wr_complete() returns failure, skb is not freed if it is not the last tls record in this skb, causes refcount never gets freed and tls_dev_del() never gets called on this connection. Fixes: 5a4b9fe7fece ("cxgb4/chcr: complete record tx handling") Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com> Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-15ch_ktls: Fix kernel panicVinay Kumar Yadav
Taking page refcount is not ideal and causes kernel panic sometimes. It's better to take tx_ctx lock for the complete skb transmit, to avoid page cleanup if ACK received in middle. Fixes: 5a4b9fe7fece ("cxgb4/chcr: complete record tx handling") Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com> Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-15Merge tag 'mlx5-fixes-2021-04-14' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2021-04-14 This series provides 3 small fixes to mlx5 driver. Please pull and let me know if there is any problem. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-15i40e: fix the panic when running bpf in xdpdrv modeJason Xing
Fix this panic by adding more rules to calculate the value of @rss_size_max which could be used in allocating the queues when bpf is loaded, which, however, could cause the failure and then trigger the NULL pointer of vsi->rx_rings. Prio to this fix, the machine doesn't care about how many cpus are online and then allocates 256 queues on the machine with 32 cpus online actually. Once the load of bpf begins, the log will go like this "failed to get tracking for 256 queues for VSI 0 err -12" and this "setup of MAIN VSI failed". Thus, I attach the key information of the crash-log here. BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 RIP: 0010:i40e_xdp+0xdd/0x1b0 [i40e] Call Trace: [2160294.717292] ? i40e_reconfig_rss_queues+0x170/0x170 [i40e] [2160294.717666] dev_xdp_install+0x4f/0x70 [2160294.718036] dev_change_xdp_fd+0x11f/0x230 [2160294.718380] ? dev_disable_lro+0xe0/0xe0 [2160294.718705] do_setlink+0xac7/0xe70 [2160294.719035] ? __nla_parse+0xed/0x120 [2160294.719365] rtnl_newlink+0x73b/0x860 Fixes: 41c445ff0f48 ("i40e: main driver core") Co-developed-by: Shujin Li <lishujin@kuaishou.com> Signed-off-by: Shujin Li <lishujin@kuaishou.com> Signed-off-by: Jason Xing <xingwanli@kuaishou.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-14net/mlx5e: fix ingress_ifindex check in mlx5e_flower_parse_metawenxu
In the nft_offload there is the mate flow_dissector with no ingress_ifindex but with ingress_iftype that only be used in the software. So if the mask of ingress_ifindex in meta is 0, this meta check should be bypass. Fixes: 6d65bc64e232 ("net/mlx5e: Add mlx5e_flower_parse_meta support") Signed-off-by: wenxu <wenxu@ucloud.cn> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-14net/mlx5e: Fix setting of RS FEC modeAya Levin
Change register setting from bit number to bit mask. Fixes: b5ede32d3329 ("net/mlx5e: Add support for FEC modes based on 50G per lane links") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-14net/mlx5: Fix setting of devlink traps in switchdev modeAya Levin
Prevent setting of devlink traps on the uplink while in switchdev mode. In this mode, it is the SW switch responsibility to handle both packets with a mismatch in destination MAC or VLAN ID. Therefore, there are no flow steering tables to trap undesirable packets and driver crashes upon setting a trap. Fixes: 241dc159391f ("net/mlx5: Notify on trap action by blocking event") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-14Merge branch '10GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-04-14 This series contains updates to ixgbe and ice drivers. Alex Duyck fixes a NULL pointer dereference for ixgbe. Yongxin Liu fixes an unbalanced enable/disable which was causing a call trace with suspend for ixgbe. Colin King fixes a potential infinite loop for ice. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-14Revert "net: stmmac: re-init rx buffers when mac resume back"Thierry Reding
This reverts commit 9c63faaa931e443e7abbbee9de0169f1d4710546, which introduces a suspend/resume regression on Jetson TX2 boards that can be reproduced every time. Given that the issue that this was supposed to fix only occurs very sporadically the safest course of action is to revert before v5.12 and then we can have another go at fixing the more rare issue in the next release (and perhaps backport it if necessary). The root cause of the observed problem seems to be that when the system is suspended, some packets are still in transit. When the descriptors for these buffers are cleared on resume, the descriptors become invalid and cause a fatal bus error. Link: https://lore.kernel.org/r/708edb92-a5df-ecc4-3126-5ab36707e275@nvidia.com/ Reported-by: Jonathan Hunter <jonathanh@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-14cavium/liquidio: Fix duplicate argumentWan Jiabing
Fix the following coccicheck warning: ./drivers/net/ethernet/cavium/liquidio/cn66xx_regs.h:413:6-28: duplicated argument to & or | The CN6XXX_INTR_M1UPB0_ERR here is duplicate. Here should be CN6XXX_INTR_M1UNB0_ERR. Signed-off-by: Wan Jiabing <wanjiabing@vivo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-14net: macb: fix the restore of cmp registersClaudiu Beznea
Commit a14d273ba159 ("net: macb: restore cmp registers on resume path") introduces the restore of CMP registers on resume path. In case the IP doesn't support type 2 screeners (zero on DCFG8 register) the struct macb::rx_fs_list::list is not initialized and thus the list_for_each_entry(item, &bp->rx_fs_list.list, list) loop introduced in commit a14d273ba159 ("net: macb: restore cmp registers on resume path") will access an uninitialized list leading to crash. Thus, initialize the struct macb::rx_fs_list::list without taking into account if the IP supports type 2 screeners or not. Fixes: a14d273ba159 ("net: macb: restore cmp registers on resume path") Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-14ibmvnic: remove duplicate napi_schedule call in open functionLijun Pan
Remove the unnecessary napi_schedule() call in __ibmvnic_open() since interrupt_rx() calls napi_schedule_prep/__napi_schedule during every receive interrupt. Fixes: ed651a10875f ("ibmvnic: Updated reset handling") Signed-off-by: Lijun Pan <lijunp213@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-14ibmvnic: remove duplicate napi_schedule call in do_reset functionLijun Pan
During adapter reset, do_reset/do_hard_reset calls ibmvnic_open(), which will calls napi_schedule if previous state is VNIC_CLOSED (i.e, the reset case, and "ifconfig down" case). So there is no need for do_reset to call napi_schedule again at the end of the function though napi_schedule will neglect the request if napi is already scheduled. Fixes: ed651a10875f ("ibmvnic: Updated reset handling") Signed-off-by: Lijun Pan <lijunp213@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-14ibmvnic: avoid calling napi_disable() twiceLijun Pan
__ibmvnic_open calls napi_disable without checking whether NAPI polling has already been disabled or not. This could cause napi_disable being called twice, which could generate deadlock. For example, the first napi_disable will spin until NAPI_STATE_SCHED is cleared by napi_complete_done, then set it again. When napi_disable is called the second time, it will loop infinitely because no dev->poll will be running to clear NAPI_STATE_SCHED. To prevent above scenario from happening, call ibmvnic_napi_disable() which checks if napi is disabled or not before calling napi_disable. Fixes: bfc32f297337 ("ibmvnic: Move resource initialization to its own routine") Suggested-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: Lijun Pan <lijunp213@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-14r8169: don't advertise pause in jumbo modeHeiner Kallweit
It has been reported [0] that using pause frames in jumbo mode impacts performance. There's no available chip documentation, but vendor drivers r8168 and r8125 don't advertise pause in jumbo mode. So let's do the same, according to Roman it fixes the issue. [0] https://bugzilla.kernel.org/show_bug.cgi?id=212617 Fixes: 9cf9b84cc701 ("r8169: make use of phy_set_asym_pause") Reported-by: Roman Mamedov <rm+bko@romanrm.net> Tested-by: Roman Mamedov <rm+bko@romanrm.net> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-13ibmvnic: correctly use dev_consume/free_skb_irqLijun Pan
It is more correct to use dev_kfree_skb_irq when packets are dropped, and to use dev_consume_skb_irq when packets are consumed. Fixes: 0d973388185d ("ibmvnic: Introduce xmit_more support using batched subCRQ hcalls") Suggested-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: Lijun Pan <lijunp213@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-13ice: Fix potential infinite loop when using u8 loop counterColin Ian King
A for-loop is using a u8 loop counter that is being compared to a u32 cmp_dcbcfg->numapp to check for the end of the loop. If cmp_dcbcfg->numapp is larger than 255 then the counter j will wrap around to zero and hence an infinite loop occurs. Fix this by making counter j the same type as cmp_dcbcfg->numapp. Addresses-Coverity: ("Infinite loop") Fixes: aeac8ce864d9 ("ice: Recognize 860 as iSCSI port in CEE mode") Signed-off-by: Colin Ian King <colin.king@canonical.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-13ixgbe: fix unbalanced device enable/disable in suspend/resumeYongxin Liu
pci_disable_device() called in __ixgbe_shutdown() decreases dev->enable_cnt by 1. pci_enable_device_mem() which increases dev->enable_cnt by 1, was removed from ixgbe_resume() in commit 6f82b2558735 ("ixgbe: use generic power management"). This caused unbalanced increase/decrease. So add pci_enable_device_mem() back. Fix the following call trace. ixgbe 0000:17:00.1: disabling already-disabled device Call Trace: __ixgbe_shutdown+0x10a/0x1e0 [ixgbe] ixgbe_suspend+0x32/0x70 [ixgbe] pci_pm_suspend+0x87/0x160 ? pci_pm_freeze+0xd0/0xd0 dpm_run_callback+0x42/0x170 __device_suspend+0x114/0x460 async_suspend+0x1f/0xa0 async_run_entry_fn+0x3c/0xf0 process_one_work+0x1dd/0x410 worker_thread+0x34/0x3f0 ? cancel_delayed_work+0x90/0x90 kthread+0x14c/0x170 ? kthread_park+0x90/0x90 ret_from_fork+0x1f/0x30 Fixes: 6f82b2558735 ("ixgbe: use generic power management") Signed-off-by: Yongxin Liu <yongxin.liu@windriver.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-13ixgbe: Fix NULL pointer dereference in ethtool loopback testAlexander Duyck
The ixgbe driver currently generates a NULL pointer dereference when performing the ethtool loopback test. This is due to the fact that there isn't a q_vector associated with the test ring when it is setup as interrupts are not normally added to the test rings. To address this I have added code that will check for a q_vector before returning a napi_id value. If a q_vector is not present it will return a value of 0. Fixes: b02e5a0ebb17 ("xsk: Propagate napi_id to XDP socket Rx path") Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-11net: davicom: Fix regulator not turned off on failed probeChristophe JAILLET
When the probe fails, we must disable the regulator that was previously enabled. This patch is a follow-up to commit ac88c531a5b3 ("net: davicom: Fix regulator not turned off on failed probe") which missed one case. Fixes: 7994fe55a4a2 ("dm9000: Add regulator and reset support to dm9000") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-09net: hns3: Trivial spell fix in hns3 driverSalil Mehta
Some trivial spelling mistakes which caught my eye during the review of the code. Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Link: https://lore.kernel.org/r/20210409074223.32480-1-salil.mehta@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-04-09lan743x: fix ethernet frame cutoff issueSven Van Asbroeck
The ethernet frame length is calculated incorrectly. Depending on the value of RX_HEAD_PADDING, this may result in ethernet frames that are too short (cut off at the end), or too long (garbage added to the end). Fix by calculating the ethernet frame length correctly. For added clarity, use the ETH_FCS_LEN constant in the calculation. Many thanks to Heiner Kallweit for suggesting this solution. Suggested-by: Heiner Kallweit <hkallweit1@gmail.com> Fixes: 3e21a10fdea3 ("lan743x: trim all 4 bytes of the FCS; not just 2") Link: https://lore.kernel.org/lkml/20210408172353.21143-1-TheSven73@gmail.com/ Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com> Reviewed-by: George McCollister <george.mccollister@gmail.com> Tested-by: George McCollister <george.mccollister@gmail.com> Link: https://lore.kernel.org/r/20210409003904.8957-1-TheSven73@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-04-08ice: fix memory leak of aRFS after resuming from suspendYongxin Liu
In ice_suspend(), ice_clear_interrupt_scheme() is called, and then irq_free_descs() will be eventually called to free irq and its descriptor. In ice_resume(), ice_init_interrupt_scheme() is called to allocate new irqs. However, in ice_rebuild_arfs(), struct irq_glue and struct cpu_rmap maybe cannot be freed, if the irqs that released in ice_suspend() were reassigned to other devices, which makes irq descriptor's affinity_notify lost. So call ice_free_cpu_rx_rmap() before ice_clear_interrupt_scheme(), which can make sure all irq_glue and cpu_rmap can be correctly released before corresponding irq and descriptor are released. Fix the following memory leak. unreferenced object 0xffff95bd951afc00 (size 512): comm "kworker/0:1", pid 134, jiffies 4294684283 (age 13051.958s) hex dump (first 32 bytes): 18 00 00 00 18 00 18 00 70 fc 1a 95 bd 95 ff ff ........p....... 00 00 ff ff 01 00 ff ff 02 00 ff ff 03 00 ff ff ................ backtrace: [<0000000072e4b914>] __kmalloc+0x336/0x540 [<0000000054642a87>] alloc_cpu_rmap+0x3b/0xb0 [<00000000f220deec>] ice_set_cpu_rx_rmap+0x6a/0x110 [ice] [<000000002370a632>] ice_probe+0x941/0x1180 [ice] [<00000000d692edba>] local_pci_probe+0x47/0xa0 [<00000000503934f0>] work_for_cpu_fn+0x1a/0x30 [<00000000555a9e4a>] process_one_work+0x1dd/0x410 [<000000002c4b414a>] worker_thread+0x221/0x3f0 [<00000000bb2b556b>] kthread+0x14c/0x170 [<00000000ad2cf1cd>] ret_from_fork+0x1f/0x30 unreferenced object 0xffff95bd81b0a2a0 (size 96): comm "kworker/0:1", pid 134, jiffies 4294684283 (age 13051.958s) hex dump (first 32 bytes): 38 00 00 00 01 00 00 00 e0 ff ff ff 0f 00 00 00 8............... b0 a2 b0 81 bd 95 ff ff b0 a2 b0 81 bd 95 ff ff ................ backtrace: [<00000000582dd5c5>] kmem_cache_alloc_trace+0x31f/0x4c0 [<000000002659850d>] irq_cpu_rmap_add+0x25/0xe0 [<00000000495a3055>] ice_set_cpu_rx_rmap+0xb4/0x110 [ice] [<000000002370a632>] ice_probe+0x941/0x1180 [ice] [<00000000d692edba>] local_pci_probe+0x47/0xa0 [<00000000503934f0>] work_for_cpu_fn+0x1a/0x30 [<00000000555a9e4a>] process_one_work+0x1dd/0x410 [<000000002c4b414a>] worker_thread+0x221/0x3f0 [<00000000bb2b556b>] kthread+0x14c/0x170 [<00000000ad2cf1cd>] ret_from_fork+0x1f/0x30 Fixes: 769c500dcc1e ("ice: Add advanced power mgmt for WoL") Signed-off-by: Yongxin Liu <yongxin.liu@windriver.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-08i40e: Fix sparse warning: missing error code 'err'Arkadiusz Kubalewski
Set proper return values inside error checking if-statements. Previously following warning was produced when compiling against sparse. i40e_main.c:15162 i40e_init_recovery_mode() warn: missing error code 'err' Fixes: 4ff0ee1af0169 ("i40e: Introduce recovery mode support") Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-08i40e: Fix sparse error: 'vsi->netdev' could be nullArkadiusz Kubalewski
Remove vsi->netdev->name from the trace. This is redundant information. With the devinfo trace, the adapter is already identifiable. Previously following error was produced when compiling against sparse. i40e_main.c:2571 i40e_sync_vsi_filters() error: we previously assumed 'vsi->netdev' could be null (see line 2323) Fixes: b603f9dc20af ("i40e: Log info when PF is entering and leaving Allmulti mode.") Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-08i40e: Fix sparse error: uninitialized symbol 'ring'Arkadiusz Kubalewski
Init pointer with NULL in default switch case statement. Previously the error was produced when compiling against sparse. i40e_debugfs.c:582 i40e_dbg_dump_desc() error: uninitialized symbol 'ring'. Fixes: 44ea803e2fa7 ("i40e: introduce new dump desc XDP command") Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-08i40e: Fix sparse errors in i40e_txrx.cArkadiusz Kubalewski
Remove error handling through pointers. Instead use plain int to return value from i40e_run_xdp(...). Previously: - sparse errors were produced during compilation: i40e_txrx.c:2338 i40e_run_xdp() error: (-2147483647) too low for ERR_PTR i40e_txrx.c:2558 i40e_clean_rx_irq() error: 'skb' dereferencing possible ERR_PTR() - sk_buff* was used to return value, but it has never had valid pointer to sk_buff. Returned value was always int handled as a pointer. Fixes: 0c8493d90b6b ("i40e: add XDP support for pass and drop actions") Fixes: 2e6893123830 ("i40e: split XDP_TX tail and XDP_REDIRECT map flushing") Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-08i40e: Fix parameters in aq_get_phy_register()Grzegorz Siwik
Change parameters order in aq_get_phy_register() due to wrong statistics in PHY reported by ethtool. Previously all PHY statistics were exactly the same for all interfaces Now statistics are reported correctly - different for different interfaces Fixes: 0514db37dd78 ("i40e: Extend PHY access with page change flag") Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-07ethtool: Remove link_mode param and derive link params from driverDanielle Ratson
Some drivers clear the 'ethtool_link_ksettings' struct in their get_link_ksettings() callback, before populating it with actual values. Such drivers will set the new 'link_mode' field to zero, resulting in user space receiving wrong link mode information given that zero is a valid value for the field. Another problem is that some drivers (notably tun) can report random values in the 'link_mode' field. This can result in a general protection fault when the field is used as an index to the 'link_mode_params' array [1]. This happens because such drivers implement their set_link_ksettings() callback by simply overwriting their private copy of 'ethtool_link_ksettings' struct with the one they get from the stack, which is not always properly initialized. Fix these problems by removing 'link_mode' from 'ethtool_link_ksettings' and instead have drivers call ethtool_params_from_link_mode() with the current link mode. The function will derive the link parameters (e.g., speed) from the link mode and fill them in the 'ethtool_link_ksettings' struct. v3: * Remove link_mode parameter and derive the link parameters in the driver instead of passing link_mode parameter to ethtool and derive it there. v2: * Introduce 'cap_link_mode_supported' instead of adding a validity field to 'ethtool_link_ksettings' struct. [1] general protection fault, probably for non-canonical address 0xdffffc00f14cc32c: 0000 [#1] PREEMPT SMP KASAN KASAN: probably user-memory-access in range [0x000000078a661960-0x000000078a661967] CPU: 0 PID: 8452 Comm: syz-executor360 Not tainted 5.11.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__ethtool_get_link_ksettings+0x1a3/0x3a0 net/ethtool/ioctl.c:446 Code: b7 3e fa 83 fd ff 0f 84 30 01 00 00 e8 16 b0 3e fa 48 8d 3c ed 60 d5 69 8a 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03 +38 d0 7c 08 84 d2 0f 85 b9 RSP: 0018:ffffc900019df7a0 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffff888026136008 RCX: 0000000000000000 RDX: 00000000f14cc32c RSI: ffffffff873439ca RDI: 000000078a661960 RBP: 00000000ffff8880 R08: 00000000ffffffff R09: ffff88802613606f R10: ffffffff873439bc R11: 0000000000000000 R12: 0000000000000000 R13: ffff88802613606c R14: ffff888011d0c210 R15: ffff888011d0c210 FS: 0000000000749300(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000004b60f0 CR3: 00000000185c2000 CR4: 00000000001506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: linkinfo_prepare_data+0xfd/0x280 net/ethtool/linkinfo.c:37 ethnl_default_notify+0x1dc/0x630 net/ethtool/netlink.c:586 ethtool_notify+0xbd/0x1f0 net/ethtool/netlink.c:656 ethtool_set_link_ksettings+0x277/0x330 net/ethtool/ioctl.c:620 dev_ethtool+0x2b35/0x45d0 net/ethtool/ioctl.c:2842 dev_ioctl+0x463/0xb70 net/core/dev_ioctl.c:440 sock_do_ioctl+0x148/0x2d0 net/socket.c:1060 sock_ioctl+0x477/0x6a0 net/socket.c:1177 vfs_ioctl fs/ioctl.c:48 [inline] __do_sys_ioctl fs/ioctl.c:753 [inline] __se_sys_ioctl fs/ioctl.c:739 [inline] __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: c8907043c6ac9 ("ethtool: Get link mode in use instead of speed and duplex parameters") Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-06net/mlx5: fix kfree mismatch in indir_table.cXiaoming Ni
Memory allocated by kvzalloc() should be freed by kvfree(). Fixes: 34ca65352ddf2 ("net/mlx5: E-Switch, Indirect table infrastructur") Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-06net/mlx5: Fix HW spec violation configuring uplinkEli Cohen
Make sure to modify uplink port to follow only if the uplink_follow capability is set as required by the HW spec. Failure to do so causes traffic to the uplink representor net device to cease after switching to switchdev mode. Fixes: 7d0314b11cdd ("net/mlx5e: Modify uplink state on interface up/down") Signed-off-by: Eli Cohen <elic@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-06net: hns3: clear VF down state bit before request link statusGuangbin Huang
Currently, the VF down state bit is cleared after VF sending link status request command. There is problem that when VF gets link status replied from PF, the down state bit may still set as 1. In this case, the link status replied from PF will be ignored and always set VF link status to down. To fix this problem, clear VF down state bit before VF requests link status. Fixes: e2cb1dec9779 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support") Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-06pcnet32: Use pci_resource_len to validate PCI resourceGuenter Roeck
pci_resource_start() is not a good indicator to determine if a PCI resource exists or not, since the resource may start at address 0. This is seen when trying to instantiate the driver in qemu for riscv32 or riscv64. pci 0000:00:01.0: reg 0x10: [io 0x0000-0x001f] pci 0000:00:01.0: reg 0x14: [mem 0x00000000-0x0000001f] ... pcnet32: card has no PCI IO resources, aborting Use pci_resouce_len() instead. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-06net: broadcom: bcm4908enet: Fix a double free in bcm4908_enet_dma_allocLv Yunlong
In bcm4908_enet_dma_alloc, if callee bcm4908_dma_alloc_buf_descs() failed, it will free the ring->cpu_addr by dma_free_coherent() and return error. Then bcm4908_enet_dma_free() will be called, and free the same cpu_addr by dma_free_coherent() again. My patch set ring->cpu_addr to NULL after it is freed in bcm4908_dma_alloc_buf_descs() to avoid the double free. Fixes: 4feffeadbcb2e ("net: broadcom: bcm4908enet: add BCM4908 controller driver") Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-05net: hns3: Remove un-necessary 'else-if' in the hclge_reset_event()Salil Mehta
Code to defer the reset(which caps the frequency of the reset) schedules the timer and returns. Hence, following 'else-if' looks un-necessary. Fixes: 9de0b86f6444 ("net: hns3: Prevent to request reset frequently") Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-05net: hns3: Remove the left over redundant check & assignmentSalil Mehta
This removes the left over check and assignment which is no longer used anywhere in the function and should have been removed as part of the below mentioned patch. Fixes: 012fcb52f67c ("net: hns3: activate reset timer when calling reset_event") Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-02net: macb: restore cmp registers on resume pathClaudiu Beznea
Restore CMP screener registers on resume path. Fixes: c1e85c6ce57ef ("net: macb: save/restore the remaining registers and features") Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-01i40e: Fix display statistics for veb_tcEryk Rybak
If veb-stats was enabled, the ethtool stats triggered a warning due to invalid size: 'unexpected stat size for veb.tc_%u_tx_packets'. This was due to an incorrect structure definition for the statistics. Structures and functions have been improved in line with requirements for the presentation of statistics, in particular for the functions: 'i40e_add_ethtool_stats' and 'i40e_add_stat_strings'. Fixes: 1510ae0be2a4 ("i40e: convert VEB TC stats to use an i40e_stats array") Signed-off-by: Eryk Rybak <eryk.roch.rybak@intel.com> Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-01i40e: fix receiving of single packets in xsk zero-copy modeMagnus Karlsson
Fix so that single packets are received immediately instead of in batches of 8. If you sent 1 pps to a system, you received 8 packets every 8 seconds instead of 1 packet every second. The problem behind this was that the work_done reporting from the Tx part of the driver was broken. The work_done reporting in i40e controls not only the reporting back to the napi logic but also the setting of the interrupt throttling logic. When Tx or Rx reports that it has more to do, interrupts are throttled or coalesced and when they both report that they are done, interrupts are armed right away. If the wrong work_done value is returned, the logic will start to throttle interrupts in a situation where it should have just enabled them. This leads to the undesired batching behavior seen in user-space. Fix this by returning the correct boolean value from the Tx xsk zero-copy path. Return true if there is nothing to do or if we got fewer packets to process than we asked for. Return false if we got as many packets as the budget since there might be more packets we can process. Fixes: 3106c580fb7c ("i40e: Use batched xsk Tx interfaces to increase performance") Reported-by: Sreedevi Joshi <sreedevi.joshi@intel.com> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-01i40e: Fix inconsistent indentingArkadiusz Kubalewski
Fixed new static analysis findings: "warn: inconsistent indenting" - introduced lately, reported with lkp and smatch build. Fixes: 4b208eaa8078 ("i40e: Add init and default config of software based DCB") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-31net/mlx5e: Guarantee room for XSK wakeup NOP on async ICOSQTariq Toukan
XSK wakeup flow triggers an IRQ by posting a NOP WQE and hitting the doorbell on the async ICOSQ. It maintains its state so that it doesn't issue another NOP WQE if it has an outstanding one already. For this flow to work properly, the NOP post must not fail. Make sure to reserve room for the NOP WQE in all WQE posts to the async ICOSQ. Fixes: 8d94b590f1e4 ("net/mlx5e: Turn XSK ICOSQ into a general asynchronous one") Fixes: 1182f3659357 ("net/mlx5e: kTLS, Add kTLS RX HW offload support") Fixes: 0419d8c9d8f8 ("net/mlx5e: kTLS, Add kTLS RX resync support") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-03-31net/mlx5e: Consider geneve_opts for encap contextsDima Chumak
Current algorithm for encap keys is legacy from initial vxlan implementation and doesn't take into account all possible fields of a tunnel. For example, for a Geneve tunnel, which may have additional TLV options, they are ignored when comparing encap keys and a rule can be attached to an incorrect encap entry. Fix that by introducing encap_info_equal() operation in struct mlx5e_tc_tunnel. Geneve tunnel type uses custom implementation, which extends generic algorithm and considers options if they are set. Fixes: 7f1a546e3222 ("net/mlx5e: Consider tunnel type for encap contexts") Signed-off-by: Dima Chumak <dchumak@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-03-31net/mlx5: Don't request more than supported EQsDaniel Jurgens
Calculating the number of compeltion EQs based on the number of available IRQ vectors doesn't work now that all async EQs share one IRQ. Thus the max number of EQs can be exceeded on systems with more than approximately 256 CPUs. Take this into account when calculating the number of available completion EQs. Fixes: 81bfa206032a ("net/mlx5: Use a single IRQ for all async EQs") Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-03-31net/mlx5e: kTLS, Fix RX counters atomicityTariq Toukan
Some TLS RX counters increment per socket/connection, and are not protected against parallel modifications from several cores. Switch them to atomic counters by taking them out of the RQ stats into the global atomic TLS stats. In this patch, we touch 'rx_tls_ctx/del' that count the number of device-offloaded RX TLS connections added/deleted. These counters are updated in the add/del callbacks, out of the fast data-path. This change is not needed for counters that increment only in NAPI context, as they are protected by the NAPI mechanism. Keep them as tls_* counters under 'struct mlx5e_rq_stats'. Fixes: 76c1e1ac2aae ("net/mlx5e: kTLS, Add kTLS RX stats") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-03-31net/mlx5e: kTLS, Fix TX counters atomicityTariq Toukan
Some TLS TX counters increment per socket/connection, and are not protected against parallel modifications from several cores. Switch them to atomic counters by taking them out of the SQ stats into the global atomic TLS stats. In this patch, we touch a single counter 'tx_tls_ctx' that counts the number of device-offloaded TX TLS connections added. Now that this counter can be increased without the for having the SQ context in hand, move it to the mlx5e_ktls_add_tx() callback where it really belongs, out of the fast data-path. This change is not needed for counters that increment only in NAPI context or under the TX lock, as they are already protected. Keep them as tls_* counters under 'struct mlx5e_sq_stats'. Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-03-31net/mlx5: E-switch, Create vport miss group only if src rewrite is supportedMaor Dickman
Create send to vport miss group was added in order to support traffic recirculation to root table with metadata source rewrite. This group is created also in case source rewrite isn't supported. Fixed by creating send to vport miss group only if source rewrite is supported by FW. Fixes: 8e404fefa58b ("net/mlx5e: Match recirculated packet miss in slow table using reg_c1") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-03-31net/mlx5e: Fix ethtool indication of connector typeAya Levin
Use connector_type read from PTYS register when it's valid, based on corresponding capability bit. Fixes: 5b4793f81745 ("net/mlx5e: Add support for reading connector type from PTYS") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-03-31net/mlx5: Delete auxiliary bus driver eth-rep firstMaor Dickman
Delete auxiliary bus drivers flow deletes the eth driver first and then the eth-reps driver but eth-reps devices resources are depend on eth device. Fixed by changing the delete order of auxiliary bus drivers to delete the eth-rep driver first and after it the eth driver. Fixes: 601c10c89cbb ("net/mlx5: Delete custom device management logic") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>