aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/net
AgeCommit message (Collapse)Author
43 hoursMerge tag 'v6.1.96' into v6.1/standard/basev6.1/standard/xilinx-zynqmpv6.1/standard/tiny/x86v6.1/standard/tiny/common-pcv6.1/standard/tiny/basev6.1/standard/ti-am335xv6.1/standard/qemuarm64v6.1/standard/nxp-ls20xxv6.1/standard/edgerouterv6.1/standard/beaglebonev6.1/standard/baseBruce Ashfield
This is the 6.1.96 stable release # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmZ9UTsACgkQONu9yGCS # aT46xxAAz4DR7XcjUDBRhOgA8NM6PVqHX1ZPzIAhUXYTtHcppvZXFvXk2nx0WWU2 # h/BWc9jaokteozI1+uMnlsye/t2gHYozE0300fFrFQkT0erVe6sPlmyKIOqikHi8 # 64SC4A3SoUxKo380iEtkLnFAhAjC8fppI2hjzofU71mFTQURM0Eeel+W6RSsZy09 # /oieXg9JeQF1nF/LfF8HindqyZ5q8GndEq7xD93zdDSimP1WpEfKJxqlDRzQLIUG # P1fvsam8+HClG5iN12WLfo3M9QjlaeDK8R0bplyKnRugeDCVCzRjk5MDucbODNSy # VQVihKV5QSTkwK3xD1tQ2Oko7/WXw1nWYJXwetPbRuVadqo+7jnYe0fL+QvzSGte # cliNWxiEGeCGBPm3YTP9f3oU+cBZVXZhs6p90dT8JNkfmySCxyXByHvEl8EidwHc # WvE4ntv9lgh3Q7q58yFGBSVtfkF6XxWbaEdF3WvskNTsXDvdnVyX1qI3O9Hj2o+o # NL+jHAqghReNgxKBxkG/hbYr8jkJ4zrPI+1fpipQfBR0Ytz5B/roiwQ/LozdW3BN # eCjUnjgU4z3feOm5bpEyOkuptV8zW3OTmA5Jnr+xq8TE9xfrDCuBrouj85UgUV8m # n9Q+BYd0WOe5lYNuS9Fq5eVE4r5BfXKUSgTfY6ZdQJ0fMPvGoRU= # =EnG7 # -----END PGP SIGNATURE----- # gpg: Signature made Thu 27 Jun 2024 07:47:07 AM EDT # gpg: using RSA key 647F28654894E3BD457199BE38DBBDC86092693E # gpg: Can't check signature: No public key
3 daysnet: usb: ax88179_178a: improve reset checkJose Ignacio Tornos Martinez
commit 7be4cb7189f747b4e5b6977d0e4387bde3204e62 upstream. After ecf848eb934b ("net: usb: ax88179_178a: fix link status when link is set to down/up") to not reset from usbnet_open after the reset from usbnet_probe at initialization stage to speed up this, some issues have been reported. It seems to happen that if the initialization is slower, and some time passes between the probe operation and the open operation, the second reset from open is necessary too to have the device working. The reason is that if there is no activity with the phy, this is "disconnected". In order to improve this, the solution is to detect when the phy is "disconnected", and we can use the phy status register for this. So we will only reset the device from reset operation in this situation, that is, only if necessary. The same bahavior is happening when the device is stopped (link set to down) and later is restarted (link set to up), so if the phy keeps working we only need to enable the mac again, but if enough time passes between the device stop and restart, reset is necessary, and we can detect the situation checking the phy status register too. cc: stable@vger.kernel.org # 6.6+ Fixes: ecf848eb934b ("net: usb: ax88179_178a: fix link status when link is set to down/up") Reported-by: Yongqin Liu <yongqin.liu@linaro.org> Reported-by: Antje Miederhöfer <a.miederhoefer@gmx.de> Reported-by: Arne Fitzenreiter <arne_f@ipfire.org> Tested-by: Yongqin Liu <yongqin.liu@linaro.org> Tested-by: Antje Miederhöfer <a.miederhoefer@gmx.de> Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 daysnet: stmmac: Assign configured channel value to EXTTS eventOleksij Rempel
commit 8851346912a1fa33e7a5966fe51f07313b274627 upstream. Assign the configured channel value to the EXTTS event in the timestamp interrupt handler. Without assigning the correct channel, applications like ts2phc will refuse to accept the event, resulting in errors such as: ... ts2phc[656.834]: config item end1.ts2phc.pin_index is 0 ts2phc[656.834]: config item end1.ts2phc.channel is 3 ts2phc[656.834]: config item end1.ts2phc.extts_polarity is 2 ts2phc[656.834]: config item end1.ts2phc.extts_correction is 0 ... ts2phc[656.862]: extts on unexpected channel ts2phc[658.141]: extts on unexpected channel ts2phc[659.140]: extts on unexpected channel Fixes: f4da56529da60 ("net: stmmac: Add support for external trigger timestamping") Cc: stable@vger.kernel.org Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Link: https://lore.kernel.org/r/20240618073821.619751-1-o.rempel@pengutronix.de Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 daysnet: usb: rtl8150 fix unintiatilzed variables in rtl8150_get_link_ksettingsOliver Neukum
[ Upstream commit fba383985354e83474f95f36d7c65feb75dba19d ] This functions retrieves values by passing a pointer. As the function that retrieves them can fail before touching the pointers, the variables must be initialized. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot+5186630949e3c55f0799@syzkaller.appspotmail.com Signed-off-by: Oliver Neukum <oneukum@suse.com> Link: https://lore.kernel.org/r/20240619132816.11526-1-oneukum@suse.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
3 daysbnxt_en: Restore PTP tx_avail count in case of skb_pad() errorPavan Chebbi
[ Upstream commit 1e7962114c10957fe4d10a15eb714578a394e90b ] The current code only restores PTP tx_avail count when we get DMA mapping errors. Fix it so that the PTP tx_avail count will be restored for both DMA mapping errors and skb_pad() errors. Otherwise PTP TX timestamp will not be available after a PTP packet hits the skb_pad() error. Fixes: 83bb623c968e ("bnxt_en: Transmit and retrieve packet timestamps") Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240618215313.29631-4-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
3 daysice: Fix VSI list rule with ICE_SW_LKUP_LAST typeMarcin Szycik
[ Upstream commit 74382aebc9035470ec4c789bdb0d09d8c14f261e ] Adding/updating VSI list rule, as well as allocating/freeing VSI list resource are called several times with type ICE_SW_LKUP_LAST, which fails because ice_update_vsi_list_rule() and ice_aq_alloc_free_vsi_list() consider it invalid. Allow calling these functions with ICE_SW_LKUP_LAST. This fixes at least one issue in switchdev mode, where the same rule with different action cannot be added, e.g.: tc filter add dev $PF1 ingress protocol arp prio 0 flower skip_sw \ dst_mac ff:ff:ff:ff:ff:ff action mirred egress redirect dev $VF1_PR tc filter add dev $PF1 ingress protocol arp prio 0 flower skip_sw \ dst_mac ff:ff:ff:ff:ff:ff action mirred egress redirect dev $VF2_PR Fixes: 0f94570d0cae ("ice: allow adding advanced rules") Suggested-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20240618210206.981885-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
3 daysocteontx2-pf: Add error handling to VLAN unoffload handlingSimon Horman
[ Upstream commit b95a4afe2defd6f46891985f9436a568cd35a31c ] otx2_sq_append_skb makes used of __vlan_hwaccel_push_inside() to unoffload VLANs - push them from skb meta data into skb data. However, it omitts a check for __vlan_hwaccel_push_inside() returning NULL. Found by inspection based on [1] and [2]. Compile tested only. [1] Re: [PATCH net-next v1] net: stmmac: Enable TSO on VLANs https://lore.kernel.org/all/ZmrN2W8Fye450TKs@shell.armlinux.org.uk/ [2] Re: [PATCH net-next v2] net: stmmac: Enable TSO on VLANs https://lore.kernel.org/all/CANn89i+11L5=tKsa7V7Aeyxaj6nYGRwy35PAbCRYJ73G+b25sg@mail.gmail.com/ Fixes: fd9d7859db6c ("octeontx2-pf: Implement ingress/egress VLAN offload") Signed-off-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
3 daysvirtio_net: checksum offloading handling fixHeng Qi
[ Upstream commit 604141c036e1b636e2a71cf6e1aa09d1e45f40c2 ] In virtio spec 0.95, VIRTIO_NET_F_GUEST_CSUM was designed to handle partially checksummed packets, and the validation of fully checksummed packets by the device is independent of VIRTIO_NET_F_GUEST_CSUM negotiation. However, the specification erroneously stated: "If VIRTIO_NET_F_GUEST_CSUM is not negotiated, the device MUST set flags to zero and SHOULD supply a fully checksummed packet to the driver." This statement is inaccurate because even without VIRTIO_NET_F_GUEST_CSUM negotiation, the device can still set the VIRTIO_NET_HDR_F_DATA_VALID flag. Essentially, the device can facilitate the validation of these packets' checksums - a process known as RX checksum offloading - removing the need for the driver to do so. This scenario is currently not implemented in the driver and requires correction. The necessary specification correction[1] has been made and approved in the virtio TC vote. [1] https://lists.oasis-open.org/archives/virtio-comment/202401/msg00011.html Fixes: 4f49129be6fa ("virtio-net: Set RXCSUM feature if GUEST_CSUM is available") Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
3 daysnet: stmmac: No need to calculate speed divider when offload is disabledXiaolei Wang
[ Upstream commit b8c43360f6e424131fa81d3ba8792ad8ff25a09e ] commit be27b8965297 ("net: stmmac: replace priv->speed with the portTransmitRate from the tc-cbs parameters") introduced a problem. When deleting, it prompts "Invalid portTransmitRate 0 (idleSlope - sendSlope)" and exits. Add judgment on cbs.enable. Only when offload is enabled, speed divider needs to be calculated. Fixes: be27b8965297 ("net: stmmac: replace priv->speed with the portTransmitRate from the tc-cbs parameters") Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240617013922.1035854-1-xiaolei.wang@windriver.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
3 daysnet: phy: mxl-gpy: Remove interrupt mask clearing from config_initRaju Lakkaraju
[ Upstream commit c44d3ffd85db03ebcc3090e55589e10d5af9f3a9 ] When the system resumes from sleep, the phy_init_hw() function invokes config_init(), which clears all interrupt masks and causes wake events to be lost in subsequent wake sequences. Remove interrupt mask clearing from config_init() and preserve relevant masks in config_intr(). Fixes: 7d901a1e878a ("net: phy: add Maxlinear GPY115/21x/24x driver") Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
3 daysnet: phy: mxl-gpy: enhance delay time required by loopback disable functionXu Liang
[ Upstream commit 0ba13995be9b416ea1d3daaf3ba871a67f45899b ] GPY2xx devices need 3 seconds to fully switch out of loopback mode before it can safely re-enter loopback mode. Implement timeout mechanism to guarantee 3 seconds waited before re-enter loopback mode. Signed-off-by: Xu Liang <lxu@maxlinear.com> Signed-off-by: David S. Miller <davem@davemloft.net> Stable-dep-of: c44d3ffd85db ("net: phy: mxl-gpy: Remove interrupt mask clearing from config_init") Signed-off-by: Sasha Levin <sashal@kernel.org>
3 daysnet: lan743x: Support WOL at both the PHY and MAC appropriatelyRaju Lakkaraju
[ Upstream commit 8c248cd836014339498486f14f435c0e344183a7 ] Prevent options not supported by the PHY from being requested to it by the MAC Whenever a WOL option is supported by both, the PHY is given priority since that usually leads to better power savings. Fixes: e9e13b6adc33 ("lan743x: fix for potential NULL pointer dereference with bare card") Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
3 daysnet: lan743x: disable WOL upon resume to restore full data path operationRaju Lakkaraju
[ Upstream commit 7725363936a88351b71495774c1e0e852ae4cdca ] When Wake-on-LAN (WoL) is active and the system is in suspend mode, triggering a system event can wake the system from sleep, which may block the data path. To restore normal data path functionality after waking, disable all wake-up events. Furthermore, clear all Write 1 to Clear (W1C) status bits by writing 1's to them. Fixes: 4d94282afd95 ("lan743x: Add power management support") Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
3 daysqca_spi: Make interrupt remembering atomicStefan Wahren
[ Upstream commit 2d7198278ece01818cd95a3beffbdf8b2a353fa0 ] The whole mechanism to remember occurred SPI interrupts is not atomic, which could lead to unexpected behavior. So fix this by using atomic bit operations instead. Fixes: 291ab06ecf67 ("net: qualcomm: new Ethernet over SPI driver for QCA7000") Signed-off-by: Stefan Wahren <wahrenst@gmx.net> Link: https://lore.kernel.org/r/20240614145030.7781-1-wahrenst@gmx.net Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
3 daysice: avoid IRQ collision to fix init failure on ACPI S3 resumeEn-Wei Wu
[ Upstream commit bc69ad74867dba1377abe14356c94a946d9837a3 ] A bug in https://bugzilla.kernel.org/show_bug.cgi?id=218906 describes that irdma would break and report hardware initialization failed after suspend/resume with Intel E810 NIC (tested on 6.9.0-rc5). The problem is caused due to the collision between the irq numbers requested in irdma and the irq numbers requested in other drivers after suspend/resume. The irq numbers used by irdma are derived from ice's ice_pf->msix_entries which stores mappings between MSI-X index and Linux interrupt number. It's supposed to be cleaned up when suspend and rebuilt in resume but it's not, causing irdma using the old irq numbers stored in the old ice_pf->msix_entries to request_irq() when resume. And eventually collide with other drivers. This patch fixes this problem. On suspend, we call ice_deinit_rdma() to clean up the ice_pf->msix_entries (and free the MSI-X vectors used by irdma if we've dynamically allocated them). On resume, we call ice_init_rdma() to rebuild the ice_pf->msix_entries (and allocate the MSI-X vectors if we would like to dynamically allocate them). Fixes: f9f5301e7e2d ("ice: Register auxiliary device to provide RDMA") Tested-by: Cyrus Lien <cyrus.lien@canonical.com> Signed-off-by: En-Wei Wu <en-wei.wu@canonical.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
3 daysice: move RDMA init to ice_idc.cMichal Swiatkowski
[ Upstream commit 2b8db6afbc95258175da69f31c9bfbea539aaa74 ] Simplify probe flow by moving all RDMA related code to ice_init_rdma(). Unroll irq allocation if RDMA initialization fails. Implement ice_deinit_rdma() and use it in remove flow. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Acked-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Stable-dep-of: bc69ad74867d ("ice: avoid IRQ collision to fix init failure on ACPI S3 resume") Signed-off-by: Sasha Levin <sashal@kernel.org>
3 dayswifi: mt76: mt7921s: fix potential hung tasks during chip recoveryLeon Yen
[ Upstream commit ecf0b2b8a37c8464186620bef37812a117ff6366 ] During chip recovery (e.g. chip reset), there is a possible situation that kernel worker reset_work is holding the lock and waiting for kernel thread stat_worker to be parked, while stat_worker is waiting for the release of the same lock. It causes a deadlock resulting in the dumping of hung tasks messages and possible rebooting of the device. This patch prevents the execution of stat_worker during the chip recovery. Signed-off-by: Leon Yen <leon.yen@mediatek.com> Signed-off-by: Ming Yen Hsieh <MingYen.Hsieh@mediatek.com> Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Sasha Levin <sashal@kernel.org>
3 daysnet: dsa: realtek: keep default LED state in rtl8366rbLuiz Angelo Daros de Luca
[ Upstream commit 5edc6585aafefa3d44fb8a84adf241d90227f7a3 ] This switch family supports four LEDs for each of its six ports. Each LED group is composed of one of these four LEDs from all six ports. LED groups can be configured to display hardware information, such as link activity, or manually controlled through a bitmap in registers RTL8366RB_LED_0_1_CTRL_REG and RTL8366RB_LED_2_3_CTRL_REG. After a reset, the default LED group configuration for groups 0 to 3 indicates, respectively, link activity, link at 1000M, 100M, and 10M, or RTL8366RB_LED_CTRL_REG as 0x5432. These configurations are commonly used for LED indications. However, the driver was replacing that configuration to use manually controlled LEDs (RTL8366RB_LED_FORCE) without providing a way for the OS to control them. The default configuration is deemed more useful than fixed, uncontrollable turned-on LEDs. The driver was enabling/disabling LEDs during port_enable/disable. However, these events occur when the port is administratively controlled (up or down) and are not related to link presence. Additionally, when a port N was disabled, the driver was turning off all LEDs for group N, not only the corresponding LED for port N in any of those 4 groups. In such cases, if port 0 was brought down, the LEDs for all ports in LED group 0 would be turned off. As another side effect, the driver was wrongly warning that port 5 didn't have an LED ("no LED for port 5"). Since showing the administrative state of ports is not an orthodox way to use LEDs, it was not worth it to fix it and all this code was dropped. The code to disable LEDs was simplified only changing each LED group to the RTL8366RB_LED_OFF state. Registers RTL8366RB_LED_0_1_CTRL_REG and RTL8366RB_LED_2_3_CTRL_REG are only used when the corresponding LED group is configured with RTL8366RB_LED_FORCE and they don't need to be cleaned. The code still references an LED controlled by RTL8366RB_INTERRUPT_CONTROL_REG, but as of now, no test device has actually used it. Also, some magic numbers were replaced by macros. Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
3 dayswifi: ath9k: work around memset overflow warningArnd Bergmann
[ Upstream commit 61752ac69b69ed2e04444d090f6917c77ab36d42 ] gcc-9 and some other older versions produce a false-positive warning for zeroing two fields In file included from include/linux/string.h:369, from drivers/net/wireless/ath/ath9k/main.c:18: In function 'fortify_memset_chk', inlined from 'ath9k_ps_wakeup' at drivers/net/wireless/ath/ath9k/main.c:140:3: include/linux/fortify-string.h:462:25: error: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror=attribute-warning] 462 | __write_overflow_field(p_size_field, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Using a struct_group seems to reliably avoid the warning and not make the code much uglier. The combined memset() should even save a couple of cpu cycles. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://msgid.link/20240328135509.3755090-3-arnd@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
6 daysMerge tag 'v6.1.95' into v6.1/standard/baseBruce Ashfield
This is the 6.1.95 stable release # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmZ1c7gACgkQONu9yGCS # aT6k6w//dJLH5VWQ2Gc3OCzKYKjNGa5vAdMpsqHZmyMUiK8Rhv/38cb9NrAYcFHQ # wlxm6sftbPPCHet6GpSiqDB/HDemGtqjCzHng8AywSsaiPhpvki4wvpt6p+hlUmf # Lck1qyvyJitoUoUikdWskD4HU02kBXRLHj5k4TxBVTT9FmyG5rjbXRD2pZjI6897 # PVni/94hJhBcTV/rqAufLOivKE4gwiYVfa7et9WMD3xiAmN7i5hcYUeBCCXlaE4B # wMbYTBcMZF2rz3q5LP1ZC/qotnaT04xSwm8bW21FTji6tsyVAK8XmYIenLunTOgw # 3om5GJBYciZ0bHrxugUEz5UenORFS2bMuMQJ0q0fuo5vkVHoSRZt5DnOmtxq4fGN # t2/TYFDKp7V52z0FJ3moWwwxBdfCEeZ/+XVKtwzb/rT1fFAfemMrY7TnkAZxbmhv # 2PTw+i59hlwrOgw3VeiZDA9gxNDEas9SbwZYFDndTmMA9KtYLj2OGKH1K1rCEDkp # tOTWTzWR4Tkd255J3z4WBvw6VCRsun5tSSEBUkDen8QgWtk57XmIzGu+479zINkQ # SUqXJA9dFiAPBHSsVGgcz7RsQMi2ubVjM9ZWa9i2CK7IkwAP1qISFqI8Wdthr4M8 # UK/SI0RF/M+mMWlxqr6Kyfj7JBZmljqzu+vwiX8D6KgXNp2npbk= # =RFmj # -----END PGP SIGNATURE----- # gpg: Signature made Fri 21 Jun 2024 08:36:08 AM EDT # gpg: using RSA key 647F28654894E3BD457199BE38DBBDC86092693E # gpg: Can't check signature: No public key
9 daysbnxt_en: Adjust logging of firmware messages in case of released token in ↵Aleksandr Mishin
__hwrm_send() [ Upstream commit a9b9741854a9fe9df948af49ca5514e0ed0429df ] In case of token is released due to token->state == BNXT_HWRM_DEFERRED, released token (set to NULL) is used in log messages. This issue is expected to be prevented by HWRM_ERR_CODE_PF_UNAVAILABLE error code. But this error code is returned by recent firmware. So some firmware may not return it. This may lead to NULL pointer dereference. Adjust this issue by adding token pointer check. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 8fa4219dba8e ("bnxt_en: add dynamic debug support for HWRM messages") Suggested-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240611082547.12178-1-amishin@t-argos.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysionic: fix use after netif_napi_del()Taehee Yoo
[ Upstream commit 79f18a41dd056115d685f3b0a419c7cd40055e13 ] When queues are started, netif_napi_add() and napi_enable() are called. If there are 4 queues and only 3 queues are used for the current configuration, only 3 queues' napi should be registered and enabled. The ionic_qcq_enable() checks whether the .poll pointer is not NULL for enabling only the using queue' napi. Unused queues' napi will not be registered by netif_napi_add(), so the .poll pointer indicates NULL. But it couldn't distinguish whether the napi was unregistered or not because netif_napi_del() doesn't reset the .poll pointer to NULL. So, ionic_qcq_enable() calls napi_enable() for the queue, which was unregistered by netif_napi_del(). Reproducer: ethtool -L <interface name> rx 1 tx 1 combined 0 ethtool -L <interface name> rx 0 tx 0 combined 1 ethtool -L <interface name> rx 0 tx 0 combined 4 Splat looks like: kernel BUG at net/core/dev.c:6666! Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 3 PID: 1057 Comm: kworker/3:3 Not tainted 6.10.0-rc2+ #16 Workqueue: events ionic_lif_deferred_work [ionic] RIP: 0010:napi_enable+0x3b/0x40 Code: 48 89 c2 48 83 e2 f6 80 b9 61 09 00 00 00 74 0d 48 83 bf 60 01 00 00 00 74 03 80 ce 01 f0 4f RSP: 0018:ffffb6ed83227d48 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff97560cda0828 RCX: 0000000000000029 RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff97560cda0a28 RBP: ffffb6ed83227d50 R08: 0000000000000400 R09: 0000000000000001 R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000 R13: ffff97560ce3c1a0 R14: 0000000000000000 R15: ffff975613ba0a20 FS: 0000000000000000(0000) GS:ffff975d5f780000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f8f734ee200 CR3: 0000000103e50000 CR4: 00000000007506f0 PKRU: 55555554 Call Trace: <TASK> ? die+0x33/0x90 ? do_trap+0xd9/0x100 ? napi_enable+0x3b/0x40 ? do_error_trap+0x83/0xb0 ? napi_enable+0x3b/0x40 ? napi_enable+0x3b/0x40 ? exc_invalid_op+0x4e/0x70 ? napi_enable+0x3b/0x40 ? asm_exc_invalid_op+0x16/0x20 ? napi_enable+0x3b/0x40 ionic_qcq_enable+0xb7/0x180 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] ionic_start_queues+0xc4/0x290 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] ionic_link_status_check+0x11c/0x170 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] ionic_lif_deferred_work+0x129/0x280 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] process_one_work+0x145/0x360 worker_thread+0x2bb/0x3d0 ? __pfx_worker_thread+0x10/0x10 kthread+0xcc/0x100 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2d/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 Fixes: 0f3154e6bcb3 ("ionic: Add Tx and Rx handling") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Link: https://lore.kernel.org/r/20240612060446.1754392-1-ap420073@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysnet: stmmac: replace priv->speed with the portTransmitRate from the tc-cbs ↵Xiaolei Wang
parameters [ Upstream commit be27b896529787e23a35ae4befb6337ce73fcca0 ] The current cbs parameter depends on speed after uplinking, which is not needed and will report a configuration error if the port is not initially connected. The UAPI exposed by tc-cbs requires userspace to recalculate the send slope anyway, because the formula depends on port_transmit_rate (see man tc-cbs), which is not an invariant from tc's perspective. Therefore, we use offload->sendslope and offload->idleslope to derive the original port_transmit_rate from the CBS formula. Fixes: 1f705bc61aee ("net: stmmac: Add support for CBS QDISC") Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20240608143524.2065736-1-xiaolei.wang@windriver.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysgve: ignore nonrelevant GSO type bits when processing TSO headersJoshua Washington
[ Upstream commit 1b9f756344416e02b41439bf2324b26aa25e141c ] TSO currently fails when the skb's gso_type field has more than one bit set. TSO packets can be passed from userspace using PF_PACKET, TUNTAP and a few others, using virtio_net_hdr (e.g., PACKET_VNET_HDR). This includes virtualization, such as QEMU, a real use-case. The gso_type and gso_size fields as passed from userspace in virtio_net_hdr are not trusted blindly by the kernel. It adds gso_type |= SKB_GSO_DODGY to force the packet to enter the software GSO stack for verification. This issue might similarly come up when the CWR bit is set in the TCP header for congestion control, causing the SKB_GSO_TCP_ECN gso_type bit to be set. Fixes: a57e5de476be ("gve: DQO: Add TX path") Signed-off-by: Joshua Washington <joshwash@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Suggested-by: Eric Dumazet <edumazet@google.com> Acked-by: Andrei Vagin <avagin@gmail.com> v2 - Remove unnecessary comments, remove line break between fixes tag and signoffs. v3 - Add back unrelated empty line removal. Link: https://lore.kernel.org/r/20240610225729.2985343-1-joshwash@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysnet/mlx5e: Fix features validation check for tunneled UDP (non-VXLAN) packetsGal Pressman
[ Upstream commit 791b4089e326271424b78f2fae778b20e53d071b ] Move the vxlan_features_check() call to after we verified the packet is a tunneled VXLAN packet. Without this, tunneled UDP non-VXLAN packets (for ex. GENENVE) might wrongly not get offloaded. In some cases, it worked by chance as GENEVE header is the same size as VXLAN, but it is obviously incorrect. Fixes: e3cfc7e6b7bd ("net/mlx5e: TX, Add geneve tunnel stateless offload support") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysgeneve: Fix incorrect inner network header offset when innerprotoinherit is setGal Pressman
[ Upstream commit c6ae073f5903f6c6439d0ac855836a4da5c0a701 ] When innerprotoinherit is set, the tunneled packets do not have an inner Ethernet header. Change 'maclen' to not always assume the header length is ETH_HLEN, as there might not be a MAC header. This resolves issues with drivers (e.g. mlx5, in mlx5e_tx_tunnel_accel()) who rely on the skb inner network header offset to be correct, and use it for TX offloads. Fixes: d8a6213d70ac ("geneve: fix header validation in geneve[6]_xmit_skb") Signed-off-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysliquidio: Adjust a NULL pointer handling path in lio_vf_rep_copy_packetAleksandr Mishin
[ Upstream commit c44711b78608c98a3e6b49ce91678cd0917d5349 ] In lio_vf_rep_copy_packet() pg_info->page is compared to a NULL value, but then it is unconditionally passed to skb_add_rx_frag() which looks strange and could lead to null pointer dereference. lio_vf_rep_copy_packet() call trace looks like: octeon_droq_process_packets octeon_droq_fast_process_packets octeon_droq_dispatch_pkt octeon_create_recv_info ...search in the dispatch_list... ->disp_fn(rdisp->rinfo, ...) lio_vf_rep_pkt_recv(struct octeon_recv_info *recv_info, ...) In this path there is no code which sets pg_info->page to NULL. So this check looks unneeded and doesn't solve potential problem. But I guess the author had reason to add a check and I have no such card and can't do real test. In addition, the code in the function liquidio_push_packet() in liquidio/lio_core.c does exactly the same. Based on this, I consider the most acceptable compromise solution to adjust this issue by moving skb_add_rx_frag() into conditional scope. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 1f233f327913 ("liquidio: switchdev support for LiquidIO NIC") Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysnet: hns3: add cond_resched() to hns3 ring buffer init processJie Wang
[ Upstream commit 968fde83841a8c23558dfbd0a0c69d636db52b55 ] Currently hns3 ring buffer init process would hold cpu too long with big Tx/Rx ring depth. This could cause soft lockup. So this patch adds cond_resched() to the process. Then cpu can break to run other tasks instead of busy looping. Fixes: a723fb8efe29 ("net: hns3: refine for set ring parameters") Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysnet: hns3: fix kernel crash problem in concurrent scenarioYonglong Liu
[ Upstream commit 12cda920212a49fa22d9e8b9492ac4ea013310a4 ] When link status change, the nic driver need to notify the roce driver to handle this event, but at this time, the roce driver may uninit, then cause kernel crash. To fix the problem, when link status change, need to check whether the roce registered, and when uninit, need to wait link update finish. Fixes: 45e92b7e4e27 ("net: hns3: add calling roce callback function when link status change") Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysnet: sfp: Always call `sfp_sm_mod_remove()` on removeCsókás, Bence
[ Upstream commit e96b2933152fd87b6a41765b2f58b158fde855b6 ] If the module is in SFP_MOD_ERROR, `sfp_sm_mod_remove()` will not be run. As a consequence, `sfp_hwmon_remove()` is not getting run either, leaving a stale `hwmon` device behind. `sfp_sm_mod_remove()` itself checks `sfp->sm_mod_state` anyways, so this check was not really needed in the first place. Fixes: d2e816c0293f ("net: sfp: handle module remove outside state machine") Signed-off-by: "Csókás, Bence" <csokas.bence@prolan.hu> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20240605084251.63502-1-csokas.bence@prolan.hu Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysgve: Clear napi->skb before dev_kfree_skb_any()Ziwei Xiao
commit 6f4d93b78ade0a4c2cafd587f7b429ce95abb02e upstream. gve_rx_free_skb incorrectly leaves napi->skb referencing an skb after it is freed with dev_kfree_skb_any(). This can result in a subsequent call to napi_get_frags returning a dangling pointer. Fix this by clearing napi->skb before the skb is freed. Fixes: 9b8dd5e5ea48 ("gve: DQO: Add RX path") Cc: stable@vger.kernel.org Reported-by: Shailend Chand <shailend@google.com> Signed-off-by: Ziwei Xiao <ziweixiao@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Shailend Chand <shailend@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Link: https://lore.kernel.org/r/20240612001654.923887-1-ziweixiao@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 dayswifi: ath10k: fix QCOM_RPROC_COMMON dependencyDmitry Baryshkov
[ Upstream commit 21ae74e1bf18331ae5e279bd96304b3630828009 ] If ath10k_snoc is built-in, while Qualcomm remoteprocs are built as modules, compilation fails with: /usr/bin/aarch64-linux-gnu-ld: drivers/net/wireless/ath/ath10k/snoc.o: in function `ath10k_modem_init': drivers/net/wireless/ath/ath10k/snoc.c:1534: undefined reference to `qcom_register_ssr_notifier' /usr/bin/aarch64-linux-gnu-ld: drivers/net/wireless/ath/ath10k/snoc.o: in function `ath10k_modem_deinit': drivers/net/wireless/ath/ath10k/snoc.c:1551: undefined reference to `qcom_unregister_ssr_notifier' Add corresponding dependency to ATH10K_SNOC Kconfig entry so that it's built as module if QCOM_RPROC_COMMON is built as module too. Fixes: 747ff7d3d742 ("ath10k: Don't always treat modem stop events as crashes") Cc: stable@vger.kernel.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://msgid.link/20240511-ath10k-snoc-dep-v1-1-9666e3af5c27@linaro.org Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysnet: wwan: iosm: Fix tainted pointer delete is case of region creation failAleksandr Mishin
[ Upstream commit b0c9a26435413b81799047a7be53255640432547 ] In case of region creation fail in ipc_devlink_create_region(), previously created regions delete process starts from tainted pointer which actually holds error code value. Fix this bug by decreasing region index before delete. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 4dcd183fbd67 ("net: wwan: iosm: devlink registration") Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Acked-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240604082500.20769-1-amishin@t-argos.ru Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysice: remove af_xdp_zc_qps bitmapLarysa Zaremba
[ Upstream commit adbf5a42341f6ea038d3626cd4437d9f0ad0b2dd ] Referenced commit has introduced a bitmap to distinguish between ZC and copy-mode AF_XDP queues, because xsk_get_pool_from_qid() does not do this for us. The bitmap would be especially useful when restoring previous state after rebuild, if only it was not reallocated in the process. This leads to e.g. xdpsock dying after changing number of queues. Instead of preserving the bitmap during the rebuild, remove it completely and distinguish between ZC and copy-mode queues based on the presence of a device associated with the pool. Fixes: e102db780e1c ("ice: track AF_XDP ZC enabled queues in bitmap") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240603-net-2024-05-30-intel-net-fixes-v2-3-e3563aa89b0c@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysice: remove null checks before devm_kfree() callsPrzemek Kitszel
[ Upstream commit ad667d626825383b626ad6ed38d6205618abb115 ] We all know they are redundant. Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Michal Wilczynski <michal.wilczynski@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Stable-dep-of: adbf5a42341f ("ice: remove af_xdp_zc_qps bitmap") Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysice: Introduce new parameters in ice_sched_nodeMichal Wilczynski
[ Upstream commit 16dfa49406bc5e1f4cbb115027cbd719d7e6c930 ] To support new devlink-rate API ice_sched_node struct needs to store a number of additional parameters. This includes tx_max, tx_share, tx_weight, and tx_priority. Add new fields to ice_sched_node struct. Add new functions to configure the hardware with new parameters. Introduce new xarray to identify nodes uniquely. Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: adbf5a42341f ("ice: remove af_xdp_zc_qps bitmap") Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysice: fix iteration of TLVs in Preserved Fields AreaJacob Keller
[ Upstream commit 03e4a092be8ce3de7c1baa7ae14e68b64e3ea644 ] The ice_get_pfa_module_tlv() function iterates over the Type-Length-Value structures in the Preserved Fields Area (PFA) of the NVM. This is used by the driver to access data such as the Part Board Assembly identifier. The function uses simple logic to iterate over the PFA. First, the pointer to the PFA in the NVM is read. Then the total length of the PFA is read from the first word. A pointer to the first TLV is initialized, and a simple loop iterates over each TLV. The pointer is moved forward through the NVM until it exceeds the PFA area. The logic seems sound, but it is missing a key detail. The Preserved Fields Area length includes one additional final word. This is documented in the device data sheet as a dummy word which contains 0xFFFF. All NVMs have this extra word. If the driver tries to scan for a TLV that is not in the PFA, it will read past the size of the PFA. It reads and interprets the last dummy word of the PFA as a TLV with type 0xFFFF. It then reads the word following the PFA as a length. The PFA resides within the Shadow RAM portion of the NVM, which is relatively small. All of its offsets are within a 16-bit size. The PFA pointer and TLV pointer are stored by the driver as 16-bit values. In almost all cases, the word following the PFA will be such that interpreting it as a length will result in 16-bit arithmetic overflow. Once overflowed, the new next_tlv value is now below the maximum offset of the PFA. Thus, the driver will continue to iterate the data as TLVs. In the worst case, the driver hits on a sequence of reads which loop back to reading the same offsets in an endless loop. To fix this, we need to correct the loop iteration check to account for this extra word at the end of the PFA. This alone is sufficient to resolve the known cases of this issue in the field. However, it is plausible that an NVM could be misconfigured or have corrupt data which results in the same kind of overflow. Protect against this by using check_add_overflow when calculating both the maximum offset of the TLVs, and when calculating the next_tlv offset at the end of each loop iteration. This ensures that the driver will not get stuck in an infinite loop when scanning the PFA. Fixes: e961b679fb0b ("ice: add board identifier info to devlink .info_get") Co-developed-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240603-net-2024-05-30-intel-net-fixes-v2-1-e3563aa89b0c@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysnet/mlx5: Fix tainted pointer delete is case of flow rules creation failAleksandr Mishin
[ Upstream commit 229bedbf62b13af5aba6525ad10b62ad38d9ccb5 ] In case of flow rule creation fail in mlx5_lag_create_port_sel_table(), instead of previously created rules, the tainted pointer is deleted deveral times. Fix this bug by using correct flow rules pointers. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 352899f384d4 ("net/mlx5: Lag, use buckets in hash mode") Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20240604100552.25201-1-amishin@t-argos.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysnet/mlx5: Always stop health timer during driver removalShay Drory
[ Upstream commit c8b3f38d2dae0397944814d691a419c451f9906f ] Currently, if teardown_hca fails to execute during driver removal, mlx5 does not stop the health timer. Afterwards, mlx5 continue with driver teardown. This may lead to a UAF bug, which results in page fault Oops[1], since the health timer invokes after resources were freed. Hence, stop the health monitor even if teardown_hca fails. [1] mlx5_core 0000:18:00.0: E-Switch: Unload vfs: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) mlx5_core 0000:18:00.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) mlx5_core 0000:18:00.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) mlx5_core 0000:18:00.0: E-Switch: cleanup mlx5_core 0000:18:00.0: wait_func:1155:(pid 1967079): TEARDOWN_HCA(0x103) timeout. Will cause a leak of a command resource mlx5_core 0000:18:00.0: mlx5_function_close:1288:(pid 1967079): tear_down_hca failed, skip cleanup BUG: unable to handle page fault for address: ffffa26487064230 PGD 100c00067 P4D 100c00067 PUD 100e5a067 PMD 105ed7067 PTE 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 0 PID: 0 Comm: swapper/0 Tainted: G OE ------- --- 6.7.0-68.fc38.x86_64 #1 Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0013.121520200651 12/15/2020 RIP: 0010:ioread32be+0x34/0x60 RSP: 0018:ffffa26480003e58 EFLAGS: 00010292 RAX: ffffa26487064200 RBX: ffff9042d08161a0 RCX: ffff904c108222c0 RDX: 000000010bbf1b80 RSI: ffffffffc055ddb0 RDI: ffffa26487064230 RBP: ffff9042d08161a0 R08: 0000000000000022 R09: ffff904c108222e8 R10: 0000000000000004 R11: 0000000000000441 R12: ffffffffc055ddb0 R13: ffffa26487064200 R14: ffffa26480003f00 R15: ffff904c108222c0 FS: 0000000000000000(0000) GS:ffff904c10800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffa26487064230 CR3: 00000002c4420006 CR4: 00000000007706f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> ? __die+0x23/0x70 ? page_fault_oops+0x171/0x4e0 ? exc_page_fault+0x175/0x180 ? asm_exc_page_fault+0x26/0x30 ? __pfx_poll_health+0x10/0x10 [mlx5_core] ? __pfx_poll_health+0x10/0x10 [mlx5_core] ? ioread32be+0x34/0x60 mlx5_health_check_fatal_sensors+0x20/0x100 [mlx5_core] ? __pfx_poll_health+0x10/0x10 [mlx5_core] poll_health+0x42/0x230 [mlx5_core] ? __next_timer_interrupt+0xbc/0x110 ? __pfx_poll_health+0x10/0x10 [mlx5_core] call_timer_fn+0x21/0x130 ? __pfx_poll_health+0x10/0x10 [mlx5_core] __run_timers+0x222/0x2c0 run_timer_softirq+0x1d/0x40 __do_softirq+0xc9/0x2c8 __irq_exit_rcu+0xa6/0xc0 sysvec_apic_timer_interrupt+0x72/0x90 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x1a/0x20 RIP: 0010:cpuidle_enter_state+0xcc/0x440 ? cpuidle_enter_state+0xbd/0x440 cpuidle_enter+0x2d/0x40 do_idle+0x20d/0x270 cpu_startup_entry+0x2a/0x30 rest_init+0xd0/0xd0 arch_call_rest_init+0xe/0x30 start_kernel+0x709/0xa90 x86_64_start_reservations+0x18/0x30 x86_64_start_kernel+0x96/0xa0 secondary_startup_64_no_verify+0x18f/0x19b ---[ end trace 0000000000000000 ]--- Fixes: 9b98d395b85d ("net/mlx5: Start health poll at earlier stage of driver load") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysnet/mlx5: Split function_setup() to enable and open functionsShay Drory
[ Upstream commit 2059cf51f318681a4cdd3eb1a01a2d62b6a9c442 ] mlx5_cmd_init_hca() is taking ~0.2 seconds. In case of a user who desire to disable some of the SF aux devices, and with large scale-1K SFs for example, this user will waste more than 3 minutes on mlx5_cmd_init_hca() which isn't needed at this stage. Downstream patch will change SFs which are probe over the E-switch, local SFs, to be probed without any aux dev. In order to support this, split function_setup() to avoid executing mlx5_cmd_init_hca(). Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Stable-dep-of: c8b3f38d2dae ("net/mlx5: Always stop health timer during driver removal") Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysnet/mlx5: Stop waiting for PCI if pci channel is offlineMoshe Shemesh
[ Upstream commit 33afbfcc105a572159750f2ebee834a8a70fdd96 ] In case pci channel becomes offline the driver should not wait for PCI reads during health dump and recovery flow. The driver has timeout for each of these loops trying to read PCI, so it would fail anyway. However, in case of recovery waiting till timeout may cause the pci error_detected() callback fail to meet pci_dpc_recovered() wait timeout. Fixes: b3bd076f7501 ("net/mlx5: Report devlink health on FW fatal issues") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Shay Drori <shayd@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysnet/mlx5: Stop waiting for PCI up if teardown was triggeredMoshe Shemesh
[ Upstream commit 8ff38e730c3f5ee717f25365ef8aa4739562d567 ] If driver teardown is called while PCI is turned off, there is a race between health recovery and teardown. If health recovery already started it will wait 60 sec trying to see if PCI gets back and it can recover, but actually there is no need to wait anymore once teardown was called. Use the MLX5_BREAK_FW_WAIT flag which is set on driver teardown to break waiting for PCI up. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20230314054234.267365-3-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: 33afbfcc105a ("net/mlx5: Stop waiting for PCI if pci channel is offline") Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysvxlan: Fix regression when dropping packets due to invalid src addressesDaniel Borkmann
[ Upstream commit 1cd4bc987abb2823836cbb8f887026011ccddc8a ] Commit f58f45c1e5b9 ("vxlan: drop packets from invalid src-address") has recently been added to vxlan mainly in the context of source address snooping/learning so that when it is enabled, an entry in the FDB is not being created for an invalid address for the corresponding tunnel endpoint. Before commit f58f45c1e5b9 vxlan was similarly behaving as geneve in that it passed through whichever macs were set in the L2 header. It turns out that this change in behavior breaks setups, for example, Cilium with netkit in L3 mode for Pods as well as tunnel mode has been passing before the change in f58f45c1e5b9 for both vxlan and geneve. After mentioned change it is only passing for geneve as in case of vxlan packets are dropped due to vxlan_set_mac() returning false as source and destination macs are zero which for E/W traffic via tunnel is totally fine. Fix it by only opting into the is_valid_ether_addr() check in vxlan_set_mac() when in fact source address snooping/learning is actually enabled in vxlan. This is done by moving the check into vxlan_snoop(). With this change, the Cilium connectivity test suite passes again for both tunnel flavors. Fixes: f58f45c1e5b9 ("vxlan: drop packets from invalid src-address") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: David Bauer <mail@david-bauer.net> Cc: Ido Schimmel <idosch@nvidia.com> Cc: Nikolay Aleksandrov <razor@blackwall.org> Cc: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: David Bauer <mail@david-bauer.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 daysocteontx2-af: Always allocate PF entries from low prioriy zoneSubbaraya Sundeep
[ Upstream commit 8b0f7410942cdc420c4557eda02bfcdf60ccec17 ] PF mcam entries has to be at low priority always so that VF can install longest prefix match rules at higher priority. This was taken care currently but when priority allocation wrt reference entry is requested then entries are allocated from mid-zone instead of low priority zone. Fix this and always allocate entries from low priority zone for PFs. Fixes: 7df5b4b260dd ("octeontx2-af: Allocate low priority entries for PF") Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 dayswifi: iwlwifi: mvm: don't read past the mfuart notifcationEmmanuel Grumbach
[ Upstream commit 4bb95f4535489ed830cf9b34b0a891e384d1aee4 ] In case the firmware sends a notification that claims it has more data than it has, we will read past that was allocated for the notification. Remove the print of the buffer, we won't see it by default. If needed, we can see the content with tracing. This was reported by KFENCE. Fixes: bdccdb854f2f ("iwlwifi: mvm: support MFUART dump in case of MFUART assert") Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240513132416.ba82a01a559e.Ia91dd20f5e1ca1ad380b95e68aebf2794f553d9b@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 dayswifi: iwlwifi: mvm: check n_ssids before accessing the ssidsMiri Korenblit
[ Upstream commit 60d62757df30b74bf397a2847a6db7385c6ee281 ] In some versions of cfg80211, the ssids poinet might be a valid one even though n_ssids is 0. Accessing the pointer in this case will cuase an out-of-bound access. Fix this by checking n_ssids first. Fixes: c1a7515393e4 ("iwlwifi: mvm: add adaptive dwell support") Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Reviewed-by: Ilan Peer <ilan.peer@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Link: https://msgid.link/20240513132416.6e4d1762bf0d.I5a0e6cc8f02050a766db704d15594c61fe583d45@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 dayswifi: iwlwifi: dbg_ini: move iwl_dbg_tlv_free outside of debugfs ifdefShahar S Matityahu
[ Upstream commit 87821b67dea87addbc4ab093ba752753b002176a ] The driver should call iwl_dbg_tlv_free even if debugfs is not defined since ini mode does not depend on debugfs ifdef. Fixes: 68f6f492c4fa ("iwlwifi: trans: support loading ini TLVs from external file") Signed-off-by: Shahar S Matityahu <shahar.s.matityahu@intel.com> Reviewed-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240510170500.c8e3723f55b0.I5e805732b0be31ee6b83c642ec652a34e974ff10@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
9 dayswifi: iwlwifi: mvm: revert gen2 TX A-MPDU size to 64Johannes Berg
[ Upstream commit 4a7aace2899711592327463c1a29ffee44fcc66e ] We don't actually support >64 even for HE devices, so revert back to 64. This fixes an issue where the session is refused because the queue is configured differently from the actual session later. Fixes: 514c30696fbc ("iwlwifi: add support for IEEE802.11ax") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: Liad Kaufman <liad.kaufman@intel.com> Reviewed-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240510170500.52f7b4cf83aa.If47e43adddf7fe250ed7f5571fbb35d8221c7c47@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
13 daysMerge tag 'v6.1.94' into v6.1/standard/baseBruce Ashfield
This is the 6.1.94 stable release # -----BEGIN PGP SIGNATURE----- # # iQIyBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmZuz8EACgkQONu9yGCS # aT4YXw/4jfUDI5QTQiD7h176b8YZQlDG0XqFl2YcfZfk3QlP8kvBOxFreUCORI8P # lzsI0azmlNPbkIQt/B4PZ2g+TWOdty/qsNWX5rWKe1H4/S6rBqgC7J4ppTDn+rGw # MeUiFCSqBz0Twku+EOxGWqVS4gs9+OPll2fqCGSzN8+6Ku4uhmrKvz0sNHgej5ct # B8KqvznVN8sitBquVwoTcW2fUWAt/98Dhh/+0E0Q28P2KhVyqrt5TwkI1rsNXVA3 # ZYHONT4k4ItPMGMw2gyeEo9avlyjIdd4BQ696ePrqFaGWHTUIEbw/ZLw/cb8AbWn # YVv3PcgAZoX10vHZmwqBEklt8cM6B3L55o5gHoO1bSKKqmJQO/a4WTdzgGNFdOLe # hvN+chrt6y3R7naT8Q5xdt5DVa59f+drw4AX2vnK3Siv3u1J7k4Iq6WT8Ij/FSUH # cL5UtQ9/iWPeEg1221WidoCttFySwFfUixg+Fc8sX71PFkraAOvJmV/Eiqo79d3t # VbCY1MWoTkieeu2AGffA3IXWEL6yTdVTw2zfPkd5kKv584yCwlvsV4GhV16ygqJK # e08I3vjiCO9hLjpplrYkFFxE97gqq2IQuwZbxgOhdKfpo+tTTZrS7p00AZRPIpvL # RQK5ogSTkNQr15pjDY1SdqXpOntZxp5RFaPXwLJ28qUByZsUGw== # =nofp # -----END PGP SIGNATURE----- # gpg: Signature made Sun 16 Jun 2024 07:42:57 AM EDT # gpg: using RSA key 647F28654894E3BD457199BE38DBBDC86092693E # gpg: Can't check signature: No public key
13 daysMerge tag 'v6.1.93' into v6.1/standard/baseBruce Ashfield
This is the 6.1.93 stable release # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmZpZN8ACgkQONu9yGCS # aT4/2RAApnfpBVZKYq3SNotcHVSJwMp0lY2L1Hy714QtQ1xCu8c4LgxcytKemBZo # riduGbSV3t3BOHcRyVwJLLSjRMMdESdp6jvieXItHroNJLmoLBhlWg0ifbceCIBe # M+IuNIB0kDs5dVmVc0x/E9h7Zloj6ikLfyaJrwedXbpTTRep6BRjkkUcBs6AqvrL # Ndo2GoTfkQ8VrGsj4aNn+YcbTpaokP2ikaNntTsUc6PfdK7C7k8pEgUQKq2CM7zl # 7p5TQEOjS+lEAisA4Dswz+fhh/Gvke0TStDQzRQRfFzgB3tVb4gxGYmLKOGqidya # ttLa3D6bEJgqg75NNuNg7skgrYjzUs6z3/YlNxrUwpK+fUId/EYyJjIQKDmoQSgr # bBUiT6FWb5b+sJ2zOjlJSCptTaq4m/oT0nXXQtVldPkEWn3biKxSwB9+ya6mzVi7 # dlbONp0OBpTFxKHhwE3V/tYrowqq6Aqfc9GafJRHXFYi6bp+F6Ghha47UqYJ/Vpk # xD31HequQOh7V3HpMQqfiJwWj5ykvGZch6q8FxYI66MLhFfcob3nCvvb80E77q27 # 2nvLEdNOdK26ypwKDJRQRJnudf1lFoFqq5NaY3bJDvGDB2Qe3MbR1ZhWTht73cMO # NNMk4IDUl4zOvpDdkeYe4gG6ncBkKLW3GhFDfM1T6yTIHFPjB0Y= # =9a7Q # -----END PGP SIGNATURE----- # gpg: Signature made Wed 12 Jun 2024 05:05:35 AM EDT # gpg: using RSA key 647F28654894E3BD457199BE38DBBDC86092693E # gpg: Can't check signature: No public key