aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/net/ethernet/mellanox
AgeCommit message (Collapse)Author
2020-07-11net/mlx5e: IPoIB, Drop multicast packets that this interface sentErez Shitrit
commit 8b46d424a743ddfef8056d5167f13ee7ebd1dcad upstream. After enabled loopback packets for IPoIB, we need to drop these packets that this HCA has replicated and came back to the same interface that sent them. Fixes: 4c6c615e3f30 ("net/mlx5e: IPoIB, Add PKEY child interface nic profile") Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-07-07net/mlx5: Avoid disabling RoCE when uninitializedMaor Gottlieb
commit 3a6ef5158d4a01a2ddd88c3c978e6de0d10ee759 upstream. Move the check if RoCE steering is initialized to the disable RoCE function, it will ensure that we disable RoCE only if we succeeded in enabling it before. Fixes: 80f09dfc237f ("net/mlx5: Eswitch, enable RoCE loopback traffic") Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-07-07net/mlx5: drain health workqueue in case of driver load errorShay Drory
commit 42ea9f1b5c625fad225d4ac96a7e757dd4199d9c upstream. In case there is a work in the health WQ when we teardown the driver, in driver load error flow, the health work will try to read dev->iseg, which was already unmap in mlx5_pci_close(). Fix it by draining the health workqueue first thing in mlx5_pci_close(). Trace of the error: BUG: unable to handle page fault for address: ffffb5b141c18014 PF: supervisor read access in kernel mode PF: error_code(0x0000) - not-present page PGD 1fe95d067 P4D 1fe95d067 PUD 1fe95e067 PMD 1b7823067 PTE 0 Oops: 0000 [#1] SMP PTI CPU: 3 PID: 6755 Comm: kworker/u128:2 Not tainted 5.2.0-net-next-mlx5-hv_stats-over-last-worked-hyperv #1 Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006 04/28/2016 Workqueue: mlx5_healtha050:00:02.0 mlx5_fw_fatal_reporter_err_work [mlx5_core] RIP: 0010:ioread32be+0x30/0x40 Code: 00 77 27 48 81 ff 00 00 01 00 76 07 0f b7 d7 ed 0f c8 c3 55 48 c7 c6 3b ee d5 9f 48 89 e5 e8 67 fc ff ff b8 ff ff ff ff 5d c3 <8b> 07 0f c8 c3 66 66 2e 0f 1f 84 00 00 00 00 00 48 81 fe ff ff 03 RSP: 0018:ffffb5b14c56fd78 EFLAGS: 00010292 RAX: ffffb5b141c18000 RBX: ffff8e9f78a801c0 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffff8e9f7ecd7628 RDI: ffffb5b141c18014 RBP: ffffb5b14c56fd90 R08: 0000000000000001 R09: 0000000000000000 R10: ffff8e9f372a2c30 R11: ffff8e9f87f4bc40 R12: ffff8e9f372a1fc0 R13: ffff8e9f78a80000 R14: ffffffffc07136a0 R15: ffff8e9f78ae6f20 FS: 0000000000000000(0000) GS:ffff8e9f7ecc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffb5b141c18014 CR3: 00000001c8f82006 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ? mlx5_health_try_recover+0x4d/0x270 [mlx5_core] mlx5_fw_fatal_reporter_recover+0x16/0x20 [mlx5_core] devlink_health_reporter_recover+0x1c/0x50 devlink_health_report+0xfb/0x240 mlx5_fw_fatal_reporter_err_work+0x65/0xd0 [mlx5_core] process_one_work+0x1fb/0x4e0 ? process_one_work+0x16b/0x4e0 worker_thread+0x4f/0x3d0 kthread+0x10d/0x140 ? process_one_work+0x4e0/0x4e0 ? kthread_cancel_delayed_work_sync+0x20/0x20 ret_from_fork+0x1f/0x30 Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 nfs fscache 8021q garp mrp stp llc ipmi_devintf ipmi_msghandler rpcrdma rdma_ucm ib_iser rdma_cm ib_umad iw_cm ib_ipoib libiscsi scsi_transport_iscsi ib_cm mlx5_ib ib_uverbs ib_core mlx5_core sb_edac crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 mlxfw crypto_simd cryptd glue_helper input_leds hyperv_fb intel_rapl_perf joydev serio_raw pci_hyperv pci_hyperv_mini mac_hid hv_balloon nfsd auth_rpcgss nfs_acl lockd grace sunrpc sch_fq_codel ip_tables x_tables autofs4 hv_utils hid_generic hv_storvsc ptp hid_hyperv hid hv_netvsc hyperv_keyboard pps_core scsi_transport_fc psmouse hv_vmbus i2c_piix4 floppy pata_acpi CR2: ffffb5b141c18014 ---[ end trace b12c5503157cad24 ]--- RIP: 0010:ioread32be+0x30/0x40 Code: 00 77 27 48 81 ff 00 00 01 00 76 07 0f b7 d7 ed 0f c8 c3 55 48 c7 c6 3b ee d5 9f 48 89 e5 e8 67 fc ff ff b8 ff ff ff ff 5d c3 <8b> 07 0f c8 c3 66 66 2e 0f 1f 84 00 00 00 00 00 48 81 fe ff ff 03 RSP: 0018:ffffb5b14c56fd78 EFLAGS: 00010292 RAX: ffffb5b141c18000 RBX: ffff8e9f78a801c0 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffff8e9f7ecd7628 RDI: ffffb5b141c18014 RBP: ffffb5b14c56fd90 R08: 0000000000000001 R09: 0000000000000000 R10: ffff8e9f372a2c30 R11: ffff8e9f87f4bc40 R12: ffff8e9f372a1fc0 R13: ffff8e9f78a80000 R14: ffffffffc07136a0 R15: ffff8e9f78ae6f20 FS: 0000000000000000(0000) GS:ffff8e9f7ecc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffb5b141c18014 CR3: 00000001c8f82006 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 BUG: sleeping function called from invalid context at ./include/linux/percpu-rwsem.h:38 in_atomic(): 0, irqs_disabled(): 1, pid: 6755, name: kworker/u128:2 INFO: lockdep is turned off. CPU: 3 PID: 6755 Comm: kworker/u128:2 Tainted: G D 5.2.0-net-next-mlx5-hv_stats-over-last-worked-hyperv #1 Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006 04/28/2016 Workqueue: mlx5_healtha050:00:02.0 mlx5_fw_fatal_reporter_err_work [mlx5_core] Call Trace: dump_stack+0x63/0x88 ___might_sleep+0x10a/0x130 __might_sleep+0x4a/0x80 exit_signals+0x33/0x230 ? blocking_notifier_call_chain+0x16/0x20 do_exit+0xb1/0xc30 ? kthread+0x10d/0x140 ? process_one_work+0x4e0/0x4e0 Fixes: 52c368dc3da7 ("net/mlx5: Move health and page alloc init to mdev_init") Signed-off-by: Shay Drory <shayd@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-06-24net/mlx5: potential error pointer dereference in error handlingDan Carpenter
commit 6cc070bdf07c8f6d5955d43da0560c9e5fd203b1 upstream. The error handling was a bit flipped around. If the mlx5_create_flow_group() function failed then it would have resulted in dereferencing "fg" when it was an error pointer. Fixes: 80f09dfc237f ("net/mlx5: Eswitch, enable RoCE loopback traffic") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-06-24net/mlx5: Fix crash upon suspend/resumeMark Bloch
commit 8fc3e29be9248048f449793502c15af329f35c6e upstream. Currently a Linux system with the mlx5 NIC always crashes upon hibernation - suspend/resume. Add basic callbacks so the NIC could be suspended and resumed. Fixes: 9603b61de1ee ("mlx5: Move pci device handling from mlx5_ib to mlx5_core") Tested-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-06-08mlxsw: spectrum: Fix use-after-free of split/unsplit/type_set in case reload ↵Jiri Pirko
fails commit 4340f42f207eacb81e7a6b6bb1e3b6afad9a2e26 upstream. In case of reload fail, the mlxsw_sp->ports contains a pointer to a freed memory (either by reload_down() or reload_up() error path). Fix this by initializing the pointer to NULL and checking it before dereferencing in split/unsplit/type_set callpaths. Fixes: 24cc68ad6c46 ("mlxsw: core: Add support for reload") Reported-by: Danielle Ratson <danieller@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-06-08net/mlx4_core: fix a memory leak bug.Qiushi Wu
commit febfd9d3c7f74063e8e630b15413ca91b567f963 upstream. In function mlx4_opreq_action(), pointer "mailbox" is not released, when mlx4_cmd_box() return and error, causing a memory leak bug. Fix this issue by going to "out" label, mlx4_free_cmd_mailbox() can free this pointer. Fixes: fe6f700d6cbb ("net/mlx4_core: Respond to operation request by firmware") Signed-off-by: Qiushi Wu <wu000273@umn.edu> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-06-08net/mlx5: Fix error flow in case of function_setup failureShay Drory
commit 4f7400d5cbaef676e00cdffb0565bf731c6bb09e upstream. Currently, if an error occurred during mlx5_function_setup(), we keep dev->state as DEVICE_STATE_UP. Fixing it by adding a goto label. Fixes: e161105e58da ("net/mlx5: Function setup/teardown procedures") Signed-off-by: Shay Drory <shayd@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-06-08net/mlx5e: Update netdev txq on completions during closureMoshe Shemesh
commit 5e911e2c06bd8c17df29147a5e2d4b17fafda024 upstream. On sq closure when we free its descriptors, we should also update netdev txq on completions which would not arrive. Otherwise if we reopen sqs and attach them back, for example on fw fatal recovery flow, we may get tx timeout. Fixes: 29429f3300a3 ("net/mlx5e: Timeout if SQ doesn't flush during close") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-06-08net/mlx5: Fix memory leak in mlx5_events_initMoshe Shemesh
commit df14ad1eccb04a4a28c90389214dbacab085b244 upstream. Fix memory leak in mlx5_events_init(), in case create_single_thread_workqueue() fails, events struct should be freed. Fixes: 5d3c537f9070 ("net/mlx5: Handle event of power detection in the PCIE slot") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-06-08net/mlx5e: Fix inner tirs handlingRoi Dayan
commit a16b8e0dcf7043bee46174bed0553cc9e36b63a5 upstream. In the cited commit inner_tirs argument was added to create and destroy inner tirs, and no indication was added to mlx5e_modify_tirs_hash() function. In order to have a consistent handling, use inner_indir_tir[0].tirn in tirs destroy/modify function as an indication to whether inner tirs are created. Inner tirs are not created for representors and before this commit, a call to mlx5e_modify_tirs_hash() was sending HW commands to modify non-existent inner tirs. Fixes: 46dc933cee82 ("net/mlx5e: Provide explicit directive if to create inner indirect tirs") Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-06-08net/mlx5: Add command entry handling completionMoshe Shemesh
commit 17d00e839d3b592da9659c1977d45f85b77f986a upstream. When FW response to commands is very slow and all command entries in use are waiting for completion we can have a race where commands can get timeout before they get out of the queue and handled. Timeout completion on uninitialized command will cause releasing command's buffers before accessing it for initialization and then we will get NULL pointer exception while trying access it. It may also cause releasing buffers of another command since we may have timeout completion before even allocating entry index for this command. Add entry handling completion to avoid this race. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-06-04net/mlx5: Fix command entry leak in Internal Error StateMoshe Shemesh
commit cece6f432cca9f18900463ed01b97a152a03600a upstream. Processing commands by cmd_work_handler() while already in Internal Error State will result in entry leak, since the handler process force completion without doorbell. Forced completion doesn't release the entry and event completion will never arrive, so entry should be released. Fixes: 73dd3a4839c1 ("net/mlx5: Avoid using pending command interface slots") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-06-04net/mlx5: Fix forced completion access non initialized command entryMoshe Shemesh
commit f3cb3cebe26ed4c8036adbd9448b372129d3c371 upstream. mlx5_cmd_flush() will trigger forced completions to all valid command entries. Triggered by an asynch event such as fast teardown it can happen at any stage of the command, including command initialization. It will trigger forced completion and that can lead to completion on an uninitialized command entry. Setting MLX5_CMD_ENT_STATE_PENDING_COMP only after command entry is initialized will ensure force completion is treated only if command entry is initialized. Fixes: 73dd3a4839c1 ("net/mlx5: Avoid using pending command interface slots") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-06-04net/mlx4_core: Fix use of ENOSPC around mlx4_counter_alloc()Tariq Toukan
commit 40e473071dbad04316ddc3613c3a3d1c75458299 upstream. When ENOSPC is set the idx is still valid and gets set to the global MLX4_SINK_COUNTER_INDEX. However gcc's static analysis cannot tell that ENOSPC is impossible from mlx4_cmd_imm() and gives this warning: drivers/net/ethernet/mellanox/mlx4/main.c:2552:28: warning: 'idx' may be used uninitialized in this function [-Wmaybe-uninitialized] 2552 | priv->def_counter[port] = idx; Also, when ENOSPC is returned mlx4_allocate_default_counters should not fail. Fixes: 6de5f7f6a1fa ("net/mlx4_core: Allocate default counter per port") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-06-04mlxsw: spectrum_acl_tcam: Position vchunk in a vregion list properlyJiri Pirko
commit 6ef4889fc0b3aa6ab928e7565935ac6f762cee6e upstream. Vregion helpers to get min and max priority depend on the correct ordering of vchunks in the vregion list. However, the current code always adds new chunk to the end of the list, no matter what the priority is. Fix this by finding the correct place in the list and put vchunk there. Fixes: 22a677661f56 ("mlxsw: spectrum: Introduce ACL core with simple TCAM implementation") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-06-04net/mlx5: Fix failing fw tracer allocation on s390Niklas Schnelle
commit a019b36123aec9700b21ae0724710f62928a8bc1 upstream. On s390 FORCE_MAX_ZONEORDER is 9 instead of 11, thus a larger kzalloc() allocation as done for the firmware tracer will always fail. Looking at mlx5_fw_tracer_save_trace(), it is actually the driver itself that copies the debug data into the trace array and there is no need for the allocation to be contiguous in physical memory. We can therefor use kvzalloc() instead of kzalloc() and get rid of the large contiguous allcoation. Fixes: f53aaa31cce7 ("net/mlx5: FW tracer, implement tracer logic") Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-06-04net/mlx5e: Get the latest values from counters in switchdev modeZhu Yanjun
commit dcdf4ce0ff4ba206fc362e149c8ae81d6a2f849c upstream. In the switchdev mode, when running "cat /sys/class/net/NIC/statistics/tx_packets", the ppcnt register is accessed to get the latest values. But currently this command can not get the correct values from ppcnt. From firmware manual, before getting the 802_3 counters, the 802_3 data layout should be set to the ppcnt register. When the command "cat /sys/class/net/NIC/statistics/tx_packets" is run, before updating 802_3 data layout with ppcnt register, the monitor counters are tested. The test result will decide the 802_3 data layout is updated or not. Actually the monitor counters do not support to monitor rx/tx stats of 802_3 in switchdev mode. So the rx/tx counters change will not trigger monitor counters. So the 802_3 data layout will not be updated in ppcnt register. Finally this command can not get the latest values from ppcnt register with 802_3 data layout. Fixes: 5c7e8bbb0257 ("net/mlx5e: Use monitor counters for update stats") Signed-off-by: Zhu Yanjun <yanjunz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-06-04net/mlx4_en: avoid indirect call in TX completionEric Dumazet
commit 310660a14b74c380b0ef5c12b66933d6a3d1b59f upstream. Commit 9ecc2d86171a ("net/mlx4_en: add xdp forwarding and data write support") brought another indirect call in fast path. Use INDIRECT_CALL_2() helper to avoid the cost of the indirect call when/if CONFIG_RETPOLINE=y Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Cc: Willem de Bruijn <willemb@google.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-06-04mlxsw: Fix some IS_ERR() vs NULL bugsDan Carpenter
commit c391eb8366ae052d571bb2841f1ccb4d39f3ceb8 upstream. The mlxsw_sp_acl_rulei_create() function is supposed to return an error pointer from mlxsw_afa_block_create(). The problem is that these functions both return NULL instead of error pointers. Half the callers expect NULL and half expect error pointers so it could lead to a NULL dereference on failure. This patch changes both of them to return error pointers and changes all the callers which checked for NULL to check for IS_ERR() instead. Fixes: 4cda7d8d7098 ("mlxsw: core: Introduce flexible actions support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-06-01net/mlx5e: Enforce setting of a single FEC modeAya Levin
commit 4bd9d5070b92da012f2715cf8e4859acb78b8f35 upstream. Ethtool command allow setting of several FEC modes in a single set command. The driver can only set a single FEC mode at a time. With this patch driver will reply not-supported on setting several FEC modes. Signed-off-by: Aya Levin <ayal@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-06-01mlxsw: spectrum_flower: Do not stop at FLOW_ACTION_VLAN_MANGLEPetr Machata
commit ccfc569347f870830e7c7cf854679a06cf9c45b5 upstream. The handler for FLOW_ACTION_VLAN_MANGLE ends by returning whatever the lower-level function that it calls returns. If there are more actions lined up after this action, those are never offloaded. Fix by only bailing out when the called function returns an error. Fixes: a150201a70da ("mlxsw: spectrum: Add support for vlan modify TC action") Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-05-21mlxsw: spectrum_mr: Fix list iteration in error pathIdo Schimmel
commit f6bf1bafdc2152bb22aff3a4e947f2441a1d49e2 upstream. list_for_each_entry_from_reverse() iterates backwards over the list from the current position, but in the error path we should start from the previous position. Fix this by using list_for_each_entry_continue_reverse() instead. This suppresses the following error from coccinelle: drivers/net/ethernet/mellanox/mlxsw//spectrum_mr.c:655:34-38: ERROR: invalid reference to the index variable of the iterator on line 636 Fixes: c011ec1bbfd6 ("mlxsw: spectrum: Add the multicast routing offloading logic") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-05-15net/mlx5: Fix sleep while atomic in mlx5_eswitch_get_vepaHuy Nguyen
commit 3d9c5e023a0dbf3e117bb416cfefd9405bf5af0c upstream. rtnl_bridge_getlink is protected by rcu lock, so mlx5_eswitch_get_vepa cannot take mutex lock. Two possible issues can happen: 1. User at the same time change vepa mode via RTM_SETLINK command. 2. User at the same time change the switchdev mode via devlink netlink interface. Case 1 cannot happen because rtnl executes one message in order. Case 2 can happen but we do not expect user to change the switchdev mode when changing vepa. Even if a user does it, so he will read a value which is no longer valid. Fixes: 8da202b24913 ("net/mlx5: E-Switch, Add support for VEPA in legacy mode.") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-05-15mlxsw: spectrum_dpipe: Add missing error pathIdo Schimmel
commit 3a99cbb6fa7bca1995586ec2dc21b0368aad4937 upstream. In case devlink_dpipe_entry_ctx_prepare() failed, release RTNL that was previously taken and free the memory allocated by mlxsw_sp_erif_entry_prepare(). Fixes: 2ba5999f009d ("mlxsw: spectrum: Add Support for erif table entries access") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-05-15mlx5: work around high stack usage with gccArnd Bergmann
commit 42ae1a5c76691928ed217c7e40269db27f5225e9 upstream. In some configurations, gcc tries too hard to optimize this code: drivers/net/ethernet/mellanox/mlx5/core/en_stats.c: In function 'mlx5e_grp_sw_update_stats': drivers/net/ethernet/mellanox/mlx5/core/en_stats.c:302:1: error: the frame size of 1336 bytes is larger than 1024 bytes [-Werror=frame-larger-than=] As was stated in the bug report, the reason is that gcc runs into a corner case in the register allocator that is rather hard to fix in a good way. As there is an easy way to work around it, just add a comment and the barrier that stops gcc from trying to overoptimize the function. Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92657 Cc: Adhemerval Zanella <adhemerval.zanella@linaro.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-05-04net/mlx5: Fix deadlock in fs_coreMaor Gottlieb
commit c1948390d78b5183ee9b7dd831efd7f6ac496ab0 upstream. free_match_list could be called when the flow table is already locked. We need to pass this notation to tree_put_node. It fixes the following lockdep warnning: [ 1797.268537] ============================================ [ 1797.276837] WARNING: possible recursive locking detected [ 1797.285101] 5.5.0-rc5+ #10 Not tainted [ 1797.291641] -------------------------------------------- [ 1797.299917] handler10/9296 is trying to acquire lock: [ 1797.307885] ffff889ad399a0a0 (&node->lock){++++}, at: tree_put_node+0x1d5/0x210 [mlx5_core] [ 1797.319694] [ 1797.319694] but task is already holding lock: [ 1797.330904] ffff889ad399a0a0 (&node->lock){++++}, at: nested_down_write_ref_node.part.33+0x1a/0x60 [mlx5_core] [ 1797.344707] [ 1797.344707] other info that might help us debug this: [ 1797.356952] Possible unsafe locking scenario: [ 1797.356952] [ 1797.368333] CPU0 [ 1797.373357] ---- [ 1797.378364] lock(&node->lock); [ 1797.384222] lock(&node->lock); [ 1797.390031] [ 1797.390031] *** DEADLOCK *** [ 1797.390031] [ 1797.403003] May be due to missing lock nesting notation [ 1797.403003] [ 1797.414691] 3 locks held by handler10/9296: [ 1797.421465] #0: ffff889cf2c5a110 (&block->cb_lock){++++}, at: tc_setup_cb_add+0x70/0x250 [ 1797.432810] #1: ffff88a030081490 (&comp->sem){++++}, at: mlx5_devcom_get_peer_data+0x4c/0xb0 [mlx5_core] [ 1797.445829] #2: ffff889ad399a0a0 (&node->lock){++++}, at: nested_down_write_ref_node.part.33+0x1a/0x60 [mlx5_core] [ 1797.459913] [ 1797.459913] stack backtrace: [ 1797.469436] CPU: 1 PID: 9296 Comm: handler10 Kdump: loaded Not tainted 5.5.0-rc5+ #10 [ 1797.480643] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.4.3 01/17/2017 [ 1797.491480] Call Trace: [ 1797.496701] dump_stack+0x96/0xe0 [ 1797.502864] __lock_acquire.cold.63+0xf8/0x212 [ 1797.510301] ? lockdep_hardirqs_on+0x250/0x250 [ 1797.517701] ? mark_held_locks+0x55/0xa0 [ 1797.524547] ? quarantine_put+0xb7/0x160 [ 1797.531422] ? lockdep_hardirqs_on+0x17d/0x250 [ 1797.538913] lock_acquire+0xd6/0x1f0 [ 1797.545529] ? tree_put_node+0x1d5/0x210 [mlx5_core] [ 1797.553701] down_write+0x94/0x140 [ 1797.560206] ? tree_put_node+0x1d5/0x210 [mlx5_core] [ 1797.568464] ? down_write_killable_nested+0x170/0x170 [ 1797.576925] ? del_hw_flow_group+0xde/0x1f0 [mlx5_core] [ 1797.585629] tree_put_node+0x1d5/0x210 [mlx5_core] [ 1797.593891] ? free_match_list.part.25+0x147/0x170 [mlx5_core] [ 1797.603389] free_match_list.part.25+0xe0/0x170 [mlx5_core] [ 1797.612654] _mlx5_add_flow_rules+0x17e2/0x20b0 [mlx5_core] [ 1797.621838] ? lock_acquire+0xd6/0x1f0 [ 1797.629028] ? esw_get_prio_table+0xb0/0x3e0 [mlx5_core] [ 1797.637981] ? alloc_insert_flow_group+0x420/0x420 [mlx5_core] [ 1797.647459] ? try_to_wake_up+0x4c7/0xc70 [ 1797.654881] ? lock_downgrade+0x350/0x350 [ 1797.662271] ? __mutex_unlock_slowpath+0xb1/0x3f0 [ 1797.670396] ? find_held_lock+0xac/0xd0 [ 1797.677540] ? mlx5_add_flow_rules+0xdc/0x360 [mlx5_core] [ 1797.686467] mlx5_add_flow_rules+0xdc/0x360 [mlx5_core] [ 1797.695134] ? _mlx5_add_flow_rules+0x20b0/0x20b0 [mlx5_core] [ 1797.704270] ? irq_exit+0xa5/0x170 [ 1797.710764] ? retint_kernel+0x10/0x10 [ 1797.717698] ? mlx5_eswitch_set_rule_source_port.isra.9+0x122/0x230 [mlx5_core] [ 1797.728708] mlx5_eswitch_add_offloaded_rule+0x465/0x6d0 [mlx5_core] [ 1797.738713] ? mlx5_eswitch_get_prio_range+0x30/0x30 [mlx5_core] [ 1797.748384] ? mlx5_fc_stats_work+0x670/0x670 [mlx5_core] [ 1797.757400] mlx5e_tc_offload_fdb_rules.isra.27+0x24/0x90 [mlx5_core] [ 1797.767665] mlx5e_tc_add_fdb_flow+0xaf8/0xd40 [mlx5_core] [ 1797.776886] ? mlx5e_encap_put+0xd0/0xd0 [mlx5_core] [ 1797.785562] ? mlx5e_alloc_flow.isra.43+0x18c/0x1c0 [mlx5_core] [ 1797.795353] __mlx5e_add_fdb_flow+0x2e2/0x440 [mlx5_core] [ 1797.804558] ? mlx5e_tc_update_neigh_used_value+0x8c0/0x8c0 [mlx5_core] [ 1797.815093] ? wait_for_completion+0x260/0x260 [ 1797.823272] mlx5e_configure_flower+0xe94/0x1620 [mlx5_core] [ 1797.832792] ? __mlx5e_add_fdb_flow+0x440/0x440 [mlx5_core] [ 1797.842096] ? down_read+0x11a/0x2e0 [ 1797.849090] ? down_write+0x140/0x140 [ 1797.856142] ? mlx5e_rep_indr_setup_block_cb+0xc0/0xc0 [mlx5_core] [ 1797.866027] tc_setup_cb_add+0x11a/0x250 [ 1797.873339] fl_hw_replace_filter+0x25e/0x320 [cls_flower] [ 1797.882385] ? fl_hw_destroy_filter+0x1c0/0x1c0 [cls_flower] [ 1797.891607] fl_change+0x1d54/0x1fb6 [cls_flower] [ 1797.899772] ? __rhashtable_insert_fast.constprop.50+0x9f0/0x9f0 [cls_flower] [ 1797.910728] ? lock_downgrade+0x350/0x350 [ 1797.918187] ? __radix_tree_lookup+0xa5/0x130 [ 1797.926046] ? fl_set_key+0x1590/0x1590 [cls_flower] [ 1797.934611] ? __rhashtable_insert_fast.constprop.50+0x9f0/0x9f0 [cls_flower] [ 1797.945673] tc_new_tfilter+0xcd1/0x1240 [ 1797.953138] ? tc_del_tfilter+0xb10/0xb10 [ 1797.960688] ? avc_has_perm_noaudit+0x92/0x320 [ 1797.968721] ? avc_has_perm_noaudit+0x1df/0x320 [ 1797.976816] ? avc_has_extended_perms+0x990/0x990 [ 1797.985090] ? mark_lock+0xaa/0x9e0 [ 1797.991988] ? match_held_lock+0x1b/0x240 [ 1797.999457] ? match_held_lock+0x1b/0x240 [ 1798.006859] ? find_held_lock+0xac/0xd0 [ 1798.014045] ? symbol_put_addr+0x40/0x40 [ 1798.021317] ? rcu_read_lock_sched_held+0xd0/0xd0 [ 1798.029460] ? tc_del_tfilter+0xb10/0xb10 [ 1798.036810] rtnetlink_rcv_msg+0x4d5/0x620 [ 1798.044236] ? rtnl_bridge_getlink+0x460/0x460 [ 1798.052034] ? lockdep_hardirqs_on+0x250/0x250 [ 1798.059837] ? match_held_lock+0x1b/0x240 [ 1798.067146] ? find_held_lock+0xac/0xd0 [ 1798.074246] netlink_rcv_skb+0xc6/0x1f0 [ 1798.081339] ? rtnl_bridge_getlink+0x460/0x460 [ 1798.089104] ? netlink_ack+0x440/0x440 [ 1798.096061] netlink_unicast+0x2d4/0x3b0 [ 1798.103189] ? netlink_attachskb+0x3f0/0x3f0 [ 1798.110724] ? _copy_from_iter_full+0xda/0x370 [ 1798.118415] netlink_sendmsg+0x3ba/0x6a0 [ 1798.125478] ? netlink_unicast+0x3b0/0x3b0 [ 1798.132705] ? netlink_unicast+0x3b0/0x3b0 [ 1798.139880] sock_sendmsg+0x94/0xa0 [ 1798.146332] ____sys_sendmsg+0x36c/0x3f0 [ 1798.153251] ? copy_msghdr_from_user+0x165/0x230 [ 1798.160941] ? kernel_sendmsg+0x30/0x30 [ 1798.167738] ___sys_sendmsg+0xeb/0x150 [ 1798.174411] ? sendmsg_copy_msghdr+0x30/0x30 [ 1798.181649] ? lock_downgrade+0x350/0x350 [ 1798.188559] ? rcu_read_lock_sched_held+0xd0/0xd0 [ 1798.196239] ? __fget+0x21d/0x320 [ 1798.202335] ? do_dup2+0x2a0/0x2a0 [ 1798.208499] ? lock_downgrade+0x350/0x350 [ 1798.215366] ? __fget_light+0xd6/0xf0 [ 1798.221808] ? syscall_trace_enter+0x369/0x5d0 [ 1798.229112] __sys_sendmsg+0xd3/0x160 [ 1798.235511] ? __sys_sendmsg_sock+0x60/0x60 [ 1798.242478] ? syscall_trace_enter+0x233/0x5d0 [ 1798.249721] ? syscall_slow_exit_work+0x280/0x280 [ 1798.257211] ? do_syscall_64+0x1e/0x2e0 [ 1798.263680] do_syscall_64+0x72/0x2e0 [ 1798.269950] entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: bd71b08ec2ee ("net/mlx5: Support multiple updates of steering rules in parallel") Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Alaa Hleihel <alaa@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-05-04net/mlx5: IPsec, fix memory leak at mlx5_fpga_ipsec_delete_sa_ctxRaed Salem
commit 08db2cf577487f5123aebcc2f913e0b8a2c14b43 upstream. SA context is allocated at mlx5_fpga_ipsec_create_sa_ctx, however the counterpart mlx5_fpga_ipsec_delete_sa_ctx function nullifies sa_ctx pointer without freeing the memory allocated, hence the memory leak. Fix by free SA context when the SA is released. Fixes: d6c4f0298cec ("net/mlx5: Refactor accel IPSec code") Signed-off-by: Raed Salem <raeds@mellanox.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-05-04net/mlx5: IPsec, Fix esp modify function attributeRaed Salem
commit 0dc2c534f17c05bed0622b37a744bc38b48ca88a upstream. The function mlx5_fpga_esp_validate_xfrm_attrs is wrongly used with negative negation as zero value indicates success but it used as failure return value instead. Fix by remove the unary not negation operator. Fixes: 05564d0ae075 ("net/mlx5: Add flow-steering commands for FPGA IPSec implementation") Signed-off-by: Raed Salem <raeds@mellanox.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-04-16mlxsw: minimal: Fix an error handling path in 'mlxsw_m_port_create()'Christophe JAILLET
commit 6dd4b4f3936e17fedea1308bc70e9716f68bf232 upstream. An 'alloc_etherdev()' called is not ballanced by a corresponding 'free_netdev()' call in one error handling path. Slighly reorder the error handling code to catch the missed case. Fixes: c100e47caa8e ("mlxsw: minimal: Add ethtool support") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-04-16mlxsw: switchx2: Do not modify cloned SKBs during xmitIdo Schimmel
commit 63963d0f9d17be83d0e419e03282847ecc2c3715 upstream. The driver needs to prepend a Tx header to each packet it is transmitting. The header includes information such as the egress port and traffic class. The addition of the header requires the driver to modify the SKB's header and therefore it must not be shared. Otherwise, we risk hitting various race conditions. For example, when a packet is flooded (cloned) by the bridge driver to two switch ports swp1 and swp2: t0 - mlxsw_sp_port_xmit() is called for swp1. Tx header is prepended with swp1's port number t1 - mlxsw_sp_port_xmit() is called for swp2. Tx header is prepended with swp2's port number, overwriting swp1's port number t2 - The device processes data buffer from t0. Packet is transmitted via swp2 t3 - The device processes data buffer from t1. Packet is transmitted via swp2 Usually, the device is fast enough and transmits the packet before its Tx header is overwritten, but this is not the case in emulated environments. Fix this by making sure the SKB's header is writable by calling skb_cow_head(). Since the function ensures we have headroom to push the Tx header, the check further in the function can be removed. v2: * Use skb_cow_head() instead of skb_unshare() as suggested by Jakub * Remove unnecessary check regarding headroom Fixes: 31557f0f9755 ("mlxsw: Introduce Mellanox SwitchX-2 ASIC support") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Shalom Toledo <shalomt@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-04-16net/mlx5: Update the list of the PCI supported devicesMeir Lichtinger
commit 505a7f5478062c6cd11e22022d9f1bf64cd8eab3 upstream. Add the upcoming ConnectX-7 device ID. Fixes: 85327a9c4150 ("net/mlx5: Update the list of the PCI supported devices") Signed-off-by: Meir Lichtinger <meirl@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-04-16net/mlx5: Fix lowest FDB pool sizePaul Blakey
commit 93b8a7ecb7287cc9b0196f12a25b57c2462d11dc upstream. The pool sizes represent the pool sizes in the fw. when we request a pool size from fw, it will return the next possible group. We track how many pools the fw has left and start requesting groups from the big to the small. When we start request 4k group, which doesn't exists in fw, fw wants to allocate the next possible size, 64k, but will fail since its exhausted. The correct smallest pool size in fw is 128 and not 4k. Fixes: e52c28024008 ("net/mlx5: E-Switch, Add chains and priorities") Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-04-16mlxsw: spectrum_acl: Fix use-after-free during reloadIdo Schimmel
commit 971de2e572118c1128bff295341e37b6c8b8f108 upstream. During reload (or module unload), the router block is de-initialized. Among other things, this results in the removal of a default multicast route from each active virtual router (VRF). These default routes are configured during initialization to trap packets to the CPU. In Spectrum-2, unlike Spectrum-1, multicast routes are implemented using ACL rules. Since the router block is de-initialized before the ACL block, it is possible that the ACL rules corresponding to the default routes are deleted while being accessed by the ACL delayed work that queries rules' activity from the device. This can result in a rare use-after-free [1]. Fix this by protecting the rules list accessed by the delayed work with a lock. We cannot use a spinlock as the activity read operation is blocking. [1] [ 123.331662] ================================================================== [ 123.339920] BUG: KASAN: use-after-free in mlxsw_sp_acl_rule_activity_update_work+0x330/0x3b0 [ 123.349381] Read of size 8 at addr ffff8881f3bb4520 by task kworker/0:2/78 [ 123.357080] [ 123.358773] CPU: 0 PID: 78 Comm: kworker/0:2 Not tainted 5.5.0-rc5-custom-33108-gf5df95d3ef41 #2209 [ 123.368898] Hardware name: Mellanox Technologies Ltd. MSN3700C/VMOD0008, BIOS 5.11 10/10/2018 [ 123.378456] Workqueue: mlxsw_core mlxsw_sp_acl_rule_activity_update_work [ 123.385970] Call Trace: [ 123.388734] dump_stack+0xc6/0x11e [ 123.392568] print_address_description.constprop.4+0x21/0x340 [ 123.403236] __kasan_report.cold.8+0x76/0xb1 [ 123.414884] kasan_report+0xe/0x20 [ 123.418716] mlxsw_sp_acl_rule_activity_update_work+0x330/0x3b0 [ 123.444034] process_one_work+0xb06/0x19a0 [ 123.453731] worker_thread+0x91/0xe90 [ 123.467348] kthread+0x348/0x410 [ 123.476847] ret_from_fork+0x24/0x30 [ 123.480863] [ 123.482545] Allocated by task 73: [ 123.486273] save_stack+0x19/0x80 [ 123.490000] __kasan_kmalloc.constprop.6+0xc1/0xd0 [ 123.495379] mlxsw_sp_acl_rule_create+0xa7/0x230 [ 123.500566] mlxsw_sp2_mr_tcam_route_create+0xf6/0x3e0 [ 123.506334] mlxsw_sp_mr_tcam_route_create+0x5b4/0x820 [ 123.512102] mlxsw_sp_mr_table_create+0x3b5/0x690 [ 123.517389] mlxsw_sp_vr_get+0x289/0x4d0 [ 123.521797] mlxsw_sp_fib_node_get+0xa2/0x990 [ 123.526692] mlxsw_sp_router_fib4_event_work+0x54c/0x2d60 [ 123.532752] process_one_work+0xb06/0x19a0 [ 123.537352] worker_thread+0x91/0xe90 [ 123.541471] kthread+0x348/0x410 [ 123.545103] ret_from_fork+0x24/0x30 [ 123.549113] [ 123.550795] Freed by task 518: [ 123.554231] save_stack+0x19/0x80 [ 123.557958] __kasan_slab_free+0x125/0x170 [ 123.562556] kfree+0xd7/0x3a0 [ 123.565895] mlxsw_sp_acl_rule_destroy+0x63/0xd0 [ 123.571081] mlxsw_sp2_mr_tcam_route_destroy+0xd5/0x130 [ 123.576946] mlxsw_sp_mr_tcam_route_destroy+0xba/0x260 [ 123.582714] mlxsw_sp_mr_table_destroy+0x1ab/0x290 [ 123.588091] mlxsw_sp_vr_put+0x1db/0x350 [ 123.592496] mlxsw_sp_fib_node_put+0x298/0x4c0 [ 123.597486] mlxsw_sp_vr_fib_flush+0x15b/0x360 [ 123.602476] mlxsw_sp_router_fib_flush+0xba/0x470 [ 123.607756] mlxsw_sp_vrs_fini+0xaa/0x120 [ 123.612260] mlxsw_sp_router_fini+0x137/0x384 [ 123.617152] mlxsw_sp_fini+0x30a/0x4a0 [ 123.621374] mlxsw_core_bus_device_unregister+0x159/0x600 [ 123.627435] mlxsw_devlink_core_bus_device_reload_down+0x7e/0xb0 [ 123.634176] devlink_reload+0xb4/0x380 [ 123.638391] devlink_nl_cmd_reload+0x610/0x700 [ 123.643382] genl_rcv_msg+0x6a8/0xdc0 [ 123.647497] netlink_rcv_skb+0x134/0x3a0 [ 123.651904] genl_rcv+0x29/0x40 [ 123.655436] netlink_unicast+0x4d4/0x700 [ 123.659843] netlink_sendmsg+0x7c0/0xc70 [ 123.664251] __sys_sendto+0x265/0x3c0 [ 123.668367] __x64_sys_sendto+0xe2/0x1b0 [ 123.672773] do_syscall_64+0xa0/0x530 [ 123.676892] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 123.682552] [ 123.684238] The buggy address belongs to the object at ffff8881f3bb4500 [ 123.684238] which belongs to the cache kmalloc-128 of size 128 [ 123.698261] The buggy address is located 32 bytes inside of [ 123.698261] 128-byte region [ffff8881f3bb4500, ffff8881f3bb4580) [ 123.711303] The buggy address belongs to the page: [ 123.716682] page:ffffea0007ceed00 refcount:1 mapcount:0 mapping:ffff888236403500 index:0x0 [ 123.725958] raw: 0200000000000200 dead000000000100 dead000000000122 ffff888236403500 [ 123.734646] raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000 [ 123.743315] page dumped because: kasan: bad access detected [ 123.749562] [ 123.751241] Memory state around the buggy address: [ 123.756620] ffff8881f3bb4400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 123.764716] ffff8881f3bb4480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 123.772812] >ffff8881f3bb4500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 123.780904] ^ [ 123.785697] ffff8881f3bb4580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 123.793793] ffff8881f3bb4600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 123.801883] ================================================================== Fixes: cf7221a4f5a5 ("mlxsw: spectrum_router: Add Multicast routing support for Spectrum-2") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-04-12mlxsw: spectrum_qdisc: Include MC TCs in Qdisc countersPetr Machata
commit 85005b82e59fa7bb7388b12594ab2067bf73d66c upstream. mlxsw configures Spectrum in such a way that BUM traffic is passed not through its nominal traffic class TC, but through its MC counterpart TC+8. However, when collecting statistics, Qdiscs only look at the nominal TC and ignore the MC TC. Add two helpers to compute the value for logical TC from the constituents, one for backlog, the other for tail drops. Use them throughout instead of going through the xstats pointer directly. Counters for TX bytes and packets are deduced from packet priority counters, and therefore already include BUM traffic. wred_drop counter is irrelevant on MC TCs, because RED is not enabled on them. Fixes: 7b8195306694 ("mlxsw: spectrum: Configure MC-aware mode on mlxsw ports") Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-04-12mlxsw: spectrum: Wipe xstats.backlog of down portsPetr Machata
commit ca7609ff3680c51d6c29897f3117aa2ad904f92a upstream. Per-port counter cache used by Qdiscs is updated periodically, unless the port is down. The fact that the cache is not updated for down ports is no problem for most counters, which are relative in nature. However, backlog is absolute in nature, and if there is a non-zero value in the cache around the time that the port goes down, that value just stays there. This value then leaks to offloaded Qdiscs that report non-zero backlog even if there (obviously) is no traffic. The HW does not keep backlog of a downed port, so do likewise: as the port goes down, wipe the backlog value from xstats. Fixes: 075ab8adaf4e ("mlxsw: spectrum: Collect tclass related stats periodically") Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-04-12mlxsw: spectrum: Do not modify cloned SKBs during xmitIdo Schimmel
commit 2da51ce75d86ab1f7770ac1391a9a1697ddaa60c upstream. The driver needs to prepend a Tx header to each packet it is transmitting. The header includes information such as the egress port and traffic class. The addition of the header requires the driver to modify the SKB's header and therefore it must not be shared. Otherwise, we risk hitting various race conditions. For example, when a packet is flooded (cloned) by the bridge driver to two switch ports swp1 and swp2: t0 - mlxsw_sp_port_xmit() is called for swp1. Tx header is prepended with swp1's port number t1 - mlxsw_sp_port_xmit() is called for swp2. Tx header is prepended with swp2's port number, overwriting swp1's port number t2 - The device processes data buffer from t0. Packet is transmitted via swp2 t3 - The device processes data buffer from t1. Packet is transmitted via swp2 Usually, the device is fast enough and transmits the packet before its Tx header is overwritten, but this is not the case in emulated environments. Fix this by making sure the SKB's header is writable by calling skb_cow_head(). Since the function ensures we have headroom to push the Tx header, the check further in the function can be removed. v2: * Use skb_cow_head() instead of skb_unshare() as suggested by Jakub * Remove unnecessary check regarding headroom Fixes: 56ade8fe3fe1 ("mlxsw: spectrum: Add initial support for Spectrum ASIC") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Shalom Toledo <shalomt@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-04-12mlxsw: core: Add support for using SKB control bufferPetr Machata
commit d7cd206dbfb25efc5f06ea3c595074a51d48d00a upstream. The SKB control buffer is useful (and used) for bookkeeping of information related to that SKB. Add helpers so that the mlxsw driver(s) can safely use the buffer as well. The structure is currently empty, individual users will add members to it as necessary. Note that SKB allocation functions already clear the buffer, so the cleanup is only necessary when ndo_start_xmit is called. Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-04-12net/mlx5e: Initialize on stack link modes bitmapAya Levin
commit 926b37f76fb0a22fe93c8873c819fd167180e85c upstream. Initialize link modes bitmap on stack before using it, otherwise the outcome of ethtool set link ksettings might have unexpected values. Fixes: 4b95840a6ced ("net/mlx5e: Fix matching of speed to PRM link modes") Signed-off-by: Aya Levin <ayal@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-04-12net/mlx5e: ethtool, Fix analysis of speed settingAya Levin
commit 3d7cadae51f1b7f28358e36d0a1ce3f0ae2eee60 upstream. When setting speed to 100G via ethtool (AN is set to off), only 25G*4 is configured while the user, who has an advanced HW which supports extended PTYS, expects also 50G*2 to be configured. With this patch, when extended PTYS mode is available, configure PTYS via extended fields. Fixes: 4b95840a6ced ("net/mlx5e: Fix matching of speed to PRM link modes") Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-03-20net/mlx5e: Fix hairpin RSS table sizeEli Cohen
commit 6412bb396a63f28de994b1480edf8e4caf4aa494 upstream. Set hairpin table size to the corret size, based on the groups that would be created in it. Groups are laid out on the table such that a group occupies a range of entries in the table. This implies that the group ranges should have correspondence to the table they are laid upon. The patch cited below made group 1's size to grow hence causing overflow of group range laid on the table. Fixes: a795d8db2a6d ("net/mlx5e: Support RSS for IP-in-IP and IPv6 tunneled packets") Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-03-20mlxsw: spectrum_qdisc: Ignore grafting of invisible FIFOPetr Machata
commit 3971a535b839489e4ea31796cc086e6ce616318c upstream. The following patch will change PRIO to replace a removed Qdisc with an invisible FIFO, instead of NOOP. mlxsw will see this replacement due to the graft message that is generated. But because FIFO does not issue its own REPLACE message, when the graft operation takes place, the Qdisc that mlxsw tracks under the indicated band is still the old one. The child handle (0:0) therefore does not match, and mlxsw rejects the graft operation, which leads to an extack message: Warning: Offloading graft operation failed. Fix by ignoring the invisible children in the PRIO graft handler. The DESTROY message of the removed Qdisc is going to follow shortly and handle the removal. Fixes: 32dc5efc6cb4 ("mlxsw: spectrum: qdiscs: prio: Handle graft command") Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-03-07mlxsw: spectrum_router: Skip loopback RIFs during MAC validationAmit Cohen
commit 314bd842d98e1035cc40b671a71e07f48420e58f upstream. When a router interface (RIF) is created the MAC address of the backing netdev is verified to have the same MSBs as existing RIFs. This is required in order to avoid changing existing RIF MAC addresses that all share the same MSBs. Loopback RIFs are special in this regard as they do not have a MAC address, given they are only used to loop packets from the overlay to the underlay. Without this change, an error is returned when trying to create a RIF after the creation of a GRE tunnel that is represented by a loopback RIF. 'rif->dev->dev_addr' points to the GRE device's local IP, which does not share the same MSBs as physical interfaces. Adding an IP address to any physical interface results in: Error: mlxsw_spectrum: All router interface MAC addresses must have the same prefix. Fix this by skipping loopback RIFs during MAC validation. Fixes: 74bc99397438 ("mlxsw: spectrum_router: Veto unsupported RIF MAC addresses") Signed-off-by: Amit Cohen <amitc@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-03-07net/mlxfw: Fix out-of-memory error in mfa2 flash burningVladyslav Tarasiuk
commit a5bcd72e054aabb93ddc51ed8cde36a5bfc50271 upstream. The burning process requires to perform internal allocations of large chunks of memory. This memory doesn't need to be contiguous and can be safely allocated by vzalloc() instead of kzalloc(). This patch changes such allocation to avoid possible out-of-memory failure. Fixes: 410ed13cae39 ("Add the mlxfw module for Mellanox firmware flash process") Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com> Reviewed-by: Aya Levin <ayal@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Tested-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-02-24net/mlx5e: Verify that rule has at least one fwd/drop actionVlad Buslov
commit ae2741e2b6ce2bf1b656b1152c4ef147ff35b096 upstream. Currently, mlx5 tc layer doesn't verify that rule has at least one forward or drop action which leads to following firmware syndrome when user tries to offload such action: [ 1824.860501] mlx5_core 0000:81:00.0: mlx5_cmd_check:753:(pid 29458): SET_FLOW_TABLE_ENTRY(0x936) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x144b7a) Add check at the end of parse_tc_fdb_actions() that verifies that resulting attribute has action fwd or drop flag set. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-31net: ipv6_stub: use ip6_dst_lookup_flow instead of ip6_dst_lookupSabrina Dubroca
commit 6c8991f41546c3c472503dff1ea9daaddf9331c2 upstream. ipv6_stub uses the ip6_dst_lookup function to allow other modules to perform IPv6 lookups. However, this function skips the XFRM layer entirely. All users of ipv6_stub->ip6_dst_lookup use ip_route_output_flow (via the ip_route_output_key and ip_route_output helpers) for their IPv4 lookups, which calls xfrm_lookup_route(). This patch fixes this inconsistent behavior by switching the stub to ip6_dst_lookup_flow, which also calls xfrm_lookup_route(). This requires some changes in all the callers, as these two functions take different arguments and have different return types. Fixes: 5f81bd2e5d80 ("ipv6: export a stub for IPv6 symbols used by vxlan") Reported-by: Xiumei Mu <xmu@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-31net/mlx5e: Query global pause state before setting prio2bufferHuy Nguyen
commit 73e6551699a32fac703ceea09214d6580edcf2d5 upstream. When the user changes prio2buffer mapping while global pause is enabled, mlx5 driver incorrectly sets all active buffers (buffer that has at least one priority mapped) to lossy. Solution: If global pause is enabled, set all the active buffers to lossless in prio2buffer command. Also, add error message when buffer size is not enough to meet xoff threshold. Fixes: 0696d60853d5 ("net/mlx5e: Receive buffer configuration") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-19net/mlx5e: Fix eswitch debug print of max fdb flowRoi Dayan
commit f382b0df6946d48fae80a2201ccff43b41382099 upstream. The value is already the calculation so remove the log prefix. Fixes: e52c28024008 ("net/mlx5: E-Switch, Add chains and priorities") Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-09net/mlx5: Update the list of the PCI supported devicesShani Shapp
commit b7eca940322f47fd30dafb70da04d193a0154090 upstream. Add the upcoming ConnectX-6 LX device ID. Fixes: 85327a9c4150 ("net/mlx5: Update the list of the PCI supported devices") Signed-off-by: Shani Shapp <shanish@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-09net/mlx5e: Do not use non-EXT link modes in EXT modeEran Ben Elisha
commit 24960574505c49b102ca1dfa6bf109669bca2a66 upstream. On some old Firmwares, connector type value was not supported, and value read from FW was 0. For those, driver used link mode in order to set connector type in link_ksetting. After FW exposed the connector type, driver translated the value to ethtool definitions. However, as 0 is a valid value, before returning PORT_OTHER, driver run the check of link mode in order to maintain backward compatibility. Cited patch added support to EXT mode. With both features (connector type and EXT link modes) ,if connector_type read from FW is 0 and EXT mode is set, driver mistakenly compare EXT link modes to non-EXT link mode. Fixed that by skipping this comparison if we are in EXT mode, as connector type value is valid in this scenario. Fixes: 6a897372417e ("net/mlx5: ethtool, Add ethtool support for 50Gbps per lane link modes") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Aya Levin <ayal@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>