aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/hw
AgeCommit message (Collapse)Author
2021-12-29IB/qib: Fix memory leak in qib_user_sdma_queue_pkts()José Expósito
[ Upstream commit bee90911e0138c76ee67458ac0d58b38a3190f65 ] The wrong goto label was used for the error case and missed cleanup of the pkt allocation. Fixes: d39bf40e55e6 ("IB/qib: Protect from buffer overflow in struct qib_user_sdma_pkt fields") Link: https://lore.kernel.org/r/20211208175238.29983-1-jose.exposito89@gmail.com Addresses-Coverity-ID: 1493352 ("Resource leak") Signed-off-by: José Expósito <jose.exposito89@gmail.com> Acked-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-12-14IB/hfi1: Correct guard on eager buffer deallocationMike Marciniszyn
commit 9292f8f9a2ac42eb320bced7153aa2e63d8cc13a upstream. The code tests the dma address which legitimately can be 0. The code should test the kernel logical address to avoid leaking eager buffer allocations that happen to map to a dma address of 0. Fixes: 60368186fd85 ("IB/hfi1: Fix user-space buffers mapping with IOMMU enabled") Link: https://lore.kernel.org/r/20211129191952.101968.17137.stgit@awfm-01.cornelisnetworks.com Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-26RDMA/mlx4: Return missed an error if device doesn't support steeringLeon Romanovsky
[ Upstream commit f4e56ec4452f48b8292dcf0e1c4bdac83506fb8b ] The error flow fixed in this patch is not possible because all kernel users of create QP interface check that device supports steering before set IB_QP_CREATE_NETIF_QP flag. Fixes: c1c98501121e ("IB/mlx4: Add support for steerable IB UD QPs") Link: https://lore.kernel.org/r/91c61f6e60eb0240f8bbc321fda7a1d2986dd03c.1634023677.git.leonro@nvidia.com Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-26RDMA/qedr: Fix NULL deref for query_qp on the GSI QPAlok Prasad
commit 4f960393a0ee9a39469ceb7c8077ae8db665cc12 upstream. This patch fixes a crash caused by querying the QP via netlink, and corrects the state of GSI qp. GSI qp's have a NULL qed_qp. The call trace is generated by: $ rdma res show BUG: kernel NULL pointer dereference, address: 0000000000000034 Hardware name: Dell Inc. PowerEdge R720/0M1GCR, BIOS 1.2.6 05/10/2012 RIP: 0010:qed_rdma_query_qp+0x33/0x1a0 [qed] RSP: 0018:ffffba560a08f580 EFLAGS: 00010206 RAX: 0000000200000000 RBX: ffffba560a08f5b8 RCX: 0000000000000000 RDX: ffffba560a08f5b8 RSI: 0000000000000000 RDI: ffff9807ee458090 RBP: ffffba560a08f5a0 R08: 0000000000000000 R09: ffff9807890e7048 R10: ffffba560a08f658 R11: 0000000000000000 R12: 0000000000000000 R13: ffff9807ee458090 R14: ffff9807f0afb000 R15: ffffba560a08f7ec FS: 00007fbbf8bfe740(0000) GS:ffff980aafa00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000034 CR3: 00000001720ba001 CR4: 00000000000606f0 Call Trace: qedr_query_qp+0x82/0x360 [qedr] ib_query_qp+0x34/0x40 [ib_core] ? ib_query_qp+0x34/0x40 [ib_core] fill_res_qp_entry_query.isra.26+0x47/0x1d0 [ib_core] ? __nla_put+0x20/0x30 ? nla_put+0x33/0x40 fill_res_qp_entry+0xe3/0x120 [ib_core] res_get_common_dumpit+0x3f8/0x5d0 [ib_core] ? fill_res_cm_id_entry+0x1f0/0x1f0 [ib_core] nldev_res_get_qp_dumpit+0x1a/0x20 [ib_core] netlink_dump+0x156/0x2f0 __netlink_dump_start+0x1ab/0x260 rdma_nl_rcv+0x1de/0x330 [ib_core] ? nldev_res_get_cm_id_dumpit+0x20/0x20 [ib_core] netlink_unicast+0x1b8/0x270 netlink_sendmsg+0x33e/0x470 sock_sendmsg+0x63/0x70 __sys_sendto+0x13f/0x180 ? setup_sgl.isra.12+0x70/0xc0 __x64_sys_sendto+0x28/0x30 do_syscall_64+0x3a/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae Cc: stable@vger.kernel.org Fixes: cecbcddf6461 ("qedr: Add support for QP verbs") Link: https://lore.kernel.org/r/20211027184329.18454-1-palok@marvell.com Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: Shai Malin <smalin@marvell.com> Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com> Signed-off-by: Alok Prasad <palok@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-12IB/qib: Protect from buffer overflow in struct qib_user_sdma_pkt fieldsMike Marciniszyn
commit d39bf40e55e666b5905fdbd46a0dced030ce87be upstream. Overflowing either addrlimit or bytes_togo can allow userspace to trigger a buffer overflow of kernel memory. Check for overflows in all the places doing math on user controlled buffers. Fixes: f931551bafe1 ("IB/qib: Add new qib driver for QLogic PCIe InfiniBand adapters") Link: https://lore.kernel.org/r/20211012175519.7298.77738.stgit@awfm-01.cornelisnetworks.com Reported-by: Ilja Van Sprundel <ivansprundel@ioactive.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-12IB/qib: Use struct_size() helperGustavo A. R. Silva
commit 829ca44ecf60e9b6f83d0161a6ef10c1304c5060 upstream. Make use of the struct_size() helper instead of an open-coded version in order to avoid any potential type mistakes, in particular in the context in which this code is being used. So, replace the following form: sizeof(*pkt) + sizeof(pkt->addr[0])*n with: struct_size(pkt, addr, n) Also, notice that variable size is unnecessary, hence it is removed. This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-03IB/hfi1: Fix possible null-pointer dereference in _extend_sdma_tx_descs()Tuo Li
[ Upstream commit cbe71c61992c38f72c2b625b2ef25916b9f0d060 ] kmalloc_array() is called to allocate memory for tx->descp. If it fails, the function __sdma_txclean() is called: __sdma_txclean(dd, tx); However, in the function __sdma_txclean(), tx-descp is dereferenced if tx->num_desc is not zero: sdma_unmap_desc(dd, &tx->descp[0]); To fix this possible null-pointer dereference, assign the return value of kmalloc_array() to a local variable descp, and then assign it to tx->descp if it is not NULL. Otherwise, go to enomem. Fixes: 7724105686e7 ("IB/hfi1: add driver files") Link: https://lore.kernel.org/r/20210806133029.194964-1-islituo@gmail.com Reported-by: TOTE Robot <oslab@tsinghua.edu.cn> Signed-off-by: Tuo Li <islituo@gmail.com> Tested-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Acked-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-20RDMA/cxgb4: Fix missing error code in create_qp()Jiapeng Chong
[ Upstream commit aeb27bb76ad8197eb47890b1ff470d5faf8ec9a5 ] The error code is missing in this code scenario so 0 will be returned. Add the error code '-EINVAL' to the return value 'ret'. Eliminates the follow smatch warning: drivers/infiniband/hw/cxgb4/qp.c:298 create_qp() warn: missing error code 'ret'. Link: https://lore.kernel.org/r/1622545669-20625-1-git-send-email-jiapeng.chong@linux.alibaba.com Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-22RDMA/i40iw: Avoid panic when reading back the IRQ affinity hintAndrew Boyer
commit 43731753c4b7d832775cf6b2301dd0447a5a1851 upstream. The current code sets an affinity hint with a cpumask_t stored on the stack. This value can then be accessed through /proc/irq/*/affinity_hint/, causing a segfault or returning corrupt data. Move the cpumask_t into struct i40iw_msix_vector so it is available later. Backtrace: BUG: unable to handle kernel paging request at ffffb16e600e7c90 IP: irq_affinity_hint_proc_show+0x60/0xf0 PGD 17c0c6d067 PUD 17c0c6e067 PMD 15d4a0e067 PTE 0 Oops: 0000 [#1] SMP Modules linked in: ... CPU: 3 PID: 172543 Comm: grep Tainted: G OE ... #1 Hardware name: ... task: ffff9a5caee08000 task.stack: ffffb16e659d8000 RIP: 0010:irq_affinity_hint_proc_show+0x60/0xf0 RSP: 0018:ffffb16e659dbd20 EFLAGS: 00010086 RAX: 0000000000000246 RBX: ffffb16e659dbd20 RCX: 0000000000000000 RDX: ffffb16e600e7c90 RSI: 0000000000000003 RDI: 0000000000000046 RBP: ffffb16e659dbd88 R08: 0000000000000038 R09: 0000000000000001 R10: 0000000070803079 R11: 0000000000000000 R12: ffff9a59d1d97a00 R13: ffff9a5da47a6cd8 R14: ffff9a5da47a6c00 R15: ffff9a59d1d97a00 FS: 00007f946c31d740(0000) GS:ffff9a5dc1800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffb16e600e7c90 CR3: 00000016a4339000 CR4: 00000000007406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: seq_read+0x12d/0x430 ? sched_clock_cpu+0x11/0xb0 proc_reg_read+0x48/0x70 __vfs_read+0x37/0x140 ? security_file_permission+0xa0/0xc0 vfs_read+0x96/0x140 SyS_read+0x58/0xc0 do_syscall_64+0x5a/0x190 entry_SYSCALL64_slow_path+0x25/0x25 RIP: 0033:0x7f946bbc97e0 RSP: 002b:00007ffdd0c4ae08 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: ffffffffffffffda RBX: 000000000096b000 RCX: 00007f946bbc97e0 RDX: 000000000096b000 RSI: 00007f946a2f0000 RDI: 0000000000000004 RBP: 0000000000001000 R08: 00007f946a2ef011 R09: 000000000000000a R10: 0000000000001000 R11: 0000000000000246 R12: 00007f946a2f0000 R13: 0000000000000004 R14: 0000000000000000 R15: 00007f946a2f0000 Code: b9 08 00 00 00 49 89 c6 48 89 df 31 c0 4d 8d ae d8 00 00 00 f3 48 ab 4c 89 ef e8 6c 9a 56 00 49 8b 96 30 01 00 00 48 85 d2 74 3f <48> 8b 0a 48 89 4d 98 48 8b 4a 08 48 89 4d a0 48 8b 4a 10 48 89 RIP: irq_affinity_hint_proc_show+0x60/0xf0 RSP: ffffb16e659dbd20 CR2: ffffb16e600e7c90 Fixes: 8e06af711bf2 ("i40iw: add main, hdr, status") Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> CC: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-22RDMA/i40iw: Fix error unwinding when i40iw_hmc_sd_one failsSindhu Devale
[ Upstream commit 783a11bf2400e5d5c42a943c3083dc0330751842 ] When i40iw_hmc_sd_one fails, chunk is freed without the deletion of chunk entry in the PBLE info list. Fix it by adding the chunk entry to the PBLE info list only after successful addition of SD in i40iw_hmc_sd_one. This fixes a static checker warning reported here: https://lore.kernel.org/linux-rdma/YHV4CFXzqTm23AOZ@mwanda/ Fixes: 9715830157be ("i40iw: add pble resource files") Link: https://lore.kernel.org/r/20210416002104.323-1-shiraz.saleem@intel.com Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Sindhu Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-22IB/hfi1: Fix error return code in parse_platform_config()Wang Wensheng
[ Upstream commit 4c7d9c69adadfc31892c7e8e134deb3546552106 ] Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: 7724105686e7 ("IB/hfi1: add driver files") Link: https://lore.kernel.org/r/20210408113140.103032-1-wangwensheng4@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-04-16RDMA/cxgb4: check for ipv6 address properly while destroying listenerPotnuri Bharat Teja
[ Upstream commit 603c4690b01aaffe3a6c3605a429f6dac39852ae ] ipv6 bit is wrongly set by the below which causes fatal adapter lookup engine errors for ipv4 connections while destroying a listener. Fix it to properly check the local address for ipv6. Fixes: 3408be145a5d ("RDMA/cxgb4: Fix adapter LE hash errors while destroying ipv6 listening server") Link: https://lore.kernel.org/r/20210331135715.30072-1-bharat@chelsio.com Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-30RDMA/cxgb4: Fix adapter LE hash errors while destroying ipv6 listening serverPotnuri Bharat Teja
[ Upstream commit 3408be145a5d6418ff955fe5badde652be90e700 ] Not setting the ipv6 bit while destroying ipv6 listening servers may result in potential fatal adapter errors due to lookup engine memory hash errors. Therefore always set ipv6 field while destroying ipv6 listening servers. Fixes: 830662f6f032 ("RDMA/cxgb4: Add support for active and passive open connection with IPv6 address") Link: https://lore.kernel.org/r/20210324190453.8171-1-bharat@chelsio.com Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-03RDMA/cxgb4: Fix the reported max_recv_sge valueKamal Heib
[ Upstream commit a372173bf314d374da4dd1155549d8ca7fc44709 ] The max_recv_sge value is wrongly reported when calling query_qp, This is happening due to a typo when assigning the max_recv_sge value, the value of sq_max_sges was assigned instead of rq_max_sges. Fixes: 3e5c02c9ef9a ("iw_cxgb4: Support query_qp() verb") Link: https://lore.kernel.org/r/20210114191423.423529-1-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Reviewed-by: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-23RDMA/usnic: Fix memleak in find_free_vf_and_create_qp_grpDinghao Liu
commit a306aba9c8d869b1fdfc8ad9237f1ed718ea55e6 upstream. If usnic_ib_qp_grp_create() fails at the first call, dev_list will not be freed on error, which leads to memleak. Fixes: e3cf00d0a87f ("IB/usnic: Add Cisco VIC low-level hardware driver") Link: https://lore.kernel.org/r/20201226074248.2893-1-dinghao.liu@zju.edu.cn Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-29RDMA/cxgb4: Validate the number of CQEsKamal Heib
[ Upstream commit 6d8285e604e0221b67bd5db736921b7ddce37d00 ] Before create CQ, make sure that the requested number of CQEs is in the supported range. Fixes: cfdda9d76436 ("RDMA/cxgb4: Add driver for Chelsio T4 RNIC") Link: https://lore.kernel.org/r/20201108132007.67537-1-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-29RDMa/mthca: Work around -Wenum-conversion warningArnd Bergmann
[ Upstream commit fbb7dc5db6dee553b5a07c27e86364a5223e244c ] gcc points out a suspicious mixing of enum types in a function that converts from MTHCA_OPCODE_* values to IB_WC_* values: drivers/infiniband/hw/mthca/mthca_cq.c: In function 'mthca_poll_one': drivers/infiniband/hw/mthca/mthca_cq.c:607:21: warning: implicit conversion from 'enum <anonymous>' to 'enum ib_wc_opcode' [-Wenum-conversion] 607 | entry->opcode = MTHCA_OPCODE_INVALID; Nothing seems to ever check for MTHCA_OPCODE_INVALID again, no idea if this is meaningful, but it seems harmless as it deals with an invalid input. Remove MTHCA_OPCODE_INVALID and set the ib_wc_opcode to 0xFF, which is still bogus, but at least doesn't make compiler warnings. Fixes: 2a4443a69934 ("[PATCH] IB/mthca: fill in opcode field for send completions") Link: https://lore.kernel.org/r/20201026211311.3887003-1-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-29RDMA/bnxt_re: Set queue pair state when being queriedKamal Heib
[ Upstream commit 53839b51a7671eeb3fb44d479d541cf3a0f2dd45 ] The API for ib_query_qp requires the driver to set cur_qp_state on return, add the missing set. Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Link: https://lore.kernel.org/r/20201021114952.38876-1-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Acked-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-08RDMA/i40iw: Address an mmap handler exploit in i40iwShiraz Saleem
commit 2ed381439e89fa6d1a0839ef45ccd45d99d8e915 upstream. i40iw_mmap manipulates the vma->vm_pgoff to differentiate a push page mmap vs a doorbell mmap, and uses it to compute the pfn in remap_pfn_range without any validation. This is vulnerable to an mmap exploit as described in: https://lore.kernel.org/r/20201119093523.7588-1-zhudi21@huawei.com The push feature is disabled in the driver currently and therefore no push mmaps are issued from user-space. The feature does not work as expected in the x722 product. Remove the push module parameter and all VMA attribute manipulations for this feature in i40iw_mmap. Update i40iw_mmap to only allow DB user mmapings at offset = 0. Check vm_pgoff for zero and if the mmaps are bound to a single page. Cc: <stable@kernel.org> Fixes: d37498417947 ("i40iw: add files for iwarp interface") Link: https://lore.kernel.org/r/20201125005616.1800-2-shiraz.saleem@intel.com Reported-by: Di Zhu <zhudi21@huawei.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-02IB/mthca: fix return value of error branch in mthca_init_cq()Xiongfeng Wang
[ Upstream commit 6830ff853a5764c75e56750d59d0bbb6b26f1835 ] We return 'err' in the error branch, but this variable may be set as zero by the above code. Fix it by setting 'err' as a negative value before we goto the error label. Fixes: 74c2174e7be5 ("IB uverbs: add mthca user CQ support") Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Link: https://lore.kernel.org/r/1605837422-42724-1-git-send-email-wangxiongfeng2@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-29RDMA/hns: Set the unsupported wr opcodeLijun Ou
[ Upstream commit 22d3e1ed2cc837af87f76c3c8a4ccf4455e225c5 ] hip06 does not support IB_WR_LOCAL_INV, so the ps_opcode should be set to an invalid value instead of being left uninitialized. Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver") Fixes: a2f3d4479fe9 ("RDMA/hns: Avoid unncessary initialization") Link: https://lore.kernel.org/r/1600350615-115217-1-git-send-email-oulijun@huawei.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-29RDMA/qedr: Fix use of uninitialized fieldMichal Kalderon
[ Upstream commit a379ad54e55a12618cae7f6333fd1b3071de9606 ] dev->attr.page_size_caps was used uninitialized when setting device attributes Fixes: ec72fce401c6 ("qedr: Add support for RoCE HW init") Link: https://lore.kernel.org/r/20200902165741.8355-4-michal.kalderon@marvell.com Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-29IB/mlx4: Adjust delayed work when a dup is observedHåkon Bugge
[ Upstream commit 785167a114855c5aa75efca97000e405c2cc85bf ] When scheduling delayed work to clean up the cache, if the entry already has been scheduled for deletion, we adjust the delay. Fixes: 3cf69cc8dbeb ("IB/mlx4: Add CM paravirtualization") Link: https://lore.kernel.org/r/20200803061941.1139994-7-haakon.bugge@oracle.com Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-29IB/mlx4: Fix starvation in paravirt mux/demuxHåkon Bugge
[ Upstream commit 7fd1507df7cee9c533f38152fcd1dd769fcac6ce ] The mlx4 driver will proxy MAD packets through the PF driver. A VM or an instantiated VF will send its MAD packets to the PF driver using loop-back. The PF driver will be informed by an interrupt, but defer the handling and polling of CQEs to a worker thread running on an ordered work-queue. Consider the following scenario: the VMs will in short proximity in time, for example due to a network event, send many MAD packets to the PF driver. Lets say there are K VMs, each sending N packets. The interrupt from the first VM will start the worker thread, which will poll N CQEs. A common case here is where the PF driver will multiplex the packets received from the VMs out on the wire QP. But before the wire QP has returned a send CQE and associated interrupt, the other K - 1 VMs have sent their N packets as well. The PF driver has to multiplex K * N packets out on the wire QP. But the send-queue on the wire QP has a finite capacity. So, in this scenario, if K * N is larger than the send-queue capacity of the wire QP, we will get MAD packets dropped on the floor with this dynamic debug message: mlx4_ib_multiplex_mad: failed sending GSI to wire on behalf of slave 2 (-11) and this despite the fact that the wire send-queue could have capacity, but the PF driver isn't aware, because the wire send CQEs have not yet been polled. We can also have a similar scenario inbound, with a wire recv-queue larger than the tunnel QP's send-queue. If many remote peers send MAD packets to the very same VM, the tunnel send-queue destined to the VM could allegedly be construed to be full by the PF driver. This starvation is fixed by introducing separate work queues for the wire QPs vs. the tunnel QPs. With this fix, using a dual ported HCA, 8 VFs instantiated, we could run cmtime on each of the 18 interfaces towards a similar configured peer, each cmtime instance with 800 QPs (all in all 14400 QPs) without a single CM packet getting lost. Fixes: 3cf69cc8dbeb ("IB/mlx4: Add CM paravirtualization") Link: https://lore.kernel.org/r/20200803061941.1139994-5-haakon.bugge@oracle.com Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01RDMA/iw_cgxb4: Fix an error handling path in 'c4iw_connect()'Christophe JAILLET
[ Upstream commit 9067f2f0b41d7e817fc8c5259bab1f17512b0147 ] We should jump to fail3 in order to undo the 'xa_insert_irq()' call. Link: https://lore.kernel.org/r/20190923190746.10964-1-christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01RDMA/i40iw: Fix potential use after freePan Bian
[ Upstream commit da046d5f895fca18d63b15ac8faebd5bf784e23a ] Release variable dst after logging dst->error to avoid possible use after free. Link: https://lore.kernel.org/r/1573022651-37171-1-git-send-email-bianpan2016@163.com Signed-off-by: Pan Bian <bianpan2016@163.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-03RDMA/pvrdma: Fix missing pci disable in pvrdma_pci_probe()Qiushi Wu
[ Upstream commit db857e6ae548f0f4f4a0f63fffeeedf3cca21f9d ] In function pvrdma_pci_probe(), pdev was not disabled in one error path. Thus replace the jump target “err_free_device” by "err_disable_pdev". Fixes: 29c8d9eba550 ("IB: Add vmw_pvrdma driver") Link: https://lore.kernel.org/r/20200523030457.16160-1-wu000273@umn.edu Signed-off-by: Qiushi Wu <wu000273@umn.edu> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-03IB/qib: Call kobject_put() when kobject_init_and_add() failsKaike Wan
[ Upstream commit a35cd6447effd5c239b564c80fa109d05ff3d114 ] When kobject_init_and_add() returns an error in the function qib_create_port_files(), the function kobject_put() is not called for the corresponding kobject, which potentially leads to memory leak. This patch fixes the issue by calling kobject_put() even if kobject_init_and_add() fails. In addition, the ppd->diagc_kobj is released along with other kobjects when the sysfs is unregistered. Fixes: f931551bafe1 ("IB/qib: Add new qib driver for QLogic PCIe InfiniBand adapters") Link: https://lore.kernel.org/r/20200512031328.189865.48627.stgit@awfm-01.aw.intel.com Cc: <stable@vger.kernel.org> Suggested-by: Lin Yi <teroincn@gmail.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-20IB/mlx4: Test return value of calls to ib_get_cached_pkeyJack Morgenstein
[ Upstream commit 6693ca95bd4330a0ad7326967e1f9bcedd6b0800 ] In the mlx4_ib_post_send() flow, some functions call ib_get_cached_pkey() without checking its return value. If ib_get_cached_pkey() returns an error code, these functions should return failure. Fixes: 1ffeb2eb8be9 ("IB/mlx4: SR-IOV IB context objects and proxy/tunnel SQP support") Fixes: 225c7b1feef1 ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters") Fixes: e622f2f4ad21 ("IB: split struct ib_send_wr") Link: https://lore.kernel.org/r/20200426075921.130074-1-leon@kernel.org Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-20i40iw: Fix error handling in i40iw_manage_arp_cache()Dan Carpenter
[ Upstream commit 37e31d2d26a4124506c24e95434e9baf3405a23a ] The i40iw_arp_table() function can return -EOVERFLOW if i40iw_alloc_resource() fails so we can't just test for "== -1". Fixes: 4e9042e647ff ("i40iw: add hw and utils files") Link: https://lore.kernel.org/r/20200422092211.GA195357@mwanda Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-05RDMA/mlx4: Initialize ib_spec on the stackAlaa Hleihel
commit c08cfb2d8d78bfe81b37cc6ba84f0875bddd0d5c upstream. Initialize ib_spec on the stack before using it, otherwise we will have garbage values that will break creating default rules with invalid parsing error. Fixes: a37a1a428431 ("IB/mlx4: Add mechanism to support flow steering over IB links") Link: https://lore.kernel.org/r/20200413132235.930642-1-leon@kernel.org Signed-off-by: Alaa Hleihel <alaa@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-05RDMA/mlx5: Set GRH fields in query QP on RoCEAharon Landau
commit 2d7e3ff7b6f2c614eb21d0dc348957a47eaffb57 upstream. GRH fields such as sgid_index, hop limit, et. are set in the QP context when QP is created/modified. Currently, when query QP is performed, we fill the GRH fields only if the GRH bit is set in the QP context, but this bit is not set for RoCE. Adjust the check so we will set all relevant data for the RoCE too. Since this data is returned to userspace, the below is an ABI regression. Fixes: d8966fcd4c25 ("IB/core: Use rdma_ah_attr accessor functions") Link: https://lore.kernel.org/r/20200413132028.930109-1-leon@kernel.org Signed-off-by: Aharon Landau <aharonl@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-13IB/hfi1: Fix memory leaks in sysfs registration and unregistrationKaike Wan
commit 5c15abc4328ad696fa61e2f3604918ed0c207755 upstream. When the hfi1 driver is unloaded, kmemleak will report the following issue: unreferenced object 0xffff8888461a4c08 (size 8): comm "kworker/0:0", pid 5, jiffies 4298601264 (age 2047.134s) hex dump (first 8 bytes): 73 64 6d 61 30 00 ff ff sdma0... backtrace: [<00000000311a6ef5>] kvasprintf+0x62/0xd0 [<00000000ade94d9f>] kobject_set_name_vargs+0x1c/0x90 [<0000000060657dbb>] kobject_init_and_add+0x5d/0xb0 [<00000000346fe72b>] 0xffffffffa0c5ecba [<000000006cfc5819>] 0xffffffffa0c866b9 [<0000000031c65580>] 0xffffffffa0c38e87 [<00000000e9739b3f>] local_pci_probe+0x41/0x80 [<000000006c69911d>] work_for_cpu_fn+0x16/0x20 [<00000000601267b5>] process_one_work+0x171/0x380 [<0000000049a0eefa>] worker_thread+0x1d1/0x3f0 [<00000000909cf2b9>] kthread+0xf8/0x130 [<0000000058f5f874>] ret_from_fork+0x35/0x40 This patch fixes the issue by: - Releasing dd->per_sdma[i].kobject in hfi1_unregister_sysfs(). - This will fix the memory leak. - Calling kobject_put() to unwind operations only for those entries in dd->per_sdma[] whose operations have succeeded (including the current one that has just failed) in hfi1_verbs_register_sysfs(). Cc: <stable@vger.kernel.org> Fixes: 0cb2aa690c7e ("IB/hfi1: Add sysfs interface for affinity setup") Link: https://lore.kernel.org/r/20200326163807.21129.27371.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-13IB/hfi1: Call kobject_put() when kobject_init_and_add() failsKaike Wan
commit dfb5394f804ed4fcea1fc925be275a38d66712ab upstream. When kobject_init_and_add() returns an error in the function hfi1_create_port_files(), the function kobject_put() is not called for the corresponding kobject, which potentially leads to memory leak. This patch fixes the issue by calling kobject_put() even if kobject_init_and_add() fails. Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200326163813.21129.44280.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-02RDMA/mlx5: Block delay drop to unprivileged usersMaor Gottlieb
commit ba80013fba656b9830ef45cd40a6a1e44707f47a upstream. It has been discovered that this feature can globally block the RX port, so it should be allowed for highly privileged users only. Fixes: 03404e8ae652("IB/mlx5: Add support to dropless RQ") Link: https://lore.kernel.org/r/20200322124906.1173790-1-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-11IB/hfi1, qib: Ensure RCU is locked when accessing listDennis Dalessandro
commit 817a68a6584aa08e323c64283fec5ded7be84759 upstream. The packet handling function, specifically the iteration of the qp list for mad packet processing misses locking RCU before running through the list. Not only is this incorrect, but the list_for_each_entry_rcu() call can not be called with a conditional check for lock dependency. Remedy this by invoking the rcu lock and unlock around the critical section. This brings MAD packet processing in line with what is done for non-MAD packets. Fixes: 7724105686e7 ("IB/hfi1: add driver files") Link: https://lore.kernel.org/r/20200225195445.140896.41873.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28IB/hfi1: Add software counter for ctxt0 seq dropMike Marciniszyn
[ Upstream commit 5ffd048698ea5139743acd45e8ab388a683642b8 ] All other code paths increment some form of drop counter. This was missed in the original implementation. Fixes: 82c2611daaf0 ("staging/rdma/hfi1: Handle packets with invalid RHF on context 0") Link: https://lore.kernel.org/r/20200106134228.119356.96828.stgit@awfm-01.aw.intel.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-28IB/hfi1: Close window for pq and request colidingMike Marciniszyn
commit be8638344c70bf492963ace206a9896606b6922d upstream. Cleaning up a pq can result in the following warning and panic: WARNING: CPU: 52 PID: 77418 at lib/list_debug.c:53 __list_del_entry+0x63/0xd0 list_del corruption, ffff88cb2c6ac068->next is LIST_POISON1 (dead000000000100) Modules linked in: mmfs26(OE) mmfslinux(OE) tracedev(OE) 8021q garp mrp ib_isert iscsi_target_mod target_core_mod crc_t10dif crct10dif_generic opa_vnic rpcrdma ib_iser libiscsi scsi_transport_iscsi ib_ipoib(OE) bridge stp llc iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crct10dif_pclmul crct10dif_common crc32_pclmul ghash_clmulni_intel ast aesni_intel ttm lrw gf128mul glue_helper ablk_helper drm_kms_helper cryptd syscopyarea sysfillrect sysimgblt fb_sys_fops drm pcspkr joydev lpc_ich mei_me drm_panel_orientation_quirks i2c_i801 mei wmi ipmi_si ipmi_devintf ipmi_msghandler nfit libnvdimm acpi_power_meter acpi_pad hfi1(OE) rdmavt(OE) rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core binfmt_misc numatools(OE) xpmem(OE) ip_tables nfsv3 nfs_acl nfs lockd grace sunrpc fscache igb ahci i2c_algo_bit libahci dca ptp libata pps_core crc32c_intel [last unloaded: i2c_algo_bit] CPU: 52 PID: 77418 Comm: pvbatch Kdump: loaded Tainted: G OE ------------ 3.10.0-957.38.3.el7.x86_64 #1 Hardware name: HPE.COM HPE SGI 8600-XA730i Gen10/X11DPT-SB-SG007, BIOS SBED1229 01/22/2019 Call Trace: [<ffffffff90365ac0>] dump_stack+0x19/0x1b [<ffffffff8fc98b78>] __warn+0xd8/0x100 [<ffffffff8fc98bff>] warn_slowpath_fmt+0x5f/0x80 [<ffffffff8ff970c3>] __list_del_entry+0x63/0xd0 [<ffffffff8ff9713d>] list_del+0xd/0x30 [<ffffffff8fddda70>] kmem_cache_destroy+0x50/0x110 [<ffffffffc0328130>] hfi1_user_sdma_free_queues+0xf0/0x200 [hfi1] [<ffffffffc02e2350>] hfi1_file_close+0x70/0x1e0 [hfi1] [<ffffffff8fe4519c>] __fput+0xec/0x260 [<ffffffff8fe453fe>] ____fput+0xe/0x10 [<ffffffff8fcbfd1b>] task_work_run+0xbb/0xe0 [<ffffffff8fc2bc65>] do_notify_resume+0xa5/0xc0 [<ffffffff90379134>] int_signal+0x12/0x17 BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 IP: [<ffffffff8fe1f93e>] kmem_cache_close+0x7e/0x300 PGD 2cdab19067 PUD 2f7bfdb067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: mmfs26(OE) mmfslinux(OE) tracedev(OE) 8021q garp mrp ib_isert iscsi_target_mod target_core_mod crc_t10dif crct10dif_generic opa_vnic rpcrdma ib_iser libiscsi scsi_transport_iscsi ib_ipoib(OE) bridge stp llc iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crct10dif_pclmul crct10dif_common crc32_pclmul ghash_clmulni_intel ast aesni_intel ttm lrw gf128mul glue_helper ablk_helper drm_kms_helper cryptd syscopyarea sysfillrect sysimgblt fb_sys_fops drm pcspkr joydev lpc_ich mei_me drm_panel_orientation_quirks i2c_i801 mei wmi ipmi_si ipmi_devintf ipmi_msghandler nfit libnvdimm acpi_power_meter acpi_pad hfi1(OE) rdmavt(OE) rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core binfmt_misc numatools(OE) xpmem(OE) ip_tables nfsv3 nfs_acl nfs lockd grace sunrpc fscache igb ahci i2c_algo_bit libahci dca ptp libata pps_core crc32c_intel [last unloaded: i2c_algo_bit] CPU: 52 PID: 77418 Comm: pvbatch Kdump: loaded Tainted: G W OE ------------ 3.10.0-957.38.3.el7.x86_64 #1 Hardware name: HPE.COM HPE SGI 8600-XA730i Gen10/X11DPT-SB-SG007, BIOS SBED1229 01/22/2019 task: ffff88cc26db9040 ti: ffff88b5393a8000 task.ti: ffff88b5393a8000 RIP: 0010:[<ffffffff8fe1f93e>] [<ffffffff8fe1f93e>] kmem_cache_close+0x7e/0x300 RSP: 0018:ffff88b5393abd60 EFLAGS: 00010287 RAX: 0000000000000000 RBX: ffff88cb2c6ac000 RCX: 0000000000000003 RDX: 0000000000000400 RSI: 0000000000000400 RDI: ffffffff9095b800 RBP: ffff88b5393abdb0 R08: ffffffff9095b808 R09: ffffffff8ff77c19 R10: ffff88b73ce1f160 R11: ffffddecddde9800 R12: ffff88cb2c6ac000 R13: 000000000000000c R14: ffff88cf3fdca780 R15: 0000000000000000 FS: 00002aaaaab52500(0000) GS:ffff88b73ce00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000010 CR3: 0000002d27664000 CR4: 00000000007607e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: [<ffffffff8fe20d44>] __kmem_cache_shutdown+0x14/0x80 [<ffffffff8fddda78>] kmem_cache_destroy+0x58/0x110 [<ffffffffc0328130>] hfi1_user_sdma_free_queues+0xf0/0x200 [hfi1] [<ffffffffc02e2350>] hfi1_file_close+0x70/0x1e0 [hfi1] [<ffffffff8fe4519c>] __fput+0xec/0x260 [<ffffffff8fe453fe>] ____fput+0xe/0x10 [<ffffffff8fcbfd1b>] task_work_run+0xbb/0xe0 [<ffffffff8fc2bc65>] do_notify_resume+0xa5/0xc0 [<ffffffff90379134>] int_signal+0x12/0x17 Code: 00 00 ba 00 04 00 00 0f 4f c2 3d 00 04 00 00 89 45 bc 0f 84 e7 01 00 00 48 63 45 bc 49 8d 04 c4 48 89 45 b0 48 8b 80 c8 00 00 00 <48> 8b 78 10 48 89 45 c0 48 83 c0 10 48 89 45 d0 48 8b 17 48 39 RIP [<ffffffff8fe1f93e>] kmem_cache_close+0x7e/0x300 RSP <ffff88b5393abd60> CR2: 0000000000000010 The panic is the result of slab entries being freed during the destruction of the pq slab. The code attempts to quiesce the pq, but looking for n_req == 0 doesn't account for new requests. Fix the issue by using SRCU to get a pq pointer and adjust the pq free logic to NULL the fd pq pointer prior to the quiesce. Fixes: e87473bc1b6c ("IB/hfi1: Only set fd pointer when base context is completely initialized") Link: https://lore.kernel.org/r/20200210131033.87408.81174.stgit@awfm-01.aw.intel.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-14IB/mlx5: Fix outstanding_pi index for GSI qpsPrabhath Sajeepa
commit b5671afe5e39ed71e94eae788bacdcceec69db09 upstream. Commit b0ffeb537f3a ("IB/mlx5: Fix iteration overrun in GSI qps") changed the way outstanding WRs are tracked for the GSI QP. But the fix did not cover the case when a call to ib_post_send() fails and updates index to track outstanding. Since the prior commmit outstanding_pi should not be bounded otherwise the loop generate_completions() will fail. Fixes: b0ffeb537f3a ("IB/mlx5: Fix iteration overrun in GSI qps") Link: https://lore.kernel.org/r/1576195889-23527-1-git-send-email-psajeepa@purestorage.com Signed-off-by: Prabhath Sajeepa <psajeepa@purestorage.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-27RDMA/hns: Fixs hw access invalid dma memory errorXi Wang
[ Upstream commit ec5bc2cc69b4fc494e04d10fc5226f6f9cf67c56 ] When smmu is enable, if execute the perftest command and then use 'kill -9' to exit, follow this operation repeatedly, the kernel will have a high probability to print the following smmu event: arm-smmu-v3 arm-smmu-v3.1.auto: event 0x10 received: arm-smmu-v3 arm-smmu-v3.1.auto: 0x00007d0000000010 arm-smmu-v3 arm-smmu-v3.1.auto: 0x0000020900000080 arm-smmu-v3 arm-smmu-v3.1.auto: 0x00000000f47cf000 arm-smmu-v3 arm-smmu-v3.1.auto: 0x00000000f47cf000 This is because the hw will periodically refresh the qpc cache until the next reset. This patch fixed it by removing the action that release qpc memory in the 'hns_roce_qp_free' function. Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver") Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27RDMA/qedr: Fix incorrect device rate.Sagiv Ozeri
[ Upstream commit 69054666df0a9b4e8331319f98b6b9a88bc3fcc4 ] Use the correct enum value introduced in commit 12113a35ada6 ("IB/core: Add HDR speed enum") Prior to this change a 50Gbps port would show 40Gbps. This patch also cleaned up the redundant redefiniton of ib speeds for qedr. Fixes: 12113a35ada6 ("IB/core: Add HDR speed enum") Signed-off-by: Sagiv Ozeri <sagiv.ozeri@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27IB/mlx5: Add missing XRC options to QP optional params maskJack Morgenstein
[ Upstream commit 8f4426aa19fcdb9326ac44154a117b1a3a5ae126 ] The QP transition optional parameters for the various transition for XRC QPs are identical to those for RC QPs. Many of the XRC QP transition optional parameter bits are missing from the QP optional mask table. These omissions caused failures when doing XRC QP state transitions. For example, when trying to change the response timer of an XRC receive QP via the RTS2RTS transition, the new timer value was ignored because MLX5_QP_OPTPAR_RNR_TIMEOUT bit was missing from the optional params mask for XRC qps for the RTS2RTS transition. Fix this by adding the missing XRC optional parameters for all QP transitions to the opt_mask table. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Fixes: a4774e9095de ("IB/mlx5: Fix opt param mask according to firmware spec") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27iw_cxgb4: use tos when finding ipv6 routesSteve Wise
[ Upstream commit c8a7eb554a83214c3d8ee5cb322da8c72810d2dc ] When IPv6 support was added, the correct tos was not passed to cxgb_find_route6(). This potentially results in the wrong route entry. Fixes: 830662f6f032 ("RDMA/cxgb4: Add support for active and passive open connection with IPv6 address") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27iw_cxgb4: use tos when importing the endpointSteve Wise
[ Upstream commit cb3ba0bde881f0cb7e3945d2a266901e2bd18c92 ] import_ep() is passed the correct tos, but doesn't use it correctly. Fixes: ac8e4c69a021 ("cxgb4/iw_cxgb4: TOS support") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27RDMA/iw_cxgb4: Fix the unchecked ep dereferenceRaju Rangoju
[ Upstream commit 3352976c892301fd576a2e9ff0ac7337b2e2ca48 ] The patch 944661dd97f4: "RDMA/iw_cxgb4: atomically lookup ep and get a reference" from May 6, 2016, leads to the following Smatch complaint: drivers/infiniband/hw/cxgb4/cm.c:2953 terminate() error: we previously assumed 'ep' could be null (see line 2945) Fixes: 944661dd97f4 ("RDMA/iw_cxgb4: atomically lookup ep and get a reference") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Raju Rangoju <rajur@chelsio.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27RDMA/qedr: Fix out of bounds index check in query pkeyGal Pressman
[ Upstream commit dbe30dae487e1a232158c24b432d45281c2805b7 ] The pkey table size is QEDR_ROCE_PKEY_TABLE_LEN, index should be tested for >= QEDR_ROCE_PKEY_TABLE_LEN instead of > QEDR_ROCE_PKEY_TABLE_LEN. Fixes: a7efd7773e31 ("qedr: Add support for PD,PKEY and CQ verbs") Signed-off-by: Gal Pressman <galpress@amazon.com> Acked-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27RDMA/ocrdma: Fix out of bounds index check in query pkeyGal Pressman
[ Upstream commit b188940796c7be31c1b8c25a9a0e0842c2e7a49e ] The pkey table size is one element, index should be tested for > 0 instead of > 1. Fixes: fe2caefcdf58 ("RDMA/ocrdma: Add driver for Emulex OneConnect IBoE RDMA adapter") Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27IB/usnic: Fix out of bounds index check in query pkeyGal Pressman
[ Upstream commit 4959d5da5737dd804255c75b8cea0a2929ce279a ] The pkey table size is one element, index should be tested for > 0 instead of > 1. Fixes: e3cf00d0a87f ("IB/usnic: Add Cisco VIC low-level hardware driver") Signed-off-by: Gal Pressman <galpress@amazon.com> Acked-by: Parvi Kaustubhi <pkaustub@cisco.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27IB/hfi1: Add mtu check for operational data VLsAlex Estrin
[ Upstream commit eb50130964e8c1379f37c3d3bab33a411ec62e98 ] Since Virtual Lanes BCT credits and MTU are set through separate MADs, we have to ensure both are valid, and data VLs are ready for transmission before we allow port transition to Armed state. Fixes: 5e2d6764a729 ("IB/hfi1: Verify port data VLs credits on transition to Armed") Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Alex Estrin <alex.estrin@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-17RDMA/mlx5: Return proper error valueLeon Romanovsky
commit 546d30099ed204792083f043cd7e016de86016a3 upstream. Returned value from mlx5_mr_cache_alloc() is checked to be error or real pointer. Return proper error code instead of NULL which is not checked later. Fixes: 81713d3788d2 ("IB/mlx5: Add implicit MR support") Link: https://lore.kernel.org/r/20191029055721.7192-1-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>