aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/hw/qedr/verbs.c
AgeCommit message (Collapse)Author
2020-09-04RDMA/qedr: SRQ's bug fixesYuval Basson
commit acca72e2b031b9fbb4184511072bd246a0abcebc upstream. QP's with the same SRQ, working on different CQs and running in parallel on different CPUs could lead to a race when maintaining the SRQ consumer count, and leads to FW running out of SRQs. Update the consumer atomically. Make sure the wqe_prod is updated after the sge_prod due to FW requirements. Fixes: 3491c9e799fb ("qedr: Add support for kernel mode SRQ's") Link: https://lore.kernel.org/r/20200708195526.31040-1-ybason@marvell.com Signed-off-by: Michal Kalderon <mkalderon@marvell.com> Signed-off-by: Yuval Basson <ybason@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-06-09RDMA/qedr: Fix synchronization methods and memory leaks in qedrMichal Kalderon
commit 82af6d19d8d9227c22a53ff00b40fb2a4f9fce69 upstream. Re-design of the iWARP CM related objects reference counting and synchronization methods, to ensure operations are synchronized correctly and that memory allocated for "ep" is properly released. Also makes sure QP memory is not released before ep is finished accessing it. Where as the QP object is created/destroyed by external operations, the ep is created/destroyed by internal operations and represents the tcp connection associated with the QP. QP destruction flow: - needs to wait for ep establishment to complete (either successfully or with error) - needs to wait for ep disconnect to be fully posted to avoid a race condition of disconnect being called after reset. - both the operations above don't always happen, so we use atomic flags to indicate whether the qp destruction flow needs to wait for these completions or not, if the destroy is called before these operations began, the flows will check the flags and not execute them ( connect / disconnect). We use completion structure for waiting for the completions mentioned above. The QP refcnt was modified to kref object. The EP has a kref added to it to handle additional worker thread accessing it. Memory Leaks - https://www.spinics.net/lists/linux-rdma/msg83762.html Concurrency not managed correctly - https://www.spinics.net/lists/linux-rdma/msg67949.html Fixes: de0089e692a9 ("RDMA/qedr: Add iWARP connection management qp related callbacks") Link: https://lore.kernel.org/r/20191027200451.28187-4-michal.kalderon@marvell.com Reported-by: Chuck Lever <chuck.lever@oracle.com> Reported-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-06-09RDMA/qedr: Fix qpids xarray api usedMichal Kalderon
commit 5fdff18b4dc64e2d1e912ad2b90495cd487f791b upstream. The qpids xarray isn't accessed from irq context and therefore there is no need to use the xa_XXX_irq version of the apis. Remove the _irq. Fixes: b6014f9e5f39 ("qedr: Convert qpidr to XArray") Link: https://lore.kernel.org/r/20191027200451.28187-3-michal.kalderon@marvell.com Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-02-24RDMA/qedr: Fix memory leak in user qp and mrMichal Kalderon
commit 24e412c1e00ebfe73619e6b88cbc26c2c7d41b85 upstream. User QPs pbl's weren't freed properly. MR pbls weren't freed properly. Fixes: e0290cce6ac0 ("qedr: Add support for memory registeration verbs") Link: https://lore.kernel.org/r/20191027200451.28187-5-michal.kalderon@marvell.com Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-09Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma updates from Jason Gunthorpe: "This has been a smaller cycle than normal. One new driver was accepted, which is unusual, and at least one more driver remains in review on the list. Summary: - Driver fixes for hns, hfi1, nes, rxe, i40iw, mlx5, cxgb4, vmw_pvrdma - Many patches from MatthewW converting radix tree and IDR users to use xarray - Introduction of tracepoints to the MAD layer - Build large SGLs at the start for DMA mapping and get the driver to split them - Generally clean SGL handling code throughout the subsystem - Support for restricting RDMA devices to net namespaces for containers - Progress to remove object allocation boilerplate code from drivers - Change in how the mlx5 driver shows representor ports linked to VFs - mlx5 uapi feature to access the on chip SW ICM memory - Add a new driver for 'EFA'. This is HW that supports user space packet processing through QPs in Amazon's cloud" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (186 commits) RDMA/ipoib: Allow user space differentiate between valid dev_port IB/core, ipoib: Do not overreact to SM LID change event RDMA/device: Don't fire uevent before device is fully initialized lib/scatterlist: Remove leftover from sg_page_iter comment RDMA/efa: Add driver to Kconfig/Makefile RDMA/efa: Add the efa module RDMA/efa: Add EFA verbs implementation RDMA/efa: Add common command handlers RDMA/efa: Implement functions that submit and complete admin commands RDMA/efa: Add the ABI definitions RDMA/efa: Add the com service API definitions RDMA/efa: Add the efa_com.h file RDMA/efa: Add the efa.h header file RDMA/efa: Add EFA device definitions RDMA: Add EFA related definitions RDMA/umem: Remove hugetlb flag RDMA/bnxt_re: Use core helpers to get aligned DMA address RDMA/i40iw: Use core helpers to get aligned DMA address within a supported page size RDMA/verbs: Add a DMA iterator to return aligned contiguous memory blocks RDMA/umem: Add API to find best driver supported page size in an MR ...
2019-05-03RDMA: Introduce and use GID attr helper to read RoCE L2 fieldsParav Pandit
Instead of RoCE drivers figuring out vlan, smac fields while working on QP/AH, provide a helper routine to read the L2 fields such as vlan_id and source mac address. This moves logic from mlx5 driver to core for wider usage for RoCE ports. This is a preparation patch to allow detaching netdev in subsequent patch. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-08RDMA: Handle SRQ allocations by IB/coreLeon Romanovsky
Convert SRQ allocation from drivers to be in the IB/core Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-08RDMA: Handle AH allocations by IB/coreLeon Romanovsky
Simplify drivers by ensuring lifetime of ib_ah object. The changes in .create_ah() go hand in hand with relevant update in .destroy_ah(). We will use this opportunity and convert .destroy_ah() to don't fail, as it was suggested a long time ago, because there is nothing to do in case of failure during destroy. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-08drivers: Remove explicit invocations of mmiowb()Will Deacon
mmiowb() is now implied by spin_unlock() on architectures that require it, so there is no reason to call it from driver code. This patch was generated using coccinelle: @mmiowb@ @@ - mmiowb(); and invoked as: $ for d in drivers include/linux/qed sound; do \ spatch --include-headers --sp-file mmiowb.cocci --dir $d --in-place; done NOTE: mmiowb() has only ever guaranteed ordering in conjunction with spin_unlock(). However, pairing each mmiowb() removal in this patch with the corresponding call to spin_unlock() is not at all trivial, so there is a small chance that this change may regress any drivers incorrectly relying on mmiowb() to order MMIO writes between CPUs using lock-free synchronisation. If you've ended up bisecting to this commit, you can reintroduce the mmiowb() calls using wmb() instead, which should restore the old behaviour on all architectures other than some esoteric ia64 systems. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-08drivers: Remove useless trailing comments from mmiowb() invocationsWill Deacon
In preparation for using coccinelle to remove all mmiowb() instances from drivers, remove all trailing comments since they won't be picked up by spatch later on and will end up being preserved in the code. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-01IB: Pass only ib_udata in function prototypesShamir Rabinovitch
Now when ib_udata is passed to all the driver's object create/destroy APIs the ib_udata will carry the ib_ucontext for every user command. There is no need to also pass the ib_ucontext via the functions prototypes. Make ib_udata the only argument psssed. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-01IB: Remove 'uobject->context' dependency in object destroy APIsShamir Rabinovitch
Now that we have the udata passed to all the ib_xxx object destroy APIs and the additional macro 'rdma_udata_to_drv_context' to get the ib_ucontext from ib_udata stored in uverbs_attr_bundle, we can finally start to remove the dependency of the drivers in the ib_xxx->uobject->context. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-01IB: Pass uverbs_attr_bundle down ib_x destroy pathShamir Rabinovitch
The uverbs_attr_bundle with the ucontext is sent down to the drivers ib_x destroy path as ib_udata. The next patch will use the ib_udata to free the drivers destroy path from the dependency in 'uobject->context' as we already did for the create path. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-29qedr: Convert srqidr to XArrayMatthew Wilcox
Signed-off-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-29qedr: Convert qpidr to XArrayMatthew Wilcox
Signed-off-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-22RDMA: Handle ucontext allocations by IB/coreLeon Romanovsky
Following the PD conversion patch, do the same for ucontext allocations. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-15IB/{hw,sw}: Remove 'uobject->context' dependency in object creation APIsShamir Rabinovitch
Now when we have the udata passed to all the ib_xxx object creation APIs and the additional macro 'rdma_udata_to_drv_context' to get the ib_ucontext from ib_udata stored in uverbs_attr_bundle, we can finally start to remove the dependency of the drivers in the ib_xxx->uobject->context. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-11RDMA/qedr: Use for_each_sg_dma_page iterator on umem SGLShiraz, Saleem
Use the for_each_sg_dma_page iterator variant to walk the umem DMA-mapped SGL and get the page DMA address. This avoids the extra loop to iterate pages in the SGE when for_each_sg iterator is used. Additionally, purge umem->page_shift usage in the driver as its only relevant for ODP MRs. Use system page size and shift instead. Signed-off-by: Shiraz, Saleem <shiraz.saleem@intel.com> Acked-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08RDMA: Handle PD allocations by IB/coreLeon Romanovsky
The PD allocations in IB/core allows us to simplify drivers and their error flows in their .alloc_pd() paths. The changes in .alloc_pd() go hand in had with relevant update in .dealloc_pd(). We will use this opportunity and convert .dealloc_pd() to don't fail, as it was suggested a long time ago, failures are not happening as we have never seen a WARN_ON print. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-29Merge branch 'devx-async' into k.o/for-nextJason Gunthorpe
Yishai Hadas says: Enable DEVX asynchronous query commands This series enables querying a DEVX object in an asynchronous mode. The userspace application won't block when calling the firmware and it will be able to get the response back once that it will be ready. To enable the above functionality: - DEVX asynchronous command completion FD object was introduced. - The applicable file operations were implemented to enable using it by the user application. - Query asynchronous method was added to the DEVX object, it will call the firmware asynchronously and manages the response on the given input FD. - Hot unplug support was added for the FD to work properly upon unbind/disassociate. - mlx5 core fence for asynchronous commands was implemented and used to prevent racing upon unbind/disassociate. This branch is based on mlx5-next & v5.0-rc2 due to dependencies, from git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux * branch 'devx-async': IB/mlx5: Implement DEVX hot unplug for async command FD IB/mlx5: Implement the file ops of DEVX async command FD IB/mlx5: Introduce async DEVX obj query API IB/mlx5: Introduce MLX5_IB_OBJECT_DEVX_ASYNC_CMD_FD Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-18IB/hw: Remove unneeded semicolonsYueHaibing
Remove unneeded semicolons. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-18RDMA/qedr: remove set but not used variable 'ib_ctx'YueHaibing
Fixes gcc '-Wunused-but-set-variable' warning: drivers/infiniband/hw/qedr/verbs.c: In function 'qedr_create_srq': drivers/infiniband/hw/qedr/verbs.c:1436:22: warning: variable 'ib_ctx' set but not used [-Wunused-but-set-variable] drivers/infiniband/hw/qedr/verbs.c: In function 'qedr_create_user_qp': drivers/infiniband/hw/qedr/verbs.c:1701:22: warning: variable 'ib_ctx' set but not used [-Wunused-but-set-variable] Fixes: b0ea0fa5435f ("IB/{core,hw}: Have ib_umem_get extract the ib_ucontext from ib_udata") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-10RDMA/qedr: Fix out of bounds index check in query pkeyGal Pressman
The pkey table size is QEDR_ROCE_PKEY_TABLE_LEN, index should be tested for >= QEDR_ROCE_PKEY_TABLE_LEN instead of > QEDR_ROCE_PKEY_TABLE_LEN. Fixes: a7efd7773e31 ("qedr: Add support for PD,PKEY and CQ verbs") Signed-off-by: Gal Pressman <galpress@amazon.com> Acked-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-10IB/{core,hw}: Have ib_umem_get extract the ib_ucontext from ib_udataJason Gunthorpe
ib_umem_get() can only be called in a method callback, which always has a udata parameter. This allows ib_umem_get() to derive the ucontext pointer directly from the udata without requiring the drivers to find it in some way or another. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
2019-01-08cross-tree: phase out dma_zalloc_coherent()Luis Chamberlain
We already need to zero out memory for dma_alloc_coherent(), as such using dma_zalloc_coherent() is superflous. Phase it out. This change was generated with the following Coccinelle SmPL patch: @ replace_dma_zalloc_coherent @ expression dev, size, data, handle, flags; @@ -dma_zalloc_coherent(dev, size, handle, flags) +dma_alloc_coherent(dev, size, handle, flags) Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> [hch: re-ran the script on the latest tree] Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-19RDMA: Mark if destroy address handle is in a sleepable contextGal Pressman
Introduce a 'flags' field to destroy address handle callback and add a flag that marks whether the callback is executed in an atomic context or not. This will allow drivers to wait for completion instead of polling for it when it is allowed. Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-19RDMA: Mark if create address handle is in a sleepable contextGal Pressman
Introduce a 'flags' field to create address handle callback and add a flag that marks whether the callback is executed in an atomic context or not. This will allow drivers to wait for completion instead of polling for it when it is allowed. Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-18RDMA: Cleanup undesired pd->uobject usageShamir Rabinovitch
Drivers should be using udata to determine if a method is invoked from user space or kernel space. A pd does not necessarily say a different objects is kernel or user. Transforming the tests to use udata eliminates a large number of uobject references from the drivers. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-11IB/{mlx5,ocrdma,qedr,rxe}: Omit port validation from IB verbsYuval Shaia
RDMA core layer already make sure port is valid, no need to check it here again. For the pkey validation this depends on commit b3ac5742fead ("RDMA/core: Validate port number in query_pkey verb") Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-03RDMA: Remove unused parameter from ib_modify_qp_is_ok()Kamal Heib
The ll parameter is not used in ib_modify_qp_is_ok(), so remove it. Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-05RDMA/qedr: remove set but not used variable 'ctx'YueHaibing
Fixes gcc '-Wunused-but-set-variable' warning: drivers/infiniband/hw/qedr/verbs.c: In function 'qedr_create_srq': drivers/infiniband/hw/qedr/verbs.c:1450:24: warning: variable 'ctx' set but not used [-Wunused-but-set-variable] Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Rahul Verma <rahul.verma@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-16Merge tag 'v4.18' into rdma.git for-nextJason Gunthorpe
Resolve merge conflicts from the -rc cycle against the rdma.git tree: Conflicts: drivers/infiniband/core/uverbs_cmd.c - New ifs added to ib_uverbs_ex_create_flow in -rc and for-next - Merge removal of file->ucontext in for-next with new code in -rc drivers/infiniband/core/uverbs_main.c - for-next removed code from ib_uverbs_write() that was modified in for-rc Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-14qedr: Add user space support for SRQYuval Bason
This patch adds support for SRQ's created in user space and update qedr_affiliated_event to deal with general SRQ events. Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com> Signed-off-by: Yuval Bason <yuval.bason@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-14qedr: Add support for kernel mode SRQ'sYuval Bason
Implement the SRQ specific verbs and update the poll_cq verb to deal with SRQ completions. Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com> Signed-off-by: Yuval Bason <yuval.bason@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-14qedr: Add wrapping generic structure for qpidr and adjust idr routines.Yuval Bason
Today, we are using idr mechanism for QP's only. This patch prepares the qedr_idr stuctures and the idr routines for both QP's and SRQ's. Signed-off-by: Yuval Bason <yuval.bason@cavium.com> Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-30RDMA, core and ULPs: Declare ib_post_send() and ib_post_recv() arguments constBart Van Assche
Since neither ib_post_send() nor ib_post_recv() modify the data structure their second argument points at, declare that argument const. This change makes it necessary to declare the 'bad_wr' argument const too and also to modify all ULPs that call ib_post_send(), ib_post_recv() or ib_post_srq_recv(). This patch does not change any functionality but makes it possible for the compiler to verify whether the ib_post_(send|recv|srq_recv) really do not modify the posted work request. To make this possible, only one cast had to be introduce that casts away constness, namely in rpcrdma_post_recvs(). The only way I can think of to avoid that cast is to introduce an additional loop in that function or to change the data type of bad_wr from struct ib_recv_wr ** into int (an index that refers to an element in the work request list). However, both approaches would require even more extensive changes than this patch. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-30RDMA: Constify the argument of the work request conversion functionsBart Van Assche
When posting a send work request, the work request that is posted is not modified by any of the RDMA drivers. Make this explicit by constifying most ib_send_wr pointers in RDMA transport drivers. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-10RDMA: Fix storage of PortInfo CapabilityMask in the kernelJason Gunthorpe
The internal flag IP_BASED_GIDS was added to a field that was being used to hold the port Info CapabilityMask without considering the effects this will have. Since most drivers just use the value from the HW MAD it means IP_BASED_GIDS will also become set on any HW that sets the IBA flag IsOtherLocalChangesNoticeSupported - which is not intended. Fix this by keeping port_cap_flags only for the IBA CapabilityMask value and store unrelated flags externally. Move the bit definitions for this to ib_mad.h to make it clear what is happening. To keep the uAPI unchanged define a new set of flags in the uapi header that are only used by ib_uverbs_query_port_resp.port_cap_flags which match the current flags supported in rdma-core, and the values exposed by the current kernel. Fixes: b4a26a27287a ("IB: Report using RoCE IP based gids in port caps") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-06-21Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma fixes from Jason Gunthorpe: "Here are eight fairly small fixes collected over the last two weeks. Regression and crashing bug fixes: - mlx4/5: Fixes for issues found from various checkers - A resource tracking and uverbs regression in the core code - qedr: NULL pointer regression found during testing - rxe: Various small bugs" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: IB/rxe: Fix missing completion for mem_reg work requests RDMA/core: Save kernel caller name when creating CQ using ib_create_cq() IB/uverbs: Fix ordering of ucontext check in ib_uverbs_write IB/mlx4: Fix an error handling path in 'mlx4_ib_rereg_user_mr()' RDMA/qedr: Fix NULL pointer dereference when running over iWARP without RDMA-CM IB/mlx5: Fix return value check in flow_counters_set_data() IB/mlx5: Fix memory leak in mlx5_ib_create_flow IB/rxe: avoid double kfree skb
2018-06-18IB/core: add max_send_sge and max_recv_sge attributesSteve Wise
This patch replaces the ib_device_attr.max_sge with max_send_sge and max_recv_sge. It allows ulps to take advantage of devices that have very different send and recv sge depths. For example cxgb4 has a max_recv_sge of 4, yet a max_send_sge of 16. Splitting out these attributes allows much more efficient use of the SQ for cxgb4 with ulps that use the RDMA_RW API. Consider a large RDMA WRITE that has 16 scattergather entries. With max_sge of 4, the ulp would send 4 WRITE WRs, but with max_sge of 16, it can be done with 1 WRITE WR. Acked-by: Sagi Grimberg <sagi@grimberg.me> Acked-by: Christoph Hellwig <hch@lst.de> Acked-by: Selvin Xavier <selvin.xavier@broadcom.com> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-18RDMA: Convert drivers to use sgid_attr instead of sgid_indexParav Pandit
The core code now ensures that all driver callbacks that receive an rdma_ah_attrs will have a sgid_attr's pointer if there is a GRH present. Drivers can use this pointer instead of calling a query function with sgid_index. This simplifies the drivers and also avoids races where a gid_index lookup may return different data if it is changed. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-06-18IB{cm, core}: Introduce and use ah_attr copy, move, replace APIsJason Gunthorpe
Introduce AH attribute copy, move and replace APIs to be used by core and provider drivers. In CM code flow when ah attribute might be re-initialized twice while processing incoming request, or initialized once while from path record while sending out CM requests. Therefore use rdma_move_ah_attr API to handle such scenarios instead of memcpy(). Provider drivers keeps a copy ah_attr during the lifetime of the ah. Therefore, use rdma_replace_ah_attr() which conditionally release reference to old ah_attr and holds reference to new attribute whose referrence is released when the AH is freed. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-06-12treewide: kzalloc() -> kcalloc()Kees Cook
The kzalloc() function has a 2-factor argument form, kcalloc(). This patch replaces cases of: kzalloc(a * b, gfp) with: kcalloc(a * b, gfp) as well as handling cases of: kzalloc(a * b * c, gfp) with: kzalloc(array3_size(a, b, c), gfp) as it's slightly less ugly than: kzalloc_array(array_size(a, b), c, gfp) This does, however, attempt to ignore constant size factors like: kzalloc(4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( kzalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) | kzalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( kzalloc( - sizeof(u8) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(char) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(u8) * COUNT + COUNT , ...) | kzalloc( - sizeof(__u8) * COUNT + COUNT , ...) | kzalloc( - sizeof(char) * COUNT + COUNT , ...) | kzalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_ID) + COUNT_ID, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_ID + COUNT_ID, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_CONST) + COUNT_CONST, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_CONST + COUNT_CONST, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_ID) + COUNT_ID, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * COUNT_ID + COUNT_ID, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_CONST) + COUNT_CONST, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * COUNT_CONST + COUNT_CONST, sizeof(THING) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ - kzalloc + kcalloc ( - SIZE * COUNT + COUNT, SIZE , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( kzalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( kzalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kzalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kzalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( kzalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products, // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( kzalloc(C1 * C2 * C3, ...) | kzalloc( - (E1) * E2 * E3 + array3_size(E1, E2, E3) , ...) | kzalloc( - (E1) * (E2) * E3 + array3_size(E1, E2, E3) , ...) | kzalloc( - (E1) * (E2) * (E3) + array3_size(E1, E2, E3) , ...) | kzalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants, // keeping sizeof() as the second factor argument. @@ expression THING, E1, E2; type TYPE; constant C1, C2, C3; @@ ( kzalloc(sizeof(THING) * C2, ...) | kzalloc(sizeof(TYPE) * C2, ...) | kzalloc(C1 * C2 * C3, ...) | kzalloc(C1 * C2, ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * (E2) + E2, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * E2 + E2, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (E2) + E2, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * E2 + E2, sizeof(THING) , ...) | - kzalloc + kcalloc ( - (E1) * E2 + E1, E2 , ...) | - kzalloc + kcalloc ( - (E1) * (E2) + E1, E2 , ...) | - kzalloc + kcalloc ( - E1 * E2 + E1, E2 , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-11RDMA/qedr: Fix NULL pointer dereference when running over iWARP without RDMA-CMKalderon, Michal
Some RoCE specific code in qedr_modify_qp was run over an iWARP device when running perftest benchmarks without the -R option. The commit 3e44e0ee0893 ("IB/providers: Avoid null netdev check for RoCE") exposed this. Dropping the check for NULL pointer on ndev in qedr_modify_qp lead to a null pointer dereference when running over iWARP. Before the code would identify ndev as being NULL and return an error. Fixes: 3e44e0ee0893 ("IB/providers: Avoid null netdev check for RoCE") Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-07Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma updates from Jason Gunthorpe: "This has been a quiet cycle for RDMA, the big bulk is the usual smallish driver updates and bug fixes. About four new uAPI related things. Not as much Szykaller patches this time, the bugs it finds are getting harder to fix. Summary: - More work cleaning up the RDMA CM code - Usual driver bug fixes and cleanups for qedr, qib, hfi1, hns, i40iw, iw_cxgb4, mlx5, rxe - Driver specific resource tracking and reporting via netlink - Continued work for name space support from Parav - MPLS support for the verbs flow steering uAPI - A few tricky IPoIB fixes improving robustness - HFI1 driver support for the '16B' management packet format - Some auditing to not print kernel pointers via %llx or similar - Mark the entire 'UCM' user-space interface as BROKEN with the intent to remove it entirely. The user space side of this was long ago replaced with RDMA-CM and syzkaller is finding bugs in the residual UCM interface nobody wishes to fix because nobody uses it. - Purge more bogus BUG_ON's from Leon - 'flow counters' verbs uAPI - T10 fixups for iser/isert, these are Acked by Martin but going through the RDMA tree due to dependencies" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (138 commits) RDMA/mlx5: Update SPDX tags to show proper license RDMA/restrack: Change SPDX tag to properly reflect license IB/hfi1: Fix comment on default hdr entry size IB/hfi1: Rename exp_lock to exp_mutex IB/hfi1: Add bypass register defines and replace blind constants IB/hfi1: Remove unused variable IB/hfi1: Ensure VL index is within bounds IB/hfi1: Fix user context tail allocation for DMA_RTAIL IB/hns: Use zeroing memory allocator instead of allocator/memset infiniband: fix a possible use-after-free bug iw_cxgb4: add INFINIBAND_ADDR_TRANS dependency IB/isert: use T10-PI check mask definitions from core layer IB/iser: use T10-PI check mask definitions from core layer RDMA/core: introduce check masks for T10-PI offload IB/isert: fix T10-pi check mask setting IB/mlx5: Add counters read support IB/mlx5: Add flow counters read support IB/mlx5: Add flow counters binding support IB/mlx5: Add counters create and destroy support IB/uverbs: Add support for flow counters ...
2018-06-05qed*: Utilize FW 8.37.2.0Michal Kalderon
This FW contains several fixes and features. RDMA - Several modifications and fixes for Memory Windows - drop vlan and tcp timestamp from mss calculation in driver for this FW - Fix SQ completion flow when local ack timeout is infinite - Modifications in t10dif support ETH - Fix aRFS for tunneled traffic without inner IP. - Fix chip configuration which may fail under heavy traffic conditions. - Support receiving any-VNI in VXLAN and GENEVE RX classification. iSCSI / FcoE - Fix iSCSI recovery flow - Drop vlan and tcp timestamp from mss calc for fw 8.37.2.0 Misc - Several registers (split registers) won't read correctly with ethtool -d Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Manish Rangankar <manish.rangankar@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-31RDMA/qedr: fix spelling mistake: "adrresses" -> "addresses"Colin Ian King
Trivial fix to spelling mistake in DP_ERR error message Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-05-28Merge branch 'mr_fix' into ↵Jason Gunthorpe
git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma for-next Update mlx4 to support user MR creation against read-only memory, previously it required the memory to be writable. Based on rdma for-rc due to dependencies. * mr_fix: (2 commits) IB/mlx4: Mark user MR as writable if actual virtual memory is writable IB/core: Make testing MR flags for writability a static inline function
2018-05-23RDMA/qedr: Fix doorbell bar mapping for dpi > 1Kalderon, Michal
Each user_context receives a separate dpi value and thus a different address on the doorbell bar. The qedr_mmap function needs to validate the address and map the doorbell bar accordingly. The current implementation always checked against dpi=0 doorbell range leading to a wrong mapping for doorbell bar. (It entered an else case that mapped the address differently). qedr_mmap should only be used for doorbells, so the else was actually wrong in the first place. This only has an affect on arm architecture and not an issue on a x86 based architecture. This lead to doorbells not occurring on arm based systems and left applications that use more than one dpi (or several applications run simultaneously ) to hang. Fixes: ac1b36e55a51 ("qedr: Add support for user context verbs") Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-05-01RDMA/qedr: fix spelling mistake: "failes" -> "fails"Colin Ian King
Trivial fix to spelling mistake in DP_ERR error message Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Doug Ledford <dledford@redhat.com>