summaryrefslogtreecommitdiffstats
path: root/drivers/block/nvme-core.c
AgeCommit message (Collapse)Author
2015-10-15NVMe: Fix memory leak on retried commandsKeith Busch
Resources are reallocated for requeued commands, so unmap and release the iod for the failed command. It's a pretty bad memory leak and causes a kernel hang if you remove a drive because of a busy dma pool. You'll get messages spewing like this: nvme 0000:xx:xx.x: dma_pool_destroy prp list 256, ffff880420dec000 busy and lock up pci and the driver since removal never completes while holding a lock. Cc: stable@vger.kernel.org Cc: <stable@vger.kernel.org> # 4.0.x- Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-15nvme: use an integer value to Linux errno valuesChristoph Hellwig
Use a separate integer variable to hold the signed Linux errno values we pass back to the block layer. Note that for pass through commands those might still be NVMe values, but those fit into the int as well. Fixes: f4829a9b7a61: ("blk-mq: fix racy updates of rq->errors") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-12nvme: fix 32-bit build warningArnd Bergmann
Compiling the nvme driver on 32-bit warns about a cast from a __u64 variable to a pointer: drivers/block/nvme-core.c: In function 'nvme_submit_io': drivers/block/nvme-core.c:1847:4: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] (void __user *)io.addr, length, NULL, 0); The cast here is intentional and safe, so we can shut up the gcc warning by adding an intermediate cast to 'uintptr_t'. I had previously submitted a patch to fix this problem in the nvme driver, but it was accepted on the same day that two new warnings got added. For clarification, I also change the third instance of this cast to use uintptr_t instead of unsigned long now. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: d29ec8241c10e ("nvme: submit internal commands through the block layer") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-01blk-mq: fix racy updates of rq->errorsChristoph Hellwig
blk_mq_complete_request may be a no-op if the request has already been completed by others means (e.g. a timeout or cancellation), but currently drivers have to set rq->errors before calling blk_mq_complete_request, which might leave us with the wrong error value. Add an error parameter to blk_mq_complete_request so that we can defer setting rq->errors until we known we won the race to complete the request. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-09-23NVMe: Set affinity after allocating request queuesKeith Busch
The asynchronous namespace scanning caused affinity hints to be set before its tagset initialized, so there was no cpu mask to set the hint. This patch moves the affinity hint setting to after namespaces are scanned. Reported-by: 김경산 <ks0204.kim@samsung.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-09-02Merge branch 'for-4.3/drivers' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block driver updates from Jens Axboe: "On top of the 4.3 core block IO changes, here are the driver related changes for 4.3. Basically just NVMe and nbd this time around: - NVMe: - PRACT PI improvement from Alok Pandey. - Cleanups and improvements on submission queue doorbell and writing, using CMB if available. From Jon Derrick. - From Keith, support for setting queue maximum segments, and reset support. - Also from Jon, fixup of u64 division issue on 32-bit archs and wiring up of the reset support through and ioctl. - Two small cleanups from Matias and Sunad - Various code cleanups and fixes from Markus Pargmann" * 'for-4.3/drivers' of git://git.kernel.dk/linux-block: NVMe: Using PRACT bit to generate and verify PI by controller NVMe:Remove unreachable code in nvme_abort_req NVMe: Add nvme subsystem reset IOCTL NVMe: Add nvme subsystem reset support NVMe: removed unused nn var from nvme_dev_add NVMe: Set queue max segments nbd: flags is a u32 variable nbd: Rename functions for clearness of recv/send path nbd: Change 'disconnect' to be boolean nbd: Add debugfs entries nbd: Remove variable 'pid' nbd: Move clear queue debug message nbd: Remove 'harderror' and propagate error properly nbd: restructure sock_shutdown nbd: sock_shutdown, remove conditional lock nbd: Fix timeout detection nvme: Fixes u64 division which breaks i386 builds NVMe: Use CMB for the IO SQes if available NVMe: Unify SQ entry writing and doorbell ringing
2015-08-26NVMe: Using PRACT bit to generate and verify PI by controllerAlok Pandey
This patch enables the PRCHK and reftag support when PRACT bit is set, and block layer integrity is disabled. Signed-off-by: Alok Pandey <pandey.alok@samsung.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-08-19NVMe:Remove unreachable code in nvme_abort_reqSunad Bhandary
Removing unreachable code from nvme_abort_req as nvme_submit_cmd has no failure status to return. Signed-off-by: Sunad Bhandary <sunad.s@samsung.com> Acked-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-08-19block: Replace SG_GAPS with new queue limits maskKeith Busch
The SG_GAPS queue flag caused checks for bio vector alignment against PAGE_SIZE, but the device may have different constraints. This patch adds a queue limits so a driver with such constraints can set to allow requests that would have been unnecessarily split. The new gaps check takes the request_queue as a parameter to simplify the logic around invoking this function. This new limit makes the queue flag redundant, so removing it and all usage. Device-mappers will inherit the correct settings through blk_stack_limits(). Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-08-18NVMe: Add nvme subsystem reset IOCTLJon Derrick
Controllers can perform optional subsystem resets as introduced in NVMe 1.1. This patch adds an IOCTL to trigger the subsystem reset by writing "NVMe" to the NSSR register. Signed-off-by: Jon Derrick <jonathan.derrick@intel.com> Acked-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-08-18NVMe: Add nvme subsystem reset supportKeith Busch
Controllers part of an NVMe subsystem may be reset by any other controller in the subsystem. If the device is capable of subsystem resets, this patch adds detection for such events and performs appropriate controller initialization upon subsystem reset detection. The register bit is a RW1C type, so the driver needs to write a 1 to the status bit to clear the subsystem reset occured bit during initialization. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-08-18NVMe: removed unused nn var from nvme_dev_addMatias Bjørling
The logic in nvme_dev_add to enumerate namespaces was moved to nvme_dev_scan. When moved, the nn variable is no longer used. This patch removes it. Fixes: a5768aai ("NVMe: Automatic namespace rescan") Signed-off-by: Matias Bjørling <m@bjorling.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-08-17NVMe: Set queue max segmentsKeith Busch
This sets the queue's max segment size to match the device's capabilities. The default of 128 is usable until a device's transfer capability exceeds 512k, assuming a device page size of 4k. Many nvme devices exceed that transfer limit, so this lets the block layer know what kind of commands it to allow to form rather than unnecessarily split them. One additional segment is added to account for a transfer that may start in the middle of a page. Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-07-21nvme: Fixes u64 division which breaks i386 buildsJon Derrick
Uses div_u64 for u64 division and round_down, a bitwise operation, instead of rounddown, which uses a modulus. Signed-off-by: Jon Derrick <jonathan.derrick@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-07-21NVMe: Use CMB for the IO SQes if availableJon Derrick
Some controllers have a controller-side memory buffer available for use for submissions, completions, lists, or data. If a CMB is available, the entire CMB will be ioremapped and it will attempt to map the IO SQes onto the CMB. The queues will be shrunk as needed. The CMB will not be used if the queue depth is shrunk below some threshold where it may have reduced performance over a larger queue in system memory. Signed-off-by: Jon Derrick <jonathan.derrick@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-07-21NVMe: Unify SQ entry writing and doorbell ringingJon Derrick
This patch changes sq_cmd writers to instead create their command on the stack. __nvme_submit_cmd copies the sq entry to the queue and writes the doorbell. Signed-off-by: Jon Derrick <jonathan.derrick@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-07-17block: have drivers use blk_queue_max_discard_sectors()Jens Axboe
Some drivers use it now, others just set the limits field manually. But in preparation for splitting this into a hard and soft limit, ensure that they all call the proper function for setting the hw limit for discards. Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-07-15NVMe: Reread partitions on metadata formatsKeith Busch
This patch has the driver automatically reread partitions if a namespace has a separate metadata format. Previously revalidating a disk was sufficient to get the correct capacity set on such formatted drives, but partitions that may exist would not have been surfaced. Reported-by: Paul Grabinar <paul.grabinar@ranbarg.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Cc: Matthew Wilcox <willy@linux.intel.com> Tested-by: Paul Grabinar <paul.grabinar@ranbarg.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-07-02NVMe: Fix irq freeing when queue_request_irq failsJon Derrick
Fixes an issue when queue_reuest_irq fails in nvme_setup_io_queues. This patch initializes all vectors to -1 and resets the vector to -1 in the case of a failure in queue_request_irq. This avoids the free_irq in nvme_suspend_queue if the queue did not get an irq. Signed-off-by: Jon Derrick <jonathan.derrick@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-27drivers/block/nvme-core.c: fix build with gcc-4.4.4Andrew Morton
gcc-4.4.4 (and possibly other versions) fail the compile when initializers are used with anonymous unions. Work around this. drivers/block/nvme-core.c: In function 'nvme_identify_ctrl': drivers/block/nvme-core.c:1163: error: unknown field 'identify' specified in initializer drivers/block/nvme-core.c:1163: warning: missing braces around initializer drivers/block/nvme-core.c:1163: warning: (near initialization for 'c.<anonymous>') drivers/block/nvme-core.c:1164: error: unknown field 'identify' specified in initializer drivers/block/nvme-core.c:1164: warning: excess elements in struct initializer drivers/block/nvme-core.c:1164: warning: (near initialization for 'c') ... This patch has no effect on text size with gcc-4.8.2. Fixes: d29ec8241c10eac ("nvme: submit internal commands through the block layer") Cc: Christoph Hellwig <hch@lst.de> Cc: Jens Axboe <axboe@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-27NVMe: Fix filesystem deadlock on removalKeith Busch
Move gendisk deletion before controller shutdown so filesystem may sync dirty pages. Before, this would deadlock trying to allocate requests on frozen queues that are about to be deleted. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-27NVMe: Failed controller initialization fixesKeith Busch
This fixes an infinite device reset loop that may occur on devices that fail initialization. If the drive fails to become ready for any reason that does not involve an admin command timeout, the probe task should assume the drive is unavailable and remove it from the topology. In the case an admin command times out during device probing, the driver's existing reset action will handle removing the drive. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-27NVMe: Unify controller probe and resumeKeith Busch
This unifies probe and resume so they both may be scheduled in the same way. This is necessary for error handling that may occur during device initialization since the task to cleanup the device wouldn't be able to run if it is blocked on device initialization. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-27NVMe: Don't use fake status on cancelled commandKeith Busch
Synchronized commands do different things for timed out commands vs. controller returned errors. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-27NVMe: Fix device cleanup on initialization failureKeith Busch
Don't release block queue and tagging resoureces if the driver never got them in the first place. This can happen if the controller fails to become ready, if memory wasn't available to allocate a tagset or admin queue, or if the resources were released as part of error recovery. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-25Merge branch 'for-4.2/drivers' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block driver updates from Jens Axboe: "This contains: - a few race fixes for null_blk, from Akinobu Mita. - a series of fixes for mtip32xx, from Asai Thambi and Selvan Mani at Micron. - NVMe: * Fix for missing error return on allocation failure, from Axel Lin. * Code consolidation and cleanups from Christoph. * Memory barrier addition, syncing queue count and queue pointers. From Jon Derrick. * Various fixes from Keith, an addition to support user issue reset from sysfs or ioctl, and automatic namespace rescan. * Fix from Matias, avoiding losing some request flags when marking the request failfast. - small cleanups and sparse fixups for ps3vram. From Geert Uytterhoeven and Geoff Lavand. - s390/dasd dead code removal, from Jarod Wilson. - a set of fixes and optimizations for loop, from Ming Lei. - conversion to blkdev_reread_part() of loop, dasd, ndb. From Ming Lei. - updates to cciss. From Tomas Henzl" * 'for-4.2/drivers' of git://git.kernel.dk/linux-block: (44 commits) mtip32xx: Fix accessing freed memory block: nvme-scsi: Catch kcalloc failure NVMe: Fix IO for extended metadata formats nvme: don't overwrite req->cmd_flags on sync cmd mtip32xx: increase wait time for hba reset mtip32xx: fix minor number mtip32xx: remove unnecessary sleep in mtip_ftl_rebuild_poll() mtip32xx: fix crash on surprise removal of the drive mtip32xx: Abort I/O during secure erase operation mtip32xx: fix incorrectly setting MTIP_DDF_SEC_LOCK_BIT mtip32xx: remove unused variable 'port->allocated' mtip32xx: fix rmmod issue MAINTAINERS: Update ps3vram block driver block/ps3vram: Remove obsolete reference to MTD block/ps3vram: Fix sparse warnings NVMe: Automatic namespace rescan NVMe: Memory barrier before queue_count is incremented NVMe: add sysfs and ioctl controller reset null_blk: restart request processing on completion handler null_blk: prevent timer handler running on a different CPU where started ...
2015-06-19NVMe: Fix IO for extended metadata formatsKeith Busch
This fixes io submit ioctl handling when using extended metadata formats. When these formats are used, the user provides a single virtually contiguous buffer containing both the block and metadata interleaved, so the metadata size needs to be added to the total length and not mapped as a separate transfer. The command is also driver generated, so this patch does not enforce blk-integrity extensions provide the metadata buffer. Reported-by: Marcin Dziegielewski <marcin.dziegielewski@intel.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-17nvme: don't overwrite req->cmd_flags on sync cmdMatias Bjørling
In __nvme_submit_sync_cmd, the request direction is overwritten when the REQ_FAILFAST_DRIVER flag is set. Signed-off-by: Matias Bjørling <m@bjorling.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Fixes: 75619bfa904d0 ("NVMe: End sync requests immediately on failure") Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-05NVMe: Automatic namespace rescanKeith Busch
Namespaces may be dynamically allocated and deleted or attached and detached. This has the driver rescan the device for namespace changes after each device reset or namespace change asynchronous event. There could potentially be many detached namespaces that we don't want polluting /dev/ with unusable block handles, so this will delete disks if the namespace is not active as indicated by the response from identify namespace. This also skips adding the disk if no capacity is provisioned to the namespace in the first place. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-05NVMe: Memory barrier before queue_count is incrementedJon Derrick
Protects against reordering and/or preempting which would allow the kthread to access the queue descriptor before it is set up Signed-off-by: Jon Derrick <jonathan.derrick@intel.com> Acked-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-05NVMe: add sysfs and ioctl controller resetKeith Busch
We need the ability to perform an nvme controller reset as discussed on the mailing list thread: http://lists.infradead.org/pipermail/linux-nvme/2015-March/001585.html This adds a sysfs entry that when written to will reset perform an NVMe controller reset if the controller was successfully initialized in the first place. This also adds locking around resetting the device in the async probe method so the driver can't schedule two resets. Signed-off-by: Keith Busch <keith.busch@intel.com> Cc: Brandon Schultz <brandon.schulz@hgst.com> Cc: David Sariel <david.sariel@pmcs.com> Updated by Jens to: 1) Merge this with the ioctl reset patch from David Sariel. The ioctl path now shares the reset code from the sysfs path. 2) Don't flush work if we fail issuing the reset. Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-01NVMe: Remove hctx reliance for multi-namespaceKeith Busch
The driver needs to track shared tags to support multiple namespaces that may be dynamically allocated or deleted. Relying on the first request_queue's hctx's is not appropriate as we cannot clear outstanding tags for all namespaces using this handle, nor can the driver easily track all request_queue's hctx as namespaces are attached/detached. Instead, this patch uses the nvme_dev's tagset to get the shared tag resources instead of through a request_queue hctx. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-05-29NVMe: End sync requests immediately on failureKeith Busch
Do not retry failed sync commands so the original status may be seen without issuing unnecessary retries. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-05-29NVMe: Use requested sync command timeoutKeith Busch
Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-05-29NVMe: fix type warning on 32-bitArnd Bergmann
A recent change to the ioctl handling caused a new harmless warning in the NVMe driver on all 32-bit machines: drivers/block/nvme-core.c: In function 'nvme_submit_io': drivers/block/nvme-core.c:1794:29: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] In order to shup up that warning, this introduces a new temporary variable that uses a double cast to extract the pointer from an __u64 structure member. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: a67a95134ff ("NVMe: Meta data handling through submit io ioctl") Acked-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-05-22NVMe: Fix obtaining command resultKeith Busch
Replaces req->sense_len usage, which is not owned by the LLD, to req->special to contain the command result for driver created commands, and sets the result unconditionally on completion. Signed-off-by: Keith Busch <keith.busch@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jens Axboe <axboe@fb.com> Fixes: d29ec8241c10 ("nvme: submit internal commands through the block layer") Signed-off-by: Jens Axboe <axboe@fb.com>
2015-05-22nvme: submit internal commands through the block layerChristoph Hellwig
Use block layer queues with an internal cmd_type to submit internally generated NVMe commands. This both simplifies the code a lot and allow for a better structure. For example now the LighNVM code can construct commands without knowing the details of the underlying I/O descriptors. Or a future NVMe over network target could inject commands, as well as could the SCSI translation and ioctl code be reused for such a beast. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-05-22nvme: store a struct device pointer in struct nvme_devChristoph Hellwig
Most users want the generic device, so store that in struct nvme_dev instead of the pci_dev. This also happens to be a nice step towards making some code reusable for non-PCI transports. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-05-22nvme: consolidate synchronous command submission helpersChristoph Hellwig
Note that we keep the unused timeout argument, but allow callers to pass 0 instead of a timeout if they want the default. This will allow adding a timeout to the pass through path later on. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-05-19nvme: disable irqs in nvme_freeze_queuesChristoph Hellwig
The queue_lock needs to be taken with irqs disabled. This is mostly due to the old pre blk-mq usage pattern, but we've also picked it up in most of the few places where we use the queue_lock with blk-mq. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-04-16Merge branch 'for-4.1/drivers' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block driver updates from Jens Axboe: "This is the block driver pull request for 4.1. As with the core bits, this is a relatively slow round. This pull request contains: - Various fixes and cleanups for NVMe, from Alexey Khoroshilov, Chong Yuan, myself, Keith Busch, and Murali Iyer. - Documentation and code cleanups for nbd from Markus Pargmann. - Change of brd maintainer to me, from Ross Zwisler. At least the email doesn't bounce anymore then. - Two xen-blkback fixes from Tao Chen" * 'for-4.1/drivers' of git://git.kernel.dk/linux-block: (23 commits) NVMe: Meta data handling through submit io ioctl NVMe: Add translation for block limits NVMe: Remove check for null NVMe: Fix error handling of class_create("nvme") xen-blkback: define pr_fmt macro to avoid the duplication of DRV_PFX xen-blkback: enlarge the array size of blkback name nbd: Return error pointer directly nbd: Return error code directly nbd: Remove fixme that was already fixed nbd: Restructure debugging prints nbd: Fix device bytesize type nbd: Replace kthread_create with kthread_run nbd: Remove kernel internal header Documentation: nbd: Add list of module parameters Documentation: nbd: Reformat to allow more documentation NVMe: increase depth of admin queue nvme: Fix PRP list calculation for non-4k system page size NVMe: Fix blk-mq hot cpu notification NVMe: embedded iod mask cleanup NVMe: Freeze admin queue on device failure ...
2015-04-07NVMe: Meta data handling through submit io ioctlKeith Busch
This adds support for the extended metadata formats through the submit IO ioctl, and simplifies the rest when using a separate metadata format. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-04-07NVMe: Remove check for nullKeith Busch
Checking fails static analysis due to additional arithmetic prior to the NULL check. Mapping doesn't return NULL here anyway, so removing the check. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-04-07NVMe: Fix error handling of class_create("nvme")Alexey Khoroshilov
class_create() returns ERR_PTR on failure, so IS_ERR() should be used instead of check for NULL. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Acked-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-03-31NVMe: increase depth of admin queueJens Axboe
Usually the admin queue depth of 64 is plenty, but for some use cases we really need it larger. Examples are use cases like MAT, where you have to touch all of NAND for init/format like purposes. In those cases, we see a good 2x increase with an increased queue depth. Signed-off-by: Jens Axboe <axboe@fb.com> Acked-by: Keith Busch <keith.busch@intel.com>
2015-03-31nvme: Fix PRP list calculation for non-4k system page sizeMurali Iyer
PRP list calculation is supposed to be based on device's page size. Systems with page size larger than device's page size cause corruption to the name space as well as system memory with out this fix. Systems like x86 might not experience this issue because it uses PAGE_SIZE of 4K where as powerpc uses PAGE_SIZE of 64k while NVMe device's page size varies depending upon the vendor. Signed-off-by: Murali Iyer <mniyer@us.ibm.com> Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Acked-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-03-31NVMe: Fix blk-mq hot cpu notificationKeith Busch
The driver may issue commands to a device that may never return, so its request_queue could always have active requests while the controller is running. Waiting for the queue to freeze could block forever, which is what blk-mq's hot cpu notification handler was doing when nvme drives were in use. This has the nvme driver make the asynchronous event command's tag reserved and does not keep the request active. We can't have more than one since the request is released back to the request_queue before the command is completed. Having only one avoids potential tag collisions, and reserving the tag for this purpose prevents other admin tasks from reusing the tag. I also couldn't think of a scenario where issuing AEN requests single depth is worse than issuing them in batches, so I don't think we lose anything with this change. As an added bonus, doing it this way removes "Cancelling I/O" warnings observed when unbinding the nvme driver from a device. Reported-by: Yigal Korman <yigal@plexistor.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-03-31NVMe: embedded iod mask cleanupChong Yuan
Remove unused mask in nvme_alloc_iod Signed-off-by: Chong Yuan <chong.yuan@memblaze.com> Reviewed-by: Wenbo Wang <wenbo.wang@memblaze.com> Acked-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-03-31NVMe: Freeze admin queue on device failureKeith Busch
This fixes a race accessing an invalid address when a controller's admin queue is in use during a reset for failure or hot removal occurs. The admin queue will be frozen to prevent new users from entering prior to the doorbell queue being unmapped. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-03-23NVMe: Initialize device list head before startingKeith Busch
Driver recovery requires the device's list node to have been initialized. Fixes: https://lkml.org/lkml/2015/3/22/262 Reported-by: Steven Noonan <steven@uplinklabs.net> Signed-off-by: Keith Busch <keith.busch@intel.com> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Jens Axboe <axboe@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>