aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/scsi/hisi_sas/hisi_sas_main.c
AgeCommit message (Collapse)Author
2024-01-25scsi: hisi_sas: Replace with standard error code return valueYihang Li
[ Upstream commit d34ee535705eb43885bc0f561c63046f697355ad ] In function hisi_sas_controller_prereset(), -ENOSYS (Function not implemented) should be returned if the driver does not support .soft_reset. Returns -EPERM (Operation not permitted) if HISI_SAS_RESETTING_BIT is already be set. In function _suspend_v3_hw(), returns -EPERM (Operation not permitted) if HISI_SAS_RESETTING_BIT is already be set. Fixes: 4522204ab218 ("scsi: hisi_sas: tidy host controller reset function a bit") Signed-off-by: Yihang Li <liyihang9@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1702525516-51258-3-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-07scsi: hisi_sas: Check sas_port before using itXiang Chen
[ Upstream commit 8c39673d5474b95374df2104dc1f65205c5278b8 ] Need to check the structure sas_port before using it. Link: https://lore.kernel.org/r/1573551059-107873-2-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04scsi: hisi_sas: Replace in_softirq() check in hisi_sas_task_exec()Xiang Chen
[ Upstream commit 550c0d89d52d3bec5c299f69b4ed5d2ee6b8a9a6 ] For IOs from upper layer, preemption may be disabled as it may be called by function __blk_mq_delay_run_hw_queue which will call get_cpu() (it disables preemption). So if flags HISI_SAS_REJECT_CMD_BIT is set in function hisi_sas_task_exec(), it may disable preempt twice after down() and up() which will cause following call trace: BUG: scheduling while atomic: fio/60373/0x00000002 Call trace: dump_backtrace+0x0/0x150 show_stack+0x24/0x30 dump_stack+0xa0/0xc4 __schedule_bug+0x68/0x88 __schedule+0x4b8/0x548 schedule+0x40/0xd0 schedule_timeout+0x200/0x378 __down+0x78/0xc8 down+0x54/0x70 hisi_sas_task_exec.isra.10+0x598/0x8d8 [hisi_sas_main] hisi_sas_queue_command+0x28/0x38 [hisi_sas_main] sas_queuecommand+0x168/0x1b0 [libsas] scsi_queue_rq+0x2ac/0x980 blk_mq_dispatch_rq_list+0xb0/0x550 blk_mq_do_dispatch_sched+0x6c/0x110 blk_mq_sched_dispatch_requests+0x114/0x1d8 __blk_mq_run_hw_queue+0xb8/0x130 __blk_mq_delay_run_hw_queue+0x1c0/0x220 blk_mq_run_hw_queue+0xb0/0x128 blk_mq_sched_insert_requests+0xdc/0x208 blk_mq_flush_plug_list+0x1b4/0x3a0 blk_flush_plug_list+0xdc/0x110 blk_finish_plug+0x3c/0x50 blkdev_direct_IO+0x404/0x550 generic_file_read_iter+0x9c/0x848 blkdev_read_iter+0x50/0x78 aio_read+0xc8/0x170 io_submit_one+0x1fc/0x8d8 __arm64_sys_io_submit+0xdc/0x280 el0_svc_common.constprop.0+0xe0/0x1e0 el0_svc_handler+0x34/0x90 el0_svc+0x10/0x14 ... To solve the issue, check preemptible() to avoid disabling preempt multiple when flag HISI_SAS_REJECT_CMD_BIT is set. Link: https://lore.kernel.org/r/1571926105-74636-5-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-17scsi: hisi_sas: Reject setting programmed minimum linkrate > 1.5GLuo Jiaxing
[ Upstream commit eb44e4d7b5a3090f0114927f42ae575c29664a09 ] The SAS controller cannot support a programmed minimum linkrate of > 1.5G (it will always negotiate to 1.5G at least), so just reject it. This solves a strange situation where the PHY negotiated linkrate may be less than the programmed minimum linkrate. Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-17scsi: hisi_sas: send primitive NOTIFY to SSP situation onlyXiang Chen
[ Upstream commit 569eddcf3a0f4efff4ef96a7012010e0f7daa8b4 ] Send primitive NOTIFY to SSP situation only, or it causes underflow issue when sending IO. Also rename hisi_sas_hw.sl_notify() to hisi_sas_hw. sl_notify_ssp(). Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-01scsi: hisi_sas: Fix NULL pointer dereferenceGustavo A. R. Silva
[ Upstream commit f4445bb93d82a984657b469e63118c2794a4c3d3 ] There is a NULL pointer dereference in case *slot* happens to be NULL at lines 1053 and 1878: struct hisi_sas_cq *cq = &hisi_hba->cq[slot->dlvry_queue]; Notice that *slot* is being NULL checked at lines 1057 and 1881: if (slot), which implies it may be NULL. Fix this by placing the declaration and definition of variable cq, which contains the pointer dereference slot->dlvry_queue, after slot has been properly NULL checked. Addresses-Coverity-ID: 1474515 ("Dereference before null check") Addresses-Coverity-ID: 1474520 ("Dereference before null check") Fixes: 584f53fe5f52 ("scsi: hisi_sas: Fix the race between IO completion and timeout for SMP/internal IO") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-01scsi: hisi_sas: Fix the race between IO completion and timeout for ↵Xiang Chen
SMP/internal IO [ Upstream commit 584f53fe5f529d877968c711a095923c1ed12307 ] If SMP/internal IO times out, we will possibly free the task immediately. However if the IO actually completes at the same time, the IO completion may refer to task which has been freed. So to solve the issue, flush the tasklet to finish IO completion before free'ing slot/task. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-01scsi: hisi_sas: Feed back linkrate(max/min) when re-attachedLuo Jiaxing
[ Upstream commit 5a54691f874ab29ec82f08bc6936866a3ccdaa91 ] At directly attached situation, if the user modifies the sysfs interface of maximum_linkrate and minimum_linkrate to renegotiate the linkrate between SAS controller and target, the value of both files mentioned above should have change to user setting after renegotiate is over, but it remains unchanged. To fix this bug, maximum_linkrate and minimum_linkrate will be directly fed back to relevant sas_phy structure. Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-05scsi: hisi_sas: Fix a timeout race of driver internal and SMP IOXiang Chen
[ Upstream commit 4790595723d4b833b18c994973d39f9efb842887 ] For internal IO and SMP IO, there is a time-out timer for them. In the timer handler, it checks whether IO is done according to the flag task->task_state_lock. There is an issue which may cause system suspended: internal IO or SMP IO is sent, but at that time because of hardware exception (such as inject 2Bit ECC error), so IO is not completed and also not timeout. But, at that time, the SAS controller reset occurs to recover system. It will release the resource and set the status of IO to be SAS_TASK_STATE_DONE, so when IO timeout, it will never complete the completion of IO and wait for ever. [ 729.123632] Call trace: [ 729.126791] [<ffff00000808655c>] __switch_to+0x94/0xa8 [ 729.133106] [<ffff000008d96e98>] __schedule+0x1e8/0x7fc [ 729.138975] [<ffff000008d974e0>] schedule+0x34/0x8c [ 729.144401] [<ffff000008d9b000>] schedule_timeout+0x1d8/0x3cc [ 729.150690] [<ffff000008d98218>] wait_for_common+0xdc/0x1a0 [ 729.157101] [<ffff000008d98304>] wait_for_completion+0x28/0x34 [ 729.165973] [<ffff000000dcefb4>] hisi_sas_internal_task_abort+0x2a0/0x424 [hisi_sas_test_main] [ 729.176447] [<ffff000000dd18f4>] hisi_sas_abort_task+0x244/0x2d8 [hisi_sas_test_main] [ 729.185258] [<ffff000008971714>] sas_eh_handle_sas_errors+0x1c8/0x7b8 [ 729.192391] [<ffff000008972774>] sas_scsi_recover_host+0x130/0x398 [ 729.199237] [<ffff00000894d8a8>] scsi_error_handler+0x148/0x5c0 [ 729.206009] [<ffff0000080f4118>] kthread+0x10c/0x138 [ 729.211563] [<ffff0000080855dc>] ret_from_fork+0x10/0x18 To solve the issue, callback function task_done of those IOs need to be called when on SAS controller reset. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-05scsi: hisi_sas: Set PHY linkrate when disconnectedJohn Garry
[ Upstream commit efdcad62e7b8a02fcccc5ccca57806dce1482ac8 ] When the PHY comes down, we currently do not set the negotiated linkrate: root@(none)$ pwd /sys/class/sas_phy/phy-0:0 root@(none)$ more enable 1 root@(none)$ more negotiated_linkrate 12.0 Gbit root@(none)$ echo 0 > enable root@(none)$ more negotiated_linkrate 12.0 Gbit root@(none)$ This patch fixes the driver code to set it properly when the PHY comes down. If the PHY had been enabled, then set unknown; otherwise, flag as disabled. The logical place to set the negotiated linkrate for this scenario is PHY down routine, which is called from the PHY down ISR. However, it is not possible to know if the PHY comes down due to PHY disable or loss of link, as sas_phy.enabled member is not set until after the transport disable routine is complete, which races with the PHY down ISR. As an imperfect solution, use sas_phy_data.enable as the flag to know if the PHY is down due to disable. It's imperfect, as sas_phy_data is internal to libsas. I can't see another way without adding a new field to hisi_sas_phy and managing it, or changing SCSI SAS transport. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-07-19scsi: hisi_sas: add memory barrier in task delivery functionXiaofei Tan
In task start delivery function, we need to add a memory barrier to prevent re-ordering of reading memory by hardware. Because the slot data is set in task prepare function and it could be running in another CPU. This patch adds an memory barrier after s->ready is read in the task start delivery function, and uses WRITE_ONCE() in the places where s->ready is set to ensure that the compiler does not re-order. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-07-19scsi: hisi_sas: Tidy hisi_sas_task_prep()Xiang Chen
To decrease the usage of spinlock during delivery IO, relocate some code in hisi_sas_task_prep(). Also an invalid comment is removed. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-07-19scsi: hisi_sas: tidy host controller reset function a bitXiaofei Tan
This patch tidies host controller reset function by putting some code to two new functions, and exports these two functions out, so that they could be used by FLR feature to be realised. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-07-19scsi: hisi_sas: Drop hisi_sas_slot_abort()John Garry
For some time now we have not used hisi_sas_slot_abort() to handle erroring slots, apart from in archaic v1 hw. As such, remove this function and associated code. For v1 hw, move error handling to same scheme as other hw revisions, where we allow erroring commands to timeout. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-06-19scsi: hisi_sas: Add missing PHY spinlock initJohn Garry
The init is missed for hisi_sas_phy spinlock, so add it. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-06-19scsi: hisi_sas: Pre-allocate slot DMA buffersXiang Chen
Currently the driver spends much time allocating and freeing the slot DMA buffer for command delivery/completion. To boost the performance, pre-allocate the buffers for all IPTT. The downside of this approach is that we are reallocating all buffer memory upfront, so hog memory which we may not need. However, the current method - DMA buffer pool - also caches all buffers and does not free them until the pool is destroyed, so is not exactly efficient either. On top of this, since the slot DMA buffer is slightly bigger than a 4K page, we need to allocate 2x4K pages per buffer (for 4K page kernel), which is quite wasteful. For 64K page size this is not such an issue. So, for the 4K page case, in order to make memory usage more efficient, pre-allocating larger blocks of DMA memory for the buffers can be more efficient. To make DMA memory usage most efficient, we would choose a single contiguous DMA memory block, but this could use up all the DMA memory in the system (when CMA enabled and no IOMMU), or we may just not be able to allocate a DMA buffer large enough when no CMA or IOMMU. To decide the block size we use the LCM (least common multiple) of the buffer size and the page size. We roundup(64) to ensure the LCM is not too large, even though a little memory may be wasted per block. So, with this, the total memory requirement is about is about 17MB for 4096 max IPTT. Previously (for 4K pages case), it would be 32MB (for all slots allocated). With this change, the relative increase of IOPS for bs=4K read when PAGE_SIZE=4K and PAGE_SIZE=64K is as follows: IODEPTH 4K PAGE_SIZE 64K PAGE_SIZE 32 56% 47% 64 53% 44% 128 64% 43% 256 67% 45% Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-06-19scsi: hisi_sas: Release all remaining resources in clear nexus haXiaofei Tan
In host reset, we use TMF or soft-reset to re-init device, and if success, we will release all LLDD resources of this device. If the init fails - maybe because the device was removed or link has not come up - then do not release the LLDD resources, but rather rely on SCSI EH to handle the timeout for these resources later on. But if clear nexus ha calls host reset, which is the last effort of SCSI EH, we should release all LLDD remain resources. Because SCSI EH will release all tasks after clear nexus ha. Before release, we do I_T nexus reset to try to clear target remain IOs. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-06-19scsi: hisi_sas: Add a flag to filter PHY events during resetXiaofei Tan
During reset, we don't want PHY events reported to libsas for PHYs which were previously attached prior to reset. So check hisi_hba->flags for HISI_SAS_RESET_BIT to filter PHY events during reset. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-06-19scsi: hisi_sas: Adjust task reject period during host resetXiaofei Tan
After soft_reset() for host reset, we should not be allowed to send commands to the HW before the PHYs have come up and the port ids have been refreshed. Prior to this point, any commands cannot be successfully completed. This exclusion is achieved by grabbing the host reset semaphore. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-06-19scsi: hisi_sas: Fix the conflict between dev gone and host resetXiaofei Tan
There is a possible conflict when a device is removed and host reset occurs concurrently. The reason is that then the device is notified as gone, we try to clear the ITCT, which is notified via an interrupt. The dev gone function pends on this event with a completion, which is completed when the ITCT interrupt occurs. But host reset will disable all interrupts, the wait_for_completion() may wait indefinitely. This patch adds an semaphore to synchronise this two processes. The semaphore is taken by the host reset as the basis of synchronising. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-06-19scsi: hisi_sas: Use dmam_alloc_coherent()Xiang Chen
This patch replaces the usage of dma_alloc_coherent() with the managed version, dmam_alloc_coherent(), hereby reducing replicated code. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by; John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28scsi: hisi_sas: Mark PHY as in reset for nexus resetXiang Chen
When issuing a nexus reset for directly attached device, we want to ignore the PHY down events so libsas will not deform and reform the port. In the case that the attached SAS changes for the reset, libsas will deform and form a port. For scenario that the PHY does not come up after a timeout period, then report the PHY down to libsas. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28scsi: hisi_sas: Fix return value when get_free_slot() failedXiaofei Tan
It is an step of executing task to get free slot. If the step fails, we will cleanup LLDD resources and should return failure to upper layer or internal caller to abort task execution of this time. But in the current code, the caller of get_free_slot() doesn't return failure when get_free_slot() failed. This patch is to fix it. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28scsi: hisi_sas: Terminate STP reject quickly for v2 hwXiaofei Tan
For v2 hw, STP link from target is rejected after host reset because of a SoC bug. The STP reject will be terminated after we have sent IO from each PHY of a port. This is not an problem before, as we don't need to setup STP link from target immediately after host reset. But now, it is. Because we want to send soft-reset immediately after host reset. In order to terminate STP reject quickly, this patch send ATA reset command through each PHY of a port. Notes: ATA reset command don't need target's response. Besides, we do abort dev for each device before terminating STP reject. This is a quirk of v2 hw. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28scsi: hisi_sas: Include TMF elements in struct hisi_sas_slotXiaofei Tan
In future scenarios we will want to use the TMF struct for more task types than SSP. As such, we can add struct hisi_sas_tmf_task directly into struct hisi_sas_slot, and this will mean we can remove the TMF parameters from the task prep functions. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28scsi: hisi_sas: Try wait commands before before controller resetXiaofei Tan
We may reset the controller in many scenarios, such as SCSI EH and HW errors. There should be no IO which returns from target when SCSI EH is active. But for other scenarios, there may be. It is not necessary to make such IOs fail. This patch adds an function of trying to wait for any commands, or IO, to complete before host reset. If no more CQ returned from host controller in 100ms, we assume no more IO can return, and then stop waiting. We wait 5s at most. The HW has a register CQE_SEND_CNT to indicate the total number of CQs that has been reported to driver. We can use this register and it is reliable to resd this register in such scenarios that require host reset. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28scsi: hisi_sas: Init disks after controller resetXiaofei Tan
After the controller is reset, it is possible that the disks attached still have outstanding IO to complete. Thus, when the PHYs come back up after controller reset, it is possible that these IOs complete at some unknown point later. We want to ensure that all IOs are complete after the controller reset so that all associated IPTT and other resources can be recycled safely. To achieve this, re-init the disks by TMF or softreset (in case of ATA devices). If the init fails - maybe because the device was removed or link has not come up - then do not release the device resources, but rather rely on SCSI EH to handle the timeout for these resources later on. This patch also does some cleanup to hisi_sas_init_disk(), including removing superfluous cases in the switch statement. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28scsi: hisi_sas: Create a scsi_host_template per HW moduleXiang Chen
When a SCSI host is registered, the SCSI mid-layer takes a reference to a module in Scsi_host.hostt.module. In doing this, we are prevented from removing the driver module for the host in dangerous scenario, like when a disk is mounted. Currently there is only one scsi_host_template (sht) for all HW versions, and this is the main.c module. So this means that we can possibly remove the HW module in this dangerous scenario, as SCSI mid-layer is only referencing the main.c module. To fix this, create a sht per module, referencing that same module to create the Scsi host. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28scsi: hisi_sas: Reset disks when discoveredXiang Chen
When a disk is discovered, it may be in an error state, or there may be residual commands remaining in the disk. To ensure any disk is in good state after discovery, reset via TMF (for SAS disk) or softreset (for a SATA disk). Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28scsi: hisi_sas: Change common allocation mode of device idXiang Chen
To reduce possibility of hitting unknown SoC bugs and aid debugging and test, change allocation mode of device id from last used device id instead of lowest available index. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28scsi: hisi_sas: change slot index allocation modeXiang Chen
Currently we find the lowest available empty bit in the IPTT bitmap to allocate the IPTT for a command. To reduce possibility of hitting unknown SoC bugs and also aid in the debugging of those same bugs, change the allocation mode. The next allocation method is to use the next free slot adjacent to the most recently allocated slot, in a round-robin fashion. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28scsi: hisi_sas: Introduce hisi_sas_phy_set_linkrate()John Garry
There is much common code and functionality between the HW versions to set the PHY linkrate. As such, this patch factors out the common code into a generic function hisi_sas_phy_set_linkrate(). Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28scsi: hisi_sas: fix a typo in hisi_sas_task_prep()Wei Yongjun
Fix a typo in hisi_sas_task_prep(). Fixes: 7eee4b921822 ("scsi: hisi_sas: relocate smp sg map") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18scsi: hisi_sas: add check of device in hisi_sas_task_exec()Xiaofei Tan
Currently we don't check that device is not gone before dereferencing its elements in the function hisi_sas_task_exec() (specifically, the DQ pointer). This patch fixes this issue by filling in the DQ pointer in hisi_sas_task_prep() after we check that the device pointer is still safe to reference. [mkp: typo] Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18scsi: hisi_sas: Use device lock to protect slot alloc/freeXiang Chen
The IPTT of a slot is unique, and we currently use hisi_hba lock to protect it. Now slot is managed on hisi_sas_device.list, so use DQ lock to protect for allocating and freeing the slot. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18scsi: hisi_sas: Don't lock DQ for complete task sendingXiang Chen
Currently we lock the DQ to protect whole delivery process. So this stops us building slots for the same queue in parallel, and can affect performance. To optimise it, only lock the DQ during special periods, specifically when allocating a slot from the DQ and when delivering a slot to the HW. This approach is now safe, thanks to the previous patches to ensure that we always deliver a slot to the HW once allocated. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18scsi: hisi_sas: allocate slot buffer earlierXiang Chen
Currently we allocate the slot's memory buffer after allocating the DQ slot. To aid DQ lockout reduction, and allow slots to be built in parallel, move this step (which can fail) prior to allocating the slot. Also a stray spin_unlock_irqrestore() is removed from internal task exec function. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18scsi: hisi_sas: make return type of prep functions voidXiang Chen
Since the task prep functions now should not fail, adjust the return types to void. In addition, some checks in the task prep functions are relocated to the main module; this is specifically the check for the number of elements in an sg list exceeded the HW SGE limit. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-18scsi: hisi_sas: relocate smp sg mapXiang Chen
Currently we use DQ lock to protect delivery of DQ entry one by one. To optimise to allow more than one slot to be built for a single DQ in parallel, we need to remove the DQ lock when preparing slots, prior to delivery. To achieve this, we rearrange the slot build order to ensure that once we allocate a slot for a task, we do cannot fail to deliver the task. In this patch, we rearrange the slot building for SMP tasks to ensure that sg mapping part (which can fail) happens before we allocate the slot in the DQ. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08scsi: hisi_sas: update PHY linkrate after a controller resetXiang Chen
After the controller is reset, we currently may not honour the PHY max linkrate set via sysfs, in that after a reset we always revert to max linkrate of 12Gbps, ignoring the value set via sysfs. This patch modifies to policy to set the programmed PHY linkrate, honouring the max linkrate programmed via sysfs. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08scsi: hisi_sas: stop controller timer for resetJohn Garry
We should only have the timer enabled after PHY up after controller reset, so disable prior to reset. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08scsi: hisi_sas: check sas_dev gone earlier in hisi_sas_abort_task()Xiang Chen
It is possible to dereference a NULL-pointer in hisi_sas_abort_task() in special scenario when the device has been removed. If an SMP task times-out, it will call hisi_sas_abort_task() to recover. And currently there is a check in hisi_sas_abort_task() to avoid the situation of processing the abort for the removed device. However we have an ordering problem, in that we may reference a task for the removed device before checking if the device has been removed. Fix this by only referencing the sas_dev after we know it is still present. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08scsi: hisi_sas: check host frozen before calling "done" functionXiang Chen
When the host is frozen in SCSI EH state, at any point after the LLDD sets SAS_TASK_STATE_DONE for the sas_task task state, libsas may free the task; see sas_scsi_find_task(). This puts the LLDD in a difficult position, in that once it sets SAS_TASK_STATE_DONE for the task state it should not reference the sas_task again. But the LLDD needs will check the sas_task indirectly in calling task->task_done()->sas_scsi_task_done() or sas_ata_task_done() (to check if the host is frozen state actually). And the LLDD cannot set SAS_TASK_STATE_DONE for the task state after task->task_done() is called (as the sas_task is free'd at this point). This situation would seem to be a problem made by libsas. To work around, check in the LLDD whether the host is in frozen state to ensure it is ok to call task->task_done() function. If in the frozen state, we rely on SCSI EH and libsas to free the sas_task directly. We do not do this for the following IO types: - SMP - they are managed in libsas directly, outside SCSI EH - Any internally originated IO, for similar reason Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08scsi: hisi_sas: Add some checks to avoid free'ing a sas_task twiceXiang Chen
If the SCSI host enters EH, any pending IO will be processed by SCSI EH. However it is possible that SCSI EH will try to abort the IO and also at the same time the IO completes in the driver. In this situation there is a small chance of freeing the sas_task twice. Then if another IO re-uses freed sas_task before the second time of free'ing sas_task, it is possible to free incorrect sas_task. To avoid this situation, add some checks to increase reliability. The sas_task task state flag SAS_TASK_STATE_ABORTED is used to mutually protect the LLDD and libsas freeing the task. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18scsi: hisi_sas: remove some unneeded structure membersJohn Garry
This patch removes unneeded structure elements: - hisi_sas_phy.dev_sas_addr: only ever written - Also remove associated function which writes it, hisi_sas_init_add(). - hisi_sas_device.attached_phy: only ever written - Also remove code to set it in hisi_sas_dev_found() Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18scsi: hisi_sas: consolidate command check in hisi_sas_get_ata_protocol()Xiaofei Tan
Currently we check the fis->command value in 2 locations in hisi_sas_get_ata_protocol() switch statement. Fix this by consolidating the check for fis->command value to 1 location only. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18scsi: hisi_sas: use dma_zalloc_coherent()Xiang Chen
This is a warning coming from Coccinelle, and need to use new interface dma_zalloc_coherent() instead of dma_alloc_coherent()/memset(). Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18scsi: hisi_sas: delete timer when removing hisi_sas driverXiang Chen
Delete timer for v1 and v3 hw when removing hisi_sas driver. Signed-off-by: Xiang chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-12scsi: hisi_sas: Code cleanup and minor bug fixesXiang Chen
The patch does some code cleanup and fixes some small bugs: - Correct return status of phy_up_v3_hw() and phy_bcast_v3_hw() - Add static for function phy_get_max_linkrate_v3_hw() - Change exception return status when no reset method - Change magic value to ts->stat in slot_complete_vx_hw() - Remove unnecessary check for dev_is_sata() - Fix some issues of alignment and indents (Authored by Xiaofei Tan in another patch, but added here to be practical) Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-12scsi: hisi_sas: fix return value of hisi_sas_task_prep()Xiaofei Tan
It is an implicit regulation that error code that function returned should be negative. But hisi_sas_task_prep() doesn't follow this. This may cause problems in the upper layer code. For example, in sas_expander.c of libsas, smp_execute_task_sg() may return the number of bytes of underrun. It will be conflicted with the scenaio lldd_execute_task() return an positive error code. This patch change the return value from SAS_PHY_DOWN to -ECOMM in hisi_sas_task_prep(). Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>