aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/scsi
AgeCommit message (Collapse)Author
2019-09-16scsi: libcxgbi: add a check for NULL pointer in cxgbi_check_route()Varun Prakash
commit cc555759117e8349088e0c5d19f2f2a500bafdbd upstream. ip_dev_find() can return NULL so add a check for NULL pointer. Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-09-16scsi: bnx2fc: fix incorrect cast to u64 on shift operationColin Ian King
commit d0c0d902339249c75da85fd9257a86cbb98dfaa5 upstream. Currently an int is being shifted and the result is being cast to a u64 which leads to undefined behaviour if the shift is more than 31 bits. Fix this by casting the integer value 1 to u64 before the shift operation. Addresses-Coverity: ("Bad shift operation") Fixes: 7b594769120b ("[SCSI] bnx2fc: Handle REC_TOV error code from firmware") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-09-16scsi: lpfc: add check for loss of ndlp when sending RRQJames Smart
commit c8cb261a072c88ca1aff0e804a30db4c7606521b upstream. There was a missing qualification of a valid ndlp structure when calling to send an RRQ for an abort. Add the check. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Tested-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-09-16scsi: qedi: remove set but not used variables 'cdev' and 'udev'YueHaibing
commit d0adee5d12752256ff0c87ad7f002f21fe49d618 upstream. Fixes gcc '-Wunused-but-set-variable' warning: drivers/scsi/qedi/qedi_iscsi.c: In function 'qedi_ep_connect': drivers/scsi/qedi/qedi_iscsi.c:813:23: warning: variable 'udev' set but not used [-Wunused-but-set-variable] drivers/scsi/qedi/qedi_iscsi.c:812:18: warning: variable 'cdev' set but not used [-Wunused-but-set-variable] These have never been used since introduction. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-09-16scsi: qedi: remove memset/memcpy to nfunc and use func insteadYueHaibing
commit c09581a52765a85f19fc35340127396d5e3379cc upstream. KASAN reports this: BUG: KASAN: global-out-of-bounds in qedi_dbg_err+0xda/0x330 [qedi] Read of size 31 at addr ffffffffc12b0ae0 by task syz-executor.0/2429 CPU: 0 PID: 2429 Comm: syz-executor.0 Not tainted 5.0.0-rc7+ #45 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xfa/0x1ce lib/dump_stack.c:113 print_address_description+0x1c4/0x270 mm/kasan/report.c:187 kasan_report+0x149/0x18d mm/kasan/report.c:317 memcpy+0x1f/0x50 mm/kasan/common.c:130 qedi_dbg_err+0xda/0x330 [qedi] ? 0xffffffffc12d0000 qedi_init+0x118/0x1000 [qedi] ? 0xffffffffc12d0000 ? 0xffffffffc12d0000 ? 0xffffffffc12d0000 do_one_initcall+0xfa/0x5ca init/main.c:887 do_init_module+0x204/0x5f6 kernel/module.c:3460 load_module+0x66b2/0x8570 kernel/module.c:3808 __do_sys_finit_module+0x238/0x2a0 kernel/module.c:3902 do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x462e99 Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f2d57e55c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 RAX: ffffffffffffffda RBX: 000000000073bfa0 RCX: 0000000000462e99 RDX: 0000000000000000 RSI: 00000000200003c0 RDI: 0000000000000003 RBP: 00007f2d57e55c70 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f2d57e566bc R13: 00000000004bcefb R14: 00000000006f7030 R15: 0000000000000004 The buggy address belongs to the variable: __func__.67584+0x0/0xffffffffffffd520 [qedi] Memory state around the buggy address: ffffffffc12b0980: fa fa fa fa 00 04 fa fa fa fa fa fa 00 00 05 fa ffffffffc12b0a00: fa fa fa fa 00 00 04 fa fa fa fa fa 00 05 fa fa > ffffffffc12b0a80: fa fa fa fa 00 06 fa fa fa fa fa fa 00 02 fa fa ^ ffffffffc12b0b00: fa fa fa fa 00 00 04 fa fa fa fa fa 00 00 03 fa ffffffffc12b0b80: fa fa fa fa 00 00 02 fa fa fa fa fa 00 00 04 fa Currently the qedi_dbg_* family of functions can overrun the end of the source string if it is less than the destination buffer length because of the use of a fixed sized memcpy. Remove the memset/memcpy calls to nfunc and just use func instead as it is always a null terminated string. Reported-by: Hulk Robot <hulkci@huawei.com> Fixes: ace7f46ba5fd ("scsi: qedi: Add QLogic FastLinQ offload iSCSI driver framework.") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-09-16scsi: qla2xxx: Reset the FCF_ASYNC_{SENT|ACTIVE} flagsGiridhar Malavali
commit 0257eda08e806b82ee1fc90ef73583b6f022845c upstream. Driver maintains state machine for processing and completing switch commands. This patch resets FCF_ASYNC_{SENT|ACTIVE} flag to indicate if the previous command is active or sent, in order for next GPSC command to advance the state machine. [mkp: commit desc typo] Signed-off-by: Giridhar Malavali <gmalavali@marvell.com> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-08-17scsi: lpfc: Fix SLI3 commands being issued on SLI4 devicesJames Smart
commit c95a3b4b0fb8d351e2329a96f87c4fc96a149505 upstream. During debug, it was seen that the driver is issuing commands specific to SLI3 on SLI4 devices. Although the adapter correctly rejected the command, this should not be done. Revise the code to stop sending these commands on a SLI4 adapter. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-08-17scsi: lpfc: Fix fc4type information for FDMIJames Smart
commit 32a80c093b524a0682f1c6166c910387b116ffce upstream. The driver is reporting support for NVME even when not configured for NVME operation. Fix (and make more readable) when NVME protocol support is indicated. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-08-17scsi: lpfc: avoid uninitialized variable warningArnd Bergmann
commit faf5a744f4f8d76e7c03912b5cd381ac8045f6ec upstream. clang -Wuninitialized incorrectly sees a variable being used without initialization: drivers/scsi/lpfc/lpfc_nvme.c:2102:37: error: variable 'localport' is uninitialized when used here [-Werror,-Wuninitialized] lport = (struct lpfc_nvme_lport *)localport->private; ^~~~~~~~~ drivers/scsi/lpfc/lpfc_nvme.c:2059:38: note: initialize the variable 'localport' to silence this warning struct nvme_fc_local_port *localport; ^ = NULL 1 error generated. This is clearly in dead code, as the condition leading up to it is always false when CONFIG_NVME_FC is disabled, and the variable is always initialized when nvme_fc_register_localport() got called successfully. Change the preprocessor conditional to the equivalent C construct, which makes the code more readable and gets rid of the warning. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-08-17scsi: qla4xxx: avoid freeing unallocated dma memoryArnd Bergmann
commit 608f729c31d4caf52216ea00d20092a80959256d upstream. Clang -Wuninitialized notices that on is_qla40XX we never allocate any DMA memory in get_fw_boot_info() but attempt to free it anyway: drivers/scsi/qla4xxx/ql4_os.c:5915:7: error: variable 'buf_dma' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized] if (!(val & 0x07)) { ^~~~~~~~~~~~~ drivers/scsi/qla4xxx/ql4_os.c:5985:47: note: uninitialized use occurs here dma_free_coherent(&ha->pdev->dev, size, buf, buf_dma); ^~~~~~~ drivers/scsi/qla4xxx/ql4_os.c:5915:3: note: remove the 'if' if its condition is always true if (!(val & 0x07)) { ^~~~~~~~~~~~~~~~~~~ drivers/scsi/qla4xxx/ql4_os.c:5885:20: note: initialize the variable 'buf_dma' to silence this warning dma_addr_t buf_dma; ^ = 0 Skip the call to dma_free_coherent() here. Fixes: 2a991c215978 ("[SCSI] qla4xxx: Boot from SAN support for open-iscsi") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-08-17scsi: qedf: Add missing return in qedf_post_io_req() in the fcport offload checkChad Dupuis
commit c5e06ba2f76809ad1492fdad312e81335df46bc5 upstream. Fixes the following crash as the return was missing from the check if an fcport is offloaded. If we hit this code we continue to try to post an invalid task which can lead to the crash: [30259.616411] [0000:61:00.3]:[qedf_post_io_req:989]:3: Session not offloaded yet. [30259.616413] [0000:61:00.3]:[qedf_upload_connection:1340]:3: Uploading connection port_id=490020. [30259.623769] BUG: unable to handle kernel NULL pointer dereference at 0000000000000198 [30259.631645] IP: [<ffffffffc035b1ed>] qedf_init_task.isra.16+0x3d/0x450 [qedf] [30259.638816] PGD 0 [30259.640841] Oops: 0000 [#1] SMP [30259.644098] Modules linked in: fuse xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables devlink ip6table_filter ip6_tables iptable_filter vfat fat ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib ib_ucm ib_umad dm_service_time skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel rpcrdma sunrpc rdma_ucm ib_uverbs lrw gf128mul ib_iser rdma_cm iw_cm ib_cm libiscsi scsi_transport_iscsi qedr(OE) glue_helper ablk_helper cryptd ib_core dm_round_robin joydev pcspkr ipmi_ssif ses enclosure ipmi_si ipmi_devintf ipmi_msghandler mei_me [30259.715529] mei sg hpilo hpwdt shpchp wmi lpc_ich acpi_power_meter dm_multipath ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic uas usb_storage mgag200 qedf(OE) i2c_algo_bit libfcoe drm_kms_helper libfc syscopyarea sysfillrect scsi_transport_fc qede(OE) sysimgblt fb_sys_fops ptp ttm pps_core drm qed(OE) smartpqi crct10dif_pclmul crct10dif_common crc32c_intel i2c_core scsi_transport_sas scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [30259.754237] CPU: 9 PID: 977 Comm: kdmwork-253:7 Kdump: loaded Tainted: G W OE ------------ 3.10.0-862.el7.x86_64 #1 [30259.765664] Hardware name: HPE Synergy 480 Gen10/Synergy 480 Gen10 Compute Module, BIOS I42 04/04/2018 [30259.775000] task: ffff8c801efd0000 ti: ffff8c801efd8000 task.ti: ffff8c801efd8000 [30259.782505] RIP: 0010:[<ffffffffc035b1ed>] [<ffffffffc035b1ed>] qedf_init_task.isra.16+0x3d/0x450 [qedf] [30259.792116] RSP: 0018:ffff8c801efdbbb0 EFLAGS: 00010046 [30259.797444] RAX: 0000000000000000 RBX: ffffa7f1450948d8 RCX: ffff8c7fe5bc40c8 [30259.804600] RDX: ffff8c800715b300 RSI: ffffa7f1450948d8 RDI: ffff8c80169c2480 [30259.811755] RBP: ffff8c801efdbc30 R08: 00000000000000ae R09: ffff8c800a314540 [30259.818911] R10: ffff8c7fe5bc40c8 R11: ffff8c801efdb8ae R12: 0000000000000000 [30259.826068] R13: ffff8c800715b300 R14: ffff8c80169c2480 R15: ffff8c8005da28e0 [30259.833223] FS: 0000000000000000(0000) GS:ffff8c803f840000(0000) knlGS:0000000000000000 [30259.841338] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [30259.847100] CR2: 0000000000000198 CR3: 000000081242e000 CR4: 00000000007607e0 [30259.854256] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [30259.861412] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [30259.868568] PKRU: 00000000 [30259.871278] Call Trace: [30259.873737] [<ffffffffc035c948>] qedf_post_io_req+0x148/0x680 [qedf] [30259.880201] [<ffffffffc035d070>] qedf_queuecommand+0x1f0/0x240 [qedf] [30259.886749] [<ffffffffa329b050>] scsi_dispatch_cmd+0xb0/0x240 [30259.892600] [<ffffffffa32a45bc>] scsi_request_fn+0x4cc/0x680 [30259.898364] [<ffffffffa3118ad9>] __blk_run_queue+0x39/0x50 [30259.903954] [<ffffffffa3114393>] __elv_add_request+0xd3/0x260 [30259.909805] [<ffffffffa311baf0>] blk_insert_cloned_request+0xf0/0x1b0 [30259.916358] [<ffffffffc010b622>] map_request+0x142/0x220 [dm_mod] [30259.922560] [<ffffffffc010b716>] map_tio_request+0x16/0x40 [dm_mod] [30259.928932] [<ffffffffa2ebb1f5>] kthread_worker_fn+0x85/0x180 [30259.934782] [<ffffffffa2ebb170>] ? kthread_stop+0xf0/0xf0 [30259.940284] [<ffffffffa2ebae31>] kthread+0xd1/0xe0 [30259.945176] [<ffffffffa2ebad60>] ? insert_kthread_work+0x40/0x40 [30259.951290] [<ffffffffa351f61d>] ret_from_fork_nospec_begin+0x7/0x21 [30259.957750] [<ffffffffa2ebad60>] ? insert_kthread_work+0x40/0x40 [30259.963860] Code: fe 41 55 49 89 d5 41 54 53 48 89 f3 48 83 ec 58 4c 8b 67 28 4c 8b 4e 18 65 48 8b 04 25 28 00 00 00 48 89 45 d0 31 c0 4c 8b 7e 58 <49> 8b 84 24 98 01 00 00 48 8b 00 f6 80 31 01 00 00 10 0f 85 0b [30259.983372] RIP [<ffffffffc035b1ed>] qedf_init_task.isra.16+0x3d/0x450 [qedf] [30259.990630] RSP <ffff8c801efdbbb0> [30259.994127] CR2: 0000000000000198 Signed-off-by: Chad Dupuis <cdupuis@marvell.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-08-17scsi: ufs: Avoid configuring regulator with undefined voltage rangeStanley Chu
commit 3b141e8cfd54ba3e5c610717295b2a02aab26a05 upstream. For regulators used by UFS, vcc, vccq and vccq2 will have voltage range initialized by ufshcd_populate_vreg(), however other regulators may have undefined voltage range if dt-bindings have no such definition. In above undefined case, both "min_uV" and "max_uV" fields in ufs_vreg struct will be zero values and these values will be configured on regulators in different power modes. Currently this may have no harm if both "min_uV" and "max_uV" always keep "zero values" because regulator_set_voltage() will always bypass such invalid values and return "good" results. However improper values shall be fixed to avoid potential bugs. Simply bypass voltage configuration if voltage range is not defined. Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Acked-by: Alim Akhtar <alim.akhtar@samsung.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-08-17scsi: ufs: Fix regulator load and icc-level configurationStanley Chu
commit 0487fff76632ec023d394a05b82e87a971db8c03 upstream. Currently if a regulator has "<name>-fixed-regulator" property in device tree, it will skip current limit initialization. This lead to a zero "max_uA" value in struct ufs_vreg. However, "regulator_set_load" operation shall be required on regulators which have valid current limits, otherwise a zero "max_uA" set by "regulator_set_load" may cause unexpected behavior when this regulator is enabled or set as high power mode. Similarly, in device's icc_level configuration flow, the target icc_level shall be updated if regulator also has valid current limit, otherwise a wrong icc_level will be calculated by zero "max_uA" and thus causes unexpected results after it is written to device. Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Acked-by: Alim Akhtar <alim.akhtar@samsung.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-08-17scsi: libsas: Do discovery on empty PHY to update PHY infoJohn Garry
commit d8649fc1c5e40e691d589ed825998c36a947491c upstream. When we discover the PHY is empty in sas_rediscover_dev(), the PHY information (like negotiated linkrate) is not updated. As such, for a user examining sysfs for that PHY, they would see incorrect values: root@(none)$ cd /sys/class/sas_phy/phy-0:0:20 root@(none)$ more negotiated_linkrate 3.0 Gbit root@(none)$ echo 0 > enable root@(none)$ more negotiated_linkrate 3.0 Gbit So fix this, simply discover the PHY again, even though we know it's empty; in the above example, this gives us: root@(none)$ more negotiated_linkrate Phy disabled We must do this after unregistering the device associated with the PHY (in sas_unregister_devs_sas_addr()). Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-08-02scsi: qedi: Abort ep termination if offload not scheduledManish Rangankar
commit f848bfd8e167210a29374e8a678892bed591684f upstream. Sometimes during connection recovery when there is a failure to resolve ARP, and offload connection was not issued, driver tries to flush pending offload connection work which was not queued up. kernel: WARNING: CPU: 19 PID: 10110 at kernel/workqueue.c:3030 __flush_work.isra.34+0x19c/0x1b0 kernel: CPU: 19 PID: 10110 Comm: iscsid Tainted: G W 5.1.0-rc4 #11 kernel: Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 2.9.1 12/04/2018 kernel: RIP: 0010:__flush_work.isra.34+0x19c/0x1b0 kernel: Code: 8b fb 66 0f 1f 44 00 00 31 c0 eb ab 48 89 ef c6 07 00 0f 1f 40 00 fb 66 0f 1f 44 00 00 31 c0 eb 96 e8 08 16 fe ff 0f 0b eb 8d <0f> 0b 31 c0 eb 87 0f 1f 40 00 66 2e 0f 1 f 84 00 00 00 00 00 0f 1f kernel: RSP: 0018:ffffa6b4054dba68 EFLAGS: 00010246 kernel: RAX: 0000000000000000 RBX: ffff91df21c36fc0 RCX: 0000000000000000 kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff91df21c36fc0 kernel: RBP: ffff91df21c36ef0 R08: 0000000000000000 R09: 0000000000000000 kernel: R10: 0000000000000038 R11: ffffa6b4054dbd60 R12: ffffffffc05e72c0 kernel: R13: ffff91db10280820 R14: 0000000000000048 R15: 0000000000000000 kernel: FS: 00007f5d83cc1740(0000) GS:ffff91df2f840000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: 0000000001cc5000 CR3: 0000000465450002 CR4: 00000000001606e0 kernel: Call Trace: kernel: ? try_to_del_timer_sync+0x4d/0x80 kernel: qedi_ep_disconnect+0x3b/0x410 [qedi] kernel: ? 0xffffffffc083c000 kernel: ? klist_iter_exit+0x14/0x20 kernel: ? class_find_device+0x93/0xf0 kernel: iscsi_if_ep_disconnect.isra.18+0x58/0x70 [scsi_transport_iscsi] kernel: iscsi_if_recv_msg+0x10e2/0x1510 [scsi_transport_iscsi] kernel: ? copyout+0x22/0x30 kernel: ? _copy_to_iter+0xa0/0x430 kernel: ? _cond_resched+0x15/0x30 kernel: ? __kmalloc_node_track_caller+0x1f9/0x270 kernel: iscsi_if_rx+0xa5/0x1e0 [scsi_transport_iscsi] kernel: netlink_unicast+0x17f/0x230 kernel: netlink_sendmsg+0x2d2/0x3d0 kernel: sock_sendmsg+0x36/0x50 kernel: ___sys_sendmsg+0x280/0x2a0 kernel: ? timerqueue_add+0x54/0x80 kernel: ? enqueue_hrtimer+0x38/0x90 kernel: ? hrtimer_start_range_ns+0x19f/0x2c0 kernel: __sys_sendmsg+0x58/0xa0 kernel: do_syscall_64+0x5b/0x180 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-08-02scsi: qla2xxx: Fix hardirq-unsafe lockingBart Van Assche
commit 300ec7415c1fed5c73660f50c8e14a67e236dc0a upstream. Since fc_remote_port_delete() must be called with interrupts enabled, do not disable interrupts when calling that function. Remove the lockin calls from around the put_sess() call. This is safe because the function that is called when the final reference is dropped, qlt_unreg_sess(), grabs the proper locks. This patch avoids that lockdep reports the following: WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected kworker/2:1/62 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: 0000000009e679b3 (&(&k->k_lock)->rlock){+.+.}, at: klist_next+0x43/0x1d0 and this task is already holding: 00000000a033b71c (&(&ha->tgt.sess_lock)->rlock){-...}, at: qla24xx_delete_sess_fn+0x55/0xf0 [qla2xxx_scst] which would create a new lock dependency: (&(&ha->tgt.sess_lock)->rlock){-...} -> (&(&k->k_lock)->rlock){+.+.} but this new dependency connects a HARDIRQ-irq-safe lock: (&(&ha->tgt.sess_lock)->rlock){-...} ... which became HARDIRQ-irq-safe at: lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 qla24xx_report_id_acquisition+0xa69/0xe30 [qla2xxx_scst] qla24xx_process_response_queue+0x69e/0x1270 [qla2xxx_scst] qla24xx_msix_rsp_q+0x79/0xf0 [qla2xxx_scst] __handle_irq_event_percpu+0x79/0x3c0 handle_irq_event_percpu+0x70/0xf0 handle_irq_event+0x5a/0x8b handle_edge_irq+0x12c/0x310 handle_irq+0x192/0x20a do_IRQ+0x73/0x160 ret_from_intr+0x0/0x1d default_idle+0x23/0x1f0 arch_cpu_idle+0x15/0x20 default_idle_call+0x35/0x40 do_idle+0x2bb/0x2e0 cpu_startup_entry+0x1d/0x20 start_secondary+0x2a8/0x320 secondary_startup_64+0xa4/0xb0 to a HARDIRQ-irq-unsafe lock: (&(&k->k_lock)->rlock){+.+.} ... which became HARDIRQ-irq-unsafe at: ... lock_acquire+0xe3/0x200 _raw_spin_lock+0x32/0x50 klist_add_tail+0x33/0xb0 device_add+0x7e1/0xb50 device_create_groups_vargs+0x11c/0x150 device_create_with_groups+0x89/0xb0 vtconsole_class_init+0xb2/0x124 do_one_initcall+0xc5/0x3ce kernel_init_freeable+0x295/0x32e kernel_init+0x11/0x11b ret_from_fork+0x3a/0x50 other info that might help us debug this: Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&(&k->k_lock)->rlock); local_irq_disable(); lock(&(&ha->tgt.sess_lock)->rlock); lock(&(&k->k_lock)->rlock); <Interrupt> lock(&(&ha->tgt.sess_lock)->rlock); *** DEADLOCK *** 3 locks held by kworker/2:1/62: #0: 00000000a4319c16 ((wq_completion)"qla2xxx_wq"){+.+.}, at: process_one_work+0x437/0xa80 #1: 00000000ffa34c42 ((work_completion)(&sess->del_work)){+.+.}, at: process_one_work+0x437/0xa80 #2: 00000000a033b71c (&(&ha->tgt.sess_lock)->rlock){-...}, at: qla24xx_delete_sess_fn+0x55/0xf0 [qla2xxx_scst] the dependencies between HARDIRQ-irq-safe lock and the holding lock: -> (&(&ha->tgt.sess_lock)->rlock){-...} ops: 8 { IN-HARDIRQ-W at: lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 qla24xx_report_id_acquisition+0xa69/0xe30 [qla2xxx_scst] qla24xx_process_response_queue+0x69e/0x1270 [qla2xxx_scst] qla24xx_msix_rsp_q+0x79/0xf0 [qla2xxx_scst] __handle_irq_event_percpu+0x79/0x3c0 handle_irq_event_percpu+0x70/0xf0 handle_irq_event+0x5a/0x8b handle_edge_irq+0x12c/0x310 handle_irq+0x192/0x20a do_IRQ+0x73/0x160 ret_from_intr+0x0/0x1d default_idle+0x23/0x1f0 arch_cpu_idle+0x15/0x20 default_idle_call+0x35/0x40 do_idle+0x2bb/0x2e0 cpu_startup_entry+0x1d/0x20 start_secondary+0x2a8/0x320 secondary_startup_64+0xa4/0xb0 INITIAL USE at: lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 qla24xx_report_id_acquisition+0xa69/0xe30 [qla2xxx_scst] qla24xx_process_response_queue+0x69e/0x1270 [qla2xxx_scst] qla24xx_msix_rsp_q+0x79/0xf0 [qla2xxx_scst] __handle_irq_event_percpu+0x79/0x3c0 handle_irq_event_percpu+0x70/0xf0 handle_irq_event+0x5a/0x8b handle_edge_irq+0x12c/0x310 handle_irq+0x192/0x20a do_IRQ+0x73/0x160 ret_from_intr+0x0/0x1d default_idle+0x23/0x1f0 arch_cpu_idle+0x15/0x20 default_idle_call+0x35/0x40 do_idle+0x2bb/0x2e0 cpu_startup_entry+0x1d/0x20 start_secondary+0x2a8/0x320 secondary_startup_64+0xa4/0xb0 } ... key at: [<ffffffffa0c0d080>] __key.85462+0x0/0xfffffffffff7df80 [qla2xxx_scst] ... acquired at: lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 klist_next+0x43/0x1d0 device_for_each_child+0x96/0x110 scsi_target_block+0x3c/0x40 [scsi_mod] fc_remote_port_delete+0xe7/0x1c0 [scsi_transport_fc] qla2x00_mark_device_lost+0xa0b/0xa30 [qla2xxx_scst] qlt_unreg_sess+0x1c6/0x380 [qla2xxx_scst] qla24xx_delete_sess_fn+0xe6/0xf0 [qla2xxx_scst] process_one_work+0x511/0xa80 worker_thread+0x67/0x5b0 kthread+0x1d2/0x1f0 ret_from_fork+0x3a/0x50 the dependencies between the lock to be acquired and HARDIRQ-irq-unsafe lock: -> (&(&k->k_lock)->rlock){+.+.} ops: 13831 { HARDIRQ-ON-W at: lock_acquire+0xe3/0x200 _raw_spin_lock+0x32/0x50 klist_add_tail+0x33/0xb0 device_add+0x7e1/0xb50 device_create_groups_vargs+0x11c/0x150 device_create_with_groups+0x89/0xb0 vtconsole_class_init+0xb2/0x124 do_one_initcall+0xc5/0x3ce kernel_init_freeable+0x295/0x32e kernel_init+0x11/0x11b ret_from_fork+0x3a/0x50 SOFTIRQ-ON-W at: lock_acquire+0xe3/0x200 _raw_spin_lock+0x32/0x50 klist_add_tail+0x33/0xb0 device_add+0x7e1/0xb50 device_create_groups_vargs+0x11c/0x150 device_create_with_groups+0x89/0xb0 vtconsole_class_init+0xb2/0x124 do_one_initcall+0xc5/0x3ce kernel_init_freeable+0x295/0x32e kernel_init+0x11/0x11b ret_from_fork+0x3a/0x50 INITIAL USE at: lock_acquire+0xe3/0x200 _raw_spin_lock+0x32/0x50 klist_add_tail+0x33/0xb0 device_add+0x7e1/0xb50 device_create_groups_vargs+0x11c/0x150 device_create_with_groups+0x89/0xb0 vtconsole_class_init+0xb2/0x124 do_one_initcall+0xc5/0x3ce kernel_init_freeable+0x295/0x32e kernel_init+0x11/0x11b ret_from_fork+0x3a/0x50 } ... key at: [<ffffffff83ed8780>] __key.15491+0x0/0x40 ... acquired at: lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 klist_next+0x43/0x1d0 device_for_each_child+0x96/0x110 scsi_target_block+0x3c/0x40 [scsi_mod] fc_remote_port_delete+0xe7/0x1c0 [scsi_transport_fc] qla2x00_mark_device_lost+0xa0b/0xa30 [qla2xxx_scst] qlt_unreg_sess+0x1c6/0x380 [qla2xxx_scst] qla24xx_delete_sess_fn+0xe6/0xf0 [qla2xxx_scst] process_one_work+0x511/0xa80 worker_thread+0x67/0x5b0 kthread+0x1d2/0x1f0 ret_from_fork+0x3a/0x50 stack backtrace: CPU: 2 PID: 62 Comm: kworker/2:1 Tainted: G O 5.0.7-dbg+ #8 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Workqueue: qla2xxx_wq qla24xx_delete_sess_fn [qla2xxx_scst] Call Trace: dump_stack+0x86/0xca check_usage.cold.52+0x473/0x563 __lock_acquire+0x11c0/0x23e0 lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 klist_next+0x43/0x1d0 device_for_each_child+0x96/0x110 scsi_target_block+0x3c/0x40 [scsi_mod] fc_remote_port_delete+0xe7/0x1c0 [scsi_transport_fc] qla2x00_mark_device_lost+0xa0b/0xa30 [qla2xxx_scst] qlt_unreg_sess+0x1c6/0x380 [qla2xxx_scst] qla24xx_delete_sess_fn+0xe6/0xf0 [qla2xxx_scst] process_one_work+0x511/0xa80 worker_thread+0x67/0x5b0 kthread+0x1d2/0x1f0 ret_from_fork+0x3a/0x50 Cc: Himanshu Madhani <hmadhani@marvell.com> Cc: Giridhar Malavali <gmalavali@marvell.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-08-02scsi: qla2xxx: Avoid that lockdep complains about unsafe locking in ↵Bart Van Assche
tcm_qla2xxx_close_session() commit d4023db71108375e4194e92730ba0d32d7f07813 upstream. This patch avoids that lockdep reports the following warning: ===================================================== WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected 5.1.0-rc1-dbg+ #11 Tainted: G W ----------------------------------------------------- rmdir/1478 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: 00000000e7ac4607 (&(&k->k_lock)->rlock){+.+.}, at: klist_next+0x43/0x1d0 and this task is already holding: 00000000cf0baf5e (&(&ha->tgt.sess_lock)->rlock){-...}, at: tcm_qla2xxx_close_session+0x57/0xb0 [tcm_qla2xxx] which would create a new lock dependency: (&(&ha->tgt.sess_lock)->rlock){-...} -> (&(&k->k_lock)->rlock){+.+.} but this new dependency connects a HARDIRQ-irq-safe lock: (&(&ha->tgt.sess_lock)->rlock){-...} ... which became HARDIRQ-irq-safe at: lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 qla2x00_fcport_event_handler+0x1f3d/0x22b0 [qla2xxx] qla2x00_async_login_sp_done+0x1dc/0x1f0 [qla2xxx] qla24xx_process_response_queue+0xa37/0x10e0 [qla2xxx] qla24xx_msix_rsp_q+0x79/0xf0 [qla2xxx] __handle_irq_event_percpu+0x79/0x3c0 handle_irq_event_percpu+0x70/0xf0 handle_irq_event+0x5a/0x8b handle_edge_irq+0x12c/0x310 handle_irq+0x192/0x20a do_IRQ+0x73/0x160 ret_from_intr+0x0/0x1d default_idle+0x23/0x1f0 arch_cpu_idle+0x15/0x20 default_idle_call+0x35/0x40 do_idle+0x2bb/0x2e0 cpu_startup_entry+0x1d/0x20 start_secondary+0x24d/0x2d0 secondary_startup_64+0xa4/0xb0 to a HARDIRQ-irq-unsafe lock: (&(&k->k_lock)->rlock){+.+.} ... which became HARDIRQ-irq-unsafe at: ... lock_acquire+0xe3/0x200 _raw_spin_lock+0x32/0x50 klist_add_tail+0x33/0xb0 device_add+0x7f4/0xb60 device_create_groups_vargs+0x11c/0x150 device_create_with_groups+0x89/0xb0 vtconsole_class_init+0xb2/0x124 do_one_initcall+0xc5/0x3ce kernel_init_freeable+0x295/0x32e kernel_init+0x11/0x11b ret_from_fork+0x3a/0x50 other info that might help us debug this: Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&(&k->k_lock)->rlock); local_irq_disable(); lock(&(&ha->tgt.sess_lock)->rlock); lock(&(&k->k_lock)->rlock); <Interrupt> lock(&(&ha->tgt.sess_lock)->rlock); *** DEADLOCK *** 4 locks held by rmdir/1478: #0: 000000002c7f1ba4 (sb_writers#10){.+.+}, at: mnt_want_write+0x32/0x70 #1: 00000000c85eb147 (&default_group_class[depth - 1]#2/1){+.+.}, at: do_rmdir+0x217/0x2d0 #2: 000000002b164d6f (&sb->s_type->i_mutex_key#13){++++}, at: vfs_rmdir+0x7e/0x1d0 #3: 00000000cf0baf5e (&(&ha->tgt.sess_lock)->rlock){-...}, at: tcm_qla2xxx_close_session+0x57/0xb0 [tcm_qla2xxx] the dependencies between HARDIRQ-irq-safe lock and the holding lock: -> (&(&ha->tgt.sess_lock)->rlock){-...} ops: 127 { IN-HARDIRQ-W at: lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 qla2x00_fcport_event_handler+0x1f3d/0x22b0 [qla2xxx] qla2x00_async_login_sp_done+0x1dc/0x1f0 [qla2xxx] qla24xx_process_response_queue+0xa37/0x10e0 [qla2xxx] qla24xx_msix_rsp_q+0x79/0xf0 [qla2xxx] __handle_irq_event_percpu+0x79/0x3c0 handle_irq_event_percpu+0x70/0xf0 handle_irq_event+0x5a/0x8b handle_edge_irq+0x12c/0x310 handle_irq+0x192/0x20a do_IRQ+0x73/0x160 ret_from_intr+0x0/0x1d default_idle+0x23/0x1f0 arch_cpu_idle+0x15/0x20 default_idle_call+0x35/0x40 do_idle+0x2bb/0x2e0 cpu_startup_entry+0x1d/0x20 start_secondary+0x24d/0x2d0 secondary_startup_64+0xa4/0xb0 INITIAL USE at: lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 qla2x00_loop_resync+0xb3d/0x2690 [qla2xxx] qla2x00_do_dpc+0xcee/0xf30 [qla2xxx] kthread+0x1d2/0x1f0 ret_from_fork+0x3a/0x50 } ... key at: [<ffffffffa125f700>] __key.62804+0x0/0xfffffffffff7e900 [qla2xxx] ... acquired at: __lock_acquire+0x11ed/0x1b60 lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 klist_next+0x43/0x1d0 device_for_each_child+0x96/0x110 scsi_target_block+0x3c/0x40 [scsi_mod] fc_remote_port_delete+0xe7/0x1c0 [scsi_transport_fc] qla2x00_mark_device_lost+0x4d3/0x500 [qla2xxx] qlt_unreg_sess+0x104/0x2c0 [qla2xxx] tcm_qla2xxx_close_session+0xa2/0xb0 [tcm_qla2xxx] target_shutdown_sessions+0x17b/0x190 [target_core_mod] core_tpg_del_initiator_node_acl+0xf3/0x1f0 [target_core_mod] target_fabric_nacl_base_release+0x25/0x30 [target_core_mod] config_item_release+0x9f/0x120 [configfs] config_item_put+0x29/0x2b [configfs] configfs_rmdir+0x3d2/0x520 [configfs] vfs_rmdir+0xb3/0x1d0 do_rmdir+0x25c/0x2d0 __x64_sys_rmdir+0x24/0x30 do_syscall_64+0x77/0x220 entry_SYSCALL_64_after_hwframe+0x49/0xbe the dependencies between the lock to be acquired and HARDIRQ-irq-unsafe lock: -> (&(&k->k_lock)->rlock){+.+.} ops: 14568 { HARDIRQ-ON-W at: lock_acquire+0xe3/0x200 _raw_spin_lock+0x32/0x50 klist_add_tail+0x33/0xb0 device_add+0x7f4/0xb60 device_create_groups_vargs+0x11c/0x150 device_create_with_groups+0x89/0xb0 vtconsole_class_init+0xb2/0x124 do_one_initcall+0xc5/0x3ce kernel_init_freeable+0x295/0x32e kernel_init+0x11/0x11b ret_from_fork+0x3a/0x50 SOFTIRQ-ON-W at: lock_acquire+0xe3/0x200 _raw_spin_lock+0x32/0x50 klist_add_tail+0x33/0xb0 device_add+0x7f4/0xb60 device_create_groups_vargs+0x11c/0x150 device_create_with_groups+0x89/0xb0 vtconsole_class_init+0xb2/0x124 do_one_initcall+0xc5/0x3ce kernel_init_freeable+0x295/0x32e kernel_init+0x11/0x11b ret_from_fork+0x3a/0x50 INITIAL USE at: lock_acquire+0xe3/0x200 _raw_spin_lock+0x32/0x50 klist_add_tail+0x33/0xb0 device_add+0x7f4/0xb60 device_create_groups_vargs+0x11c/0x150 device_create_with_groups+0x89/0xb0 vtconsole_class_init+0xb2/0x124 do_one_initcall+0xc5/0x3ce kernel_init_freeable+0x295/0x32e kernel_init+0x11/0x11b ret_from_fork+0x3a/0x50 } ... key at: [<ffffffff83f3d900>] __key.15805+0x0/0x40 ... acquired at: __lock_acquire+0x11ed/0x1b60 lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 klist_next+0x43/0x1d0 device_for_each_child+0x96/0x110 scsi_target_block+0x3c/0x40 [scsi_mod] fc_remote_port_delete+0xe7/0x1c0 [scsi_transport_fc] qla2x00_mark_device_lost+0x4d3/0x500 [qla2xxx] qlt_unreg_sess+0x104/0x2c0 [qla2xxx] tcm_qla2xxx_close_session+0xa2/0xb0 [tcm_qla2xxx] target_shutdown_sessions+0x17b/0x190 [target_core_mod] core_tpg_del_initiator_node_acl+0xf3/0x1f0 [target_core_mod] target_fabric_nacl_base_release+0x25/0x30 [target_core_mod] config_item_release+0x9f/0x120 [configfs] config_item_put+0x29/0x2b [configfs] configfs_rmdir+0x3d2/0x520 [configfs] vfs_rmdir+0xb3/0x1d0 do_rmdir+0x25c/0x2d0 __x64_sys_rmdir+0x24/0x30 do_syscall_64+0x77/0x220 entry_SYSCALL_64_after_hwframe+0x49/0xbe stack backtrace: CPU: 7 PID: 1478 Comm: rmdir Tainted: G W 5.1.0-rc1-dbg+ #11 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Call Trace: dump_stack+0x86/0xca check_usage.cold.59+0x473/0x563 check_prev_add.constprop.43+0x1f1/0x1170 __lock_acquire+0x11ed/0x1b60 lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 klist_next+0x43/0x1d0 device_for_each_child+0x96/0x110 scsi_target_block+0x3c/0x40 [scsi_mod] fc_remote_port_delete+0xe7/0x1c0 [scsi_transport_fc] qla2x00_mark_device_lost+0x4d3/0x500 [qla2xxx] qlt_unreg_sess+0x104/0x2c0 [qla2xxx] tcm_qla2xxx_close_session+0xa2/0xb0 [tcm_qla2xxx] target_shutdown_sessions+0x17b/0x190 [target_core_mod] core_tpg_del_initiator_node_acl+0xf3/0x1f0 [target_core_mod] target_fabric_nacl_base_release+0x25/0x30 [target_core_mod] config_item_release+0x9f/0x120 [configfs] config_item_put+0x29/0x2b [configfs] configfs_rmdir+0x3d2/0x520 [configfs] vfs_rmdir+0xb3/0x1d0 do_rmdir+0x25c/0x2d0 __x64_sys_rmdir+0x24/0x30 do_syscall_64+0x77/0x220 entry_SYSCALL_64_after_hwframe+0x49/0xbe Cc: Himanshu Madhani <hmadhani@marvell.com> Cc: Giridhar Malavali <gmalavali@marvell.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-08-02scsi: qla2xxx: Fix abort handling in tcm_qla2xxx_write_pending()Bart Van Assche
commit e209783d66bca04b5fce4429e59338517ffc1a0b upstream. Implementations of the .write_pending() callback functions must guarantee that an appropriate LIO core callback function will be called immediately or at a later time. Make sure that this guarantee is met for aborted SCSI commands. [mkp: typo] Cc: Himanshu Madhani <hmadhani@marvell.com> Cc: Giridhar Malavali <gmalavali@marvell.com> Fixes: 694833ee00c4 ("scsi: tcm_qla2xxx: Do not allow aborted cmd to advance.") # v4.13. Fixes: a07100e00ac4 ("qla2xxx: Fix TMR ABORT interaction issue between qla2xxx and TCM") # v4.5. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-08-02scsi: qla2xxx: Fix a qla24xx_enable_msix() error pathBart Van Assche
commit 24afabdbd0b3553963a2bbf465895492b14d1107 upstream. Make sure that the allocated interrupts are freed if allocating memory for the msix_entries array fails. Cc: Himanshu Madhani <hmadhani@marvell.com> Cc: Giridhar Malavali <gmalavali@marvell.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-08-02Revert "scsi: sd: Keep disk read-only when re-reading partition"Martin K. Petersen
commit 8acf608e602f6ec38b7cc37b04c80f1ce9a1a6cc upstream. This reverts commit 20bd1d026aacc5399464f8328f305985c493cde3. This patch introduced regressions for devices that come online in read-only state and subsequently switch to read-write. Given how the partition code is currently implemented it is not possible to persist the read-only flag across a device revalidate call. This may need to get addressed in the future since it is common for user applications to proactively call BLKRRPART. Reverting this commit will re-introduce a regression where a device-initiated revalidate event will cause the admin state to be forgotten. A separate patch will address this issue. Fixes: 20bd1d026aac ("scsi: sd: Keep disk read-only when re-reading partition") Cc: <stable@vger.kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-30scsi: aic7xxx: fix EISA supportChristoph Hellwig
commit 144ec97493af34efdb77c5aba146e9c7de8d0a06 upstream. Instead of relying on the now removed NULL argument to pci_alloc_consistent, switch to the generic DMA API, and store the struct device so that we can pass it. Fixes: 4167b2ad5182 ("PCI: Remove NULL device handling from PCI DMA API") Reported-by: Matthew Whitehead <tedheadster@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Matthew Whitehead <tedheadster@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-30scsi: qla2xxx: Fix device staying in blocked stateQuinn Tran
commit 2137490f2147a8d0799b72b9a1023efb012d40c7 upstream. This patch fixes issue reported by some of the customers, who discovered that after cable pull scenario the devices disappear and path seems to remain in blocked state. Once the device reappears, driver does not seem to update path to online. This issue appears because of the defer flag creating race condition where the same session reappears. This patch fixes this issue by indicating SCSI-ML of device lost when qlt_free_session_done() is called from qlt_unreg_sess(). Fixes: 41dc529a4602a ("qla2xxx: Improve RSCN handling in driver") Signed-off-by: Quinn Tran <qtran@marvell.com> Cc: stable@vger.kernel.org #4.19 Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> [PG: (... 0, 0) ---> (... 1, 1) for older code base of 4.18.x] Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-30scsi: qla2xxx: Fix incorrect region-size setting in optrom SYSFS routinesAndrew Vasquez
commit 5cbdae10bf11f96e30b4d14de7b08c8b490e903c upstream. Commit e6f77540c067 ("scsi: qla2xxx: Fix an integer overflow in sysfs code") incorrectly set 'optrom_region_size' to 'start+size', which can overflow option-rom boundaries when 'start' is non-zero. Continue setting optrom_region_size to the proper adjusted value of 'size'. Fixes: e6f77540c067 ("scsi: qla2xxx: Fix an integer overflow in sysfs code") Cc: stable@vger.kernel.org Signed-off-by: Andrew Vasquez <andrewv@marvell.com> Signed-off-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-30scsi: lpfc: change snprintf to scnprintf for possible overflowSilvio Cesare
commit e7f7b6f38a44697428f5a2e7c606de028df2b0e3 upstream. Change snprintf to scnprintf. There are generally two cases where using snprintf causes problems. 1) Uses of size += snprintf(buf, SIZE - size, fmt, ...) In this case, if snprintf would have written more characters than what the buffer size (SIZE) is, then size will end up larger than SIZE. In later uses of snprintf, SIZE - size will result in a negative number, leading to problems. Note that size might already be too large by using size = snprintf before the code reaches a case of size += snprintf. 2) If size is ultimately used as a length parameter for a copy back to user space, then it will potentially allow for a buffer overflow and information disclosure when size is greater than SIZE. When the size is used to index the buffer directly, we can have memory corruption. This also means when size = snprintf... is used, it may also cause problems since size may become large. Copying to userspace is mitigated by the HARDENED_USERCOPY kernel configuration. The solution to these issues is to use scnprintf which returns the number of characters actually written to the buffer, so the size variable will never exceed SIZE. Signed-off-by: Silvio Cesare <silvio.cesare@gmail.com> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: James Smart <james.smart@broadcom.com> Cc: Dick Kennedy <dick.kennedy@broadcom.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Kees Cook <keescook@chromium.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [PG: use 4.19.x backport version here for v4.18.40] Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-30scsi: csiostor: fix missing data copy in csio_scsi_err_handler()Varun Prakash
commit 5c2442fd78998af60e13aba506d103f7f43f8701 upstream. If scsi cmd sglist is not suitable for DDP then csiostor driver uses preallocated buffers for DDP, because of this data copy is required from DDP buffer to scsi cmd sglist before calling ->scsi_done(). Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-30scsi: libsas: fix a race condition when smp task timeoutJason Yan
commit b90cd6f2b905905fb42671009dc0e27c310a16ae upstream. When the lldd is processing the complete sas task in interrupt and set the task stat as SAS_TASK_STATE_DONE, the smp timeout timer is able to be triggered at the same time. And smp_task_timedout() will complete the task wheter the SAS_TASK_STATE_DONE is set or not. Then the sas task may freed before lldd end the interrupt process. Thus a use-after-free will happen. Fix this by calling the complete() only when SAS_TASK_STATE_DONE is not set. And remove the check of the return value of the del_timer(). Once the LLDD sets DONE, it must call task->done(), which will call smp_task_done()->complete() and the task will be completed and freed correctly. Reported-by: chenxiang <chenxiang66@hisilicon.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> CC: John Garry <john.garry@huawei.com> CC: Johannes Thumshirn <jthumshirn@suse.de> CC: Ewan Milne <emilne@redhat.com> CC: Christoph Hellwig <hch@lst.de> CC: Tomas Henzl <thenzl@redhat.com> CC: Dan Williams <dan.j.williams@intel.com> CC: Hannes Reinecke <hare@suse.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: John Garry <john.garry@huawei.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-23scsi: storvsc: Fix calculation of sub-channel countMichael Kelley
commit 382e06d11e075a40b4094b6ef809f8d4bcc7ab2a upstream. When the number of sub-channels offered by Hyper-V is >= the number of CPUs in the VM, calculate the correct number of sub-channels. The current code produces one too many. This scenario arises only when the number of CPUs is artificially restricted (for example, with maxcpus=<n> on the kernel boot line), because Hyper-V normally offers a sub-channel count < number of CPUs. While the current code doesn't break, the extra sub-channel is unbalanced across the CPUs (for example, a total of 5 channels on a VM with 4 CPUs). Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-23scsi: core: add new RDAC LENOVO/DE_Series deviceXose Vazquez Perez
commit 1cb1d2c64e812928fe0a40b8f7e74523d0283dbe upstream. Blacklist "Universal Xport" LUN. It's used for in-band storage array management. Also add model to the rdac dh family. Cc: Martin Wilck <mwilck@suse.com> Cc: Hannes Reinecke <hare@suse.de> Cc: NetApp RDAC team <ng-eseries-upstream-maintainers@netapp.com> Cc: Christophe Varoqui <christophe.varoqui@opensvc.com> Cc: James E.J. Bottomley <jejb@linux.vnet.ibm.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: SCSI ML <linux-scsi@vger.kernel.org> Cc: DM ML <dm-devel@redhat.com> Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-23scsi: qla4xxx: fix a potential NULL pointer dereferenceKangjie Lu
commit fba1bdd2a9a93f3e2181ec1936a3c2f6b37e7ed6 upstream. In case iscsi_lookup_endpoint fails, the fix returns -EINVAL to avoid NULL pointer dereference. Signed-off-by: Kangjie Lu <kjlu@umn.edu> Acked-by: Manish Rangankar <mrangankar@marvell.com> Reviewed-by: Mukesh Ojha <mojha@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-23scsi: aacraid: Insure we don't access PCIe space during AER/EEHDave Carroll
commit b6554cfe09e1f610aed7d57164ab7760be57acd9 upstream. There are a few windows during AER/EEH when we can access PCIe I/O mapped registers. This will harden the access to insure we do not allow PCIe access during errors Signed-off-by: Dave Carroll <david.carroll@microsemi.com> Reviewed-by: Sagar Biradar <sagar.biradar@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-23scsi: mpt3sas: Fix kernel panic during expander resetSreekanth Reddy
commit c2fe742ff6e77c5b4fe4ad273191ddf28fdea25e upstream. During expander reset handling, the driver invokes kernel function scsi_host_find_tag() to obtain outstanding requests associated with the scsi host managed by the driver. Driver loops from tag value zero to hba queue depth to obtain the outstanding scmds. But when blk-mq is enabled, the block layer may return stale entry for one or more requests. This may lead to kernel panic if the returned value is inaccessible or the memory pointed by the returned value is reused. Reference of upstream discussion: https://patchwork.kernel.org/patch/10734933/ Instead of calling scsi_host_find_tag() API for each and every smid (smid is tag +1) from one to shost->can_queue, now driver will call this API (to obtain the outstanding scmd) only for those smid's which are outstanding at the driver level. Driver will determine whether this smid is outstanding at driver level by looking into it's corresponding MPI request frame, if its MPI request frame is empty, then it means that this smid is free and does not need to call scsi_host_find_tag() for it. By doing this, driver will invoke scsi_host_find_tag() for only those tags which are outstanding at the driver level. Driver will check whether particular MPI request frame is empty or not by looking into the "DevHandle" field. If this field is zero then it means that this MPI request is empty. For active MPI request DevHandle must be non-zero. Also driver will memset the MPI request frame once the corresponding scmd is processed (i.e. just before calling scmd->done function). Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-15Revert "scsi: fcoe: clear FC_RP_STARTED flags when receiving a LOGO"Saurav Kashyap
commit 0228034d8e5915b98c33db35a98f5e909e848ae9 upstream. This patch clears FC_RP_STARTED flag during logoff, because of this re-login(flogi) didn't happen to the switch. This reverts commit 1550ec458e0cf1a40a170ab1f4c46e3f52860f65. Fixes: 1550ec458e0c ("scsi: fcoe: clear FC_RP_STARTED flags when receiving a LOGO") Cc: <stable@vger.kernel.org> # v4.18+ Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Reviewed-by: Hannes Reinecke <hare@#suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-15scsi: core: set result when the command cannot be dispatchedJaesoo Lee
commit be549d49115422f846b6d96ee8fd7173a5f7ceb0 upstream. When SCSI blk-mq is enabled, there is a bug in handling errors in scsi_queue_rq. Specifically, the bug is not setting result field of scsi_request correctly when the dispatch of the command has been failed. Since the upper layer code including the sg_io ioctl expects to receive any error status from result field of scsi_request, the error is silently ignored and this could cause data corruptions for some applications. Fixes: d285203cf647 ("scsi: add support for a blk-mq based I/O path.") Cc: <stable@vger.kernel.org> Signed-off-by: Jaesoo Lee <jalee@purestorage.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-05scsi: core: Avoid that system resume triggers a kernel warningBart Van Assche
commit 388b4e6a00bb3097278ed1648ac5a1cb48c894e6 upstream. scsi_device_quiesce() and scsi_device_resume() are called during system-wide suspend and resume. scsi_device_quiesce() only succeeds for SCSI devices that are in one of the RUNNING, OFFLINE or TRANSPORT_OFFLINE states (see also scsi_set_device_state()). This patch avoids that the following warning is triggered when resuming a system for which quiescing a SCSI device failed: WARNING: CPU: 2 PID: 11303 at drivers/scsi/scsi_lib.c:2600 scsi_device_resume+0x4f/0x58 CPU: 2 PID: 11303 Comm: kworker/u8:70 Not tainted 5.0.0-rc1+ #50 Hardware name: LENOVO 80E3/Lancer 5B2, BIOS A2CN45WW(V2.13) 08/04/2016 Workqueue: events_unbound async_run_entry_fn Call Trace: scsi_dev_type_resume+0x2e/0x60 async_run_entry_fn+0x32/0xd8 process_one_work+0x1f4/0x420 worker_thread+0x28/0x3c0 kthread+0x118/0x130 ret_from_fork+0x22/0x40 Cc: Przemek Socha <soprwa@gmail.com> Reported-by: Przemek Socha <soprwa@gmail.com> Fixes: 3a0a529971ec ("block, scsi: Make SCSI quiesce and resume work reliably") # v4.15 Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-05scsi: iscsi: flush running unbind operations when removing a sessionMaurizio Lombardi
commit 165aa2bfb42904b1bec4bf2fa257c8c603c14a06 upstream. In some cases, the iscsi_remove_session() function is called while an unbind_work operation is still running. This may cause a situation where sysfs objects are removed in an incorrect order, triggering a kernel warning. [ 605.249442] ------------[ cut here ]------------ [ 605.259180] sysfs group 'power' not found for kobject 'target2:0:0' [ 605.321371] WARNING: CPU: 1 PID: 26794 at fs/sysfs/group.c:235 sysfs_remove_group+0x76/0x80 [ 605.341266] Modules linked in: dm_service_time target_core_user target_core_pscsi target_core_file target_core_iblock iscsi_target_mod target_core_mod nls_utf8 isofs ppdev bochs_drm nfit ttm libnvdimm drm_kms_helper syscopyarea sysfillrect sysimgblt joydev pcspkr fb_sys_fops drm i2c_piix4 sg parport_pc parport xfs libcrc32c dm_multipath sr_mod sd_mod cdrom ata_generic 8021q garp mrp ata_piix stp crct10dif_pclmul crc32_pclmul llc libata crc32c_intel virtio_net net_failover ghash_clmulni_intel serio_raw failover sunrpc dm_mirror dm_region_hash dm_log dm_mod be2iscsi bnx2i cnic uio cxgb4i cxgb4 libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi [ 605.627479] CPU: 1 PID: 26794 Comm: kworker/u32:2 Not tainted 4.18.0-60.el8.x86_64 #1 [ 605.721401] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29 04/01/2014 [ 605.823651] Workqueue: scsi_wq_2 __iscsi_unbind_session [scsi_transport_iscsi] [ 605.830940] RIP: 0010:sysfs_remove_group+0x76/0x80 [ 605.922907] Code: 48 89 df 5b 5d 41 5c e9 38 c4 ff ff 48 89 df e8 e0 bf ff ff eb cb 49 8b 14 24 48 8b 75 00 48 c7 c7 38 73 cb a7 e8 24 77 d7 ff <0f> 0b 5b 5d 41 5c c3 0f 1f 00 0f 1f 44 00 00 41 56 41 55 41 54 55 [ 606.122304] RSP: 0018:ffffbadcc8d1bda8 EFLAGS: 00010286 [ 606.218492] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 606.326381] RDX: ffff98bdfe85eb40 RSI: ffff98bdfe856818 RDI: ffff98bdfe856818 [ 606.514498] RBP: ffffffffa7ab73e0 R08: 0000000000000268 R09: 0000000000000007 [ 606.529469] R10: 0000000000000000 R11: ffffffffa860d9ad R12: ffff98bdf978e838 [ 606.630535] R13: ffff98bdc2cd4010 R14: ffff98bdc2cd3ff0 R15: ffff98bdc2cd4000 [ 606.824707] FS: 0000000000000000(0000) GS:ffff98bdfe840000(0000) knlGS:0000000000000000 [ 607.018333] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 607.117844] CR2: 00007f84b78ac024 CR3: 000000002c00a003 CR4: 00000000003606e0 [ 607.117844] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 607.420926] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 607.524236] Call Trace: [ 607.530591] device_del+0x56/0x350 [ 607.624393] ? ata_tlink_match+0x30/0x30 [libata] [ 607.727805] ? attribute_container_device_trigger+0xb4/0xf0 [ 607.829911] scsi_target_reap_ref_release+0x39/0x50 [ 607.928572] scsi_remove_target+0x1a2/0x1d0 [ 608.017350] __iscsi_unbind_session+0xb3/0x160 [scsi_transport_iscsi] [ 608.117435] process_one_work+0x1a7/0x360 [ 608.132917] worker_thread+0x30/0x390 [ 608.222900] ? pwq_unbound_release_workfn+0xd0/0xd0 [ 608.323989] kthread+0x112/0x130 [ 608.418318] ? kthread_bind+0x30/0x30 [ 608.513821] ret_from_fork+0x35/0x40 [ 608.613909] ---[ end trace 0b98c310c8a6138c ]--- Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Acked-by: Chris Leech <cleech@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-04-12scsi: fcoe: make use of fip_mode enum completeSedat Dilek
commit 8beb90aaf334a6efa3e924339926b5f93a234dbb upstream. commit 1917d42d14b7 ("fcoe: use enum for fip_mode") introduces a separate enum for the fip_mode that shall be used during initialisation handling until it is passed to fcoe_ctrl_link_up to set the initial fip_state. That change was incomplete and gcc quietly converted in various places between the fip_mode and the fip_state enum values with implicit enum conversions, which fortunately cannot cause any issues in the actual code's execution. clang however warns about these implicit enum conversions in the scsi drivers. This commit consolidates the use of the two enums, guided by clang's enum-conversion warnings. This commit now completes the use of the fip_mode: It expects and uses fip_mode in {bnx2fc,fcoe}_interface_create and fcoe_ctlr_init, and it calls fcoe_ctrl_set_set() with the correct values in fcoe_ctlr_link_up(). It also breaks the association between FIP_MODE_AUTO and FIP_ST_AUTO to indicate these two enums are distinct. Link: https://github.com/ClangBuiltLinux/linux/issues/151 Fixes: 1917d42d14b7 ("fcoe: use enum for fip_mode") Reported-by: Dmitry Golovin <dima@golovin.in> Original-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> CC: Lukas Bulwahn <lukas.bulwahn@gmail.com> CC: Nick Desaulniers <ndesaulniers@google.com> CC: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Suggested-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-04-12scsi: megaraid_sas: return error when create DMA pool failedJason Yan
commit bcf3b67d16a4c8ffae0aa79de5853435e683945c upstream. when create DMA pool for cmd frames failed, we should return -ENOMEM, instead of 0. In some case in: megasas_init_adapter_fusion() -->megasas_alloc_cmds() -->megasas_create_frame_pool create DMA pool failed, --> megasas_free_cmds() [1] -->megasas_alloc_cmds_fusion() failed, then goto fail_alloc_cmds. -->megasas_free_cmds() [2] we will call megasas_free_cmds twice, [1] will kfree cmd_list, [2] will use cmd_list.it will cause a problem: Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = ffffffc000f70000 [00000000] *pgd=0000001fbf893003, *pud=0000001fbf893003, *pmd=0000001fbf894003, *pte=006000006d000707 Internal error: Oops: 96000005 [#1] SMP Modules linked in: CPU: 18 PID: 1 Comm: swapper/0 Not tainted task: ffffffdfb9290000 ti: ffffffdfb923c000 task.ti: ffffffdfb923c000 PC is at megasas_free_cmds+0x30/0x70 LR is at megasas_free_cmds+0x24/0x70 ... Call trace: [<ffffffc0005b779c>] megasas_free_cmds+0x30/0x70 [<ffffffc0005bca74>] megasas_init_adapter_fusion+0x2f4/0x4d8 [<ffffffc0005b926c>] megasas_init_fw+0x2dc/0x760 [<ffffffc0005b9ab0>] megasas_probe_one+0x3c0/0xcd8 [<ffffffc0004a5abc>] local_pci_probe+0x4c/0xb4 [<ffffffc0004a5c40>] pci_device_probe+0x11c/0x14c [<ffffffc00053a5e4>] driver_probe_device+0x1ec/0x430 [<ffffffc00053a92c>] __driver_attach+0xa8/0xb0 [<ffffffc000538178>] bus_for_each_dev+0x74/0xc8 [<ffffffc000539e88>] driver_attach+0x28/0x34 [<ffffffc000539a18>] bus_add_driver+0x16c/0x248 [<ffffffc00053b234>] driver_register+0x6c/0x138 [<ffffffc0004a5350>] __pci_register_driver+0x5c/0x6c [<ffffffc000ce3868>] megasas_init+0xc0/0x1a8 [<ffffffc000082a58>] do_one_initcall+0xe8/0x1ec [<ffffffc000ca7be8>] kernel_init_freeable+0x1c8/0x284 [<ffffffc0008d90b8>] kernel_init+0x1c/0xe4 Signed-off-by: Jason Yan <yanaijie@huawei.com> Acked-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-04-12scsi: core: replace GFP_ATOMIC with GFP_KERNEL in scsi_scan.cBenjamin Block
commit 1749ef00f7312679f76d5e9104c5d1e22a829038 upstream. We had a test-report where, under memory pressure, adding LUNs to the systems would fail (the tests add LUNs strictly in sequence): [ 5525.853432] scsi 0:0:1:1088045124: Direct-Access IBM 2107900 .148 PQ: 0 ANSI: 5 [ 5525.853826] scsi 0:0:1:1088045124: alua: supports implicit TPGS [ 5525.853830] scsi 0:0:1:1088045124: alua: device naa.6005076303ffd32700000000000044da port group 0 rel port 43 [ 5525.853931] sd 0:0:1:1088045124: Attached scsi generic sg10 type 0 [ 5525.854075] sd 0:0:1:1088045124: [sdk] Disabling DIF Type 1 protection [ 5525.855495] sd 0:0:1:1088045124: [sdk] 2097152 512-byte logical blocks: (1.07 GB/1.00 GiB) [ 5525.855606] sd 0:0:1:1088045124: [sdk] Write Protect is off [ 5525.855609] sd 0:0:1:1088045124: [sdk] Mode Sense: ed 00 00 08 [ 5525.855795] sd 0:0:1:1088045124: [sdk] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 5525.857838] sdk: sdk1 [ 5525.859468] sd 0:0:1:1088045124: [sdk] Attached SCSI disk [ 5525.865073] sd 0:0:1:1088045124: alua: transition timeout set to 60 seconds [ 5525.865078] sd 0:0:1:1088045124: alua: port group 00 state A preferred supports tolusnA [ 5526.015070] sd 0:0:1:1088045124: alua: port group 00 state A preferred supports tolusnA [ 5526.015213] sd 0:0:1:1088045124: alua: port group 00 state A preferred supports tolusnA [ 5526.587439] scsi_alloc_sdev: Allocation failure during SCSI scanning, some SCSI devices might not be configured [ 5526.588562] scsi_alloc_sdev: Allocation failure during SCSI scanning, some SCSI devices might not be configured Looking at the code of scsi_alloc_sdev(), and all the calling contexts, there seems to be no reason to use GFP_ATMOIC here. All the different call-contexts use a mutex at some point, and nothing in between that requires no sleeping, as far as I could see. Additionally, the code that later allocates the block queue for the device (scsi_mq_alloc_queue()) already uses GFP_KERNEL. There are similar allocations in two other functions: scsi_probe_and_add_lun(), and scsi_add_lun(),; that can also be done with GFP_KERNEL. Here is the contexts for the three functions so far: scsi_alloc_sdev() scsi_probe_and_add_lun() scsi_sequential_lun_scan() __scsi_scan_target() scsi_scan_target() mutex_lock() scsi_scan_channel() scsi_scan_host_selected() mutex_lock() scsi_report_lun_scan() __scsi_scan_target() ... __scsi_add_device() mutex_lock() __scsi_scan_target() ... scsi_report_lun_scan() ... scsi_get_host_dev() mutex_lock() scsi_probe_and_add_lun() ... scsi_add_lun() scsi_probe_and_add_lun() ... So replace all these, and give them a bit of a better chance to succeed, with more chances of reclaim. Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-04-12scsi: hisi_sas: Fix a timeout race of driver internal and SMP IOXiang Chen
commit 4790595723d4b833b18c994973d39f9efb842887 upstream. For internal IO and SMP IO, there is a time-out timer for them. In the timer handler, it checks whether IO is done according to the flag task->task_state_lock. There is an issue which may cause system suspended: internal IO or SMP IO is sent, but at that time because of hardware exception (such as inject 2Bit ECC error), so IO is not completed and also not timeout. But, at that time, the SAS controller reset occurs to recover system. It will release the resource and set the status of IO to be SAS_TASK_STATE_DONE, so when IO timeout, it will never complete the completion of IO and wait for ever. [ 729.123632] Call trace: [ 729.126791] [<ffff00000808655c>] __switch_to+0x94/0xa8 [ 729.133106] [<ffff000008d96e98>] __schedule+0x1e8/0x7fc [ 729.138975] [<ffff000008d974e0>] schedule+0x34/0x8c [ 729.144401] [<ffff000008d9b000>] schedule_timeout+0x1d8/0x3cc [ 729.150690] [<ffff000008d98218>] wait_for_common+0xdc/0x1a0 [ 729.157101] [<ffff000008d98304>] wait_for_completion+0x28/0x34 [ 729.165973] [<ffff000000dcefb4>] hisi_sas_internal_task_abort+0x2a0/0x424 [hisi_sas_test_main] [ 729.176447] [<ffff000000dd18f4>] hisi_sas_abort_task+0x244/0x2d8 [hisi_sas_test_main] [ 729.185258] [<ffff000008971714>] sas_eh_handle_sas_errors+0x1c8/0x7b8 [ 729.192391] [<ffff000008972774>] sas_scsi_recover_host+0x130/0x398 [ 729.199237] [<ffff00000894d8a8>] scsi_error_handler+0x148/0x5c0 [ 729.206009] [<ffff0000080f4118>] kthread+0x10c/0x138 [ 729.211563] [<ffff0000080855dc>] ret_from_fork+0x10/0x18 To solve the issue, callback function task_done of those IOs need to be called when on SAS controller reset. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-04-12scsi: hisi_sas: Set PHY linkrate when disconnectedJohn Garry
commit efdcad62e7b8a02fcccc5ccca57806dce1482ac8 upstream. When the PHY comes down, we currently do not set the negotiated linkrate: root@(none)$ pwd /sys/class/sas_phy/phy-0:0 root@(none)$ more enable 1 root@(none)$ more negotiated_linkrate 12.0 Gbit root@(none)$ echo 0 > enable root@(none)$ more negotiated_linkrate 12.0 Gbit root@(none)$ This patch fixes the driver code to set it properly when the PHY comes down. If the PHY had been enabled, then set unknown; otherwise, flag as disabled. The logical place to set the negotiated linkrate for this scenario is PHY down routine, which is called from the PHY down ISR. However, it is not possible to know if the PHY comes down due to PHY disable or loss of link, as sas_phy.enabled member is not set until after the transport disable routine is complete, which races with the PHY down ISR. As an imperfect solution, use sas_phy_data.enable as the flag to know if the PHY is down due to disable. It's imperfect, as sas_phy_data is internal to libsas. I can't see another way without adding a new field to hisi_sas_phy and managing it, or changing SCSI SAS transport. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-04-07scsi: sd: Quiesce warning if device does not report optimal I/O sizeMartin K. Petersen
commit 1d5de5bd311be7cd54f02f7cd164f0349a75c876 upstream. Commit a83da8a4509d ("scsi: sd: Optimal I/O size should be a multiple of physical block size") split one conditional into several separate statements in an effort to provide more accurate warning messages when a device reports a nonsensical value. However, this reorganization accidentally dropped the precondition of the reported value being larger than zero. This lead to a warning getting emitted on devices that do not report an optimal I/O size at all. Remain silent if a device does not report an optimal I/O size. Fixes: a83da8a4509d ("scsi: sd: Optimal I/O size should be a multiple of physical block size") Cc: Randy Dunlap <rdunlap@infradead.org> Cc: <stable@vger.kernel.org> Reported-by: Hussam Al-Tayeb <ht990332@gmx.com> Tested-by: Hussam Al-Tayeb <ht990332@gmx.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-04-07scsi: sd: Fix a race between closing an sd device and sd I/OBart Van Assche
commit c14a57264399efd39514a2329c591a4b954246d8 upstream. The scsi_end_request() function calls scsi_cmd_to_driver() indirectly and hence needs the disk->private_data pointer. Avoid that that pointer is cleared before all affected I/O requests have finished. This patch avoids that the following crash occurs: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 Call trace: scsi_mq_uninit_cmd+0x1c/0x30 scsi_end_request+0x7c/0x1b8 scsi_io_completion+0x464/0x668 scsi_finish_command+0xbc/0x160 scsi_eh_flush_done_q+0x10c/0x170 sas_scsi_recover_host+0x84c/0xa98 [libsas] scsi_error_handler+0x140/0x5b0 kthread+0x100/0x12c ret_from_fork+0x10/0x18 Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: Jason Yan <yanaijie@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reported-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-04-07scsi: ibmvscsi: Fix empty event pool access during host removalTyrel Datwyler
commit 7f5203c13ba8a7b7f9f6ecfe5a4d5567188d7835 upstream. The event pool used for queueing commands is destroyed fairly early in the ibmvscsi_remove() code path. Since, this happens prior to the call so scsi_remove_host() it is possible for further calls to queuecommand to be processed which manifest as a panic due to a NULL pointer dereference as seen here: PANIC: "Unable to handle kernel paging request for data at address 0x00000000" Context process backtrace: DSISR: 0000000042000000 ????Syscall Result: 0000000000000000 4 [c000000002cb3820] memcpy_power7 at c000000000064204 [Link Register] [c000000002cb3820] ibmvscsi_send_srp_event at d000000003ed14a4 5 [c000000002cb3920] ibmvscsi_send_srp_event at d000000003ed14a4 [ibmvscsi] ?(unreliable) 6 [c000000002cb39c0] ibmvscsi_queuecommand at d000000003ed2388 [ibmvscsi] 7 [c000000002cb3a70] scsi_dispatch_cmd at d00000000395c2d8 [scsi_mod] 8 [c000000002cb3af0] scsi_request_fn at d00000000395ef88 [scsi_mod] 9 [c000000002cb3be0] __blk_run_queue at c000000000429860 10 [c000000002cb3c10] blk_delay_work at c00000000042a0ec 11 [c000000002cb3c40] process_one_work at c0000000000dac30 12 [c000000002cb3cd0] worker_thread at c0000000000db110 13 [c000000002cb3d80] kthread at c0000000000e3378 14 [c000000002cb3e30] ret_from_kernel_thread at c00000000000982c The kernel buffer log is overfilled with this log: [11261.952732] ibmvscsi: found no event struct in pool! This patch reorders the operations during host teardown. Start by calling the SRP transport and Scsi_Host remove functions to flush any outstanding work and set the host offline. LLDD teardown follows including destruction of the event pool, freeing the Command Response Queue (CRQ), and unmapping any persistent buffers. The event pool destruction is protected by the scsi_host lock, and the pool is purged prior of any requests for which we never received a response. Finally, move the removal of the scsi host from our global list to the end so that the host is easily locatable for debugging purposes during teardown. Cc: <stable@vger.kernel.org> # v2.6.12+ Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-04-07scsi: ibmvscsi: Protect ibmvscsi_head from concurrent modificaitonTyrel Datwyler
commit 7205981e045e752ccf96cf6ddd703a98c59d4339 upstream. For each ibmvscsi host created during a probe or destroyed during a remove we either add or remove that host to/from the global ibmvscsi_head list. This runs the risk of concurrent modification. This patch adds a simple spinlock around the list modification calls to prevent concurrent updates as is done similarly in the ibmvfc driver and ipr driver. Fixes: 32d6e4b6e4ea ("scsi: ibmvscsi: add vscsi hosts to global list_head") Cc: <stable@vger.kernel.org> # v4.10+ Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-03-26scsi: sd: Optimal I/O size should be a multiple of physical block sizeMartin K. Petersen
commit a83da8a4509d3ebfe03bb7fffce022e4d5d4764f upstream. It was reported that some devices report an OPTIMAL TRANSFER LENGTH of 0xFFFF blocks. That looks bogus, especially for a device with a 4096-byte physical block size. Ignore OPTIMAL TRANSFER LENGTH if it is not a multiple of the device's reported physical block size. To make the sanity checking conditionals more readable--and to facilitate printing warnings--relocate the checking to a helper function. No functional change aside from the printks. Cc: <stable@vger.kernel.org> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199759 Reported-by: Christoph Anton Mitterer <calestyo@scientia.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-03-26scsi: aacraid: Fix performance issue on logical drivesSagar Biradar
commit 0015437cc046e5ec2b57b00ff8312b8d432eac7c upstream. Fix performance issue where the queue depth for SmartIOC logical volumes is set to 1, and allow the usual logical volume code to be executed Fixes: a052865fe287 (aacraid: Set correct Queue Depth for HBA1000 RAW disks) Cc: stable@vger.kernel.org Signed-off-by: Sagar Biradar <Sagar.Biradar@microchip.com> Reviewed-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-03-26scsi: virtio_scsi: don't send sc payload with tmfsFelipe Franciosi
commit 3722e6a52174d7c3a00e6f5efd006ca093f346c1 upstream. The virtio scsi spec defines struct virtio_scsi_ctrl_tmf as a set of device-readable records and a single device-writable response entry: struct virtio_scsi_ctrl_tmf { // Device-readable part le32 type; le32 subtype; u8 lun[8]; le64 id; // Device-writable part u8 response; } The above should be organised as two descriptor entries (or potentially more if using VIRTIO_F_ANY_LAYOUT), but without any extra data after "le64 id" or after "u8 response". The Linux driver doesn't respect that, with virtscsi_abort() and virtscsi_device_reset() setting cmd->sc before calling virtscsi_tmf(). It results in the original scsi command payload (or writable buffers) added to the tmf. This fixes the problem by leaving cmd->sc zeroed out, which makes virtscsi_kick_cmd() add the tmf to the control vq without any payload. Cc: stable@vger.kernel.org Signed-off-by: Felipe Franciosi <felipe@nutanix.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-03-26scsi: libiscsi: Fix race between iscsi_xmit_task and iscsi_complete_taskAnoob Soman
commit 79edd00dc6a96644d76b4a1cb97d94d49e026768 upstream. When a target sends Check Condition, whilst initiator is busy xmiting re-queued data, could lead to race between iscsi_complete_task() and iscsi_xmit_task() and eventually crashing with the following kernel backtrace. [3326150.987523] ALERT: BUG: unable to handle kernel NULL pointer dereference at 0000000000000078 [3326150.987549] ALERT: IP: [<ffffffffa05ce70d>] iscsi_xmit_task+0x2d/0xc0 [libiscsi] [3326150.987571] WARN: PGD 569c8067 PUD 569c9067 PMD 0 [3326150.987582] WARN: Oops: 0002 [#1] SMP [3326150.987593] WARN: Modules linked in: tun nfsv3 nfs fscache dm_round_robin [3326150.987762] WARN: CPU: 2 PID: 8399 Comm: kworker/u32:1 Tainted: G O 4.4.0+2 #1 [3326150.987774] WARN: Hardware name: Dell Inc. PowerEdge R720/0W7JN5, BIOS 2.5.4 01/22/2016 [3326150.987790] WARN: Workqueue: iscsi_q_13 iscsi_xmitworker [libiscsi] [3326150.987799] WARN: task: ffff8801d50f3800 ti: ffff8801f5458000 task.ti: ffff8801f5458000 [3326150.987810] WARN: RIP: e030:[<ffffffffa05ce70d>] [<ffffffffa05ce70d>] iscsi_xmit_task+0x2d/0xc0 [libiscsi] [3326150.987825] WARN: RSP: e02b:ffff8801f545bdb0 EFLAGS: 00010246 [3326150.987831] WARN: RAX: 00000000ffffffc3 RBX: ffff880282d2ab20 RCX: ffff88026b6ac480 [3326150.987842] WARN: RDX: 0000000000000000 RSI: 00000000fffffe01 RDI: ffff880282d2ab20 [3326150.987852] WARN: RBP: ffff8801f545bdc8 R08: 0000000000000000 R09: 0000000000000008 [3326150.987862] WARN: R10: 0000000000000000 R11: 000000000000fe88 R12: 0000000000000000 [3326150.987872] WARN: R13: ffff880282d2abe8 R14: ffff880282d2abd8 R15: ffff880282d2ac08 [3326150.987890] WARN: FS: 00007f5a866b4840(0000) GS:ffff88028a640000(0000) knlGS:0000000000000000 [3326150.987900] WARN: CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 [3326150.987907] WARN: CR2: 0000000000000078 CR3: 0000000070244000 CR4: 0000000000042660 [3326150.987918] WARN: Stack: [3326150.987924] WARN: ffff880282d2ad58 ffff880282d2ab20 ffff880282d2abe8 ffff8801f545be18 [3326150.987938] WARN: ffffffffa05cea90 ffff880282d2abf8 ffff88026b59cc80 ffff88026b59cc00 [3326150.987951] WARN: ffff88022acf32c0 ffff880289491800 ffff880255a80800 0000000000000400 [3326150.987964] WARN: Call Trace: [3326150.987975] WARN: [<ffffffffa05cea90>] iscsi_xmitworker+0x2f0/0x360 [libiscsi] [3326150.987988] WARN: [<ffffffff8108862c>] process_one_work+0x1fc/0x3b0 [3326150.987997] WARN: [<ffffffff81088f95>] worker_thread+0x2a5/0x470 [3326150.988006] WARN: [<ffffffff8159cad8>] ? __schedule+0x648/0x870 [3326150.988015] WARN: [<ffffffff81088cf0>] ? rescuer_thread+0x300/0x300 [3326150.988023] WARN: [<ffffffff8108ddf5>] kthread+0xd5/0xe0 [3326150.988031] WARN: [<ffffffff8108dd20>] ? kthread_stop+0x110/0x110 [3326150.988040] WARN: [<ffffffff815a0bcf>] ret_from_fork+0x3f/0x70 [3326150.988048] WARN: [<ffffffff8108dd20>] ? kthread_stop+0x110/0x110 [3326150.988127] ALERT: RIP [<ffffffffa05ce70d>] iscsi_xmit_task+0x2d/0xc0 [libiscsi] [3326150.988138] WARN: RSP <ffff8801f545bdb0> [3326150.988144] WARN: CR2: 0000000000000078 [3326151.020366] WARN: ---[ end trace 1c60974d4678d81b ]--- Commit 6f8830f5bbab ("scsi: libiscsi: add lock around task lists to fix list corruption regression") introduced "taskqueuelock" to fix list corruption during the race, but this wasn't enough. Re-setting of conn->task to NULL, could race with iscsi_xmit_task(). iscsi_complete_task() { .... if (conn->task == task) conn->task = NULL; } conn->task in iscsi_xmit_task() could be NULL and so will be task. __iscsi_get_task(task) will crash (NullPtr de-ref), trying to access refcount. iscsi_xmit_task() { struct iscsi_task *task = conn->task; __iscsi_get_task(task); } This commit will take extra conn->session->back_lock in iscsi_xmit_task() to ensure iscsi_xmit_task() waits for iscsi_complete_task(), if iscsi_complete_task() wins the race. If iscsi_xmit_task() wins the race, iscsi_xmit_task() increments task->refcount (__iscsi_get_task) ensuring iscsi_complete_task() will not iscsi_free_task(). Signed-off-by: Anoob Soman <anoob.soman@citrix.com> Signed-off-by: Bob Liu <bob.liu@oracle.com> Acked-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-03-26scsi: qla2xxx: Fix panic from use after free in qla2x00_async_tm_cmdBill Kuzeja
commit 388a49959ee4e4e99f160241d9599efa62cd4299 upstream. In qla2x00_async_tm_cmd, we reference off sp after it has been freed. This caused a panic on a system running a slub debug kernel. Since fcport is passed in anyways, just use that instead. Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com> Acked-by: Giridhar Malavali <gmalavali@marvell.com> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-03-20scsi: aacraid: Fix missing break in switch statementGustavo A. R. Silva
commit 5e420fe635813e5746b296cfc8fff4853ae205a2 upstream. Add missing break statement and fix identation issue. This bug was found thanks to the ongoing efforts to enable -Wimplicit-fallthrough. Fixes: 9cb62fa24e0d ("aacraid: Log firmware AIF messages") Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>