Driver team has highlighted this patch is required to address this issue: author James Smart <jsmart2...@gmail.com> 2022-04-12 15:19:44 -0700 committer Martin K. Petersen <martin.peter...@oracle.com> 2022-04-18 22:48:43 -0400 commit e294647b1aed4247fe52851f3a3b2b19ae906228 (patch) tree fd7e11a3c6f680d5aabd468d523d08ffcd66b59f /drivers/scsi/lpfc parent b83a8c21f3fe874e12eb2b6e6c5cfb220d35c446 (diff) download scsi-e294647b1aed4247fe52851f3a3b2b19ae906228.tar.gz scsi: lpfc: Move cfg_log_verbose check before calling lpfc_dmp_dbg() In an attempt to log message 0126 with LOG_TRACE_EVENT, the following hard lockup call trace hangs the system.
Call Trace: _raw_spin_lock_irqsave+0x32/0x40 lpfc_dmp_dbg.part.32+0x28/0x220 [lpfc] lpfc_cmpl_els_fdisc+0x145/0x460 [lpfc] lpfc_sli_cancel_jobs+0x92/0xd0 [lpfc] lpfc_els_flush_cmd+0x43c/0x670 [lpfc] lpfc_els_flush_all_cmd+0x37/0x60 [lpfc] lpfc_sli4_async_event_proc+0x956/0x1720 [lpfc] lpfc_do_work+0x1485/0x1d70 [lpfc] kthread+0x112/0x130 ret_from_fork+0x1f/0x40 Kernel panic - not syncing: Hard LOCKUP The same CPU tries to claim the phba->port_list_lock twice. Move the cfg_log_verbose checks as part of the lpfc_printf_vlog() and lpfc_printf_log() macros before calling lpfc_dmp_dbg(). There is no need to take the phba->port_list_lock within lpfc_dmp_dbg(). Link: https://lore.kernel.org/r/20220412222008.126521-3-jsmart2...@gmail.com Co-developed-by: Justin Tee <justin....@broadcom.com> Signed-off-by: Justin Tee <justin....@broadcom.com> Signed-off-by: James Smart <jsmart2...@gmail.com> Signed-off-by: Martin K. Petersen <martin.peter...@oracle.com> Diffstat (limited to 'drivers/scsi/lpfc') -rw-r--r-- drivers/scsi/lpfc/lpfc_init.c 29 -rw-r--r-- drivers/scsi/lpfc/lpfc_logmsg.h 6 2 files changed, 4 insertions, 31 deletions diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index 461d333b1b3a8..f9cd4b72d949a 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -15700,34 +15700,7 @@ void lpfc_dmp_dbg(struct lpfc_hba *phba) unsigned int temp_idx; int i; int j = 0; - unsigned long rem_nsec, iflags; - bool log_verbose = false; - struct lpfc_vport *port_iterator; - - /* Don't dump messages if we explicitly set log_verbose for the - * physical port or any vport. - */ - if (phba->cfg_log_verbose) - return; - - spin_lock_irqsave(&phba->port_list_lock, iflags); - list_for_each_entry(port_iterator, &phba->port_list, listentry) { - if (port_iterator->load_flag & FC_UNLOADING) - continue; - if (scsi_host_get(lpfc_shost_from_vport(port_iterator))) { - if (port_iterator->cfg_log_verbose) - log_verbose = true; - - scsi_host_put(lpfc_shost_from_vport(port_iterator)); - - if (log_verbose) { - spin_unlock_irqrestore(&phba->port_list_lock, - iflags); - return; - } - } - } - spin_unlock_irqrestore(&phba->port_list_lock, iflags); + unsigned long rem_nsec; if (atomic_cmpxchg(&phba->dbg_log_dmping, 0, 1) != 0) return; diff --git a/drivers/scsi/lpfc/lpfc_logmsg.h b/drivers/scsi/lpfc/lpfc_logmsg.h index 7d480c7987942..a5aafe230c74f 100644 --- a/drivers/scsi/lpfc/lpfc_logmsg.h +++ b/drivers/scsi/lpfc/lpfc_logmsg.h @@ -73,7 +73,7 @@ do { \ #define lpfc_printf_vlog(vport, level, mask, fmt, arg...) \ do { \ { if (((mask) & (vport)->cfg_log_verbose) || (level[1] <= '3')) { \ - if ((mask) & LOG_TRACE_EVENT) \ + if ((mask) & LOG_TRACE_EVENT && !(vport)->cfg_log_verbose) \ lpfc_dmp_dbg((vport)->phba); \ dev_printk(level, &((vport)->phba->pcidev)->dev, "%d:(%d):" \ fmt, (vport)->phba->brd_no, vport->vpi, ##arg); \ @@ -89,11 +89,11 @@ do { \ (phba)->pport->cfg_log_verbose : \ (phba)->cfg_log_verbose; \ if (((mask) & log_verbose) || (level[1] <= '3')) { \ - if ((mask) & LOG_TRACE_EVENT) \ + if ((mask) & LOG_TRACE_EVENT && !log_verbose) \ lpfc_dmp_dbg(phba); \ dev_printk(level, &((phba)->pcidev)->dev, "%d:" \ fmt, phba->brd_no, ##arg); \ - } else if (!(phba)->cfg_log_verbose)\ + } else if (!log_verbose)\ lpfc_dbg_print(phba, "%d:" fmt, phba->brd_no, ##arg); \ } \ } while (0) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1971193 Title: Server Crash while running IO and switch port bounce test with 2K login session Status in linux package in Ubuntu: Incomplete Bug description: [Impact] Server crash and Call trace reported on one of the servers running IO and switch port bounce test from the 2K login session configuration. Call Trace: [56048.470488] Call Trace: [56048.470489] _raw_spin_lock_irqsave+0x32/0x40 [56048.470489] lpfc_dmp_dbg.part.32+0x28/0x220 [lpfc] [56048.470490] lpfc_cmpl_els_fdisc+0x145/0x460 [lpfc] [56048.470490] lpfc_sli_cancel_jobs+0x92/0xd0 [lpfc] [56048.470490] lpfc_els_flush_cmd+0x43c/0x670 [lpfc] [56048.470491] lpfc_els_flush_all_cmd+0x37/0x60 [lpfc] [56048.470491] lpfc_sli4_async_event_proc+0x956/0x1720 [lpfc] [56048.470492] lpfc_do_work+0x1485/0x1d70 [lpfc] [56048.470492] ? __schedule+0x280/0x700 [56048.470492] ? finish_wait+0x80/0x80 [56048.470493] ? lpfc_unregister_unused_fcf+0x80/0x80 [lpfc] [56048.470493] kthread+0x112/0x130 [56048.470493] ? kthread_flush_work_fn+0x10/0x10 [56048.470494] ret_from_fork+0x1f/0x40 [56048.470494] Kernel panic - not syncing: Hard LOCKUP [56048.470495] CPU: 0 PID: 682 Comm: lpfc_worker_0 Kdump: loaded Tainted: G IOE --------- - - 4.18.0-240.el8.x86_64 #1 [56048.470496] Hardware name: Dell Inc. PowerEdge R740/0DY2X0, BIOS 2.11.2 004/21/2021 [56048.470496] Call Trace: [56048.470496] <NMI> [56048.470496] dump_stack+0x5c/0x80 [56048.470497] panic+0xe7/0x2a9 [56048.470497] ? __switch_to_asm+0x51/0x70 [56048.470497] nmi_panic.cold.9+0xc/0xc [56048.470498] watchdog_overflow_callback.cold.7+0x5c/0x70 [56048.470498] __perf_event_overflow+0x52/0xf0 [56048.470499] handle_pmi_common+0x1db/0x270 [56048.470499] ? __set_pte_vaddr+0x32/0x50 [56048.470499] ? __native_set_fixmap+0x24/0x30 [56048.470500] ? ghes_copy_tofrom_phys+0xd3/0x1c0 [56048.470500] ? __ghes_peek_estatus.isra.12+0x49/0xa0 [56048.470500] intel_pmu_handle_irq+0xbf/0x160 [56048.470501] perf_event_nmi_handler+0x2d/0x50 [56048.470501] nmi_handle+0x63/0x110 [56048.470501] default_do_nmi+0x4e/0x100 [56048.470502] do_nmi+0x128/0x190 [56048.470502] end_repeat_nmi+0x16/0x6a [56048.470503] RIP: 0010:native_queued_spin_lock_slowpath+0x5d/0x1d0 [56048.470504] Code: 0f ba 2f 08 0f 92 c0 0f b6 c0 c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 75 47 85 c0 74 0e 8b 07 84 c0 74 08 f3 90 8b 07 <84> c0 75 f8 b8 01 00 00 00 66 89 07 c3 8b 37 81 fe 00 01 00 00 75 [56048.470504] RSP: 0018:ffffacebc7877ca8 EFLAGS: 00000002 [56048.470505] RAX: 0000000000000101 RBX: 0000000000000246 RCX: 000000000000001f [56048.470505] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff94dcf5341dc0 [56048.470506] RBP: ffff94dcf5340000 R08: 0000000000000002 R09: 0000000000029600 [56048.470506] R10: 000060d29656a45c R11: ffff94dcf534fd12 R12: ffff94dcf5341db0 [56048.470507] R13: ffff94dcf5341dc0 R14: ffff94dcc4ae8a00 R15: 0000000000000003 [56048.470507] ? native_queued_spin_lock_slowpath+0x5d/0x1d0 [56048.470507] ? native_queued_spin_lock_slowpath+0x5d/0x1d0 [56048.470508] </NMI> [56048.470508] _raw_spin_lock_irqsave+0x32/0x40 [56048.470509] lpfc_dmp_dbg.part.32+0x28/0x220 [lpfc] [56048.470509] lpfc_cmpl_els_fdisc+0x145/0x460 [lpfc] [56048.470509] lpfc_sli_cancel_jobs+0x92/0xd0 [lpfc] [56048.470510] lpfc_els_flush_cmd+0x43c/0x670 [lpfc] [56048.470510] lpfc_els_flush_all_cmd+0x37/0x60 [lpfc] [56048.470510] lpfc_sli4_async_event_proc+0x956/0x1720 [lpfc] [56048.470511] lpfc_do_work+0x1485/0x1d70 [lpfc] [56048.470511] ? __schedule+0x280/0x700 [56048.470511] ? finish_wait+0x80/0x80 [56048.470512] ? lpfc_unregister_unused_fcf+0x80/0x80 [lpfc] [56048.470512] kthread+0x112/0x130 [56048.470513] ? kthread_flush_work_fn+0x10/0x10 [56048.470513] ret_from_fork+0x1f/0x40 [root@ms-svr3-10-231-131-160 127.0.0.1-2021-11-20-05:14:30]# [root@ms-svr3-10-231-131-160 127.0.0.1-2021-11-20-05:14:30]# cat /etc/redhat-release Red Hat Enterprise Linux release 8.3 (Ootpa) [root@ms-svr3-10-231-131-160 127.0.0.1-2021-11-20-05:14:30]# cat /sys/module/lpfc/version 0:14.0.390.2 [root@ms-svr3-10-231-131-160 127.0.0.1-2021-11-20-05:14:30]# cat /sys/class/scsi_host/host*/modeldesc Emulex LightPulse LPe32002-M2 2-Port 32Gb Fibre Channel Adapter Emulex LightPulse LPe32002-M2 2-Port 32Gb Fibre Channel Adapter [root@ms-svr3-10-231-131-160 127.0.0.1-2021-11-20-05:14:30]# cat /sys/class/scsi_host/host*/fwrev 14.0.390.1, sli-4:2:c 14.0.390.1, sli-4:2:c [root@ms-svr3-10-231-131-160 127.0.0.1-2021-11-20-05:14:30]# cat /sys/class/fc_host/host*/port_name 0x10000090faf09459 0x10000090faf0945a [root@ms-svr3-10-231-131-160 127.0.0.1-2021-11-20-05:14:30]# HBA Attributes for 10:00:00:90:fa:f0:94:59 Host Name : ms-svr3-10-231-131-160 Manufacturer : Emulex Corporation Serial Number : FC70793283 Model : LPe32002-M2 Model Desc : Emulex LightPulse LPe32002-M2 2-Port 32Gb Fibre Channel Adapter Node WWN : 20 00 00 90 fa f0 94 59 Node Symname : HW Version : 0000000c 00000001 00000000 FW Version : 14.0.390.1 Vendor Spec ID : 10DF Number of Ports : 1 Driver Name : lpfc Driver Version : 14.0.390.2; HBAAPI(I) v2.3.d, 07-12-10 Device ID : E300 HBA Type : LPe32002-M2 Operational FW : 14.0.390.1 IEEE Address : 00 90 fa f0 94 59 Boot Code : Enabled Boot Version : 14.0.390.1 Board Temperature : Normal Function Type : FC Sub Device ID : E300 PCI Bus Number : 94 PCI Func Number : 0 Sub Vendor ID : 10DF IPL Filename : H62LEX1 Service Processor FW Name : 14.0.390.1 ULP FW Name : 14.0.390.1 FC Universal BIOS Version : 14.0.390.1 FC x86 BIOS Version : 14.0.390.1 FC EFI BIOS Version : 14.0.388.0 FC FCODE Version : 14.0.386.0 Flash Firmware Version : 14.0.390.1 Secure Firmware : Enabled [root@ms-svr3-10-231-131-160 log]# hbacmd portattrib 10:00:00:90:fa:f0:94:59 Port Attributes for 10:00:00:90:fa:f0:94:59 Node WWN : 20 00 00 90 fa f0 94 59 Port WWN : 10 00 00 90 fa f0 94 59 Port Symname : Port FCID : 0000 Port Type : Unknown Port State : Link Down Port Service Type : 8 Port Supported FC4 : 00 00 01 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Port Active FC4 : 00 00 01 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Port Supported Speed : 8 16 32 Gbit/sec Configured Port Speed : Auto Detect Port Speed : Not Available Max Frame Size : 2048 OS Device Name : /sys/class/scsi_host/host15 Num Discovered Ports : 0 Fabric Name : 00 00 00 00 00 00 00 00 Function Type : FC FEC : Enabled [Fixes] The following patch will resolve the issue: scsi: lpfc: Move cfg_log_verbose check before calling lpfc_dmp_dbg() In an attempt to log message 0126 with LOG_TRACE_EVENT, the following hard lockup call trace hangs the system. [Testcase] [root@ms-svr3-10-231-131-160 log]# [reply] [-]Comment 3James Smart 2022-04-13 09:12:37 PDT Patches pushed upstream 4/12/22: https://lore.kernel.org/linux- scsi/20220412222008.126521-1-jsmart2...@gmail.com/T/#t To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1971193/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp