Driver team has highlighted this patch is required to address this
issue:

author  James Smart <jsmart2...@gmail.com>      2022-04-12 15:19:44 -0700
committer       Martin K. Petersen <martin.peter...@oracle.com> 2022-04-18 
22:48:43 -0400
commit  e294647b1aed4247fe52851f3a3b2b19ae906228 (patch)
tree    fd7e11a3c6f680d5aabd468d523d08ffcd66b59f /drivers/scsi/lpfc
parent  b83a8c21f3fe874e12eb2b6e6c5cfb220d35c446 (diff)
download        scsi-e294647b1aed4247fe52851f3a3b2b19ae906228.tar.gz
scsi: lpfc: Move cfg_log_verbose check before calling lpfc_dmp_dbg()
In an attempt to log message 0126 with LOG_TRACE_EVENT, the following hard
lockup call trace hangs the system.

Call Trace:
 _raw_spin_lock_irqsave+0x32/0x40
 lpfc_dmp_dbg.part.32+0x28/0x220 [lpfc]
 lpfc_cmpl_els_fdisc+0x145/0x460 [lpfc]
 lpfc_sli_cancel_jobs+0x92/0xd0 [lpfc]
 lpfc_els_flush_cmd+0x43c/0x670 [lpfc]
 lpfc_els_flush_all_cmd+0x37/0x60 [lpfc]
 lpfc_sli4_async_event_proc+0x956/0x1720 [lpfc]
 lpfc_do_work+0x1485/0x1d70 [lpfc]
 kthread+0x112/0x130
 ret_from_fork+0x1f/0x40
Kernel panic - not syncing: Hard LOCKUP

The same CPU tries to claim the phba->port_list_lock twice.

Move the cfg_log_verbose checks as part of the lpfc_printf_vlog() and
lpfc_printf_log() macros before calling lpfc_dmp_dbg().  There is no need
to take the phba->port_list_lock within lpfc_dmp_dbg().

Link: https://lore.kernel.org/r/20220412222008.126521-3-jsmart2...@gmail.com
Co-developed-by: Justin Tee <justin....@broadcom.com>
Signed-off-by: Justin Tee <justin....@broadcom.com>
Signed-off-by: James Smart <jsmart2...@gmail.com>
Signed-off-by: Martin K. Petersen <martin.peter...@oracle.com>
Diffstat (limited to 'drivers/scsi/lpfc')
-rw-r--r--      drivers/scsi/lpfc/lpfc_init.c   29      
-rw-r--r--      drivers/scsi/lpfc/lpfc_logmsg.h 6       
2 files changed, 4 insertions, 31 deletions
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 461d333b1b3a8..f9cd4b72d949a 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -15700,34 +15700,7 @@ void lpfc_dmp_dbg(struct lpfc_hba *phba)
        unsigned int temp_idx;
        int i;
        int j = 0;
-       unsigned long rem_nsec, iflags;
-       bool log_verbose = false;
-       struct lpfc_vport *port_iterator;
-
-       /* Don't dump messages if we explicitly set log_verbose for the
-        * physical port or any vport.
-        */
-       if (phba->cfg_log_verbose)
-               return;
-
-       spin_lock_irqsave(&phba->port_list_lock, iflags);
-       list_for_each_entry(port_iterator, &phba->port_list, listentry) {
-               if (port_iterator->load_flag & FC_UNLOADING)
-                       continue;
-               if (scsi_host_get(lpfc_shost_from_vport(port_iterator))) {
-                       if (port_iterator->cfg_log_verbose)
-                               log_verbose = true;
-
-                       scsi_host_put(lpfc_shost_from_vport(port_iterator));
-
-                       if (log_verbose) {
-                               spin_unlock_irqrestore(&phba->port_list_lock,
-                                                      iflags);
-                               return;
-                       }
-               }
-       }
-       spin_unlock_irqrestore(&phba->port_list_lock, iflags);
+       unsigned long rem_nsec;
 
        if (atomic_cmpxchg(&phba->dbg_log_dmping, 0, 1) != 0)
                return;
diff --git a/drivers/scsi/lpfc/lpfc_logmsg.h b/drivers/scsi/lpfc/lpfc_logmsg.h
index 7d480c7987942..a5aafe230c74f 100644
--- a/drivers/scsi/lpfc/lpfc_logmsg.h
+++ b/drivers/scsi/lpfc/lpfc_logmsg.h
@@ -73,7 +73,7 @@ do { \
 #define lpfc_printf_vlog(vport, level, mask, fmt, arg...) \
 do { \
        { if (((mask) & (vport)->cfg_log_verbose) || (level[1] <= '3')) { \
-               if ((mask) & LOG_TRACE_EVENT) \
+               if ((mask) & LOG_TRACE_EVENT && !(vport)->cfg_log_verbose) \
                        lpfc_dmp_dbg((vport)->phba); \
                dev_printk(level, &((vport)->phba->pcidev)->dev, "%d:(%d):" \
                           fmt, (vport)->phba->brd_no, vport->vpi, ##arg);  \
@@ -89,11 +89,11 @@ do { \
                                 (phba)->pport->cfg_log_verbose : \
                                 (phba)->cfg_log_verbose; \
        if (((mask) & log_verbose) || (level[1] <= '3')) { \
-               if ((mask) & LOG_TRACE_EVENT) \
+               if ((mask) & LOG_TRACE_EVENT && !log_verbose) \
                        lpfc_dmp_dbg(phba); \
                dev_printk(level, &((phba)->pcidev)->dev, "%d:" \
                        fmt, phba->brd_no, ##arg); \
-       } else  if (!(phba)->cfg_log_verbose)\
+       } else if (!log_verbose)\
                lpfc_dbg_print(phba, "%d:" fmt, phba->brd_no, ##arg); \
        } \
 } while (0)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1971193

Title:
  Server Crash while running IO and switch port bounce test with 2K
  login session

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  [Impact]
  Server crash and Call trace reported on one of the servers running IO and
  switch port bounce test from the 2K login session configuration.

  Call Trace:
  [56048.470488] Call Trace:
  [56048.470489]  _raw_spin_lock_irqsave+0x32/0x40
  [56048.470489]  lpfc_dmp_dbg.part.32+0x28/0x220 [lpfc]
  [56048.470490]  lpfc_cmpl_els_fdisc+0x145/0x460 [lpfc]
  [56048.470490]  lpfc_sli_cancel_jobs+0x92/0xd0 [lpfc]
  [56048.470490]  lpfc_els_flush_cmd+0x43c/0x670 [lpfc]
  [56048.470491]  lpfc_els_flush_all_cmd+0x37/0x60 [lpfc]
  [56048.470491]  lpfc_sli4_async_event_proc+0x956/0x1720 [lpfc]
  [56048.470492]  lpfc_do_work+0x1485/0x1d70 [lpfc]
  [56048.470492]  ? __schedule+0x280/0x700
  [56048.470492]  ? finish_wait+0x80/0x80
  [56048.470493]  ? lpfc_unregister_unused_fcf+0x80/0x80 [lpfc]
  [56048.470493]  kthread+0x112/0x130
  [56048.470493]  ? kthread_flush_work_fn+0x10/0x10
  [56048.470494]  ret_from_fork+0x1f/0x40
  [56048.470494] Kernel panic - not syncing: Hard LOCKUP
  [56048.470495] CPU: 0 PID: 682 Comm: lpfc_worker_0 Kdump: loaded Tainted: G
       IOE    --------- -  - 4.18.0-240.el8.x86_64 #1
  [56048.470496] Hardware name: Dell Inc. PowerEdge R740/0DY2X0, BIOS 2.11.2
  004/21/2021
  [56048.470496] Call Trace:
  [56048.470496]  <NMI>
  [56048.470496]  dump_stack+0x5c/0x80
  [56048.470497]  panic+0xe7/0x2a9
  [56048.470497]  ? __switch_to_asm+0x51/0x70
  [56048.470497]  nmi_panic.cold.9+0xc/0xc
  [56048.470498]  watchdog_overflow_callback.cold.7+0x5c/0x70
  [56048.470498]  __perf_event_overflow+0x52/0xf0
  [56048.470499]  handle_pmi_common+0x1db/0x270
  [56048.470499]  ? __set_pte_vaddr+0x32/0x50
  [56048.470499]  ? __native_set_fixmap+0x24/0x30
  [56048.470500]  ? ghes_copy_tofrom_phys+0xd3/0x1c0
  [56048.470500]  ? __ghes_peek_estatus.isra.12+0x49/0xa0
  [56048.470500]  intel_pmu_handle_irq+0xbf/0x160
  [56048.470501]  perf_event_nmi_handler+0x2d/0x50
  [56048.470501]  nmi_handle+0x63/0x110
  [56048.470501]  default_do_nmi+0x4e/0x100
  [56048.470502]  do_nmi+0x128/0x190
  [56048.470502]  end_repeat_nmi+0x16/0x6a
  [56048.470503] RIP: 0010:native_queued_spin_lock_slowpath+0x5d/0x1d0
  [56048.470504] Code: 0f ba 2f 08 0f 92 c0 0f b6 c0 c1 e0 08 89 c2 8b 07 30 e4
  09 d0 a9 00 01 ff ff 75 47 85 c0 74 0e 8b 07 84 c0 74 08 f3 90 8b 07 <84> c0 
75
  f8 b8 01 00 00 00 66 89 07 c3 8b 37 81 fe 00 01 00 00 75
  [56048.470504] RSP: 0018:ffffacebc7877ca8 EFLAGS: 00000002
  [56048.470505] RAX: 0000000000000101 RBX: 0000000000000246 RCX:
  000000000000001f
  [56048.470505] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
  ffff94dcf5341dc0
  [56048.470506] RBP: ffff94dcf5340000 R08: 0000000000000002 R09:
  0000000000029600
  [56048.470506] R10: 000060d29656a45c R11: ffff94dcf534fd12 R12:
  ffff94dcf5341db0
  [56048.470507] R13: ffff94dcf5341dc0 R14: ffff94dcc4ae8a00 R15:
  0000000000000003
  [56048.470507]  ? native_queued_spin_lock_slowpath+0x5d/0x1d0
  [56048.470507]  ? native_queued_spin_lock_slowpath+0x5d/0x1d0
  [56048.470508]  </NMI>
  [56048.470508]  _raw_spin_lock_irqsave+0x32/0x40
  [56048.470509]  lpfc_dmp_dbg.part.32+0x28/0x220 [lpfc]
  [56048.470509]  lpfc_cmpl_els_fdisc+0x145/0x460 [lpfc]
  [56048.470509]  lpfc_sli_cancel_jobs+0x92/0xd0 [lpfc]
  [56048.470510]  lpfc_els_flush_cmd+0x43c/0x670 [lpfc]
  [56048.470510]  lpfc_els_flush_all_cmd+0x37/0x60 [lpfc]
  [56048.470510]  lpfc_sli4_async_event_proc+0x956/0x1720 [lpfc]
  [56048.470511]  lpfc_do_work+0x1485/0x1d70 [lpfc]
  [56048.470511]  ? __schedule+0x280/0x700
  [56048.470511]  ? finish_wait+0x80/0x80
  [56048.470512]  ? lpfc_unregister_unused_fcf+0x80/0x80 [lpfc]
  [56048.470512]  kthread+0x112/0x130
  [56048.470513]  ? kthread_flush_work_fn+0x10/0x10
  [56048.470513]  ret_from_fork+0x1f/0x40
  [root@ms-svr3-10-231-131-160 127.0.0.1-2021-11-20-05:14:30]#

  [root@ms-svr3-10-231-131-160 127.0.0.1-2021-11-20-05:14:30]# cat
  /etc/redhat-release
  Red Hat Enterprise Linux release 8.3 (Ootpa)

  [root@ms-svr3-10-231-131-160 127.0.0.1-2021-11-20-05:14:30]# cat
  /sys/module/lpfc/version
  0:14.0.390.2

  [root@ms-svr3-10-231-131-160 127.0.0.1-2021-11-20-05:14:30]# cat
  /sys/class/scsi_host/host*/modeldesc
  Emulex LightPulse LPe32002-M2 2-Port 32Gb Fibre Channel Adapter
  Emulex LightPulse LPe32002-M2 2-Port 32Gb Fibre Channel Adapter

  [root@ms-svr3-10-231-131-160 127.0.0.1-2021-11-20-05:14:30]# cat
  /sys/class/scsi_host/host*/fwrev
  14.0.390.1, sli-4:2:c
  14.0.390.1, sli-4:2:c

  [root@ms-svr3-10-231-131-160 127.0.0.1-2021-11-20-05:14:30]# cat
  /sys/class/fc_host/host*/port_name
  0x10000090faf09459
  0x10000090faf0945a
  [root@ms-svr3-10-231-131-160 127.0.0.1-2021-11-20-05:14:30]#

  HBA Attributes for 10:00:00:90:fa:f0:94:59

  Host Name                     : ms-svr3-10-231-131-160
  Manufacturer                  : Emulex Corporation
  Serial Number                 : FC70793283
  Model                         : LPe32002-M2
  Model Desc                    : Emulex LightPulse LPe32002-M2 2-Port 32Gb 
Fibre
  Channel Adapter
  Node WWN                      : 20 00 00 90 fa f0 94 59
  Node Symname                  :
  HW Version                    : 0000000c 00000001 00000000
  FW Version                    : 14.0.390.1
  Vendor Spec ID                : 10DF
  Number of Ports               : 1
  Driver Name                   : lpfc
  Driver Version                : 14.0.390.2; HBAAPI(I) v2.3.d, 07-12-10
  Device ID                     : E300
  HBA Type                      : LPe32002-M2
  Operational FW                : 14.0.390.1
  IEEE Address                  : 00 90 fa f0 94 59
  Boot Code                     : Enabled
  Boot Version                  : 14.0.390.1
  Board Temperature             : Normal
  Function Type                 : FC
  Sub Device ID                 : E300
  PCI Bus Number                : 94
  PCI Func Number               : 0
  Sub Vendor ID                 : 10DF
  IPL Filename                  : H62LEX1
  Service Processor FW Name     : 14.0.390.1
  ULP FW Name                   : 14.0.390.1
  FC Universal BIOS Version     : 14.0.390.1
  FC x86 BIOS Version           : 14.0.390.1
  FC EFI BIOS Version           : 14.0.388.0
  FC FCODE Version              : 14.0.386.0
  Flash Firmware Version        : 14.0.390.1
  Secure Firmware               : Enabled

  [root@ms-svr3-10-231-131-160 log]# hbacmd portattrib
  10:00:00:90:fa:f0:94:59

  Port Attributes for 10:00:00:90:fa:f0:94:59

  Node WWN                  : 20 00 00 90 fa f0 94 59
  Port WWN                  : 10 00 00 90 fa f0 94 59
  Port Symname              :
  Port FCID                 : 0000
  Port Type                 : Unknown
  Port State                : Link Down
  Port Service Type         : 8
  Port Supported FC4        : 00 00 01 00 00 00 00 01
                              00 00 00 00 00 00 00 00
                              00 00 00 00 00 00 00 00
                              00 00 00 00 00 00 00 00
  Port Active FC4           : 00 00 01 00 00 00 00 01
                              00 00 00 00 00 00 00 00
                              00 00 00 00 00 00 00 00
                              00 00 00 00 00 00 00 00
  Port Supported Speed      : 8 16 32 Gbit/sec
  Configured Port Speed     : Auto Detect
  Port Speed                : Not Available
  Max Frame Size            : 2048
  OS Device Name            : /sys/class/scsi_host/host15
  Num Discovered Ports      : 0
  Fabric Name               : 00 00 00 00 00 00 00 00
  Function Type             : FC
  FEC                       : Enabled

  [Fixes]
  The following patch will resolve the issue:
  scsi: lpfc: Move cfg_log_verbose check before calling lpfc_dmp_dbg()
  In an attempt to log message 0126 with LOG_TRACE_EVENT, the following hard
  lockup call trace hangs the system.

  [Testcase]


  [root@ms-svr3-10-231-131-160 log]#
  [reply] [-]Comment 3James Smart 2022-04-13 09:12:37 PDT
  Patches pushed upstream 4/12/22:

  https://lore.kernel.org/linux-
  scsi/20220412222008.126521-1-jsmart2...@gmail.com/T/#t

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1971193/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to