------- Comment From bren...@br.ibm.com 2017-09-11 14:23 EDT-------
Lata, Chandan,

Canonical created a special kernel with this fix. They need us to test
it before integrating the patch in the kernel. Could you please test it
and let them know the result of this one-off kernel?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1702998

Title:
  Ubuntu 17.04: Guest crashed @writeback_sb_inodes+0x310/0x590

Status in The Ubuntu-power-systems project:
  Incomplete
Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Zesty:
  In Progress

Bug description:
  == Comment: #0 - Lata Kuntal <lakun...@in.ibm.com> - 2017-03-03 00:50:54 ==
  Ubuntu 17.04 guest dropped at xmon after crashing at 
writeback_sb_inodes+0x310/0x590. 
  The guest is having XFS rootfs and NPIV disk. It crashed after 30+ hrs of 
BASE and NFS stress test .

  Crash logs
  =======
  root@guskvm:~# virsh console gusg1 --force
  Connected to domain gusg1
  Escape character is ^]

  0:mon>
  0:mon> t
  [c0000000a4bc7940] c00000000036f790 writeback_sb_inodes+0x310/0x590
  [c0000000a4bc7a50] c00000000036faf4 __writeback_inodes_wb+0xe4/0x150
  [c0000000a4bc7ab0] c00000000036ff1c wb_writeback+0x2cc/0x440
  [c0000000a4bc7b80] c000000000370c30 wb_workfn+0x150/0x560
  [c0000000a4bc7c90] c0000000000ed8c0 process_one_work+0x2b0/0x5a0
  [c0000000a4bc7d20] c0000000000edc58 worker_thread+0xa8/0x650
  [c0000000a4bc7dc0] c0000000000f67b4 kthread+0x154/0x1a0
  [c0000000a4bc7e30] c00000000000b4e8 ret_from_kernel_thread+0x5c/0x74
  0:mon> r
  R00 = c00000000036f790   R16 = c0000000eca70300
  R01 = c0000000a4bc78e0   R17 = c0000000f7035240
  R02 = c00000000143c900   R18 = 0000000000000000
  R03 = c0000000f7035150   R19 = 0000000000000000
  R04 = 0000000000000019   R20 = c0000000a4bc4000
  R05 = 0000000000000100   R21 = ffffffffffffff7f
  R06 = 0000000000000000   R22 = c00000000433d758
  R07 = 0000000000000000   R23 = c00000000433d738
  R08 = 0000000000034995   R24 = 0000000000000000
  R09 = 0000000000000000   R25 = 0000000000000000
  R10 = 0000000080000000   R26 = c0000000f70351d8
  R11 = c0000000a4bc7a40   R27 = 0000000000000000
  R12 = 0000000000002200   R28 = 0000000000000001
  R13 = c00000000fb80000   R29 = c00000000433d728
  R14 = 0000000000000000   R30 = c0000000f7035150
  R15 = c0000000f70351d8   R31 = 0000000000000000
  pc  = c00000000036c120 locked_inode_to_wb_and_lock_list+0x50/0x290
  cfar= c0000000000b2a14 kvmppc_save_tm+0x168/0x16c
  lr  = c00000000036f790 writeback_sb_inodes+0x310/0x590
  msr = 8000000000009033   cr  = 24002482
  ctr = c000000000381e30   xer = 0000000000000000   trap =  300
  dar = 0000000000000000   dsisr = 40000000
  0:mon> e
  cpu 0x0: Vector: 300 (Data Access) at [c0000000a4bc7660]
      pc: c00000000036c120: locked_inode_to_wb_and_lock_list+0x50/0x290
      lr: c00000000036f790: writeback_sb_inodes+0x310/0x590
      sp: c0000000a4bc78e0
     msr: 8000000000009033
     dar: 0
   dsisr: 40000000
    current = 0xc0000000fbe96000
    paca    = 0xc00000000fb80000   softe: 0        irq_happened: 0x01
      pid   = 17305, comm = kworker/u16:0
  Linux version 4.10.0-8-generic (buildd@bos01-ppc64el-001) (gcc version 6.3.0 
20161229 (Ubuntu 6.3.0-2ubuntu1) ) #10-Ubuntu SMP Mon Feb 13 14:00:06 UTC 2017 
(Ubuntu 4.10.0-8.10-generic 4.10.0-rc8)
  0:mon> d
  0000000000000000 **************** ****************  |                |
  0:mon>

  
  Host and guest kernel build
  =====================
  4.10.0-8-generic

  
  OPAL firmware version
  ----------------------------------------
    T side    : FW860.20 (SV860_078)
    Boot side : FW860.20 (SV860_078)

  
  == Comment: #4 - VIPIN K. PARASHAR <vipar...@in.ibm.com> - 2017-03-03 
02:55:20 ==
  [140071.761707] Adding 153536k swap on /dev/loop0.  Priority:-2 extents:1 
across:153536k FS
  [140072.153143] Adding 153472k swap on /dev/loop0.  Priority:-2 extents:1 
across:153472k FS
  [140072.441833] Unable to handle kernel paging request for data at address 
0x00000000
  [140072.442064] Faulting instruction address: 0xc00000000036c120
  0:mon>

  0:mon> e
  cpu 0x0: Vector: 300 (Data Access) at [c0000000a4bc7660]
      pc: c00000000036c120: locked_inode_to_wb_and_lock_list+0x50/0x290
      lr: c00000000036f790: writeback_sb_inodes+0x310/0x590
      sp: c0000000a4bc78e0
     msr: 8000000000009033
     dar: 0
   dsisr: 40000000
    current = 0xc0000000fbe96000
    paca    = 0xc00000000fb80000         softe: 0        irq_happened: 0x01
      pid   = 17305, comm = kworker/u16:0
  Linux version 4.10.0-8-generic (buildd@bos01-ppc64el-001) (gcc version 6.3.0 
20161229 (Ubuntu 6.3.0-2ubuntu1) ) #10-Ubuntu SMP Mon Feb 13 14:00:06 UTC 2017 
(Ubuntu 4.10.0-8.10-generic 4.10.0-rc8)
  0:mon> t
  [c0000000a4bc7940] c00000000036f790 writeback_sb_inodes+0x310/0x590
  [c0000000a4bc7a50] c00000000036faf4 __writeback_inodes_wb+0xe4/0x150
  [c0000000a4bc7ab0] c00000000036ff1c wb_writeback+0x2cc/0x440
  [c0000000a4bc7b80] c000000000370c30 wb_workfn+0x150/0x560
  [c0000000a4bc7c90] c0000000000ed8c0 process_one_work+0x2b0/0x5a0
  [c0000000a4bc7d20] c0000000000edc58 worker_thread+0xa8/0x650
  [c0000000a4bc7dc0] c0000000000f67b4 kthread+0x154/0x1a0
  [c0000000a4bc7e30] c00000000000b4e8 ret_from_kernel_thread+0x5c/0x74
  0:mon> r
  R00 = c00000000036f790   R16 = c0000000eca70300
  R01 = c0000000a4bc78e0   R17 = c0000000f7035240
  R02 = c00000000143c900   R18 = 0000000000000000
  R03 = c0000000f7035150   R19 = 0000000000000000
  R04 = 0000000000000019   R20 = c0000000a4bc4000
  R05 = 0000000000000100   R21 = ffffffffffffff7f
  R06 = 0000000000000000   R22 = c00000000433d758
  R07 = 0000000000000000   R23 = c00000000433d738
  R08 = 0000000000034995   R24 = 0000000000000000
  R09 = 0000000000000000   R25 = 0000000000000000
  R10 = 0000000080000000   R26 = c0000000f70351d8
  R11 = c0000000a4bc7a40   R27 = 0000000000000000
  R12 = 0000000000002200   R28 = 0000000000000001
  R13 = c00000000fb80000   R29 = c00000000433d728
  R14 = 0000000000000000   R30 = c0000000f7035150
  R15 = c0000000f70351d8   R31 = 0000000000000000
  pc  = c00000000036c120 locked_inode_to_wb_and_lock_list+0x50/0x290
  cfar= c0000000000b2a14 kvmppc_save_tm+0x168/0x16c
  lr  = c00000000036f790 writeback_sb_inodes+0x310/0x590
  msr = 8000000000009033   cr  = 24002482
  ctr = c000000000381e30   xer = 0000000000000000   trap =  300
  dar = 0000000000000000   dsisr = 40000000
  0:mon> S
  msr    = 8000000000001033  sprg0 = 0000000000000000
  pvr    = 00000000004b0201  sprg1 = c00000000fb80000
  dec    = 00000000b56746ff  sprg2 = c00000000fb80000
  sp     = c0000000a4bc7100  sprg3 = 0000000000000000
  toc    = c00000000143c900  dar   = 0000000000000400
  srr0   = 000000000008c59c  srr1  = 0000000000001033 dsisr  = 40000000
  dscr   = 0000000000000000  ppr   = 0000000000000000 pir    = 00000030
  dpdes  = 0000000000000000  tir   = 0000000000000000 cir    = 00000000
  fscr   = 0000000000000180  tar   = 0000000000000000 pspb   = 00000000
  mmcr0  = 0000000080000000  mmcr1 = 0000000000000000 mmcr2  = 0000000000000000
  pmc1   = 00000000 pmc2 = 00000000  pmc3 = 00000000  pmc4   = 00000000
  mmcra  = 0000000000000000   siar = 0000000000000000 pmc5   = b9ad0e28
  sdar   = 0000000000000000   sier = 0000000000000000 pmc6   = 7f0fdfbe
  ebbhr  = 0000000000000000  ebbrr = 0000000000000000 bescr  = 0000000000000000
  0:mon> 

  
  Crash is due to Kernel hitting a DSI  while executing 
locked_inode_to_wb_and_lock_list routine.

  == Comment: #8 - VIPIN K. PARASHAR <vipar...@in.ibm.com> - 2017-03-03 
05:07:03 ==
  Its crashing at fs/fs-writeback.c

  static struct bdi_writeback *
  locked_inode_to_wb_and_lock_list(struct inode *inode)
          __releases(&inode->i_lock)
          __acquires(&wb->list_lock)
  {
          while (true) {
                  struct bdi_writeback *wb = inode_to_wb(inode);

                  /*
                   * inode_to_wb() association is protected by both
                   * @inode->i_lock and @wb->list_lock but list_lock nests
                   * outside i_lock.  Drop i_lock and verify that the
                   * association hasn't changed after acquiring list_lock.
                   */
                  wb_get(wb);                                                   
 <-----------
                  spin_unlock(&inode->i_lock);
                  spin_lock(&wb->list_lock);

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1702998/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to