------- Comment From aleksandra.pa...@de.ibm.com 2021-09-16 04:22 EDT-------
Hi, we have hit the same problem:

vmcore/dvtc2b-2.gpfs.net_202106240332/dmesg.202106240332:
...
[645967.289658] Unable to handle kernel pointer dereference in virtual kernel 
address space
[645967.289665] Failing address: 001ffc004bf14000 TEID: 001ffc004bf14403
[645967.289668] Fault in home space mode while using kernel ASCE.
[645967.289671] AS:00000001d839c00b R2:000000038bbec00b R3:00000003010c0007 
S:0000000302260000 P:0000000000000400
[645967.289715] Oops: 0011 ilc:2 [#1] SMP
[645967.289721] Modules linked in: mmfs26(OE) mmfslinux(OE) tracedev(OE) nfsv3 
nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache 8021q garp 
mrp stp llc bonding binfmt_misc dm_service_time dm_multipath scsi_dh_rdac 
scsi_dh_emc scsi_dh_alua pkey zcrypt s390_trng ghash_s390 prng aes_s390 
des_s390 libdes sha3_512_s390 sha3_256_s390 sha512_s390 sha256_s390 sha1_s390 
sha_common chsc_sch eadm_sch vfio_ccw vfio_mdev mdev vfio_iommu_type1 vfio 
sch_fq_codel drm drm_panel_orientation_quirks i2c_core sunrpc ip_tables 
x_tables btrfs zstd_compress zlib_deflate raid10 raid456 async_raid6_recov 
async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 
linear crc32_vx_s390 zfcp scsi_transport_fc qeth_l2 dasd_eckd_mod dasd_mod qeth 
qdio ccwgroup [last unloaded: tracedev]
[645967.289791] CPU: 4 PID: 1891047 Comm: kgnrdwr_dvtc2b Kdump: loaded Tainted: 
G           OE     5.4.0-74-generic #83-Ubuntu
[645967.289795] Hardware name: IBM 3906 M05 710 (LPAR)
[645967.289798] Krnl PSW : 0404e00180000000 00000001d73e20ce 
(try_to_wake_up+0x4e/0x700)
[645967.289809]            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 
RI:0 EA:3
[645967.289814] Krnl GPRS: 0000000370d32488 001ffc0000000000 001ffc0000000005 
0000000000000003
[645967.289817]            0000000000000000 ffffffff00000005 041ffbff80bcb9e0 
0000000000000003
[645967.289858]            0000000000000003 001ffc004bf141bc 0000000000000000 
001ffc004bf13878
[645967.289860]            0000000095190000 00000001d7c1aa40 001ffbff80bcba10 
001ffbff80bcb990
[645967.289872] Krnl Code: 00000001d73e20c2: 41902944           la      
%r9,2372(%r2)
00000001d73e20c6: 582003ac           l       %r2,940
#00000001d73e20ca: a7180000           lhi     %r1,0
>00000001d73e20ce: ba129000           cs      %r1,%r2,0(%r9)
00000001d73e20d2: a77401c9           brc     7,00000001d73e2464
00000001d73e20d6: e310b0080004       lg      %r1,8(%r11)
00000001d73e20dc: b9800018           ngr     %r1,%r8
00000001d73e20e0: a774001f           brc     7,00000001d73e211e
[645967.289894] Call Trace:
[645967.289899] ([<0000000000000000>] 0x0)
[645967.289906]  [<00000001d785c83a>] rq_qos_wake_function+0x8a/0xa0
[645967.289913]  [<00000001d74004c2>] __wake_up_common+0xa2/0x1b0
[645967.289915]  [<00000001d74009c4>] __wake_up_common_lock+0x94/0xe0
[645967.289918]  [<00000001d7400a3a>] __wake_up+0x2a/0x40
[645967.289923]  [<00000001d7873870>] wbt_done+0x90/0xe0
[645967.289925]  [<00000001d785c942>] __rq_qos_done+0x42/0x60
[645967.289928]  [<00000001d78486c0>] blk_mq_free_request+0xe0/0x140
[645967.289949]  [<001fffff801bf18a>] dasd_request_done+0x2a/0x40 [dasd_mod]
[645967.289951]  [<00000001d7848938>] blk_mq_complete_request+0xb8/0x160
[645967.289957]  [<001fffff801c43c8>] dasd_block_tasklet+0x148/0x470 [dasd_mod]
[645967.289962]  [<00000001d73b12d2>] tasklet_action_common.isra.0+0x82/0x160
[645967.289968]  [<00000001d7c117b4>] __do_softirq+0x104/0x360
[645967.289971]  [<00000001d73b1a4e>] irq_exit+0x9e/0xc0
[645967.289974]  [<00000001d733cb28>] do_IRQ+0x78/0xb0
[645967.289977]  [<00000001d7c10a20>] io_int_handler+0x12c/0x294
[645967.289985]  [<001fffff805f3c30>] _DTrace3+0x10/0xb0 [tracedev]
[645967.290048] ([<001fffff80a8d2ca>] gpfs_f_llseek+0x4a/0x280 [mmfslinux])
[645967.290053]  [<00000001d75f5ed2>] ksys_lseek+0x92/0xe0
[645967.290055]  [<00000001d7c10498>] system_call+0xdc/0x2c8
[645967.290056] Last Breaking-Event-Address:
[645967.290060]  [<00000001d73e278e>] wake_up_process+0xe/0x20

If I should upload the data for this bug or open a new please tell.

------- Comment From aleksandra.pa...@de.ibm.com 2021-09-16 04:27 EDT-------
dvtc2b-2.gpfs.net:  OS: Debian Linux: Ubuntu 20.04.2 LTS  =>  Kernel: 
5.4.0-74-generic on s390x

------- Comment From boris.m...@de.ibm.com 2021-09-17 09:36 EDT-------
I double-checked with Canonical regarding the current / in-service release 
level of 20.04 LTS. The current level is 20.04.3, to be more precise 20.04.3 
with Kernel 5.4.0.84. Therefore, you should try to re-produce the bug on the 
20.04.3 release.

Apart from the (not supported anymore) 20.04.2 release level, a first
look on the error messages strongly hints to a GPFS error.
("vmcore/dvtc2b-2.gpfs.net_202106240332/dmesg.202106240332")

@Aleksandra: can you please confirm if GPFS is running on the system?

------- Comment From aleksandra.pa...@de.ibm.com 2021-09-20 05:29 EDT-------
Yes, gpfs is running on the system.

But what you think it is a error message is actually the name of the file where 
the dmesg output from the crash is stored, with some data about the machine 
where the crash was taken and the date.
vmcore/dvtc2b-2.gpfs.net_202106240332/dmesg.202106240332:
The actual dmesg data start after '...'

Is there something else beside that line that makes you think that this
is a gpfs problem?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1929923

Title:
  [UBUNTU 20.04] LPAR becomes unresponsive after the Kernel panic -
  rq_qos_wake_function

Status in Ubuntu on IBM z Systems:
  Invalid
Status in linux package in Ubuntu:
  Invalid

Bug description:
  ---Problem Description---
  kernel panic rq_qos_wake_function
   
  ---uname output---
  Linux version 5.4.0-71-generic
   
  Machine Type = s390x 
   
  ---Debugger---
  A debugger is not configured
   
  Stack trace output:
   May 15 20:21:04 data1 kernel: Call Trace:
  May 15 20:21:04 data1 kernel: ([<000000234091e670>] 0x234091e670)
  May 15 20:21:04 data1 kernel:  [<0000003e10047e3a>] 
rq_qos_wake_function+0x8a/0xa0 
  May 15 20:21:04 data1 kernel:  [<0000003e0fbec482>] 
__wake_up_common+0xa2/0x1b0 
  May 15 20:21:04 data1 kernel:  [<0000003e0fbec984>] 
__wake_up_common_lock+0x94/0xe0 
  May 15 20:21:04 data1 kernel:  [<0000003e0fbec9fa>] __wake_up+0x2a/0x40 
  May 15 20:21:04 data1 kernel:  [<0000003e1005ee70>] wbt_done+0x90/0xe0 
  May 15 20:21:04 data1 kernel:  [<0000003e10047f42>] __rq_qos_done+0x42/0x60 
  May 15 20:21:04 data1 kernel:  [<0000003e10033cb0>] 
blk_mq_free_request+0xe0/0x140 
  May 15 20:21:04 data1 kernel:  [<0000003e101d46f0>] 
dm_softirq_done+0x140/0x230 
  May 15 20:21:04 data1 kernel:  [<0000003e100326c0>] 
blk_done_softirq+0xc0/0xe0 
  May 15 20:21:04 data1 kernel:  [<0000003e103fc084>] __do_softirq+0x104/0x360 
  May 15 20:21:04 data1 kernel:  [<0000003e0fb9da1e>] irq_exit+0x9e/0xc0 
  May 15 20:21:04 data1 kernel:  [<0000003e0fb28ae8>] do_IRQ+0x78/0xb0 
  May 15 20:21:04 data1 kernel:  [<0000003e103fb588>] 
ext_int_handler+0x130/0x134 
  May 15 20:21:04 data1 kernel:  [<0000003e101d4416>] dm_mq_queue_rq+0x36/0x1d0 
  May 15 20:21:04 data1 kernel: Last Breaking-Event-Address:
  May 15 20:21:04 data1 kernel:  [<0000003e0fbce75e>] wake_up_process+0xe/0x20
  May 15 20:21:04 data1 kernel: Kernel panic - not syncing: Fatal exception in 
interrupt
   
  Oops output:
   no
   
  System Dump Info:
    The system was configured to capture a dump, however a dump was not 
produced.

  -Attach sysctl -a output output to the bug.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1929923/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to