------- Comment From aleksandra.pa...@de.ibm.com 2021-09-16 04:22 EDT------- Hi, we have hit the same problem:
vmcore/dvtc2b-2.gpfs.net_202106240332/dmesg.202106240332: ... [645967.289658] Unable to handle kernel pointer dereference in virtual kernel address space [645967.289665] Failing address: 001ffc004bf14000 TEID: 001ffc004bf14403 [645967.289668] Fault in home space mode while using kernel ASCE. [645967.289671] AS:00000001d839c00b R2:000000038bbec00b R3:00000003010c0007 S:0000000302260000 P:0000000000000400 [645967.289715] Oops: 0011 ilc:2 [#1] SMP [645967.289721] Modules linked in: mmfs26(OE) mmfslinux(OE) tracedev(OE) nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache 8021q garp mrp stp llc bonding binfmt_misc dm_service_time dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua pkey zcrypt s390_trng ghash_s390 prng aes_s390 des_s390 libdes sha3_512_s390 sha3_256_s390 sha512_s390 sha256_s390 sha1_s390 sha_common chsc_sch eadm_sch vfio_ccw vfio_mdev mdev vfio_iommu_type1 vfio sch_fq_codel drm drm_panel_orientation_quirks i2c_core sunrpc ip_tables x_tables btrfs zstd_compress zlib_deflate raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 linear crc32_vx_s390 zfcp scsi_transport_fc qeth_l2 dasd_eckd_mod dasd_mod qeth qdio ccwgroup [last unloaded: tracedev] [645967.289791] CPU: 4 PID: 1891047 Comm: kgnrdwr_dvtc2b Kdump: loaded Tainted: G OE 5.4.0-74-generic #83-Ubuntu [645967.289795] Hardware name: IBM 3906 M05 710 (LPAR) [645967.289798] Krnl PSW : 0404e00180000000 00000001d73e20ce (try_to_wake_up+0x4e/0x700) [645967.289809] R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3 [645967.289814] Krnl GPRS: 0000000370d32488 001ffc0000000000 001ffc0000000005 0000000000000003 [645967.289817] 0000000000000000 ffffffff00000005 041ffbff80bcb9e0 0000000000000003 [645967.289858] 0000000000000003 001ffc004bf141bc 0000000000000000 001ffc004bf13878 [645967.289860] 0000000095190000 00000001d7c1aa40 001ffbff80bcba10 001ffbff80bcb990 [645967.289872] Krnl Code: 00000001d73e20c2: 41902944 la %r9,2372(%r2) 00000001d73e20c6: 582003ac l %r2,940 #00000001d73e20ca: a7180000 lhi %r1,0 >00000001d73e20ce: ba129000 cs %r1,%r2,0(%r9) 00000001d73e20d2: a77401c9 brc 7,00000001d73e2464 00000001d73e20d6: e310b0080004 lg %r1,8(%r11) 00000001d73e20dc: b9800018 ngr %r1,%r8 00000001d73e20e0: a774001f brc 7,00000001d73e211e [645967.289894] Call Trace: [645967.289899] ([<0000000000000000>] 0x0) [645967.289906] [<00000001d785c83a>] rq_qos_wake_function+0x8a/0xa0 [645967.289913] [<00000001d74004c2>] __wake_up_common+0xa2/0x1b0 [645967.289915] [<00000001d74009c4>] __wake_up_common_lock+0x94/0xe0 [645967.289918] [<00000001d7400a3a>] __wake_up+0x2a/0x40 [645967.289923] [<00000001d7873870>] wbt_done+0x90/0xe0 [645967.289925] [<00000001d785c942>] __rq_qos_done+0x42/0x60 [645967.289928] [<00000001d78486c0>] blk_mq_free_request+0xe0/0x140 [645967.289949] [<001fffff801bf18a>] dasd_request_done+0x2a/0x40 [dasd_mod] [645967.289951] [<00000001d7848938>] blk_mq_complete_request+0xb8/0x160 [645967.289957] [<001fffff801c43c8>] dasd_block_tasklet+0x148/0x470 [dasd_mod] [645967.289962] [<00000001d73b12d2>] tasklet_action_common.isra.0+0x82/0x160 [645967.289968] [<00000001d7c117b4>] __do_softirq+0x104/0x360 [645967.289971] [<00000001d73b1a4e>] irq_exit+0x9e/0xc0 [645967.289974] [<00000001d733cb28>] do_IRQ+0x78/0xb0 [645967.289977] [<00000001d7c10a20>] io_int_handler+0x12c/0x294 [645967.289985] [<001fffff805f3c30>] _DTrace3+0x10/0xb0 [tracedev] [645967.290048] ([<001fffff80a8d2ca>] gpfs_f_llseek+0x4a/0x280 [mmfslinux]) [645967.290053] [<00000001d75f5ed2>] ksys_lseek+0x92/0xe0 [645967.290055] [<00000001d7c10498>] system_call+0xdc/0x2c8 [645967.290056] Last Breaking-Event-Address: [645967.290060] [<00000001d73e278e>] wake_up_process+0xe/0x20 If I should upload the data for this bug or open a new please tell. ------- Comment From aleksandra.pa...@de.ibm.com 2021-09-16 04:27 EDT------- dvtc2b-2.gpfs.net: OS: Debian Linux: Ubuntu 20.04.2 LTS => Kernel: 5.4.0-74-generic on s390x ------- Comment From boris.m...@de.ibm.com 2021-09-17 09:36 EDT------- I double-checked with Canonical regarding the current / in-service release level of 20.04 LTS. The current level is 20.04.3, to be more precise 20.04.3 with Kernel 5.4.0.84. Therefore, you should try to re-produce the bug on the 20.04.3 release. Apart from the (not supported anymore) 20.04.2 release level, a first look on the error messages strongly hints to a GPFS error. ("vmcore/dvtc2b-2.gpfs.net_202106240332/dmesg.202106240332") @Aleksandra: can you please confirm if GPFS is running on the system? ------- Comment From aleksandra.pa...@de.ibm.com 2021-09-20 05:29 EDT------- Yes, gpfs is running on the system. But what you think it is a error message is actually the name of the file where the dmesg output from the crash is stored, with some data about the machine where the crash was taken and the date. vmcore/dvtc2b-2.gpfs.net_202106240332/dmesg.202106240332: The actual dmesg data start after '...' Is there something else beside that line that makes you think that this is a gpfs problem? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1929923 Title: [UBUNTU 20.04] LPAR becomes unresponsive after the Kernel panic - rq_qos_wake_function Status in Ubuntu on IBM z Systems: Invalid Status in linux package in Ubuntu: Invalid Bug description: ---Problem Description--- kernel panic rq_qos_wake_function ---uname output--- Linux version 5.4.0-71-generic Machine Type = s390x ---Debugger--- A debugger is not configured Stack trace output: May 15 20:21:04 data1 kernel: Call Trace: May 15 20:21:04 data1 kernel: ([<000000234091e670>] 0x234091e670) May 15 20:21:04 data1 kernel: [<0000003e10047e3a>] rq_qos_wake_function+0x8a/0xa0 May 15 20:21:04 data1 kernel: [<0000003e0fbec482>] __wake_up_common+0xa2/0x1b0 May 15 20:21:04 data1 kernel: [<0000003e0fbec984>] __wake_up_common_lock+0x94/0xe0 May 15 20:21:04 data1 kernel: [<0000003e0fbec9fa>] __wake_up+0x2a/0x40 May 15 20:21:04 data1 kernel: [<0000003e1005ee70>] wbt_done+0x90/0xe0 May 15 20:21:04 data1 kernel: [<0000003e10047f42>] __rq_qos_done+0x42/0x60 May 15 20:21:04 data1 kernel: [<0000003e10033cb0>] blk_mq_free_request+0xe0/0x140 May 15 20:21:04 data1 kernel: [<0000003e101d46f0>] dm_softirq_done+0x140/0x230 May 15 20:21:04 data1 kernel: [<0000003e100326c0>] blk_done_softirq+0xc0/0xe0 May 15 20:21:04 data1 kernel: [<0000003e103fc084>] __do_softirq+0x104/0x360 May 15 20:21:04 data1 kernel: [<0000003e0fb9da1e>] irq_exit+0x9e/0xc0 May 15 20:21:04 data1 kernel: [<0000003e0fb28ae8>] do_IRQ+0x78/0xb0 May 15 20:21:04 data1 kernel: [<0000003e103fb588>] ext_int_handler+0x130/0x134 May 15 20:21:04 data1 kernel: [<0000003e101d4416>] dm_mq_queue_rq+0x36/0x1d0 May 15 20:21:04 data1 kernel: Last Breaking-Event-Address: May 15 20:21:04 data1 kernel: [<0000003e0fbce75e>] wake_up_process+0xe/0x20 May 15 20:21:04 data1 kernel: Kernel panic - not syncing: Fatal exception in interrupt Oops output: no System Dump Info: The system was configured to capture a dump, however a dump was not produced. -Attach sysctl -a output output to the bug. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-z-systems/+bug/1929923/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp