Having a look at the new logs it shows that there are many different disks and 
disk systems in use.
I see DASDs, that get partly unhappy, like:
Dec  8 07:43:29 ilabg3 kernel: [651348.534047] dasd(eckd): I/O status report 
for device 0.0.248b:
Dec  8 07:43:29 ilabg3 kernel: [651348.534047] dasd(eckd): in req: 
000000006004d11c CC:00 FC:01 AC:00 SC:01 DS:00 CS:00 RC:0
Dec  8 07:43:29 ilabg3 kernel: [651348.534047] dasd(eckd): device 0.0.248b: 
Failing CCW:           (null)
Dec  8 07:43:29 ilabg3 kernel: [651348.534047] dasd(eckd): SORRY - NO VALID 
SENSE AVAILABLE
Dec  8 07:43:29 ilabg3 kernel: [651348.534072] dasd(eckd): Related CP in req: 
000000006004d11c
Dec  8 07:43:29 ilabg3 kernel: [651348.534072] dasd(eckd): CCW 
00000000584ecfe4: 2740000C 7FE86BC0 DAT:  18000000 00000e00  00000000
Dec  8 07:43:29 ilabg3 kernel: [651348.534072] dasd(eckd): CCW 
00000000a235ae5d: 3E000200 7FEDD400 DAT:  00000000 00000000  00000000
(of course not related to the reported issue)
I see 'local' scsi disks:
scsi 3:0:12:0: Direct-Access     IBM      FlashSystem-9840 1600 PQ: 0 ANSI: 5
and I see XIV Storage as well.

I noticed a lot of 'Power-on or device reset's, as well as many 'LUN scan's and 
multipath path failures on a single day on that environment, like:
sd 9:0:0:1: tag#58 Add. Sense: Power on, reset, or bus device reset occurred
...
Dec  8 14:01:07 ilabg3 kernel: [    2.127745] sd 3:0:0:0: scsi scan: REPORT LUN 
scan
...
[458582.527232] device-mapper: multipath: Failing path 66:112.
[458582.527234] print_req_error: I/O error, dev sdan, sector 15001576
That seems to be very odd and points to an issue with the SAN infrastructure 
and/or setup...

The error messages that belong to the Ops seem to be known and points to a 
potential race discussed here:
[RFC,1/1] libiscsi: Fix race between iscsi_xmit_task and iscsi_complete_task
https://patchwork.kernel.org/patch/10501773/
Well, the situation above may even favor a race condition like this ...

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1804149

Title:
  Kernel panic,Oops: 0004 ilc:3 [#1] SMP,  iscsi_q_20 iscsi_xmitworker
  [libiscsi]

Status in Ubuntu on IBM z Systems:
  Triaged
Status in linux package in Ubuntu:
  Triaged

Bug description:
  == Comment: #0 - Michael Finnegan <finne...@us.ibm.com> - 2018-11-19 14:14:40 
==
  ---Problem Description---
  Kernel panic,Oops: 0004 ilc:3 [#1] SMP,  iscsi_q_20 iscsi_xmitworker 
[libiscsi]
   
  ---uname output---
  Linux ilabg3 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:13:24 UTC 2018 
s390x s390x s390x GNU/Linux
   
  Machine Type = IBM 3906 M05 7G4 (z/VM 7.1.0) 
   
  ---Debugger---
  A debugger is not configured
   
  Contact Information = Michael Finnegan/finne...@us.ibm.com 
   
  Stack trace output:
   dmesg.201811161956
  [1363037.322472] Unable to handle kernel pointer dereference in virtual 
kernel address space
  [1363037.322481] Failing address: 0000000000000000 TEID: 0000000000000483
  [1363037.322483] Fault in home space mode while using kernel ASCE.
  [1363037.322486] AS:0000000000ea0007 R3:00000000f37d0007 S:00000000f37ff000 
P:000000000000013d
  [1363037.322524] Oops: 0004 ilc:3 [#1] SMP
  [1363037.322529] Modules linked in: iptable_filter binfmt_misc 
rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache qeth_l3 qeth_l2 
s390_trng ghash_s390 prng aes_s390 des_s390 des_generic sha512_s390 sha256_s390 
sha1_s390 sha_common qeth vmur ccwgroup vfio_ccw vfio_mdev mdev 
vfio_iommu_type1 vfio sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core sunrpc 
iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables 
dm_round_robin dm_service_time crc32_vx_s390 dasd_eckd_mod zfcp qdio 
scsi_transport_fc dasd_fba_mod dasd_mod scsi_dh_emc scsi_dh_rdac scsi_dh_alua 
dm_multipath
  [1363037.322567] CPU: 3 PID: 37970 Comm: kworker/u128:19 Not tainted 
4.15.0-36-generic #39-Ubuntu
  [1363037.322573] Hardware name: IBM 3906 M05 7G4 (z/VM 7.1.0)
  [1363037.322581] Workqueue: iscsi_q_20 iscsi_xmitworker [libiscsi]
  [1363037.322583] Krnl PSW : 00000000c8051cf6 00000000e49a28f4 
(__iscsi_get_task+0x6/0x18 [libiscsi])
  [1363037.322587]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 
PM:0 RI:0 EA:3
  [1363037.322589] Krnl GPRS: 0000000000000000 0000000002923ce0 
0000000000000000 0000000000000400
  [1363037.322591]            000003ff80277640 0000000000000008 
00000000788efc00 000003ff802777cc
  [1363037.322592]            000003ff8027769c 0000000000000000 
0000000000000000 0000000078ce8310
  [1363037.322594]            00000000f3689800 0000000078ce83a2 
000003ff80272e8e 0000000002923c90
  [1363037.322601] Krnl Code: 000003ff80272624: c0f4fffffcf6      brcl    
15,3ff80272010
                              000003ff8027262a: c0f40000377b      brcl    
15,3ff80279520
                             #000003ff80272630: c00400000000      brcl    
0,3ff80272630
                             >000003ff80272636: eb012078006a      asi     
120(%r2),1
                              000003ff8027263c: c0f400003772      brcl    
15,3ff80279520
                              000003ff80272642: 0707              bcr     0,%r7
                              000003ff80272644: 0707              bcr     0,%r7
                              000003ff80272646: 0707              bcr     0,%r7
  [1363037.322618] Call Trace:
  [1363037.322621] ([<000003ff80272ec6>] iscsi_xmit_task+0x86/0x138 [libiscsi])
  [1363037.322625]  [<000003ff8027769c>] iscsi_data_xmit+0x44c/0x548 [libiscsi]
  [1363037.322636]  [<000003ff802777cc>] iscsi_xmitworker+0x34/0x58 [libiscsi]
  [1363037.322642]  [<00000000001918f2>] process_one_work+0x262/0x4d8
  [1363037.322644]  [<0000000000191bc0>] worker_thread+0x58/0x4e8
  [1363037.322648]  [<0000000000198d24>] kthread+0x14c/0x168
  [1363037.322652]  [<00000000008e3eb2>] kernel_thread_starter+0xa/0x10
  [1363037.322654]  [<00000000008e3ea8>] kernel_thread_starter+0x0/0x10
  [1363037.322655] Last Breaking-Event-Address:
  [1363037.322657]  [<000003ff80272e88>] iscsi_xmit_task+0x48/0x138 [libiscsi]
  [1363037.322658]
  [1363037.322660] Kernel panic - not syncing: Fatal exception in interrupt

  
   
  Oops output:
    Oops: 0004 ilc:3 [#1] SMP


  *Additional Instructions for Michael Finnegan/finne...@us.ibm.com: 
  -Post a private note with access information to the machine that the bug is 
occuring on. 
  -Attach sysctl -a output output to the bug.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1804149/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to