------- Comment From dougm...@us.ibm.com 2018-04-19 06:51 EDT-------
There are two different panics being shown here. One is the kernel assert in 
usercopy.c, the other is the crash in qla2xxx. You should not be using one bug 
to handle two different issues. If the kernel assert is no longer happening, 
then close this bug.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1761729

Title:
  Ubuntu 18.04  Machine crashed while running ltp.

Status in The Ubuntu-power-systems project:
  Incomplete
Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Bionic:
  Incomplete

Bug description:
  ---Problem Description---
  Ubuntu 18.04 [ Briggs P8 ]: Machine crashed while running ltp.

  ---Environment--
  Kernel Build:  Ubuntu 18.04
  System Name :  ltc-briggs2
  Model/Type  :  P8
  Platform    :  BML

  ---Uname output---

  root@ltc-briggs2:~# uname -a
  Linux ltc-briggs2 4.15.0-13-generic #14-Ubuntu SMP Sat Mar 17 13:43:15 UTC 
2018 ppc64le ppc64le ppc64le GNU/Linux

  ---Steps to reproduce--

  $ git clone https://github.com/linux-test-project/ltp.git
  $ cd ltp
  $ make autotools
  $ ./configure
  $ make
  $ make install

  
  ltp
  =====

  root@ltc-briggs2:~# 
  root@ltc-briggs2:~# [10781.098337] LTP: starting fs_inod01 (fs_inod $TMPDIR 
10 10 10)
  [10782.837910] LTP: starting linker01 (linktest.sh 1000 1000)
  [10784.504474] LTP: starting openfile01 (openfile -f10 -t10)
  [10784.534953] LTP: starting inode01
  [10784.550767] LTP: starting inode02
  [10784.739104] LTP: starting stream01
  [10784.740840] LTP: starting stream02
  [10784.742487] LTP: starting stream03
  [10784.744532] LTP: starting stream04
  [10784.746087] LTP: starting stream05
  [10784.747722] LTP: starting ftest01
  [10785.142054] LTP: starting ftest02
  [10785.158852] LTP: starting ftest03
  [10785.404760] LTP: starting ftest04
  [10785.527197] LTP: starting ftest05
  [10785.937164] LTP: starting ftest06
  [10785.958360] LTP: starting ftest07
  [10786.463382] LTP: starting ftest08
  [10786.592998] LTP: starting lftest01 (lftest 100)
  [10786.672707] LTP: starting writetest01 (writetest)
  [10786.774292] LTP: starting fs_di (fs_di -d $TMPDIR)
  [10792.973510] LTP: starting proc01 (proc01 -m 128)
  [10793.865686] ICMPv6: process `proc01' is using deprecated sysctl (syscall) 
net.ipv6.neigh.default.base_reachable_time - use 
net.ipv6.neigh.default.base_reachable_time_ms instead
  [10795.785593] LTP: starting read_all_dev (read_all -d /dev -e 
'/dev/watchdog?(0)' -q -r 10)
  [10795.895774] NET: Registered protocol family 40
  [10795.918763] Bluetooth: Core ver 2.22
  [10795.918866] NET: Registered protocol family 31
  [10795.918909] Bluetooth: HCI device and connection manager initialized
  [10795.918955] Bluetooth: HCI socket layer initialized
  [10795.918991] Bluetooth: L2CAP socket layer initialized
  [10795.919032] Bluetooth: SCO socket layer initialized
  [10798.374850] usercopy: kernel memory exposure attempt detected from 
0000000029431ea4 (<kernel text>) (1023 bytes)
  [10798.374952] ------------[ cut here ]------------
  [10798.374988] kernel BUG at 
/build/linux-2BXDjB/linux-4.15.0/mm/usercopy.c:72!
  [10798.375041] Oops: Exception in kernel mode, sig: 5 [#1]
  [10798.375080] LE SMP NR_CPUS=2048 [10871.343999650,5] OPAL: Switch to 
big-endian OS
  NUMA PowerNV
  [10798.375117] [10876.190849323,5] OPAL: Switch to little-endian OS
  Modules linked in: hci_vhci bluetooth ecdh_generic vhost_vsock cuse 
vmw_vsock_virtio_transport_common userio vsock uhid vhost_net vhost tap snd_seq 
snd_seq_device snd_timer snd soundcore binfmt_misc sctp quota_v2 quota_tree 
nls_iso8859_1 ntfs xfs xt_CHECKSUM iptable_mangle ipt_MASQUERADE 
nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 
nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp 
bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables 
iptable_filter kvm_hv kvm idt_89hpesx vmx_crypto ofpart cmdlinepart 
ipmi_powernv powernv_flash ipmi_devintf mtd ipmi_msghandler ibmpowernv opal_prd 
at24 powernv_rng joydev input_leds mac_hid uio_pdrv_genirq uio sch_fq_codel 
nfsd ib_iser rdma_cm auth_rpcgss iw_cm nfs_acl lockd ib_cm grace iscsi_tcp
  [10798.375636]  libiscsi_tcp libiscsi sunrpc scsi_transport_iscsi ip_tables 
x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov 
async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 
multipath linear mlx5_ib ses enclosure scsi_transport_sas hid_generic usbhid 
hid ib_core qla2xxx ast i2c_algo_bit ttm mlx5_core drm_kms_helper syscopyarea 
sysfillrect sysimgblt fb_sys_fops nvme_fc crct10dif_vpmsum nvme_fabrics ahci 
mlxfw crc32c_vpmsum i40e drm devlink scsi_transport_fc megaraid_sas libahci
  [10798.375961] CPU: 87 PID: 4085 Comm: read_all Not tainted 4.15.0-13-generic 
#14-Ubuntu
  [10798.376013] NIP:  c0000000003c76f0 LR: c0000000003c76ec CTR: 
00000000300378e8
  [10798.376068] REGS: c0000076c63aba00 TRAP: 0700   Not tainted  
(4.15.0-13-generic)
  [10798.376120] MSR:  9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 28002222 
 XER: 20000000
  [10798.376176] CFAR: c00000000018cce4 SOFTE: 1 
  [10798.376176] GPR00: c0000000003c76ec c0000076c63abc80 c0000000016eaf00 
0000000000000064 
  [10798.376176] GPR04: c000007ffc1cce18 c000007ffc1e4368 9000000000009033 
000000000000040f 
  [10798.376176] GPR08: 0000000000000007 c0000000011c3a74 0000007ffb010000 
9000000000001003 
  [10798.376176] GPR12: 0000000000002200 c000000007a8bd00 0000000000000000 
0000000000000000 
  [10798.376176] GPR16: 0000000000000000 0000000000000000 0000000000000006 
00007ffff7a0a018 
  [10798.376176] GPR20: 000008bb551c8908 000008bb551c88f8 000008bb551c88c8 
c0000076c63abe00 
  [10798.376176] GPR24: 0000000000010000 0000000000000000 00007ffff7a0a018 
c0000076c63abe00 
  [10798.376176] GPR28: c0000000000003ff 0000000000000001 00000000000003ff 
c000000000000000 
  [10798.376619] NIP [c0000000003c76f0] __check_object_size+0x140/0x270
  [10798.376662] LR [c0000000003c76ec] __check_object_size+0x13c/0x270
  [10798.376706] Call Trace:
  [10798.376724] [c0000076c63abc80] [c0000000003c76ec] 
__check_object_size+0x13c/0x270 (unreliable)
  [10798.376787] [c0000076c63abd00] [c0000000008268a4] read_mem+0x84/0x220
  [10798.376835] [c0000076c63abd70] [c0000000003d109c] __vfs_read+0x3c/0x70
  [10798.376880] [c0000076c63abd90] [c0000000003d118c] vfs_read+0xbc/0x1b0
  [10798.376925] [c0000076c63abde0] [c0000000003d1788] SyS_read+0x68/0x110
  [10798.377012] [c0000076c63abe30] [c00000000000b184] system_call+0x58/0x6c
  [10798.377057] Instruction dump:
  [10798.377086] 2fbd0000 419e010c 3c82ff8b 3ca2ff94 3884c360 38a5ad68 3c62ff8b 
7fc8f378 
  [10798.377140] 7fe6fb78 3863c370 4bdc55b5 60000000 <0fe00000> 60000000 
60000000 60420000 
  [10798.377195] ---[ end trace 21abd4753a69334c ]---
  [10798.445038] 
  [10798.445135] Sending IPI to other CPUs
  [10798.446688] IPI complete
  [10798.449081] kexec: waiting for cpu 0 (physical 16) to enter OPAL
  [10798.450224] kexec: waiting for cpu 23 (physical 47) to enter OPAL
  [10798.451396] kexec: waiting for cpu 54 (physical 94) to enter OPAL
  [10800.049202] kexec: Starting switchover sequence.
  [    1.078053] integrity: Unable to open file: /etc/keys/x509_ima.der (-2)
  [    1.078057] integrity: Unable to open file: /etc/keys/x509_evm.der (-2)
  [    1.165219] vio vio: uevent: failed to send synthetic uevent
  /dev/nvme0n1p2: recovering journal
  /dev/nvme0n1p2: clean, 14017353/122101760 files, 57953106/488376576 blocks
  -.mount
  sys-kernel-debug.mount
  setvtrgb.service
  dev-hugepages.mount
  dev-mqueue.mount
  kmod-static-nodes.service
  lvm2-lvmetad.service
  systemd-remount-fs.service
  systemd-tmpfiles-setup-dev.service
  systemd-random-seed.service
  lvm2-monitor.service
  systemd-udevd.service
  systemd-modules-load.service
  sys-fs-fuse-connections.mount
  sys-kernel-config.mount
  systemd-sysctl.service
  systemd-networkd.service
  swapfile.swap
  [    5.177490] vio vio: uevent: failed to send synthetic uevent
  systemd-udev-trigger.service
  keyboard-setup.service
  systemd-journald.service
  [    5.458352] qla2xxx [0020:01:00.0]-00c6:17: MSI-X: Failed to enable 
support with 32 vectors, using 10 vectors.
  apparmor.service
  systemd-journal-flush.service
  systemd-tmpfiles-setup.service
  systemd-update-utmp.service
  [    6.119284] qla2xxx [0020:01:00.1]-00c6:18: MSI-X: Failed to enable 
support with 32 vectors, using 10 vectors.
  systemd-timesyncd.service
  [   10.052141] megaraid_sas 0001:03:00.0: Init cmd return status SUCCESS for 
SCSI host 1
  systemd-networkd-wait-online.service
  iscsid.service
  blk-availability.service
  [   10.675964] kdump-tools[2222]: Starting kdump-tools:  * running 
makedumpfile -c -d 31 /proc/vmcore /var/crash/201804050340/dump-incomplete
  lvm2-pvscan@8:195.service
  lvm2-pvscan@8:179.service
  Copying data                                      : [100.0 %] /           
eta: 0s
  [   55.227083] kdump-tools[2222]: The kernel version is not supported.
  [   55.227300] kdump-tools[2222]: The makedumpfile operation may be 
incomplete.
  [   55.227471] kdump-tools[2222]: The dumpfile is saved to 
/var/crash/201804050340/dump-incomplete.
  [   55.227583] kdump-tools[2222]: makedumpfile Completed.
  [   55.230250] kdump-tools[2222]:  * kdump-tools: saved vmcore in 
/var/crash/201804050340
  [   55.311695] kdump-tools[2222]:  * running makedumpfile --dump-dmesg 
/proc/vmcore /var/crash/201804050340/dmesg.201804050340
  [   55.330032] kdump-tools[2222]: The kernel version is not supported.
  [   55.330206] kdump-tools[2222]: The makedumpfile operation may be 
incomplete.
  [   55.330302] kdump-tools[2222]: The dmesg log is saved to 
/var/crash/201804050340/dmesg.201804050340.
  [   55.330416] kdump-tools[2222]: makedumpfile Completed.
  [   55.330533] kdump-tools[2222]:  * kdump-tools: saved dmesg content in 
/var/crash/201804050340
  [   55.334722] kdump-tools[2222]: Thu, 05 Apr 2018 03:40:44 -0500
  [   55.338419] kdump-tools[2222]: Rebooting.
  [   55.546343] mlx5_core 0021:01:00.1: mlx5_enter_error_state:121:(pid 2715): 
start
  [   55.546414] mlx5_core 0021:01:00.1: mlx5_enter_error_state:128:(pid 2715): 
end
  [   55.942498] mlx5_core 0021:01:00.0: mlx5_enter_error_state:121:(pid 2715): 
start
  [   55.942631] mlx5_core 0021:01:00.0: mlx5_enter_error_state:128:(pid 2715): 
end
  [   59.836381] reboot: Restarting system
  [10963.485916127,5] OPAL: Reboot request...
    5.31149|Ignoring boot flags, incorrect version 0x0
    5.52090|ISTEP  6. 3
    6.16670|ISTEP  6. 4
    6.16957|ISTEP  6. 5
    8.74865|HWAS|PRESENT> DIMM[03]=00AA00AA00AA00AA
    8.74865|HWAS|PRESENT> Membuf[04]=4444000000000000
    8.74866|HWAS|PRESENT> Proc[05]=C000000000000000
   14.03690|ISTEP  6. 6
   14.11948|ISTEP  6. 7
   16.75478|ISTEP  6. 8
   16.91585|ISTEP  6. 9
   17.47534|ISTEP  6.10
   17.55249|ISTEP  6.11
   19.29629|ISTEP  6.12
   19.29926|ISTEP  6.13
   19.30139|ISTEP  7. 1
   19.51889|ISTEP  7. 2

  == Comment: #7 - Vaishnavi Bhat <vaish...@in.ibm.com> - 2018-04-06 04:52:31 ==
  kernel memory exposure attempt detected and the BUG() is called from the 
below code snippet:
  mm/usercopy.c:72

        KERNEL: /usr/lib/debug/boot/vmlinux-4.15.0-13-generic
      DUMPFILE: dump.201804050340  [PARTIAL DUMP]
          CPUS: 160
          DATE: Thu Apr  5 03:39:16 2018
        UPTIME: 00:48:44
  LOAD AVERAGE: 2.78, 11.61, 106.19
         TASKS: 1748
      NODENAME: ltc-briggs2
       RELEASE: 4.15.0-13-generic
       VERSION: #14-Ubuntu SMP Sat Mar 17 13:43:15 UTC 2018
       MACHINE: ppc64le  (2926 Mhz)
        MEMORY: 512 GB
         PANIC: "kernel BUG at 
/build/linux-2BXDjB/linux-4.15.0/mm/usercopy.c:72!"
           PID: 4085
       COMMAND: "read_all"
          TASK: c000007659f23f00  [THREAD_INFO: c0000076c63a8000]
           CPU: 87
         STATE: TASK_RUNNING (PANIC)

  crash> bt
  PID: 4085   TASK: c000007659f23f00  CPU: 87  COMMAND: "read_all"
   #0 [c0000076c63ab740] crash_kexec at c0000000001e22b0
   #1 [c0000076c63ab780] oops_end at c000000000025888
   #2 [c0000076c63ab800] _exception at c000000000026684
   #3 [c0000076c63ab990] program_check_common at c000000000008da4
   Program Check [700] exception frame:
   R0:  c0000000003c76ec    R1:  c0000076c63abc80    R2:  c0000000016eaf00   
   R3:  0000000000000064    R4:  c000007ffc1cce18    R5:  c000007ffc1e4368   
   R6:  9000000000009033    R7:  000000000000040f    R8:  0000000000000007   
   R9:  c0000000011c3a74    R10: 0000007ffb010000    R11: 9000000000001003   
   R12: 0000000000002200    R13: c000000007a8bd00    R14: 0000000000000000   
   R15: 0000000000000000    R16: 0000000000000000    R17: 0000000000000000   
   R18: 0000000000000006    R19: 00007ffff7a0a018    R20: 000008bb551c8908   
   R21: 000008bb551c88f8    R22: 000008bb551c88c8    R23: c0000076c63abe00   
   R24: 0000000000010000    R25: 0000000000000000    R26: 00007ffff7a0a018   
   R27: c0000076c63abe00    R28: c0000000000003ff    R29: 0000000000000001   
   R30: 00000000000003ff    R31: c000000000000000   
   NIP: c0000000003c76f0    MSR: 9000000000029033    OR3: c00000000018cce4
   CTR: 00000000300378e8    LR:  c0000000003c76ec    XER: 0000000020000000
   CCR: 0000000028002222    MQ:  0000000000000001    DAR: 0000000000000000
   DSISR: 0000000000000000     Syscall Result: 0000000000000000
   #4 [c0000076c63abc80] __check_object_size at c0000000003c76f0
   [Link Register] [c0000076c63abc80] __check_object_size at c0000000003c76ec  
(unreliable)
   #5 [c0000076c63abd00] read_mem at c0000000008268a4
   #6 [c0000076c63abd70] __vfs_read at c0000000003d109c
   #7 [c0000076c63abd90] vfs_read at c0000000003d118c
   #8 [c0000076c63abde0] sys_read at c0000000003d1788
   #9 [c0000076c63abe30] system_call at c00000000000b184
   System Call [c01] exception frame:
   R0:  0000000000000003    R1:  00007ffff7a09ae0    R2:  0000753ec21b7f00   
   R3:  0000000000000006    R4:  00007ffff7a0a018    R5:  00000000000003ff   
   R6:  0000000000004000    R7:  0000753ec21898c4    R8:  900000010000d033   
   R9:  0000000000000000    R10: 0000000000000000    R11: 0000000000000000   
   R12: 0000000000000000    R13: 0000753ec224a8d0   
   NIP: 0000753ec2188580    MSR: 900000010000d033    OR3: 0000000000000006
   CTR: 0000000000000000    LR:  000008bb551b5f20    XER: 0000000000000000
   CCR: 0000000042002244    MQ:  0000000000000001    DAR: 0000753ec21affa8
   DSISR: 0000000040000000     Syscall Result: 0000000000000006
  crash> dis -s c0000000003c76f0
  FILE: /build/linux-2BXDjB/linux-4.15.0/mm/usercopy.c
  LINE: 72

  static void report_usercopy(const void *ptr, unsigned long len,
                              bool to_user, const char *type)
  {
          pr_emerg("kernel memory %s attempt detected %s %p (%s) (%lu bytes)\n",
                  to_user ? "exposure" : "overwrite",
                  to_user ? "from" : "to", ptr, type ? : "unknown", len);
          /*
           * For greater effect, it would be nice to do do_group_exit(),
           * but BUG() actually hooks all the lock-breaking and per-arch
           * Oops code, so that is used here instead.
           */
          BUG();
  }

  
  From the logs, I see that the memory exposure happens after the bluetooth 
driver is initialized. This might be an issue with the default bluetooth driver 
provided by the distro. 

  [10795.918866] NET: Registered protocol family 31
  [10795.918909] Bluetooth: HCI device and connection manager initialized
  [10795.918955] Bluetooth: HCI socket layer initialized
  [10795.918991] Bluetooth: L2CAP socket layer initialized
  [10795.919032] Bluetooth: SCO socket layer initialized
  [10798.374850] usercopy: kernel memory exposure attempt detected from 
0000000029431ea4 (<kernel text>) (1023 bytes)
  [10798.374952] ------------[ cut here ]------------
  [10798.374988] kernel BUG at 
/build/linux-2BXDjB/linux-4.15.0/mm/usercopy.c:72!

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1761729/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to