------- Comment From dougm...@us.ibm.com 2018-04-19 06:51 EDT------- There are two different panics being shown here. One is the kernel assert in usercopy.c, the other is the crash in qla2xxx. You should not be using one bug to handle two different issues. If the kernel assert is no longer happening, then close this bug.
-- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1761729 Title: Ubuntu 18.04 Machine crashed while running ltp. Status in The Ubuntu-power-systems project: Incomplete Status in linux package in Ubuntu: Incomplete Status in linux source package in Bionic: Incomplete Bug description: ---Problem Description--- Ubuntu 18.04 [ Briggs P8 ]: Machine crashed while running ltp. ---Environment-- Kernel Build: Ubuntu 18.04 System Name : ltc-briggs2 Model/Type : P8 Platform : BML ---Uname output--- root@ltc-briggs2:~# uname -a Linux ltc-briggs2 4.15.0-13-generic #14-Ubuntu SMP Sat Mar 17 13:43:15 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux ---Steps to reproduce-- $ git clone https://github.com/linux-test-project/ltp.git $ cd ltp $ make autotools $ ./configure $ make $ make install ltp ===== root@ltc-briggs2:~# root@ltc-briggs2:~# [10781.098337] LTP: starting fs_inod01 (fs_inod $TMPDIR 10 10 10) [10782.837910] LTP: starting linker01 (linktest.sh 1000 1000) [10784.504474] LTP: starting openfile01 (openfile -f10 -t10) [10784.534953] LTP: starting inode01 [10784.550767] LTP: starting inode02 [10784.739104] LTP: starting stream01 [10784.740840] LTP: starting stream02 [10784.742487] LTP: starting stream03 [10784.744532] LTP: starting stream04 [10784.746087] LTP: starting stream05 [10784.747722] LTP: starting ftest01 [10785.142054] LTP: starting ftest02 [10785.158852] LTP: starting ftest03 [10785.404760] LTP: starting ftest04 [10785.527197] LTP: starting ftest05 [10785.937164] LTP: starting ftest06 [10785.958360] LTP: starting ftest07 [10786.463382] LTP: starting ftest08 [10786.592998] LTP: starting lftest01 (lftest 100) [10786.672707] LTP: starting writetest01 (writetest) [10786.774292] LTP: starting fs_di (fs_di -d $TMPDIR) [10792.973510] LTP: starting proc01 (proc01 -m 128) [10793.865686] ICMPv6: process `proc01' is using deprecated sysctl (syscall) net.ipv6.neigh.default.base_reachable_time - use net.ipv6.neigh.default.base_reachable_time_ms instead [10795.785593] LTP: starting read_all_dev (read_all -d /dev -e '/dev/watchdog?(0)' -q -r 10) [10795.895774] NET: Registered protocol family 40 [10795.918763] Bluetooth: Core ver 2.22 [10795.918866] NET: Registered protocol family 31 [10795.918909] Bluetooth: HCI device and connection manager initialized [10795.918955] Bluetooth: HCI socket layer initialized [10795.918991] Bluetooth: L2CAP socket layer initialized [10795.919032] Bluetooth: SCO socket layer initialized [10798.374850] usercopy: kernel memory exposure attempt detected from 0000000029431ea4 (<kernel text>) (1023 bytes) [10798.374952] ------------[ cut here ]------------ [10798.374988] kernel BUG at /build/linux-2BXDjB/linux-4.15.0/mm/usercopy.c:72! [10798.375041] Oops: Exception in kernel mode, sig: 5 [#1] [10798.375080] LE SMP NR_CPUS=2048 [10871.343999650,5] OPAL: Switch to big-endian OS NUMA PowerNV [10798.375117] [10876.190849323,5] OPAL: Switch to little-endian OS Modules linked in: hci_vhci bluetooth ecdh_generic vhost_vsock cuse vmw_vsock_virtio_transport_common userio vsock uhid vhost_net vhost tap snd_seq snd_seq_device snd_timer snd soundcore binfmt_misc sctp quota_v2 quota_tree nls_iso8859_1 ntfs xfs xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter kvm_hv kvm idt_89hpesx vmx_crypto ofpart cmdlinepart ipmi_powernv powernv_flash ipmi_devintf mtd ipmi_msghandler ibmpowernv opal_prd at24 powernv_rng joydev input_leds mac_hid uio_pdrv_genirq uio sch_fq_codel nfsd ib_iser rdma_cm auth_rpcgss iw_cm nfs_acl lockd ib_cm grace iscsi_tcp [10798.375636] libiscsi_tcp libiscsi sunrpc scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear mlx5_ib ses enclosure scsi_transport_sas hid_generic usbhid hid ib_core qla2xxx ast i2c_algo_bit ttm mlx5_core drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops nvme_fc crct10dif_vpmsum nvme_fabrics ahci mlxfw crc32c_vpmsum i40e drm devlink scsi_transport_fc megaraid_sas libahci [10798.375961] CPU: 87 PID: 4085 Comm: read_all Not tainted 4.15.0-13-generic #14-Ubuntu [10798.376013] NIP: c0000000003c76f0 LR: c0000000003c76ec CTR: 00000000300378e8 [10798.376068] REGS: c0000076c63aba00 TRAP: 0700 Not tainted (4.15.0-13-generic) [10798.376120] MSR: 9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 28002222 XER: 20000000 [10798.376176] CFAR: c00000000018cce4 SOFTE: 1 [10798.376176] GPR00: c0000000003c76ec c0000076c63abc80 c0000000016eaf00 0000000000000064 [10798.376176] GPR04: c000007ffc1cce18 c000007ffc1e4368 9000000000009033 000000000000040f [10798.376176] GPR08: 0000000000000007 c0000000011c3a74 0000007ffb010000 9000000000001003 [10798.376176] GPR12: 0000000000002200 c000000007a8bd00 0000000000000000 0000000000000000 [10798.376176] GPR16: 0000000000000000 0000000000000000 0000000000000006 00007ffff7a0a018 [10798.376176] GPR20: 000008bb551c8908 000008bb551c88f8 000008bb551c88c8 c0000076c63abe00 [10798.376176] GPR24: 0000000000010000 0000000000000000 00007ffff7a0a018 c0000076c63abe00 [10798.376176] GPR28: c0000000000003ff 0000000000000001 00000000000003ff c000000000000000 [10798.376619] NIP [c0000000003c76f0] __check_object_size+0x140/0x270 [10798.376662] LR [c0000000003c76ec] __check_object_size+0x13c/0x270 [10798.376706] Call Trace: [10798.376724] [c0000076c63abc80] [c0000000003c76ec] __check_object_size+0x13c/0x270 (unreliable) [10798.376787] [c0000076c63abd00] [c0000000008268a4] read_mem+0x84/0x220 [10798.376835] [c0000076c63abd70] [c0000000003d109c] __vfs_read+0x3c/0x70 [10798.376880] [c0000076c63abd90] [c0000000003d118c] vfs_read+0xbc/0x1b0 [10798.376925] [c0000076c63abde0] [c0000000003d1788] SyS_read+0x68/0x110 [10798.377012] [c0000076c63abe30] [c00000000000b184] system_call+0x58/0x6c [10798.377057] Instruction dump: [10798.377086] 2fbd0000 419e010c 3c82ff8b 3ca2ff94 3884c360 38a5ad68 3c62ff8b 7fc8f378 [10798.377140] 7fe6fb78 3863c370 4bdc55b5 60000000 <0fe00000> 60000000 60000000 60420000 [10798.377195] ---[ end trace 21abd4753a69334c ]--- [10798.445038] [10798.445135] Sending IPI to other CPUs [10798.446688] IPI complete [10798.449081] kexec: waiting for cpu 0 (physical 16) to enter OPAL [10798.450224] kexec: waiting for cpu 23 (physical 47) to enter OPAL [10798.451396] kexec: waiting for cpu 54 (physical 94) to enter OPAL [10800.049202] kexec: Starting switchover sequence. [ 1.078053] integrity: Unable to open file: /etc/keys/x509_ima.der (-2) [ 1.078057] integrity: Unable to open file: /etc/keys/x509_evm.der (-2) [ 1.165219] vio vio: uevent: failed to send synthetic uevent /dev/nvme0n1p2: recovering journal /dev/nvme0n1p2: clean, 14017353/122101760 files, 57953106/488376576 blocks -.mount sys-kernel-debug.mount setvtrgb.service dev-hugepages.mount dev-mqueue.mount kmod-static-nodes.service lvm2-lvmetad.service systemd-remount-fs.service systemd-tmpfiles-setup-dev.service systemd-random-seed.service lvm2-monitor.service systemd-udevd.service systemd-modules-load.service sys-fs-fuse-connections.mount sys-kernel-config.mount systemd-sysctl.service systemd-networkd.service swapfile.swap [ 5.177490] vio vio: uevent: failed to send synthetic uevent systemd-udev-trigger.service keyboard-setup.service systemd-journald.service [ 5.458352] qla2xxx [0020:01:00.0]-00c6:17: MSI-X: Failed to enable support with 32 vectors, using 10 vectors. apparmor.service systemd-journal-flush.service systemd-tmpfiles-setup.service systemd-update-utmp.service [ 6.119284] qla2xxx [0020:01:00.1]-00c6:18: MSI-X: Failed to enable support with 32 vectors, using 10 vectors. systemd-timesyncd.service [ 10.052141] megaraid_sas 0001:03:00.0: Init cmd return status SUCCESS for SCSI host 1 systemd-networkd-wait-online.service iscsid.service blk-availability.service [ 10.675964] kdump-tools[2222]: Starting kdump-tools: * running makedumpfile -c -d 31 /proc/vmcore /var/crash/201804050340/dump-incomplete lvm2-pvscan@8:195.service lvm2-pvscan@8:179.service Copying data : [100.0 %] / eta: 0s [ 55.227083] kdump-tools[2222]: The kernel version is not supported. [ 55.227300] kdump-tools[2222]: The makedumpfile operation may be incomplete. [ 55.227471] kdump-tools[2222]: The dumpfile is saved to /var/crash/201804050340/dump-incomplete. [ 55.227583] kdump-tools[2222]: makedumpfile Completed. [ 55.230250] kdump-tools[2222]: * kdump-tools: saved vmcore in /var/crash/201804050340 [ 55.311695] kdump-tools[2222]: * running makedumpfile --dump-dmesg /proc/vmcore /var/crash/201804050340/dmesg.201804050340 [ 55.330032] kdump-tools[2222]: The kernel version is not supported. [ 55.330206] kdump-tools[2222]: The makedumpfile operation may be incomplete. [ 55.330302] kdump-tools[2222]: The dmesg log is saved to /var/crash/201804050340/dmesg.201804050340. [ 55.330416] kdump-tools[2222]: makedumpfile Completed. [ 55.330533] kdump-tools[2222]: * kdump-tools: saved dmesg content in /var/crash/201804050340 [ 55.334722] kdump-tools[2222]: Thu, 05 Apr 2018 03:40:44 -0500 [ 55.338419] kdump-tools[2222]: Rebooting. [ 55.546343] mlx5_core 0021:01:00.1: mlx5_enter_error_state:121:(pid 2715): start [ 55.546414] mlx5_core 0021:01:00.1: mlx5_enter_error_state:128:(pid 2715): end [ 55.942498] mlx5_core 0021:01:00.0: mlx5_enter_error_state:121:(pid 2715): start [ 55.942631] mlx5_core 0021:01:00.0: mlx5_enter_error_state:128:(pid 2715): end [ 59.836381] reboot: Restarting system [10963.485916127,5] OPAL: Reboot request... 5.31149|Ignoring boot flags, incorrect version 0x0 5.52090|ISTEP 6. 3 6.16670|ISTEP 6. 4 6.16957|ISTEP 6. 5 8.74865|HWAS|PRESENT> DIMM[03]=00AA00AA00AA00AA 8.74865|HWAS|PRESENT> Membuf[04]=4444000000000000 8.74866|HWAS|PRESENT> Proc[05]=C000000000000000 14.03690|ISTEP 6. 6 14.11948|ISTEP 6. 7 16.75478|ISTEP 6. 8 16.91585|ISTEP 6. 9 17.47534|ISTEP 6.10 17.55249|ISTEP 6.11 19.29629|ISTEP 6.12 19.29926|ISTEP 6.13 19.30139|ISTEP 7. 1 19.51889|ISTEP 7. 2 == Comment: #7 - Vaishnavi Bhat <vaish...@in.ibm.com> - 2018-04-06 04:52:31 == kernel memory exposure attempt detected and the BUG() is called from the below code snippet: mm/usercopy.c:72 KERNEL: /usr/lib/debug/boot/vmlinux-4.15.0-13-generic DUMPFILE: dump.201804050340 [PARTIAL DUMP] CPUS: 160 DATE: Thu Apr 5 03:39:16 2018 UPTIME: 00:48:44 LOAD AVERAGE: 2.78, 11.61, 106.19 TASKS: 1748 NODENAME: ltc-briggs2 RELEASE: 4.15.0-13-generic VERSION: #14-Ubuntu SMP Sat Mar 17 13:43:15 UTC 2018 MACHINE: ppc64le (2926 Mhz) MEMORY: 512 GB PANIC: "kernel BUG at /build/linux-2BXDjB/linux-4.15.0/mm/usercopy.c:72!" PID: 4085 COMMAND: "read_all" TASK: c000007659f23f00 [THREAD_INFO: c0000076c63a8000] CPU: 87 STATE: TASK_RUNNING (PANIC) crash> bt PID: 4085 TASK: c000007659f23f00 CPU: 87 COMMAND: "read_all" #0 [c0000076c63ab740] crash_kexec at c0000000001e22b0 #1 [c0000076c63ab780] oops_end at c000000000025888 #2 [c0000076c63ab800] _exception at c000000000026684 #3 [c0000076c63ab990] program_check_common at c000000000008da4 Program Check [700] exception frame: R0: c0000000003c76ec R1: c0000076c63abc80 R2: c0000000016eaf00 R3: 0000000000000064 R4: c000007ffc1cce18 R5: c000007ffc1e4368 R6: 9000000000009033 R7: 000000000000040f R8: 0000000000000007 R9: c0000000011c3a74 R10: 0000007ffb010000 R11: 9000000000001003 R12: 0000000000002200 R13: c000000007a8bd00 R14: 0000000000000000 R15: 0000000000000000 R16: 0000000000000000 R17: 0000000000000000 R18: 0000000000000006 R19: 00007ffff7a0a018 R20: 000008bb551c8908 R21: 000008bb551c88f8 R22: 000008bb551c88c8 R23: c0000076c63abe00 R24: 0000000000010000 R25: 0000000000000000 R26: 00007ffff7a0a018 R27: c0000076c63abe00 R28: c0000000000003ff R29: 0000000000000001 R30: 00000000000003ff R31: c000000000000000 NIP: c0000000003c76f0 MSR: 9000000000029033 OR3: c00000000018cce4 CTR: 00000000300378e8 LR: c0000000003c76ec XER: 0000000020000000 CCR: 0000000028002222 MQ: 0000000000000001 DAR: 0000000000000000 DSISR: 0000000000000000 Syscall Result: 0000000000000000 #4 [c0000076c63abc80] __check_object_size at c0000000003c76f0 [Link Register] [c0000076c63abc80] __check_object_size at c0000000003c76ec (unreliable) #5 [c0000076c63abd00] read_mem at c0000000008268a4 #6 [c0000076c63abd70] __vfs_read at c0000000003d109c #7 [c0000076c63abd90] vfs_read at c0000000003d118c #8 [c0000076c63abde0] sys_read at c0000000003d1788 #9 [c0000076c63abe30] system_call at c00000000000b184 System Call [c01] exception frame: R0: 0000000000000003 R1: 00007ffff7a09ae0 R2: 0000753ec21b7f00 R3: 0000000000000006 R4: 00007ffff7a0a018 R5: 00000000000003ff R6: 0000000000004000 R7: 0000753ec21898c4 R8: 900000010000d033 R9: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000753ec224a8d0 NIP: 0000753ec2188580 MSR: 900000010000d033 OR3: 0000000000000006 CTR: 0000000000000000 LR: 000008bb551b5f20 XER: 0000000000000000 CCR: 0000000042002244 MQ: 0000000000000001 DAR: 0000753ec21affa8 DSISR: 0000000040000000 Syscall Result: 0000000000000006 crash> dis -s c0000000003c76f0 FILE: /build/linux-2BXDjB/linux-4.15.0/mm/usercopy.c LINE: 72 static void report_usercopy(const void *ptr, unsigned long len, bool to_user, const char *type) { pr_emerg("kernel memory %s attempt detected %s %p (%s) (%lu bytes)\n", to_user ? "exposure" : "overwrite", to_user ? "from" : "to", ptr, type ? : "unknown", len); /* * For greater effect, it would be nice to do do_group_exit(), * but BUG() actually hooks all the lock-breaking and per-arch * Oops code, so that is used here instead. */ BUG(); } From the logs, I see that the memory exposure happens after the bluetooth driver is initialized. This might be an issue with the default bluetooth driver provided by the distro. [10795.918866] NET: Registered protocol family 31 [10795.918909] Bluetooth: HCI device and connection manager initialized [10795.918955] Bluetooth: HCI socket layer initialized [10795.918991] Bluetooth: L2CAP socket layer initialized [10795.919032] Bluetooth: SCO socket layer initialized [10798.374850] usercopy: kernel memory exposure attempt detected from 0000000029431ea4 (<kernel text>) (1023 bytes) [10798.374952] ------------[ cut here ]------------ [10798.374988] kernel BUG at /build/linux-2BXDjB/linux-4.15.0/mm/usercopy.c:72! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1761729/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp