Thanks! I will try, but this issue was not spotted in our general kernel testing with the node we use (node rizzo). Thus it might take some time to find out an affected node first.
You can run this with the source tree: $ make -C tools/testing/selftests TARGETS=memory-hotplug run_tests Ref: https://www.kernel.org/doc/html/v4.15/dev-tools/kselftest.html -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1862312 Title: Segmentation fault (kernel oops) with memory-hotplug in ubuntu_kernel_selftests on Bionic kernel Status in ubuntu-kernel-tests: New Status in linux package in Ubuntu: Incomplete Status in linux source package in Bionic: New Bug description: It looks like memory-hotplug test in ubuntu_kernel_selftests will trigger this issue. This issue cannot be reproduced with the kernel in -updates, but can be reproduced quite easily with the proposed kernel (X-4.15 Oracle 4.15.0-1032.35~16.04.1). This was spotted on the following kernels (for now): * X-oracle-4.15 * B * B-oracle-4.15 It's not very easy to spot this as the jenkins job will just hang and you won't see the test result on the report page, for example the jenkins job "sru-misc__B_ppc64el-generic__using_baltar__for_kernel" hung at the same spot (the beginning of the KVM unit test) for two out of two attempts: 05:06:37 INFO | GOOD ubuntu_kvm_unit_tests.setup ubuntu_kvm_unit_tests.setup timestamp=1580792797 localtime=Feb 04 05:06:37 completed successfully 05:06:37 INFO | END GOOD ubuntu_kvm_unit_tests.setup ubuntu_kvm_unit_tests.setup timestamp=1580792797 localtime=Feb 04 05:06:37 05:06:37 DEBUG| Persistent state client._record_indent now set to 1 05:06:37 DEBUG| Persistent state client.unexpected_reboot deleted 05:06:37 INFO | START ubuntu_kvm_unit_tests.emulator ubuntu_kvm_unit_tests.emulator timestamp=1580792797 localtime=Feb 04 05:06:37 05:06:37 DEBUG| Persistent state client._record_indent now set to 2 05:06:37 DEBUG| Persistent state client.unexpected_reboot now set to ('ubuntu_kvm_unit_tests.emulator', 'ubuntu_kvm_unit_tests.emulator') 05:06:37 DEBUG| Running 'kvm-ok' 05:06:37 DEBUG| [stdout] INFO: /dev/kvm exists 05:06:37 DEBUG| [stdout] KVM acceleration can be used 05:06:37 DEBUG| Running 'ppc64_cpu --smt=off' Build was aborted Check the syslog, there is a call trace before the test_bpf and after page offline: [ 1195.321441] Offlined Pages 4096 [ 1195.335056] Offlined Pages 4096 [ 1195.354614] Offlined Pages 4096 [ 1198.491967] Offlined Pages 4096 [ 1199.457587] Injecting error (-12) to MEM_GOING_ONLINE [ 1200.473838] ------------[ cut here ]------------ [ 1200.473841] kernel BUG at /build/linux-CWyQTi/linux-4.15.0/kernel/rcu/sync.c:128! [ 1200.473909] Oops: Exception in kernel mode, sig: 5 [#1] [ 1200.473953] LE SMP NR_CPUS=2048 NUMA PowerNV [ 1200.473999] Modules linked in: memory_notifier_error_inject notifier_error_inject overlay veth xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack xt_tcpudp bridge stp llc iptable_filter binfmt_misc joydev input_leds mac_hid idt_89hpesx opal_prd ofpart at24 cmdlinepart powernv_flash ipmi_powernv uio_pdrv_genirq uio mtd ipmi_devintf ibmpowernv ipmi_msghandler sch_fq_codel vmx_crypto ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear ses enclosure scsi_transport_sas ast i2c_algo_bit hid_generic ttm drm_kms_helper [ 1200.474641] syscopyarea usbhid sysfillrect sysimgblt hid fb_sys_fops crct10dif_vpmsum crc32c_vpmsum drm i40e aacraid [last unloaded: test_bpf] [ 1200.474792] CPU: 12 PID: 139071 Comm: mem-on-off-test Not tainted 4.15.0-87-generic #87-Ubuntu [ 1200.474894] NIP: c0000000001a8490 LR: c0000000001a8478 CTR: c00000000026c5e0 [ 1200.474981] REGS: c000000c830ff7c0 TRAP: 0700 Not tainted (4.15.0-87-generic) [ 1200.475084] MSR: 900000000282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 28222888 XER: 20040000 [ 1200.475219] CFAR: c00000000001940c SOFTE: 1 [ 1200.475219] GPR00: c0000000001a8434 c000000c830ffa40 c00000000172c900 0000000000000001 [ 1200.475219] GPR04: 00000000000001f0 c000000c7a4d2480 0000000028228882 c00000000001e730 [ 1200.475219] GPR08: 0000000ff9a10000 0000000000000001 0000000000000000 c000000c61bab790 [ 1200.475219] GPR12: 0000000000002000 c00000000fa88400 0000058d97936070 0000000000000000 [ 1200.475219] GPR16: 0000058d6b6e9690 0000058d6b776ab0 0000058d6b7a8204 0000058d6b776ae8 [ 1200.475219] GPR20: 0000058d6b7ad5d8 0000000000000001 0000000000000000 00007fffd1cb80e4 [ 1200.475219] GPR24: 00007fffd1cb80e0 c000000001763428 c0000000015f6ba8 0000000000000000 [ 1200.475219] GPR28: 0000000000000020 c0000000015f6bb0 ffffffffffffffff c0000000015f6ba8 [ 1200.476036] NIP [c0000000001a8490] rcu_sync_enter+0xa0/0x1e0 [ 1200.476124] LR [c0000000001a8478] rcu_sync_enter+0x88/0x1e0 [ 1200.476180] Call Trace: [ 1200.476215] [c000000c830ffa40] [c000000c830ffaa0] 0xc000000c830ffaa0 (unreliable) [ 1200.476311] [c000000c830ffab0] [c0000000001889a8] percpu_down_write+0x38/0x140 [ 1200.476407] [c000000c830ffb00] [c00000000039fa6c] online_pages+0x1fc/0x440 [ 1200.476456] [c000000c830ffbd0] [c0000000008a7320] memory_subsys_online+0x180/0x250 [ 1200.476495] [c000000c830ffc60] [c000000000879f54] device_online+0x84/0x120 [ 1200.476528] [c000000c830ffca0] [c0000000008a7ee8] store_mem_state+0xb8/0x180 [ 1200.476566] [c000000c830ffce0] [c0000000008744bc] dev_attr_store+0x3c/0x60 [ 1200.476599] [c000000c830ffd00] [c0000000004ae254] sysfs_kf_write+0x64/0x90 [ 1200.476631] [c000000c830ffd20] [c0000000004acf2c] kernfs_fop_write+0x1ac/0x240 [ 1200.476670] [c000000c830ffd70] [c0000000003e147c] __vfs_write+0x3c/0x70 [ 1200.476703] [c000000c830ffd90] [c0000000003e16d8] vfs_write+0xd8/0x220 [ 1200.476735] [c000000c830ffde0] [c0000000003e1a38] SyS_write+0x78/0x140 [ 1200.476768] [c000000c830ffe30] [c00000000000b288] system_call+0x5c/0x70 [ 1200.476799] Instruction dump: [ 1200.476819] 409e00b0 7c2004ac 39200000 38600001 913f0008 4be70f85 60000000 2fbe0000 [ 1200.476858] 39200000 419e000c 7f9c0034 5789d97e <0b090000> 4092008c 813f0038 3d42fffb [ 1200.476909] ---[ end trace 5ef11694541f2535 ]--- [ 1200.527850] [ 1224.784549] test_bpf: #0 TAX jited:1 36 35 33 PASS [ 1224.785669] test_bpf: #1 TXA jited:1 11 11 11 PASS [ 1224.786073] test_bpf: #2 ADD_SUB_MUL_K jited:1 10 PASS [ 1224.786236] test_bpf: #3 DIV_MOD_KX jited:1 15 PASS [ 1224.786444] test_bpf: #4 AND_OR_LSH_K jited:1 10 10 PASS ProblemType: Bug DistroRelease: Ubuntu 18.04 Package: linux-image-4.15.0-87-generic 4.15.0-87.87 ProcVersionSignature: User Name 4.15.0-87.87-generic 4.15.18 Uname: Linux 4.15.0-87-generic ppc64le .sys.firmware.opal.msglog: Error: [Errno 13] Permission denied: '/sys/firmware/opal/msglog' AlsaDevices: total 0 crw-rw---- 1 root audio 116, 1 Feb 6 06:35 seq crw-rw---- 1 root audio 116, 33 Feb 6 06:35 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay' ApportVersion: 2.20.9-0ubuntu7.10 Architecture: ppc64el ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CurrentDmesg: Date: Fri Feb 7 07:57:32 2020 IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig' Lsusb: Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 001 Device 003: ID 0451:80ff Texas Instruments, Inc. Bus 001 Device 004: ID 0557:2419 ATEN International Co., Ltd Bus 001 Device 002: ID 0557:7000 ATEN International Co., Ltd Hub Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub PciMultimedia: ProcFB: 0 astdrmfb ProcKernelCmdLine: root=UUID=acd1a0d7-f6fc-4130-928c-c8b11ad6e4be ro console=hvc0 ProcLoadAvg: 2.02 1.31 1.11 1/1377 37783 ProcSwaps: Filename Type Size Used Priority /swap.img file 8388544 0 -2 ProcVersion: Linux version 4.15.0-87-generic (buildd@bos02-ppc64el-002) (gcc version 7.4.0 (User Name 7.4.0-1ubuntu1~18.04.1)) #87-User Name SMP Fri Jan 31 19:32:29 UTC 2020 RelatedPackageVersions: linux-restricted-modules-4.15.0-87-generic N/A linux-backports-modules-4.15.0-87-generic N/A linux-firmware 1.173.15 RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill' SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) VarLogDump_list: total 0 cpu_cores: Number of cores present = 40 cpu_coreson: Number of cores online = 39 cpu_smt: SMT=4 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1862312/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp