------- Comment From mdr...@us.ibm.com 2017-11-14 18:07 EDT------- (In reply to comment #34) > I built a 17.10(Artful) test kernel with a pick of the following commit: > 67f8a8c1151c ("KVM: PPC: Book3S HV: Fix bug causing host SLB to be restored > incorrectly") > > The test kernel can be downloaded from: > http://kernel.ubuntu.com/~jsalisbury/lp1725350/ > > Can you test this kernel and see if it resolves this bug?
Thanks! I've retried test cases a) and b) above using this kernel, and it does appear to resolve the issue. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1725350 Title: KVM on 17.10 crashes the machine Status in The Ubuntu-power-systems project: Incomplete Status in linux package in Ubuntu: In Progress Status in linux source package in Artful: In Progress Bug description: When you start qemu on a 17.10 machine, the whole machine goes down and crashes: [ 90.689627] Unable to handle kernel paging request for data at address 0xf000000002d3bda0 [ 90.689705] Faulting instruction address: 0xc000000000361224 [ 90.689840] Oops: Kernel access of bad area, sig: 11 [#1] [ 90.689911] SMP NR_CPUS=2048 [ 90.689912] NUMA [ 90.690053] PowerNV [ 90.690092] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc kvm_hv kvm_pr kvm ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat nf_conntrack_netlink nf_conntrack nfnetlink idt_89hpesx snd_hda_codec_hdmi xfs joydev input_leds mac_hid snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd soundcore ofpart opal_prd cmdlinepart powernv_flash mtd at24 ipmi_powernv ipmi_devintf ipmi_msghandler powernv_rng uio_pdrv_genirq vmx_crypto ibmpowernv uio ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi sunrpc ip_tables x_tables [ 90.690724] autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor hid_generic usbhid hid raid6_pq libcrc32c raid1 raid0 multipath linear uas usb_storage ast crct10dif_vpmsum i2c_algo_bit crc32c_vpmsum ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm tg3 ahci libahci [ 90.690937] CPU: 48 PID: 3986 Comm: qemu-system-ppc Not tainted 4.13.0-12-generic #13-Ubuntu [ 90.691001] task: c000000b122d8700 task.stack: c000000b431cc000 [ 90.691167] NIP: c000000000361224 LR: c000000000998960 CTR: c0000000009a19b0 [ 90.691223] REGS: c000000bff61b800 TRAP: 0300 Not tainted (4.13.0-12-generic) [ 90.691277] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> [ 90.691282] CR: 88002844 XER: 00000000 [ 90.691347] CFAR: c00000000099895c DAR: f000000002d3bda0 DSISR: 40000000 SOFTE: 0 [ 90.691347] GPR00: c000000000998960 c000000bff61ba80 c0000000015e3000 c000000b4ef61f20 [ 90.691347] GPR04: c000000b44c61680 0000000000000000 000000000000001f 000000000000001f [ 90.691347] GPR08: 000000000000001f 0000000002d3bd80 c00000000178e8e8 c000000b5a0c26f0 [ 90.691347] GPR12: 0000000028002842 c00000000fadf800 c000000b52d07880 c000000b44c61680 [ 90.691347] GPR16: 0000000000000000 000000000000001f 000000000000001f c00000000553a560 [ 90.691347] GPR20: 0000000000000001 0000000000000002 080000000553a560 c000000b5c62a228 [ 90.691347] GPR24: c000000005531110 c000000b5c632238 0000000000000210 0000000000000000 [ 90.691347] GPR28: c000000000998960 c000000bff61bc20 c000000b4ef61f20 f000000002d3bd80 [ 90.692089] NIP [c000000000361224] kfree+0x54/0x270 [ 90.692133] LR [c000000000998960] xhci_urb_free_priv+0x20/0x40 [ 90.692325] Call Trace: [ 90.692345] [c000000bff61ba80] [c000000bff61bad0] 0xc000000bff61bad0 (unreliable) [ 90.692402] [c000000bff61bac0] [c000000000998960] xhci_urb_free_priv+0x20/0x40 [ 90.692459] [c000000bff61bae0] [c00000000099bfc8] xhci_giveback_urb_in_irq.isra.22+0x78/0x190 [ 90.692645] [c000000bff61bb40] [c00000000099c350] xhci_td_cleanup+0x130/0x200 [ 90.692702] [c000000bff61bbc0] [c0000000009a175c] handle_tx_event+0x74c/0x1380 [ 90.692759] [c000000bff61bcc0] [c0000000009a2894] xhci_irq+0x504/0xf20 [ 90.692808] [c000000bff61bde0] [c00000000017b110] __handle_irq_event_percpu+0x90/0x300 [ 90.692977] [c000000bff61bea0] [c00000000017b3b8] handle_irq_event_percpu+0x38/0x90 [ 90.693038] [c000000bff61bee0] [c00000000017b474] handle_irq_event+0x64/0xb0 [ 90.693094] [c000000bff61bf10] [c000000000180da0] handle_fasteoi_irq+0xc0/0x230 [ 90.693155] [c000000bff61bf40] [c00000000017972c] generic_handle_irq+0x4c/0x70 [ 90.693332] [c000000bff61bf60] [c00000000001767c] __do_irq+0x7c/0x1c0 [ 90.693383] [c000000bff61bf90] [c00000000002ab70] call_do_irq+0x14/0x24 [ 90.693431] [c000000b431cf9d0] [c00000000001785c] do_IRQ+0x9c/0x130 [ 90.693478] [c000000b431cfa20] [c000000000008ac4] hardware_interrupt_common+0x114/0x120 [ 90.693663] --- interrupt: 501 at __copy_tofrom_user_power7+0x1f4/0x7cc [ 90.693663] LR = _copy_to_user+0x3c/0x60 [ 90.693736] [c000000b431cfd10] [c000000b431cfdc0] 0xc000000b431cfdc0 (unreliable) [ 90.693797] [c000000b431cfd30] [c0000000003bfa90] poll_select_copy_remaining+0x180/0x1b0 [ 90.693853] [c000000b431cfda0] [c0000000003c1934] SyS_ppoll+0x104/0x1e0 [ 90.694018] [c000000b431cfe30] [c00000000000b184] system_call+0x58/0x6c [ 90.694064] Instruction dump: [ 90.694094] Unable to handle kernel paging request for data at address 0xf000000002ffd860 [ 90.694153] Faulting instruction address: 0xc000000000399624 [ 90.694198] Oops: Kernel access of bad area, sig: 11 [#2] [ 90.694351] SMP NR_CPUS=2048 [ 90.694351] NUMA [ 90.694381] PowerNV I am using the latest kernel at the moment version 4.13-12 I just reproduced it with a different stack this time: [ 2764.725547] Severe Machine check interrupt [Recovered] [ 2764.725676] NIP [c000000000089268]: __copy_tofrom_user_power7+0x1f4/0x7cc [ 2764.725743] Initiator: CPU [ 2764.725764] Error type: SLB [Multihit] [ 2764.725786] Effective address: 00007fffd16e82c8 [ 2796.015384] Severe Machine check interrupt [Recovered] [ 2796.015509] NIP [c000000000089268]: __copy_tofrom_user_power7+0x1f4/0x7cc [ 2796.015586] Initiator: CPU [ 2796.015701] Error type: SLB [Parity] [ 2796.015723] Effective address: 00007fffddabe278 [ 2796.073775] Unable to handle kernel paging request for data at address 0xf000000002378020 [ 2796.073949] Faulting instruction address: 0xc000000000309a18 [ 2796.074075] Oops: Kernel access of bad area, sig: 11 [#1] [ 2796.074104] SMP NR_CPUS=2048 [ 2796.074104] NUMA [ 2796.074126] PowerNV [ 2796.074156] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc kvm_hv kvm_pr kvm ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat nf_conntrack_netlink nf_conntrack nfnetlink xfs idt_89hpesx snd_hda_codec_hdmi joydev input_leds mac_hid snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd soundcore ipmi_powernv at24 uio_pdrv_genirq ofpart cmdlinepart powernv_flash ipmi_devintf powernv_rng mtd ipmi_msghandler opal_prd uio ibmpowernv vmx_crypto sunrpc ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables [ 2796.074643] autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx hid_generic usbhid hid xor raid6_pq libcrc32c raid1 raid0 multipath linear uas usb_storage ast i2c_algo_bit crct10dif_vpmsum ttm crc32c_vpmsum drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm tg3 ahci libahci [ 2796.074902] CPU: 40 PID: 21964 Comm: CPU 0/KVM Tainted: G M 4.13.0-15-generic #16-Ubuntu [ 2796.074955] task: c000000a0b255900 task.stack: c000000a0bf9c000 [ 2796.074990] NIP: c000000000309a18 LR: c000000000309a14 CTR: c00000000030a280 [ 2796.075031] REGS: c000000a0bf9f560 TRAP: 0300 Tainted: G M (4.13.0-15-generic) [ 2796.075080] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> [ 2796.075083] CR: 48024244 XER: 20000000 [ 2796.075133] CFAR: c00000000006c508 DAR: f000000002378020 DSISR: 40000000 SOFTE: 0 [ 2796.075133] GPR00: c000000000309a14 c000000a0bf9f7e0 c0000000015f3400 f000000002378000 [ 2796.075133] GPR04: 00000000d9458000 0000000000000012 00000000834c0000 0000000000000008 [ 2796.075133] GPR08: f000000000000000 0000000000000001 0000000002378000 c00000000179e958 [ 2796.075133] GPR12: 0000000028004248 c00000000fada400 000072882e440000 000072882e440000 [ 2796.075133] GPR16: 0000000000010000 000074882e430000 c000000ad9458000 0000000000000001 [ 2796.075133] GPR20: 4000000000002000 c00000000179e968 000072882e43ffff 000072882e440000 [ 2796.075133] GPR24: c000000a0bf9f988 0008000000000040 07000000000000c0 0000000000000001 [ 2796.075133] GPR28: c0800008de002386 862300de080080c0 c0000009834c0170 0000000000000004 [ 2796.075513] NIP [c000000000309a18] __get_user_pages_fast+0x798/0xfd0 [ 2796.075549] LR [c000000000309a14] __get_user_pages_fast+0x794/0xfd0 [ 2796.075652] Call Trace: [ 2796.075699] [c000000a0bf9f7e0] [d0000000070f89e4] kvmppc_run_core+0xeec/0x1370 [kvm_hv] (unreliable) [ 2796.075749] [c000000a0bf9f900] [c00000000030a390] get_user_pages_fast+0x110/0x160 [ 2796.075793] [c000000a0bf9f950] [d0000000070fe21c] kvmppc_book3s_hv_page_fault+0x384/0xc60 [kvm_hv] [ 2796.075844] [c000000a0bf9fa40] [d0000000070fa94c] kvmppc_vcpu_run_hv+0x314/0x790 [kvm_hv] [ 2796.075891] [c000000a0bf9fb10] [d000000006f759ec] kvmppc_vcpu_run+0x34/0x48 [kvm] [ 2796.075941] [c000000a0bf9fb30] [d000000006f71aa0] kvm_arch_vcpu_ioctl_run+0x108/0x320 [kvm] [ 2796.076100] [c000000a0bf9fbd0] [d000000006f65018] kvm_vcpu_ioctl+0x400/0x7c8 [kvm] [ 2796.076144] [c000000a0bf9fd40] [c0000000003bd6a4] do_vfs_ioctl+0xd4/0xa00 [ 2796.076181] [c000000a0bf9fde0] [c0000000003be094] SyS_ioctl+0xc4/0x130 [ 2796.076217] [c000000a0bf9fe30] [c00000000000b184] system_call+0x58/0x6c [ 2796.076252] Instruction dump: [ 2796.076275] Unable to handle kernel paging request for data at address 0xf00000000282fe60 [ 2796.076339] Faulting instruction address: 0xc0000000003995c4 [ 2796.076444] Oops: Kernel access of bad area, sig: 11 [#2] [ 2796.076473] SMP NR_CPUS=2048 [ 2796.076473] NUMA [ 2796.076494] PowerNV [ 2796.076523] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc kvm_hv kvm_pr kvm ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter openvswitch nf_conntrack_ipv6 nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat nf_conntrack_netlink nf_conntrack nfnetlink xfs idt_89hpesx snd_hda_codec_hdmi joydev input_leds mac_hid snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd soundcore ipmi_powernv at24 uio_pdrv_genirq ofpart cmdlinepart powernv_flash ipmi_devintf powernv_rng mtd ipmi_msghandler opal_prd uio ibmpowernv vmx_crypto sunrpc ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables [ 2796.078461] autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx hid_generic usbhid hid xor raid6_pq libcrc32c raid1 raid0 multipath linear uas usb_storage ast i2c_algo_bit crct10dif_vpmsum ttm crc32c_vpmsum drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm tg3 ahci libahci [ 2796.080130] CPU: 40 PID: 21964 Comm: CPU 0/KVM Tainted: G M 4.13.0-15-generic #16-Ubuntu [ 2796.080797] task: c000000a0b255900 task.stack: c000000a0bf9c000 [ 2796.081128] NIP: c0000000003995c4 LR: c0000000002bf778 CTR: 00000000300303f0 [ 2796.081474] REGS: c000000a0bf9efc0 TRAP: 0300 Tainted: G M (4.13.0-15-generic) [ 2796.081819] MSR: 9000000000001033 <SF,HV,ME,IR,DR,RI,LE> [ 2796.081822] CR: 48024228 XER: 20000000 [ 2796.082458] CFAR: c0000000002bf774 DAR: f00000000282fe60 DSISR: 40000000 SOFTE: 0 [ 2796.082458] GPR00: c0000000002bf778 c000000a0bf9f240 c0000000015f3400 c000000a0bf9f360 [ 2796.082458] GPR04: 0000000000000004 f00000000282fe40 9000000000001033 0000000000000060 [ 2796.082458] GPR08: 000000000000a0b0 000000000282fe40 c00000000179e8e8 9000000000001003 [ 2796.082458] GPR12: 0000000000004400 c00000000fada400 000072882e440000 000072882e440000 [ 2796.082458] GPR16: 0000000000010000 000074882e430000 c000000ad9458000 0000000000000001 [ 2796.082458] GPR20: 4000000000002000 c00000000179e968 000072882e43ffff 000072882e440000 [ 2796.082458] GPR24: c000000a0bf9f988 c000000000e98308 c000000000e98318 c000000a0bf9f560 [ 2796.082458] GPR28: c000000a0bf9f364 0000000000000000 0000000000000004 c000000a0bf9f360 [ 2796.088348] NIP [c0000000003995c4] __check_object_size+0xc4/0x250 [ 2796.088427] LR [c0000000002bf778] __probe_kernel_read+0x68/0xd0 [ 2796.088750] Call Trace: [ 2796.089060] [c000000a0bf9f240] [c000000a0bf9f2c0] 0xc000000a0bf9f2c0 (unreliable) [ 2796.089405] [c000000a0bf9f2c0] [c0000000002bf778] __probe_kernel_read+0x68/0xd0 [ 2796.090048] [c000000a0bf9f300] [c00000000001e010] show_regs+0x300/0x430 [ 2796.090394] [c000000a0bf9f3c0] [c00000000002647c] __die+0xec/0x130 [ 2796.090732] [c000000a0bf9f440] [c000000000026524] die+0x64/0xe0 [ 2796.091091] [c000000a0bf9f480] [c000000000069fb0] bad_page_fault+0xe0/0x14c [ 2796.091404] [c000000a0bf9f4f0] [c00000000000a4b8] handle_page_fault+0x34/0x38 [ 2796.091745] --- interrupt: 300 at __get_user_pages_fast+0x798/0xfd0 [ 2796.091745] LR = __get_user_pages_fast+0x794/0xfd0 [ 2796.092403] [c000000a0bf9f7e0] [d0000000070f89e4] kvmppc_run_core+0xeec/0x1370 [kvm_hv] (unreliable) [ 2796.093083] [c000000a0bf9f900] [c00000000030a390] get_user_pages_fast+0x110/0x160 [ 2796.093418] [c000000a0bf9f950] [d0000000070fe21c] kvmppc_book3s_hv_page_fault+0x384/0xc60 [kvm_hv] [ 2796.094073] [c000000a0bf9fa40] [d0000000070fa94c] kvmppc_vcpu_run_hv+0x314/0x790 [kvm_hv] [ 2796.094423] [c000000a0bf9fb10] [d000000006f759ec] kvmppc_vcpu_run+0x34/0x48 [kvm] [ 2796.094777] [c000000a0bf9fb30] [d000000006f71aa0] kvm_arch_vcpu_ioctl_run+0x108/0x320 [kvm] [ 2796.096433] [c000000a0bf9fbd0] [d000000006f65018] kvm_vcpu_ioctl+0x400/0x7c8 [kvm] [ 2796.096785] [c000000a0bf9fd40] [c0000000003bd6a4] do_vfs_ioctl+0xd4/0xa00 [ 2796.097121] [c000000a0bf9fde0] [c0000000003be094] SyS_ioctl+0xc4/0x130 [ 2796.097467] [c000000a0bf9fe30] [c00000000000b184] system_call+0x58/0x6c [ 2796.098127] Instruction dump: ... It repeats the above. Breno got some information the problem is mostly like to be related to SBL multi-hit. Mirroring to Launchpad to advise Canonical of this KVM issue... To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1725350/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp