** Summary changed: - ISST-LTE:KOP:1060FW:evelp2 :L2 Guest migration: evelp2g4[L2]: while running NFS guest migration continuously dumping smp_call_function_many_cond+0x500/0x738 (unreliable) and watchdog: BUG: soft lockup - CPU#14 stuck for 223s! [systemd-homed} (Fedora) + L2 Guest migration: continuously dumping while running NFS guest migration
-- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2076406 Title: L2 Guest migration: continuously dumping while running NFS guest migration Status in The Ubuntu-power-systems project: Triaged Status in linux package in Ubuntu: Fix Committed Status in linux source package in Noble: Triaged Status in linux source package in Oracular: Fix Committed Bug description: == Comment: #0 - SEETEENA THOUFEEK <sthou...@in.ibm.com> - 2024-08-09 03:50:24 == +++ This bug was initially created as a clone of Bug #206737 +++ ---Problem Description--- L2 Guest migration: evelp2g4[L2]: while running NFS guest migration continuously dumping smp_call_function_many_cond+0x500/0x738 (unreliable) and watchdog: BUG: soft lockup - CPU#14 stuck for 223s! [systemd-homed} ---uname output--- NA Machine Type = NA Contact Information = NA [79205.163691] Hardware name: IBM pSeries (emulated by qemu) POWER10 (raw) 0x800200 0xf000006 of:SLOF,HEAD hv:linux,kvm pSeries [79205.163834] NIP: c0000000002bb7a4 LR: c0000000002bb750 CTR: c0000000000d192c [79205.163929] REGS: c0000003871cf1b0 TRAP: 0900 Tainted: G L [79205.165041] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 44042222 XER: 20040004 [79205.165266] CFAR: 0000000000000000 IRQMASK: 0 GPR00: c0000000002bbc58 c0000003871cf450 c0000000020ded00 0000000000000009 GPR04: 0000000000000009 0000000000000009 0000000000000080 0000000000000200 GPR08: 00000000000001ff 0000000000000001 c000000740f57ee0 0000000044048222 GPR12: c0000000000d192c c000000743ddc980 0000000000000000 0000000000000000 GPR16: 0000000000000000 c00000000d86e200 0000000000000001 0000000000000001 GPR20: 000000000000000c c000000003d06188 c0000000000ac4d0 c00000000a374e00 GPR24: c000000003d06840 0000000000000000 c000000741193188 c000000741193188 GPR28: c000000741193180 c000000003d06840 0000000000000048 0000000000000009 [79205.171660] NIP [c0000000002bb7a4] smp_call_function_many_cond+0x1e0/0x738 [79205.171752] LR [c0000000002bb750] smp_call_function_many_cond+0x18c/0x738 [79205.171835] Call Trace: [79205.171869] [c0000003871cf450] [c0000000002bbc58] smp_call_function_many_cond+0x694/0x738 (unreliable) [79205.171986] [c0000003871cf520] [c0000000000ac4d0] radix__tlb_flush+0x4c/0x140 [79205.173636] [c0000003871cf560] [c00000000052e900] tlb_finish_mmu+0x130/0x1f0 [79205.173754] [c0000003871cf590] [c00000000052a280] exit_mmap+0x1cc/0x574 [79205.173848] [c0000003871cf6c0] [c00000000016ec9c] __mmput+0x54/0x1d4 [79205.173939] [c0000003871cf6f0] [c0000000006385c4] begin_new_exec+0x6dc/0xefc [79205.174037] [c0000003871cf780] [c0000000006edea8] load_elf_binary+0x4c8/0x1a50 [79205.174136] [c0000003871cf880] [c0000000006361c8] bprm_execve+0x2b4/0x7a0 [79205.174219] [c0000003871cf950] [c000000000637988] do_execveat_common+0x1c0/0x2d8 [79205.174316] [c0000003871cf9f0] [c000000000638e38] sys_execve+0x54/0x6c [79205.174399] [c0000003871cfa20] [c00000000002fec8] system_call_exception+0x168/0x310 [79205.174497] [c0000003871cfe50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec [79205.176245] --- interrupt: 3000 at 0x7fff95b10b08 [79205.176326] NIP: 00007fff95b10b08 LR: 00007fff95b10b08 CTR: 0000000000000000 [79205.176438] REGS: c0000003871cfe80 TRAP: 3000 Tainted: G L ( [79205.176558] MSR: 800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 48044424 XER: 00000000 [79205.176686] IRQMASK: 0 GPR00: 000000000000000b 00007fffe6919aa0 00007fff95c47c00 0000000152598c80 GPR04: 00007fffe6919bf8 00000001525db6e0 ffffffffffffffff 00007fffe6919a20 GPR08: 0000000152598c88 0000000000000000 0000000000000000 0000000000000000 GPR12: 0000000000000000 00007fff969a4220 0000000152585570 0000000000000000 GPR16: 00007fffe6919c48 0000000000000570 0000000152598c80 0000000000000000 GPR20: 0000000000000000 0000000000009998 000000015259a450 0000000152586460 GPR24: 00000001525bca90 00007fffe6919e48 0000000000000000 00000001525db6e0 GPR28: 0000000117e98448 00000001525d0b00 0000000000000000 0000000000100000 [79205.177505] NIP [00007fff95b10b08] 0x7fff95b10b08 [79205.177578] LR [00007fff95b10b08] 0x7fff95b10b08 [79205.177649] --- interrupt: 3000 Steps to reproduce: Install the build on NFS storage guest kernel 6.8.10-300 Start the HTX workload - mdt.less Start the NFS guest migration between the L2 hosts. Sourece L2 host : evelp2 Target L2 host : rinlp1 migration command : virsh migrate --live --domain $vm_name qemu+ssh://$target_host/system --verbose --undefinesource --persistent --timeout 120 Share the same NFS storage between two hosts [here /kvm_pool] 10.33.4.52:/kvm_pool nfs4 650G 304G 347G 47% /kvm_pool Test running : HTX Guest state : up ------------------------------------------------------------------------------------- -------------------------------------- L2 guest Config: (1) Problem on Guest: evelp2g4 (2) PHYP/ Processor Type: KVM/P10/Everest (3) Rootvg Filesystem: EXT4 (5) Network Bridge: Macvtap (6) IO Disk Type/Driver: qemu-img/ qcow2 (7) Install Disk Type: Single ------------------------------------------------------------------------------------- -------------------------------------- L1 host details : MDC mode : off (1) PHYP/ Processor Type: KVM/P10/Everest (2) CEC Name: evelp2 (3) Rootvg Filesystem: xfs (5) Network Interface: Dedicated Network (6) IO Type: NVME (8) Multipath Enabled: no (9) Install Disk Type: Single (10) MMU: RPT The kernel patches are at https://lore.kernel.org/kvm/d1sloycqgiq6.17y5c9xjdh...@gmail.com/T/#t Qemu patches are at https://lore.kernel.org/qemu-devel/171760304518.1127.12881297254648658843.stgit@ad1b393f0e09/ powerpc/topic/ppc-kvm. [1/8] KVM: PPC: Book3S HV: Fix the set_one_reg for MMCR3 https://git.kernel.org/powerpc/c/f9ca6a10be20479d526f27316cc32cfd1785ed39 [2/8] KVM: PPC: Book3S HV: Fix the get_one_reg of SDAR https://git.kernel.org/powerpc/c/009f6f42c67e9de737d6d3d199f92b21a8cb9622 [3/8] KVM: PPC: Book3S HV: Add one-reg interface for DEXCR register https://git.kernel.org/powerpc/c/1a1e6865f516696adcf6e94f286c7a0f84d78df3 [4/8] KVM: PPC: Book3S HV nestedv2: Keep nested guest DEXCR in sync https://git.kernel.org/powerpc/c/2d6be3ca3276ab30fb14f285d400461a718d45e7 [5/8] KVM: PPC: Book3S HV: Add one-reg interface for HASHKEYR register https://git.kernel.org/powerpc/c/e9eb790b25577a15d3f450ed585c59048e4e6c44 [6/8] KVM: PPC: Book3S HV nestedv2: Keep nested guest HASHKEYR in sync https://git.kernel.org/powerpc/c/1e97c1eb785fe2dc863c2bd570030d6fcf4b5e5b [7/8] KVM: PPC: Book3S HV: Add one-reg interface for HASHPKEYR register https://git.kernel.org/powerpc/c/9a0d2f4995ddde3022c54e43f9ece4f71f76f6e8 [8/8] KVM: PPC: Book3S HV nestedv2: Keep nested guest HASHPKEYR in sync https://git.kernel.org/powerpc/c/0b65365f3fa95c2c5e2094739151a05cabb3c48a To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/2076406/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp