Let's concentrate on the hang without KSM on this bug. I've split the KSM enabled in nested virt issue in bug 1414153.
** Summary changed: - issues with KSM enabled for nested KVM VMs + soft lockup issues with nested KVM VMs running tempest ** No longer affects: qemu (Ubuntu) ** Description changed: + + [Impact] + Users of nested KVM for testing openstack have soft lockups as follows: + [74180.076007] BUG: soft lockup - CPU#1 stuck for 22s! [qemu-system-x86:14590] + <snip> + [74180.076007] Call Trace: + [74180.076007] [<ffffffff8105c7a0>] ? leave_mm+0x80/0x80 + [74180.076007] [<ffffffff810dbf75>] smp_call_function_single+0xe5/0x190 + [74180.076007] [<ffffffff8105c7a0>] ? leave_mm+0x80/0x80 + [74180.076007] [<ffffffffa00c4300>] ? rmap_write_protect+0x80/0x80 [kvm] + [74180.076007] [<ffffffff810dc3a6>] smp_call_function_many+0x286/0x2d0 + [74180.076007] [<ffffffff8105c7a0>] ? leave_mm+0x80/0x80 + [74180.076007] [<ffffffff8105c8f7>] native_flush_tlb_others+0x37/0x40 + [74180.076007] [<ffffffff8105c9cb>] flush_tlb_mm_range+0x5b/0x230 + [74180.076007] [<ffffffff8105b80d>] pmdp_splitting_flush+0x3d/0x50 + [74180.076007] [<ffffffff811ac95b>] __split_huge_page+0xdb/0x720 + [74180.076007] [<ffffffff811ad008>] split_huge_page_to_list+0x68/0xd0 + [74180.076007] [<ffffffff811ad9a6>] __split_huge_page_pmd+0x136/0x330 + [74180.076007] [<ffffffff8117728d>] unmap_page_range+0x7dd/0x810 + [74180.076007] [<ffffffffa00a66b5>] ? kvm_mmu_notifier_invalidate_range_start+0x75/0x90 [kvm] + [74180.076007] [<ffffffff81177341>] unmap_single_vma+0x81/0xf0 + [74180.076007] [<ffffffff811784ed>] zap_page_range+0xed/0x150 + [74180.076007] [<ffffffff8108ed74>] ? hrtimer_start_range_ns+0x14/0x20 + [74180.076007] [<ffffffff81174fbf>] SyS_madvise+0x3bf/0x850 + [74180.076007] [<ffffffff810db841>] ? SyS_futex+0x71/0x150 + [74180.076007] [<ffffffff8173186d>] system_call_fastpath+0x1a/0x1f + + [Test Case] + - Deploy openstack on openstack + - Run tempest on L1 cloud + - Check kernel log of L1 nova-compute nodes + + -- + + Original Description: + When installing qemu-kvm on a VM, KSM is enabled. I have encountered this problem in trusty:$ lsb_release -a Distributor ID: Ubuntu Description: Ubuntu 14.04.1 LTS Release: 14.04 Codename: trusty $ uname -a Linux juju-gema-machine-2 3.13.0-40-generic #69-Ubuntu SMP Thu Nov 13 17:53:56 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux The way to see the behaviour: 1) $ more /sys/kernel/mm/ksm/run 0 2) $ sudo apt-get install qemu-kvm 3) $ more /sys/kernel/mm/ksm/run 1 To see the soft lockups, deploy a cloud on a virtualised env like ctsstack, run tempest on it, the compute nodes of the virtualised deployment will eventually stop responding with (run tempest 2 times at least): 24096.072003] BUG: soft lockup - CPU#0 stuck for 23s! [qemu-system-x86:24791] [24124.072003] BUG: soft lockup - CPU#0 stuck for 23s! [qemu-system-x86:24791] [24152.072002] BUG: soft lockup - CPU#0 stuck for 22s! [qemu-system-x86:24791] [24180.072003] BUG: soft lockup - CPU#0 stuck for 22s! [qemu-system-x86:24791] [24208.072004] BUG: soft lockup - CPU#0 stuck for 22s! [qemu-system-x86:24791] [24236.072004] BUG: soft lockup - CPU#0 stuck for 22s! [qemu-system-x86:24791] [24264.072003] BUG: soft lockup - CPU#0 stuck for 22s! [qemu-system-x86:24791] I am not sure whether the problem is that we are enabling KSM on a VM or the problem is that nested KSM is not behaving properly. Either way I can easily reproduce, please contact me if you need further details. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1413540 Title: soft lockup issues with nested KVM VMs running tempest Status in linux package in Ubuntu: Confirmed Bug description: [Impact] Users of nested KVM for testing openstack have soft lockups as follows: [74180.076007] BUG: soft lockup - CPU#1 stuck for 22s! [qemu-system-x86:14590] <snip> [74180.076007] Call Trace: [74180.076007] [<ffffffff8105c7a0>] ? leave_mm+0x80/0x80 [74180.076007] [<ffffffff810dbf75>] smp_call_function_single+0xe5/0x190 [74180.076007] [<ffffffff8105c7a0>] ? leave_mm+0x80/0x80 [74180.076007] [<ffffffffa00c4300>] ? rmap_write_protect+0x80/0x80 [kvm] [74180.076007] [<ffffffff810dc3a6>] smp_call_function_many+0x286/0x2d0 [74180.076007] [<ffffffff8105c7a0>] ? leave_mm+0x80/0x80 [74180.076007] [<ffffffff8105c8f7>] native_flush_tlb_others+0x37/0x40 [74180.076007] [<ffffffff8105c9cb>] flush_tlb_mm_range+0x5b/0x230 [74180.076007] [<ffffffff8105b80d>] pmdp_splitting_flush+0x3d/0x50 [74180.076007] [<ffffffff811ac95b>] __split_huge_page+0xdb/0x720 [74180.076007] [<ffffffff811ad008>] split_huge_page_to_list+0x68/0xd0 [74180.076007] [<ffffffff811ad9a6>] __split_huge_page_pmd+0x136/0x330 [74180.076007] [<ffffffff8117728d>] unmap_page_range+0x7dd/0x810 [74180.076007] [<ffffffffa00a66b5>] ? kvm_mmu_notifier_invalidate_range_start+0x75/0x90 [kvm] [74180.076007] [<ffffffff81177341>] unmap_single_vma+0x81/0xf0 [74180.076007] [<ffffffff811784ed>] zap_page_range+0xed/0x150 [74180.076007] [<ffffffff8108ed74>] ? hrtimer_start_range_ns+0x14/0x20 [74180.076007] [<ffffffff81174fbf>] SyS_madvise+0x3bf/0x850 [74180.076007] [<ffffffff810db841>] ? SyS_futex+0x71/0x150 [74180.076007] [<ffffffff8173186d>] system_call_fastpath+0x1a/0x1f [Test Case] - Deploy openstack on openstack - Run tempest on L1 cloud - Check kernel log of L1 nova-compute nodes -- Original Description: When installing qemu-kvm on a VM, KSM is enabled. I have encountered this problem in trusty:$ lsb_release -a Distributor ID: Ubuntu Description: Ubuntu 14.04.1 LTS Release: 14.04 Codename: trusty $ uname -a Linux juju-gema-machine-2 3.13.0-40-generic #69-Ubuntu SMP Thu Nov 13 17:53:56 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux The way to see the behaviour: 1) $ more /sys/kernel/mm/ksm/run 0 2) $ sudo apt-get install qemu-kvm 3) $ more /sys/kernel/mm/ksm/run 1 To see the soft lockups, deploy a cloud on a virtualised env like ctsstack, run tempest on it, the compute nodes of the virtualised deployment will eventually stop responding with (run tempest 2 times at least): 24096.072003] BUG: soft lockup - CPU#0 stuck for 23s! [qemu-system-x86:24791] [24124.072003] BUG: soft lockup - CPU#0 stuck for 23s! [qemu-system-x86:24791] [24152.072002] BUG: soft lockup - CPU#0 stuck for 22s! [qemu-system-x86:24791] [24180.072003] BUG: soft lockup - CPU#0 stuck for 22s! [qemu-system-x86:24791] [24208.072004] BUG: soft lockup - CPU#0 stuck for 22s! [qemu-system-x86:24791] [24236.072004] BUG: soft lockup - CPU#0 stuck for 22s! [qemu-system-x86:24791] [24264.072003] BUG: soft lockup - CPU#0 stuck for 22s! [qemu-system-x86:24791] I am not sure whether the problem is that we are enabling KSM on a VM or the problem is that nested KSM is not behaving properly. Either way I can easily reproduce, please contact me if you need further details. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1413540/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp