I've added instructions for a workaround. The code paths I've seen in crashes has been the following:
kvm_sched_in -> kvm_arch_vcpu_load -> vmx_vcpu_load -> loaded_vmcs_clear -> smp_call_function_single pmdp_clear_flush -> flush_tlb_mm_range -> native_flush_tlb_others -> smp_call_function_many Generally this has been caused by workloads that use nested VMs, and stress L2/L1 vms (causing non-local CPU TLB flushing or VMCS clearing). The hang is in csd_lock_wait waiting for CSD_FLAG_LOCK bit to be cleared, which can only be triggered with non-local smp_call_function_* calls. Another data point is that this can happen with x2apic as well as flat apic (as tested with nox2apic). -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1413540 Title: Trusty soft lockup issues with nested KVM To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1413540/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs