Yes, I am clear on this Dexuan. Had found the same info tonight,
including the table of causes of VM exit. The issue is not the VM exit
itself, it is that hyperv adversely affects the 32 bit programs state.
Please keep posting the progress on fixing the issue if you could as
this goes along.
Regards,K.
On Wednesday, December 16, 2020, 7:55:44 PM PST, Dexuan Cui
<[email protected]> wrote:
VM exits are pretty frequent and normal. "VM exits occur in response to
certain instructions and events in VMX non-root operation" (see CHAPTER 27
VM EXITS of
https://software.intel.com/content/www/us/en/develop/download/intel-64-and-ia-32-architectures-sdm-volume-3c-system-programming-guide-part-3.html.
--
You received this bug notification because you are subscribed to the bug
report.
https://bugs.launchpad.net/bugs/1904632
Title:
Ubuntu 18.04 Azure VM host kernel panic
Status in linux-azure package in Ubuntu:
New
Bug description:
Running a container on an DV3 Standard_D8_v3 Azure host, as the
container comes up, the Azure host VM kernel panics per the logs
below.
Isolated the issue to a process in the container which uses the
virtual NICs available on the Azure host. The container also is
running Ubuntu 18.04 based packages. The problem happens every single
time the container is started, unless its NIC access process is not
started.
Has this sort of kernel panic on Azure been seen and what is the root
cause and remedy please.
Also the kernel logs on the Azure host show it vulnerable to the
following CVE. There are other VMs and containers that can run on the
Azure host without a kernel panic on it, but providing this info in
case there is some tie-in to the panic.
https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-3646
Kernel panic from the Azure Host console:
Microsoft.EnterpriseCloud.Monitoring.OmsAgentForLinux_1.13.33_e857c609-bc35-4b66-9a8b-e86fd8707e82.scope
2020-11-17T00:50:11.537914Z INFO MonitorHandler ExtHandler Stopped tracking
cgroup: Microsoft.EnterpriseCloud.Monitoring.OmsAgentForLinux-1.13.33, path:
/sys/fs/cgroup/memory/system.slice/Microsoft.EnterpriseCloud.Monitoring.OmsAgentForLinux_1.13.33_e857c609-bc35-4b66-9a8b-e86fd8707e82.scope
2020-11-17T00:50:23.291433Z INFO ExtHandler ExtHandler Checking for agent
updates (family: Prod)
2020-11-17T00:51:11.677191Z INFO ExtHandler ExtHandler [HEARTBEAT] Agent
WALinuxAgent-2.2.52 is running as the goal state agent [DEBUG HeartbeatCounter:
7;HeartbeatId: 8A2DD5B7-02E5-46E2-9EDB-F8CCBA274479;DroppedPackets:
0;UpdateGSErrors: 0;AutoUpdate: 1]
[11218.537937] PANIC: double fault, error_code: 0x0
[11218.541423] Kernel panic - not syncing: Machine halted.
[11218.541423] CPU: 0 PID: 9281 Comm: vmxt Not tainted 4.15.18+test #1
[11218.541423] Hardware name: Microsoft Corporation Virtual Machine/Virtual
Machine, BIOS 090008 12/07/2018
[11218.541423] Call Trace:
[11218.541423] <#DF>
[11218.541423] dump_stack+0x63/0x8b
[11218.541423] panic+0xe4/0x244
[11218.541423] df_debug+0x2d/0x30
[11218.541423] do_double_fault+0x9a/0x130
[11218.541423] double_fault+0x1e/0x30
[11218.541423] RIP: 0010:0x1a80
[11218.541423] RSP: 0018:0000000000002200 EFLAGS: 00010096
[11218.541423] RAX: 0000000000000102 RBX: 00000000f7a40768 RCX:
000000000000002f
[11218.541423] RDX: 00000000f7ee9970 RSI: 00000000f7a40700 RDI:
00000000f7c3a000
[11218.541423] RBP: 00000000fffd6430 R08: 0000000000000000 R09:
0000000000000000
[11218.541423] R10: 0000000000000000 R11: 0000000000000000 R12:
0000000000000000
[11218.541423] R13: 0000000000000000 R14: 0000000000000000 R15:
0000000000000000
[11218.541423] </#DF>
[11218.541423] Kernel Offset: 0x2a400000 from 0xffffffff81000000 (relocation
range: 0xffffffff80000000-0xffffffffbfffffff)
[11218.541423] ---[ end Kernel panic - not syncing: Machine halted.
[11218.636804] ------------[ cut here ]------------
[11218.640802] sched: Unexpected reschedule of offline CPU#2!
[11218.640802] WARNING: CPU: 0 PID: 9281 at arch/x86/kernel/smp.c:128
native_smp_send_reschedule+0x3f/0x50
[11218.640802] Modules linked in: xt_nat xt_u32 vxlan ip6_udp_tunnel
udp_tunnel veth nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype
br_netfilter xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4
iptable_nat ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter
ebtables ip6table_filter ip6_tables iptable_filter aufs xt_owner
iptable_security xt_conntrack overlay openvswitch nsh nf_conntrack_ipv6
nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat
nf_conntrack nls_iso8859_1 joydev input_leds mac_hid kvm_intel hv_balloon kvm
serio_raw irqbypass intel_rapl_perf sch_fq_codel ib_iser rdma_cm iw_cm ib_cm
ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables
autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov
[11218.640802] async_memcpy async_pq async_xor async_tx xor raid6_pq
libcrc32c raid1 raid0 multipath linear hid_generic crct10dif_pclmul
crc32_pclmul hid_hyperv ghash_clmulni_intel hv_utils hv_storvsc pcbc ptp
hv_netvsc hid pps_core scsi_transport_fc hyperv_keyboard aesni_intel aes_x86_64
crypto_simd hyperv_fb floppy glue_helper cryptd psmouse hv_vmbus i2c_piix4
pata_acpi
[11218.640802] CPU: 0 PID: 9281 Comm: vmxt Not tainted 4.15.18+test #1
[11218.640802] Hardware name: Microsoft Corporation Virtual Machine/Virtual
Machine, BIOS 090008 12/07/2018
[11218.640802] RIP: 0010:native_smp_send_reschedule+0x3f/0x50
[11218.640802] RSP: 0018:ffff9446bfc03e08 EFLAGS: 00010082
[11218.640802] RAX: 0000000000000000 RBX: 0000000000000002 RCX:
0000000000000006
[11218.640802] RDX: 0000000000000007 RSI: 0000000000000082 RDI:
ffff9446bfc16490
[11218.640802] RBP: ffff9446bfc03e08 R08: 0000000000000000 R09:
0000000000001480
[11218.640802] R10: 0000000000000549 R11: 0000000000000038 R12:
ffff9446bfca2880
[11218.640802] R13: 0000000000000000 R14: 000000010029a6b8 R15:
ffff9446bfc1cd28
[11218.640802] FS: 0000000000000000(0000) GS:ffff9446bfc00000(0063)
knlGS:00000000f7a40700
[11218.640802] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
[11218.640802] CR2: 00000000000021f8 CR3: 000000084c576004 CR4:
00000000003626f0
[11218.640802] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[11218.640802] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[11218.640802] Call Trace:
[11218.640802] <IRQ>
[11218.640802] trigger_load_balance+0x12a/0x230
[11218.640802] scheduler_tick+0xae/0xd0
[11218.640802] ? tick_sched_do_timer+0x40/0x40
[11218.640802] update_process_times+0x47/0x60
[11218.640802] tick_sched_handle+0x2a/0x60
[11218.640802] tick_sched_timer+0x39/0x80
[11218.640802] __hrtimer_run_queues+0xe7/0x230
[11218.640802] hrtimer_interrupt+0xb1/0x200
[11218.640802] vmbus_isr+0x16c/0x2a0 [hv_vmbus]
[11218.640802] hyperv_vector_handler+0x3f/0x6e
[11218.640802] hyperv_callback_vector+0x84/0x90
[11218.640802] </IRQ>
[11218.640802] <#DF>
[11218.640802] RIP: 0010:panic+0x1fe/0x244
[11218.640802] RSP: 0018:fffffe0000007e90 EFLAGS: 00000286 ORIG_RAX:
ffffffffffffff0c
[11218.640802] RAX: 0000000000000034 RBX: fffffe0000007f00 RCX:
0000000000000006
[11218.640802] RDX: 0000000000000000 RSI: 0000000000000096 RDI:
ffff9446bfc16490
[11218.640802] RBP: fffffe0000007f08 R08: 0000000000000000 R09:
000000000000147e
[11218.640802] R10: 0000000000000000 R11: 0000000000000038 R12:
0000000000000000
[11218.640802] R13: 0000000000000000 R14: 000000084c577804 R15:
0000000000000000
[11218.640802] df_debug+0x2d/0x30
[11218.640802] do_double_fault+0x9a/0x130
[11218.640802] double_fault+0x1e/0x30
[11218.640802] RIP: 0010:0x1a80
[11218.640802] RSP: 0018:0000000000002200 EFLAGS: 00010096
[11218.640802] RAX: 0000000000000102 RBX: 00000000f7a40768 RCX:
000000000000002f
[11218.640802] RDX: 00000000f7ee9970 RSI: 00000000f7a40700 RDI:
00000000f7c3a000
[11218.640802] RBP: 00000000fffd6430 R08: 0000000000000000 R09:
0000000000000000
[11218.640802] R10: 0000000000000000 R11: 0000000000000000 R12:
0000000000000000
[11218.640802] R13: 0000000000000000 R14: 0000000000000000 R15:
0000000000000000
[11218.640802] </#DF>
[11218.640802] Code: 92 c0 84 c0 74 17 48 8b 05 7f 71 15 01 be fd 00 00 00 48
8b 40 30 e8 31 b6 ba 00 5d c3 89 fe 48 c7 c7 10 e3 4b ac e8 b1 49 03 00 <0f> 0b
5d c3 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00
[11218.640802] ---[ end trace 857d64f92b00ceb2 ]---
[11221.669166] hyperv_fb: Unable to send packet via vmbus
[11221.669167] hyperv_fb: Unable to send packet via vmbus
[11221.669167] hyperv_fb: Unable to send packet via vmbus
[11221.669167] hyperv_fb: Unable to send packet via vmbu
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1904632/+subscriptions
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-azure in Ubuntu.
https://bugs.launchpad.net/bugs/1904632
Title:
Ubuntu 18.04 Azure VM host kernel panic
Status in linux-azure package in Ubuntu:
New
Bug description:
Running a container on an DV3 Standard_D8_v3 Azure host, as the
container comes up, the Azure host VM kernel panics per the logs
below.
Isolated the issue to a process in the container which uses the
virtual NICs available on the Azure host. The container also is
running Ubuntu 18.04 based packages. The problem happens every single
time the container is started, unless its NIC access process is not
started.
Has this sort of kernel panic on Azure been seen and what is the root
cause and remedy please.
Also the kernel logs on the Azure host show it vulnerable to the
following CVE. There are other VMs and containers that can run on the
Azure host without a kernel panic on it, but providing this info in
case there is some tie-in to the panic.
https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-3646
Kernel panic from the Azure Host console:
Microsoft.EnterpriseCloud.Monitoring.OmsAgentForLinux_1.13.33_e857c609-bc35-4b66-9a8b-e86fd8707e82.scope
2020-11-17T00:50:11.537914Z INFO MonitorHandler ExtHandler Stopped tracking
cgroup: Microsoft.EnterpriseCloud.Monitoring.OmsAgentForLinux-1.13.33, path:
/sys/fs/cgroup/memory/system.slice/Microsoft.EnterpriseCloud.Monitoring.OmsAgentForLinux_1.13.33_e857c609-bc35-4b66-9a8b-e86fd8707e82.scope
2020-11-17T00:50:23.291433Z INFO ExtHandler ExtHandler Checking for agent
updates (family: Prod)
2020-11-17T00:51:11.677191Z INFO ExtHandler ExtHandler [HEARTBEAT] Agent
WALinuxAgent-2.2.52 is running as the goal state agent [DEBUG HeartbeatCounter:
7;HeartbeatId: 8A2DD5B7-02E5-46E2-9EDB-F8CCBA274479;DroppedPackets:
0;UpdateGSErrors: 0;AutoUpdate: 1]
[11218.537937] PANIC: double fault, error_code: 0x0
[11218.541423] Kernel panic - not syncing: Machine halted.
[11218.541423] CPU: 0 PID: 9281 Comm: vmxt Not tainted 4.15.18+test #1
[11218.541423] Hardware name: Microsoft Corporation Virtual Machine/Virtual
Machine, BIOS 090008 12/07/2018
[11218.541423] Call Trace:
[11218.541423] <#DF>
[11218.541423] dump_stack+0x63/0x8b
[11218.541423] panic+0xe4/0x244
[11218.541423] df_debug+0x2d/0x30
[11218.541423] do_double_fault+0x9a/0x130
[11218.541423] double_fault+0x1e/0x30
[11218.541423] RIP: 0010:0x1a80
[11218.541423] RSP: 0018:0000000000002200 EFLAGS: 00010096
[11218.541423] RAX: 0000000000000102 RBX: 00000000f7a40768 RCX:
000000000000002f
[11218.541423] RDX: 00000000f7ee9970 RSI: 00000000f7a40700 RDI:
00000000f7c3a000
[11218.541423] RBP: 00000000fffd6430 R08: 0000000000000000 R09:
0000000000000000
[11218.541423] R10: 0000000000000000 R11: 0000000000000000 R12:
0000000000000000
[11218.541423] R13: 0000000000000000 R14: 0000000000000000 R15:
0000000000000000
[11218.541423] </#DF>
[11218.541423] Kernel Offset: 0x2a400000 from 0xffffffff81000000 (relocation
range: 0xffffffff80000000-0xffffffffbfffffff)
[11218.541423] ---[ end Kernel panic - not syncing: Machine halted.
[11218.636804] ------------[ cut here ]------------
[11218.640802] sched: Unexpected reschedule of offline CPU#2!
[11218.640802] WARNING: CPU: 0 PID: 9281 at arch/x86/kernel/smp.c:128
native_smp_send_reschedule+0x3f/0x50
[11218.640802] Modules linked in: xt_nat xt_u32 vxlan ip6_udp_tunnel
udp_tunnel veth nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype
br_netfilter xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4
iptable_nat ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter
ebtables ip6table_filter ip6_tables iptable_filter aufs xt_owner
iptable_security xt_conntrack overlay openvswitch nsh nf_conntrack_ipv6
nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat
nf_conntrack nls_iso8859_1 joydev input_leds mac_hid kvm_intel hv_balloon kvm
serio_raw irqbypass intel_rapl_perf sch_fq_codel ib_iser rdma_cm iw_cm ib_cm
ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables
autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov
[11218.640802] async_memcpy async_pq async_xor async_tx xor raid6_pq
libcrc32c raid1 raid0 multipath linear hid_generic crct10dif_pclmul
crc32_pclmul hid_hyperv ghash_clmulni_intel hv_utils hv_storvsc pcbc ptp
hv_netvsc hid pps_core scsi_transport_fc hyperv_keyboard aesni_intel aes_x86_64
crypto_simd hyperv_fb floppy glue_helper cryptd psmouse hv_vmbus i2c_piix4
pata_acpi
[11218.640802] CPU: 0 PID: 9281 Comm: vmxt Not tainted 4.15.18+test #1
[11218.640802] Hardware name: Microsoft Corporation Virtual Machine/Virtual
Machine, BIOS 090008 12/07/2018
[11218.640802] RIP: 0010:native_smp_send_reschedule+0x3f/0x50
[11218.640802] RSP: 0018:ffff9446bfc03e08 EFLAGS: 00010082
[11218.640802] RAX: 0000000000000000 RBX: 0000000000000002 RCX:
0000000000000006
[11218.640802] RDX: 0000000000000007 RSI: 0000000000000082 RDI:
ffff9446bfc16490
[11218.640802] RBP: ffff9446bfc03e08 R08: 0000000000000000 R09:
0000000000001480
[11218.640802] R10: 0000000000000549 R11: 0000000000000038 R12:
ffff9446bfca2880
[11218.640802] R13: 0000000000000000 R14: 000000010029a6b8 R15:
ffff9446bfc1cd28
[11218.640802] FS: 0000000000000000(0000) GS:ffff9446bfc00000(0063)
knlGS:00000000f7a40700
[11218.640802] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
[11218.640802] CR2: 00000000000021f8 CR3: 000000084c576004 CR4:
00000000003626f0
[11218.640802] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[11218.640802] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[11218.640802] Call Trace:
[11218.640802] <IRQ>
[11218.640802] trigger_load_balance+0x12a/0x230
[11218.640802] scheduler_tick+0xae/0xd0
[11218.640802] ? tick_sched_do_timer+0x40/0x40
[11218.640802] update_process_times+0x47/0x60
[11218.640802] tick_sched_handle+0x2a/0x60
[11218.640802] tick_sched_timer+0x39/0x80
[11218.640802] __hrtimer_run_queues+0xe7/0x230
[11218.640802] hrtimer_interrupt+0xb1/0x200
[11218.640802] vmbus_isr+0x16c/0x2a0 [hv_vmbus]
[11218.640802] hyperv_vector_handler+0x3f/0x6e
[11218.640802] hyperv_callback_vector+0x84/0x90
[11218.640802] </IRQ>
[11218.640802] <#DF>
[11218.640802] RIP: 0010:panic+0x1fe/0x244
[11218.640802] RSP: 0018:fffffe0000007e90 EFLAGS: 00000286 ORIG_RAX:
ffffffffffffff0c
[11218.640802] RAX: 0000000000000034 RBX: fffffe0000007f00 RCX:
0000000000000006
[11218.640802] RDX: 0000000000000000 RSI: 0000000000000096 RDI:
ffff9446bfc16490
[11218.640802] RBP: fffffe0000007f08 R08: 0000000000000000 R09:
000000000000147e
[11218.640802] R10: 0000000000000000 R11: 0000000000000038 R12:
0000000000000000
[11218.640802] R13: 0000000000000000 R14: 000000084c577804 R15:
0000000000000000
[11218.640802] df_debug+0x2d/0x30
[11218.640802] do_double_fault+0x9a/0x130
[11218.640802] double_fault+0x1e/0x30
[11218.640802] RIP: 0010:0x1a80
[11218.640802] RSP: 0018:0000000000002200 EFLAGS: 00010096
[11218.640802] RAX: 0000000000000102 RBX: 00000000f7a40768 RCX:
000000000000002f
[11218.640802] RDX: 00000000f7ee9970 RSI: 00000000f7a40700 RDI:
00000000f7c3a000
[11218.640802] RBP: 00000000fffd6430 R08: 0000000000000000 R09:
0000000000000000
[11218.640802] R10: 0000000000000000 R11: 0000000000000000 R12:
0000000000000000
[11218.640802] R13: 0000000000000000 R14: 0000000000000000 R15:
0000000000000000
[11218.640802] </#DF>
[11218.640802] Code: 92 c0 84 c0 74 17 48 8b 05 7f 71 15 01 be fd 00 00 00 48
8b 40 30 e8 31 b6 ba 00 5d c3 89 fe 48 c7 c7 10 e3 4b ac e8 b1 49 03 00 <0f> 0b
5d c3 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00
[11218.640802] ---[ end trace 857d64f92b00ceb2 ]---
[11221.669166] hyperv_fb: Unable to send packet via vmbus
[11221.669167] hyperv_fb: Unable to send packet via vmbus
[11221.669167] hyperv_fb: Unable to send packet via vmbus
[11221.669167] hyperv_fb: Unable to send packet via vmbu
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1904632/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp