It is verified SEV works for the v5.15 kernel where we wants SEV to be
supported. Closing this one for v5.11 now.
** Changed in: linux-oracle-5.11 (Ubuntu)
Status: New => Invalid
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-oracle-5.11 in Ubuntu.
https://bugs.launchpad.net/bugs/1980884
Title:
ubuntu guest kernel panics when a sev guest with passthrough mlx5 VF
is used
Status in linux-oracle-5.11 package in Ubuntu:
Invalid
Bug description:
Guest kernel panic can be observed when Ubuntu SEV guest with mlx5 vfio-pci
is started
as iperf3 server using "iperf3 -s" and as soon as the client tries to connect
with it.
Steps to reproduce:
HOST INFO
Host type : OCI (Oracle Cloud) Bare-Metal Server
Server/Machine: ORACLE SERVER E4-2c
CPU model : AMD EPYC 7J13 64-Core Processor
Architecture : x86_64
Host OS : Oracle Linux Server release 7.9
Host Kernel : 5.4.17-2136.309.3.el7uek.x86_64 #2 SMP Tue Jun 14 21:58:29
PDT 2022
Hypervisor : QEMU emulator version 4.2.1 (qemu-4.2.1-17.1.el7)
OVMF/AAVMF : OVMF-1.6.2-2.el7.noarch
libiscsi : libiscsi-1.19.0-1.el7.x86_64
Guest Kernel : 5.11.0-1028-ORACLE
1) Start Ubuntu 20.04/18.04 SEV guest with vfio-pci:
/usr/bin/qemu-system-x86_64 -machine q35 -name OL20.04-uefi -enable-kvm
-nodefaults -cpu host,+host-phys-bits -m 8G -smp 8,maxcpus=240 -monitor stdio
-vnc 0.0.0.0:0,to=999 -vga std -drive
file=/usr/share/OVMF/OVMF_CODE.pure-efi.fd,index=0,if=pflash,format=raw,readonly
-drive file=OVMF_VARS.pure-efi.fd.ol20.04,index=1,if=pflash,format=raw
-device
virtio-scsi-pci,id=virtio-scsi-pci0,disable-legacy=on,iommu_platform=true
-drive
file=/systest/atanveer/scripts/Ubuntu-20.04-2022.02.15-0-uefi-x86_64.qcow2,if=none,id=local_disk0,format=qcow2,media=disk
-device
ide-hd,drive=local_disk0,id=local_disk1,bootindex=0 -net none -device
vfio-pci,host=0000:21:10.1 -qmp tcp:127.0.0.1:3334,server,nowait -serial
telnet:127.0.0.1:3333,server,nowait -D ./OL20.04-uefi.log -device
virtio-rng-pci,disable-legacy=on,iommu_platform=true -object
sev-guest,id=sev0,cbitpos=51,reduced-phys-bits=1 -machine
memory-encryption=sev0
2) Start a client guest OL/Ubuntu:
/usr/bin/qemu-system-x86_64 -machine q35 -name OL18.04-uefi -enable-kvm
-nodefaults -cpu host,+host-phys-bits -m 8G -smp 8,maxcpus=240 -monitor stdio
-vnc 0.0.0.0:0,to=999 -vga std -drive
file=/usr/share/OVMF/OVMF_CODE.pure-efi.fd,index=0,if=pflash,format=raw,readonly
-drive file=OVMF_VARS.pure-efi.fd.ol18.04,index=1,if=pflash,format=raw
-device
virtio-scsi-pci,id=virtio-scsi-pci0,disable-legacy=on,iommu_platform=true
-drive
file=/systest/atanveer/scripts/Ubuntu-18.04-2022.02.13-0-uefi-x86_64.qcow2,if=none,id=local_disk0,format=qcow2,media=disk
-device
ide-hd,drive=local_disk0,id=local_disk1,bootindex=0 -net none -device
vfio-pci,host=0000:21:10.2 -qmp tcp:127.0.0.1:6666,server,nowait -serial
telnet:127.0.0.1:5555,server,nowait -D ./OL18.04-uefi.log -device
virtio-rng-pci,disable-legacy=on,iommu_platform=true -object
sev-guest,id=sev0,cbitpos=51,reduced-phys-bits=1 -machine
memory-encryption=sev0
3) Flush iptables on both the VMs using "iptables -F"
4) Start the iperf3 server on the first VM using "iperf3 -s"
5) Start the iperf3 client on the second VM using "iperf3 -c <server_ip> -4
-f M -i 0 -t 70 -O 10 -P 64"
The kernel panic is seen on the first VM i.e. Ubuntu 20.04 with iperf3 also
showing "Bad Address" error.
Console logs:
root@ubuntu-20-04:~# iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 10.196.246.104, port 33732
[ 5] local 10.196.247.88 port 5201 connected to 10.196.246.104 port 33734
[ 8] local 10.196.247.88 port 5201 connected to 10.196.246.104 port 33736
[ 10] local 10.196.247.88 port 5201 connected to 10.196.246.104 port 33738
iperf3: error - unable to read from stream socket: Bad address
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
[ 91.083856] general protection fault: 0000 [#1] SMP NOPTI
[ 91.084591] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 5.11.0-1028-oracle
#31~20.04.1-Ubuntu
[ 91.085393] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.6.2
06/01/2022
[ 91.086205] RIP: 0010:memcpy_erms+0x6/0x10
[ 91.086640] Code: cc cc cc cc eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03
83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4
c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38 fe
[ 91.088559] RSP: 0018:ffffa9c1408e4b60 EFLAGS: 00010282
[ 91.089105] RAX: ffff938cd8e48000 RBX: 0000000000001000 RCX:
0000000000001000
[ 91.089843] RDX: 0000000000001000 RSI: bb62fcf4fd5bf3d6 RDI:
ffff938cd8e48000
[ 91.090578] RBP: ffffa9c1408e4c00 R08: ffffef2745639200 R09:
0000000000000000
[ 91.091309] R10: ffffef27456399c8 R11: 0000000000004209 R12:
0000000000001000
[ 91.092043] R13: ffffef2745639200 R14: 0000000000001000 R15:
000000000d558380
[ 91.092782] FS: 0000000000000000(0000) GS:ffff938df4300000(0000)
knlGS:0000000000000000
[ 91.093615] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 91.094206] CR2: 00005573bf7ac958 CR3: 0008000105c12006 CR4:
0000000000770ee0
[ 91.094943] PKRU: 55555554
[ 91.095230] Call Trace:
[ 91.095490] <IRQ>
[ 91.095709] ? skb_copy_ubufs+0x448/0x5e0
[ 91.096130] __netif_receive_skb_core+0xdbf/0xf60
[ 91.096623] ? irqentry_exit+0x20/0x30
[ 91.097018] ? asm_common_interrupt+0x1e/0x40
[ 91.097471] __netif_receive_skb_list_core+0x102/0x250
[ 91.098007] netif_receive_skb_list_internal+0x1a1/0x2b0
[ 91.098560] ? inet_gro_receive+0x24b/0x310
[ 91.098996] gro_normal_list.part.0+0x1e/0x40
[ 91.099447] gro_normal_one+0x46/0x50
[ 91.099832] napi_gro_receive+0x161/0x1a0
[ 91.100251] mlx5e_handle_rx_cqe_mpwrq+0x127/0x230 [mlx5_core]
[ 91.100886] mlx5e_poll_rx_cq+0x20c/0xa30 [mlx5_core]
[ 91.101430] mlx5e_napi_poll+0xda/0x670 [mlx5_core]
[ 91.101958] ? mlx5_eq_comp_int+0x149/0x1b0 [mlx5_core]
[ 91.102520] net_rx_action+0x13f/0x3f0
[ 91.102913] __do_softirq+0xe0/0x29b
[ 91.103288] asm_call_irq_on_stack+0x12/0x20
[ 91.103736] </IRQ>
[ 91.103959] do_softirq_own_stack+0x3d/0x50
[ 91.104394] irq_exit_rcu+0xa4/0xb0
[ 91.104766] common_interrupt+0x7d/0x150
[ 91.105177] asm_common_interrupt+0x1e/0x40
[ 91.105616] RIP: 0010:native_safe_halt+0xe/0x10
[ 91.106087] Code: 7b ff ff ff eb bd cc cc cc cc cc cc e9 07 00 00 00 0f 00
2d e6 76 59 00 f4 c3 66 90 e9 07 00 00 00 0f 00 2d d6 76 59 00 fb f4 <c3> cc
0f 1f 44 00 00 55 48 89 e5 53 65 8b 15 5f 3f 1a 4e 0f 1f 44
[ 91.107593] RSP: 0018:ffffa9c1400b3e90 EFLAGS: 00000202
[ 91.108133] RAX: ffffffffb1e75470 RBX: 0000000000000004 RCX:
ffff938df4334fc0
[ 91.109235] RDX: 00000000000181c6 RSI: 000000151f3f7720 RDI:
0000000000000082
[ 91.110329] RBP: ffffa9c1400b3e98 R08: 000000cd42e4dffb R09:
0000001532524720
[ 91.111419] R10: 0000000000000001 R11: 000000000000000c R12:
ffff938c8034b100
[ 91.112503] R13: ffff938c8034b100 R14: 0000000000000000 R15:
0000000000000000
[ 91.113587] ? __cpuidle_text_start+0x8/0x8
[ 91.114358] ? default_idle+0xe/0x20
[ 91.115068] arch_cpu_idle+0x15/0x20
[ 91.115765] default_idle_call+0x38/0xc0
[ 91.116481] do_idle+0x1f8/0x260
[ 91.117130] ? complete+0x3f/0x50
[ 91.117776] cpu_startup_entry+0x20/0x30
[ 91.118479] start_secondary+0x11f/0x160
[ 91.119183] secondary_startup_64_no_verify+0xc2/0xcb
[ 91.120003] Modules linked in: ip6table_filter ip6_tables xt_comment
xt_owner ipt_REJECT nf_reject_ipv4 xt_tcpudp xt_state xt_conntrack
nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter nls_iso8859_1
dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua intel_rapl_msr
intel_rapl_common amd_energy kvm lz4hc lz4hc_compress joydev input_leds
efi_pstore serio_raw qemu_fw_cfg mac_hid sch_fq_codel msr sunrpc virtio_rng
ip_tables x_tables autofs4 btrfs blake2b_generic iscsi_tcp libiscsi_tcp
libiscsi scsi_transport_iscsi iscsi_ibft iscsi_boot_sysfs raid10 raid456
async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1
raid0 multipath linear mlx5_ib ib_uverbs ib_core bochs_drm drm_vram_helper
crct10dif_pclmul drm_ttm_helper crc32_pclmul ghash_clmulni_intel ttm
virtio_scsi aesni_intel crypto_simd cryptd glue_helper drm_kms_helper
mlx5_core syscopyarea sysfillrect sysimgblt fb_sys_fops ahci pci_hyperv_intf
i2c_i801 mlxfw i2c_smbus drm psmouse libahci lpc_ich
[ 91.131465] ---[ end trace 742180202e4ffeea ]---
[ 91.578040] RIP: 0010:memcpy_erms+0x6/0x10
[ 91.578993] Code: cc cc cc cc eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03
83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4
c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38 fe
[ 91.581955] RSP: 0018:ffffa9c1408e4b60 EFLAGS: 00010282
[ 91.582991] RAX: ffff938cd8e48000 RBX: 0000000000001000 RCX:
0000000000001000
[ 91.584231] RDX: 0000000000001000 RSI: bb62fcf4fd5bf3d6 RDI:
ffff938cd8e48000
[ 91.585471] RBP: ffffa9c1408e4c00 R08: ffffef2745639200 R09:
0000000000000000
[ 91.586720] R10: ffffef27456399c8 R11: 0000000000004209 R12:
0000000000001000
[ 91.587967] R13: ffffef2745639200 R14: 0000000000001000 R15:
000000000d558380
[ 91.589212] FS: 0000000000000000(0000) GS:ffff938df4300000(0000)
knlGS:0000000000000000
[ 91.590574] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 91.591673] CR2: 00005573bf7ac958 CR3: 0008000105c12006 CR4:
0000000000770ee0
[ 91.592931] PKRU: 55555554
[ 91.593709] Kernel panic - not syncing: Fatal exception in interrupt
[ 91.604082] Kernel Offset: 0x30200000 from 0xffffffff81000000 (relocation
range: 0xffffffff80000000-0xffffffffbfffffff)
[ 92.049672] ---[ end Kernel panic - not syncing: Fatal exception in
interrupt ]---
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-oracle-5.11/+bug/1980884/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp