Public bug reported: [ Impact ]
When booting a large memory guest (both focal and jammy) with 5.15 kernel on a SEV enabled host it fails to boot and shows the following error in dmesg: software IO TLB: Cannot allocate buffer But booting a Fedora36 guest works fine on a SEV enabled host With this kernel commit: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e998879d4fb7991856916972168cf27c0d86ed12 SWIOTLB could allocate from 64MB to 1G top contiguous memory according to how much memory the system has in sev_setup_arch: size = total_mem * 6 / 100; size = clamp_val(size, IO_TLB_DEFAULT_SIZE, SZ_1G); swiotlb_adjust_size(size); Look into the memory block layout from Fedora grub, the available memory blocks are: [ 0.005879] memory[0x0] [0x0000000000001000-0x000000000009ffff], 0x000000000009f000 bytes flags: 0x0 [ 0.005881] memory[0x1] [0x0000000000100000-0x000000007e9ecfff], 0x000000007e8ed000 bytes flags: 0x0 [ 0.005883] memory[0x2] [0x000000007eb1b000-0x000000007fb9afff], 0x0000000001080000 bytes flags: 0x0 [ 0.005885] memory[0x3] [0x000000007fbff000-0x000000007ffdffff], 0x00000000003e1000 bytes flags: 0x0 [ 0.005886] memory[0x4] [0x0000000100000000-0x00000004ffffffff], 0x0000000400000000 bytes flags: 0x0 The biggest one is: [ 0.005881] memory[0x1] [0x0000000000100000-0x000000007e9ecfff], 0x000000007e8ed000 bytes flags: 0x0 The size is close to 2G and sufficient for SWIOTLB to allocate 1G contiguous memory Then we need to exclude reserved memory blocks overlapped with this region, below is the list [ 0.005892] reserved[0x2] [0x00000000574a7000-0x0000000059313fff], 0x0000000001e6d000 bytes flags: 0x0 [ 0.005894] reserved[0x3] [0x000000007e133018-0x000000007e17e057], 0x000000000004b040 bytes flags: 0x0 [ 0.005896] reserved[0x4] [0x000000007e845018-0x000000007e845857], 0x0000000000000840 bytes flags: 0x0 [ 0.005897] reserved[0x5] [0x000000007ee95698-0x000000007ee95af7], 0x0000000000000460 bytes flags: 0x0 Now the biggest available range is [0x0000000000100000-0x00000000574a7000] Before SWIOTLB allocates memory block, EFI also reserves some memory the one that overlapped with the above range is [ 0.005942] memblock_reserve: [0x000000007bfbe000-0x000000007bfddfff] efi_reserve_boot_services+0x8a/0xdb It’s fine that SWIOTLB can still allocate 1G contiguous memory from [0x0000000000100000-0x00000000574a7000]: [ 1.089832] software IO TLB: mapped [mem 0x00000000174a7000-0x00000000574a7000] (1024MB) But if we look into the memory block layout from Ubuntu grub, the available memory blocks are: [ 0.005833] memory[0x0] [0x0000000000001000-0x000000000009ffff], 0x000000000009f000 bytes flags: 0x0 [ 0.005835] memory[0x1] [0x0000000000100000-0x000000007e9ecfff], 0x000000007e8ed000 bytes flags: 0x0 [ 0.005837] memory[0x2] [0x000000007eb1b000-0x000000007fb9afff], 0x0000000001080000 bytes flags: 0x0 [ 0.005838] memory[0x3] [0x000000007fbff000-0x000000007ffdffff], 0x00000000003e1000 bytes flags: 0x0 [ 0.005840] memory[0x4] [0x0000000100000000-0x00000004ffffffff], 0x0000000400000000 bytes flags: 0x0 The biggest one is also: [ 0.005835] memory[0x1] [0x0000000000100000-0x000000007e9ecfff], 0x000000007e8ed000 bytes flags: 0x0 Then excluding the reserved memory blocks: [ 0.005846] reserved[0x2] [0x000000003a9ba000-0x000000003c7cdfff], 0x0000000001e14000 bytes flags: 0x0 [ 0.005848] reserved[0x3] [0x000000007e133018-0x000000007e17e057], 0x000000000004b040 bytes flags: 0x0 [ 0.005849] reserved[0x4] [0x000000007e847018-0x000000007e847887], 0x0000000000000870 bytes flags: 0x0 [ 0.005851] reserved[0x5] [0x000000007ee95698-0x000000007ee95af7], 0x0000000000000460 bytes flags: 0x0 Now the biggest one is: [0x000000003e133018-0x000000007e133018] Then excluding EFI reserved memory block that overlapped with the above range: [ 0.005896] memblock_reserve: [0x000000007bfbe000-0x000000007bfddfff] efi_reserve_boot_services+0x8a/0xdb So now, the biggest contiguous memory becomes [0x000000003c7ce000-0x000000007bfbe000] Which is less than 1G, this is why SWIOTLB can not allocate 1G contiguous memory This commit from rhboot/grub2 fixes this issue: https://github.com/rhboot/grub2/commit/9e6c1d803ade111b8719502ff25e86d8b4564de8 it adjusts the memory block layout, so SWIOTLB or any other drivers that need more than 1G contiguous memory can be satisfied [ Test Plan ] Enable SEV on a AMD machine, refer to https://docs.ovh.com/us/en/dedicated/enable-and-use-amd-sme- sev/#references-and-additional-resources_1 create a ubuntu VM with SEV enabled (--launchSecurity sev) and 18G memory as below: virt-install --name <guest-name> --memory 18874368 --memtune hard_limit=36507216 --boot uefi --disk /var/lib/libvirt/images/<guest-name.img>,device=disk,bus=scsi --disk /var/lib/libvirt/images/<guest-name>-config.iso,device=cdrom --os-type linux --os-variant <variant> --import --controller type=scsi,model=virtio-scsi,driver.iommu=on --controller type=virtio-serial,driver.iommu=on --network network=default,model=virtio,driver.iommu=on --memballoon driver.iommu=on --graphics none --launchSecurity sev --noautoconsole Check if it can boot successfully with the above patch [ Where problems could occur ] This patch only adjust the memory block layout, it shouldn't affect any other functions [ Other Info ] Related bugs: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1983625 https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1842320 ** Affects: grub2-unsigned (Ubuntu) Importance: High Assignee: gerald.yang (gerald-yang-tw) Status: In Progress ** Package changed: linux (Ubuntu) => grub2-unsigned (Ubuntu) ** Changed in: grub2-unsigned (Ubuntu) Assignee: (unassigned) => gerald.yang (gerald-yang-tw) ** Changed in: grub2-unsigned (Ubuntu) Importance: Undecided => High ** Changed in: grub2-unsigned (Ubuntu) Status: New => In Progress -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1989446 Title: [SRU] unable to boot guest with large memory when SEV is enabled on host Status in grub2-unsigned package in Ubuntu: In Progress Bug description: [ Impact ] When booting a large memory guest (both focal and jammy) with 5.15 kernel on a SEV enabled host it fails to boot and shows the following error in dmesg: software IO TLB: Cannot allocate buffer But booting a Fedora36 guest works fine on a SEV enabled host With this kernel commit: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e998879d4fb7991856916972168cf27c0d86ed12 SWIOTLB could allocate from 64MB to 1G top contiguous memory according to how much memory the system has in sev_setup_arch: size = total_mem * 6 / 100; size = clamp_val(size, IO_TLB_DEFAULT_SIZE, SZ_1G); swiotlb_adjust_size(size); Look into the memory block layout from Fedora grub, the available memory blocks are: [ 0.005879] memory[0x0] [0x0000000000001000-0x000000000009ffff], 0x000000000009f000 bytes flags: 0x0 [ 0.005881] memory[0x1] [0x0000000000100000-0x000000007e9ecfff], 0x000000007e8ed000 bytes flags: 0x0 [ 0.005883] memory[0x2] [0x000000007eb1b000-0x000000007fb9afff], 0x0000000001080000 bytes flags: 0x0 [ 0.005885] memory[0x3] [0x000000007fbff000-0x000000007ffdffff], 0x00000000003e1000 bytes flags: 0x0 [ 0.005886] memory[0x4] [0x0000000100000000-0x00000004ffffffff], 0x0000000400000000 bytes flags: 0x0 The biggest one is: [ 0.005881] memory[0x1] [0x0000000000100000-0x000000007e9ecfff], 0x000000007e8ed000 bytes flags: 0x0 The size is close to 2G and sufficient for SWIOTLB to allocate 1G contiguous memory Then we need to exclude reserved memory blocks overlapped with this region, below is the list [ 0.005892] reserved[0x2] [0x00000000574a7000-0x0000000059313fff], 0x0000000001e6d000 bytes flags: 0x0 [ 0.005894] reserved[0x3] [0x000000007e133018-0x000000007e17e057], 0x000000000004b040 bytes flags: 0x0 [ 0.005896] reserved[0x4] [0x000000007e845018-0x000000007e845857], 0x0000000000000840 bytes flags: 0x0 [ 0.005897] reserved[0x5] [0x000000007ee95698-0x000000007ee95af7], 0x0000000000000460 bytes flags: 0x0 Now the biggest available range is [0x0000000000100000-0x00000000574a7000] Before SWIOTLB allocates memory block, EFI also reserves some memory the one that overlapped with the above range is [ 0.005942] memblock_reserve: [0x000000007bfbe000-0x000000007bfddfff] efi_reserve_boot_services+0x8a/0xdb It’s fine that SWIOTLB can still allocate 1G contiguous memory from [0x0000000000100000-0x00000000574a7000]: [ 1.089832] software IO TLB: mapped [mem 0x00000000174a7000-0x00000000574a7000] (1024MB) But if we look into the memory block layout from Ubuntu grub, the available memory blocks are: [ 0.005833] memory[0x0] [0x0000000000001000-0x000000000009ffff], 0x000000000009f000 bytes flags: 0x0 [ 0.005835] memory[0x1] [0x0000000000100000-0x000000007e9ecfff], 0x000000007e8ed000 bytes flags: 0x0 [ 0.005837] memory[0x2] [0x000000007eb1b000-0x000000007fb9afff], 0x0000000001080000 bytes flags: 0x0 [ 0.005838] memory[0x3] [0x000000007fbff000-0x000000007ffdffff], 0x00000000003e1000 bytes flags: 0x0 [ 0.005840] memory[0x4] [0x0000000100000000-0x00000004ffffffff], 0x0000000400000000 bytes flags: 0x0 The biggest one is also: [ 0.005835] memory[0x1] [0x0000000000100000-0x000000007e9ecfff], 0x000000007e8ed000 bytes flags: 0x0 Then excluding the reserved memory blocks: [ 0.005846] reserved[0x2] [0x000000003a9ba000-0x000000003c7cdfff], 0x0000000001e14000 bytes flags: 0x0 [ 0.005848] reserved[0x3] [0x000000007e133018-0x000000007e17e057], 0x000000000004b040 bytes flags: 0x0 [ 0.005849] reserved[0x4] [0x000000007e847018-0x000000007e847887], 0x0000000000000870 bytes flags: 0x0 [ 0.005851] reserved[0x5] [0x000000007ee95698-0x000000007ee95af7], 0x0000000000000460 bytes flags: 0x0 Now the biggest one is: [0x000000003e133018-0x000000007e133018] Then excluding EFI reserved memory block that overlapped with the above range: [ 0.005896] memblock_reserve: [0x000000007bfbe000-0x000000007bfddfff] efi_reserve_boot_services+0x8a/0xdb So now, the biggest contiguous memory becomes [0x000000003c7ce000-0x000000007bfbe000] Which is less than 1G, this is why SWIOTLB can not allocate 1G contiguous memory This commit from rhboot/grub2 fixes this issue: https://github.com/rhboot/grub2/commit/9e6c1d803ade111b8719502ff25e86d8b4564de8 it adjusts the memory block layout, so SWIOTLB or any other drivers that need more than 1G contiguous memory can be satisfied [ Test Plan ] Enable SEV on a AMD machine, refer to https://docs.ovh.com/us/en/dedicated/enable-and-use-amd-sme- sev/#references-and-additional-resources_1 create a ubuntu VM with SEV enabled (--launchSecurity sev) and 18G memory as below: virt-install --name <guest-name> --memory 18874368 --memtune hard_limit=36507216 --boot uefi --disk /var/lib/libvirt/images/<guest-name.img>,device=disk,bus=scsi --disk /var/lib/libvirt/images/<guest-name>-config.iso,device=cdrom --os-type linux --os-variant <variant> --import --controller type=scsi,model=virtio-scsi,driver.iommu=on --controller type=virtio-serial,driver.iommu=on --network network=default,model=virtio,driver.iommu=on --memballoon driver.iommu=on --graphics none --launchSecurity sev --noautoconsole Check if it can boot successfully with the above patch [ Where problems could occur ] This patch only adjust the memory block layout, it shouldn't affect any other functions [ Other Info ] Related bugs: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1983625 https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1842320 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/grub2-unsigned/+bug/1989446/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp