This should target the Plucky kernel, and should be triaged with medium importance, since it has a notable performance impact on virtual machines, but does not otherwise impact their functionality.
-- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2111861 Title: VM boots slowly with large-BAR GPU Passthrough (Root Cause Fix SRU) Status in linux package in Ubuntu: In Progress Bug description: SRU Justification: [ Impact ] Due to an inefficiency in the way older host kernels manage pfnmaps for guest VM memory ranges[1], guests with large-BAR GPUs passed through have a very long (multiple minutes) initialization time when the MMIO window advertised by OVMF is sufficiently sized for the passed-through BARs (i.e., the correct OVMF behavior). We have already integrated a partial efficiency improvement [2] which is transparent to the user in 6.8+ kernels, as well as an OVMF-based approach to allow the user to force Jammy-like, faster boot speeds via fw_ctl [3], but the approach in the patch series outlined in this report is the full fix for the underlying cause of the issue on kernels that have support for huge pfnmaps. With this series [0] applied to both the host and guest of an impacted system, BAR initialization times are reduced substantially: In the commonly achieved optimal case, this results in a reduction of pfn lookups by a factor of 256k. For a local test system, an overhead of ~1s for DMA mapping a 32GB PCI BAR is reduced to sub-millisecond (8M page sized operations reduced to 32 pud sized operations). [ Test Plan ] On a machine with GPUs with sufficiently sized BARs: 1. Create a virtual machine with 4 GPUs passed through and CPU host-passthrough enabled. (We use DGX H100 or A100, typically) 2. Observe that, on an unaltered 6.14 kernel, the VM boot time exceeds 5 minutes 3. After applying this series to both the host and guest kernels, boot the guest and observe that the VM boot time is under 30 seconds, with the BAR initialization steps occurring significantly faster in dmesg output. [ Fix ] This series attempts to fully address the issue by leveraging the huge pfnmap support added in v6.12. When we insert pfnmaps using pud and pmd mappings, we can later take advantage of the knowledge of the mapping level page mask to iterate on the relevant mapping stride. [ Where problems could occur ] I do not expect any regressions. The only callers of ABIs changed by this series are also adjusted within this series. [ Additional Context ] [0]: https://lore.kernel.org/all/20250205231728.2527186-1-alex.william...@redhat.com/ [1]: https://lore.kernel.org/all/cahta-uyp07fgm6t1ozqkqadsa5jrzo0reneyzgqzub4mdrr...@mail.gmail.com/ [2]: https://bugs.launchpad.net/bugs/2097389 [3]: https://bugs.launchpad.net/bugs/2101903 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2111861/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp