On Thu, Mar 19, 2026 at 12:29 PM Alex Williamson <[email protected]> wrote: > > On Thu, 19 Mar 2026 19:04:37 +0000 > David Matlack <[email protected]> wrote: > > > On 2026-03-17 02:42 PM, Rubin Du wrote: > > > Add a new VFIO PCI driver for NVIDIA GPUs that enables DMA testing > > > via the Falcon (Fast Logic Controller) microcontrollers. This driver > > > extracts and adapts the DMA test functionality from the NVIDIA > > > gpu-admin-tools project and integrates it into the existing VFIO > > > selftest framework. > > > > > > The Falcon is a general-purpose microcontroller present on NVIDIA GPUs > > > that can perform DMA operations between system memory and device memory. > > > By leveraging Falcon DMA, this driver allows NVIDIA GPUs to be tested > > > alongside Intel IOAT and DSA devices using the same selftest > > > infrastructure. > > > > > > Supported GPUs: > > > - Kepler: K520, GTX660, K4000, K80, GT635 > > > - Maxwell Gen1: GTX750, GTX745 > > > - Maxwell Gen2: M60 > > > - Pascal: P100, P4, P40 > > > - Volta: V100 > > > - Turing: T4 > > > - Ampere: A16, A100, A10 > > > - Ada: L4, L40S > > > - Hopper: H100 > > > > > > The PMU falcon on Kepler and Maxwell Gen1 GPUs uses legacy FBIF register > > > offsets and requires enabling via PMC_ENABLE with the HUB bit set. > > > > > > Limitations and tradeoffs: > > > > > > 1. Architecture support: > > > Blackwell and newer architectures may require additional work > > > due to firmware. > > > > > > 2. Synchronous DMA operations: > > > Each transfer blocks until completion because the reference > > > implementation does not expose command queuing - only one > > > DMA operation can be in flight at a time. > > > > Asynchronous DMA will be important for testing Live Update: > > > > https://lore.kernel.org/kvm/[email protected]/ > > > > That is why I split memcpy_start() and memcpy_wait() from the beginning. > > > > Would it be possible to add support for it here even though it is not in > > the reference implementation? > > I'll leave the can-we questions to Rubin, but do you see either the MSI > or asynchronous issues as blockers?
No, I don't consider either to be hard blockers. > Currently our driver tests are > limited to a very narrow range of Intel server platforms, whereas this > is a plug'able endpoint we can install anywhere. I'd think that's > sufficiently valuable in expanding the test base to make some > compromises. Thanks, Yeah we can compromise if both issues cannot be resolved. I hope to limit differences between the selftests drivers as much as possible, so that tests don't have to care. But I also recognize some divergence will be inevitable if we want to support a broad set of devices while also supporting more advanced features like asyncronous memcpy.

