Hi Honglei, On 11/12/25 13:10, [email protected] wrote: >>> Paravirtualized environments exacerbate this issue, as KVM's memory backing >>> is often non-contiguous at the host level. In virtualized environments, >>> guest >>> physical memory appears contiguous to the VM but is actually scattered >>> across >>> host memory pages. This fragmentation means that what appears as a single >>> large allocation in the guest may require multiple discrete SVM >>> registrations >>> to properly handle the underlying host memory layout, further multiplying >>> the >>> number of required ioctl calls. >> SVM with dynamic migration under KVM is most likely a dead end to begin with. >> >> The only possibility to implement it is with memory pinning which is >> basically userptr. >> >> Or a rather slow client side IOMMU emulation to catch concurrent DMA >> transfers to get the necessary information onto the host side. >> >> Intel calls this approach colIOMMU: >> https://www.usenix.org/system/files/atc20-paper236-slides-tian.pdf >> > > This is very helpful context.Your confirmation that memory pinning > (userptr-style) is the practical approach helps me understand that what I > initially saw as a "workaround" is actually the intended solution for this > use case.
Well "intended" is maybe not the right term, I would rather say "possible" with the current SW/HW stack design in virtualization. In general fault based SVM/HMM would still be nice to have even under virtualization environment, it's just simply not really feasible at the moment. > For colIOMMU, I'll study it to better understand the alternatives and their > trade-offs. I haven't looked into it in detail either. It's mostly developed with the pass-through use case in mind, but avoiding pinning memory on the host side which is one of many per-requisites to have some HMM based migration working as well. ...>>> Why Submit This RFC? >>> ==================== >>> >>> Despite the limitations above, I am submitting this series to: >>> >>> 1. **Start the Discussion**: I want community feedback on whether batch >>> registration is a useful feature worth pursuing. >>> >>> 2. **Explore Better Alternatives**: Is there a way to achieve batch >>> registration without pinning? Could I extend HMM to better support >>> this use case? >> >> There is an ongoing unification project between KFD and KGD, we are >> currently looking into the SVM part on a weekly basis. >> >> Saying that we probably need a really good justification to add new features >> to the KFD interfaces cause this is going to delay the unification. >> >> Regards, >> Christian. > > Thank you for sharing this critical information. Is there a public discussion > forum or mailing list for the KFD/KGD unification where I could follow > progress and understand the design direction? Alex is driving this. No mailing list, but IIRC Alex has organized a lot of topics on some confluence page, but I can't find it of hand. > Regarding the use case justification: I need to be honest here - the > primary driver for this feature is indeed KVM/virtualized environments. > The scattered allocation problem exists in native environments too, but > the overhead is tolerable there. However, I do want to raise one > consideration for the unified interface design: > > GPU computing in virtualized/cloud environments is growing rapidly, major > cloud providers (AWS, Azure) now offer GPU instances ROCm in containers/VMs > is becoming more common.So while my current use case is specific to KVM, the > virtualized GPU workload pattern may become more prevalent. > > So during the unified interface design, please keep the door open for > batch-style operations if they don't complicate the core design. Oh, yes! That's definitely valuable information to have and a more or less a new requirement for the SVM userspace API. I already expected that we sooner or later run into such things, but having it definitely confirmed is really good to have. Regards, Christian. > > I really appreciate your time and guidance on this. > > Regards, > Honglei > > > >> >>> >>> 3. **Understand Trade-offs**: For some workloads, the performance benefit >>> of batch registration might outweigh the drawbacks of pinning. I'd >>> like to understand where the balance lies. >>> >>> Questions for the Community >>> ============================ >>> >>> 1. Are there existing mechanisms in HMM or mm that could support batch >>> operations without pinning? >>> >>> 2. Would a different approach (e.g., async registration, delayed validation) >>> be more acceptable? >>> >>> Alternative Approaches Considered >>> ================================== >>> >>> I've considered several alternatives: >>> >>> A) **Pure HMM approach**: Register ranges without pinning, rely entirely on >>> >>> B) **Userspace batching library**: Hide multiple ioctls behind a library. >>> >>> Patch Series Overview >>> ===================== >>> >>> Patch 1: Add KFD_IOCTL_SVM_ATTR_MAPPED attribute type >>> Patch 2: Define data structures for batch SVM range registration >>> Patch 3: Add new AMDKFD_IOC_SVM_RANGES ioctl command >>> Patch 4: Implement page pinning mechanism for scattered ranges >>> Patch 5: Wire up the ioctl handler and attribute processing >>> >>> Testing >>> ======= >>> >>> The series has been tested with: >>> - Multiple scattered malloc() allocations (2-2000+ ranges) >>> - Various allocation sizes (4KB to 1G+) >>> - GPU compute workloads using the registered ranges >>> - Memory pressure scenarios >>> - OpecnCL CTS in KVM guest environment >>> - HIP catch tests in KVM guest environment >>> - Some AI applications like Stable Diffusion, ComfyUI, 3B LLM models based >>> on HuggingFace transformers >>> >>> I understand this approach is not ideal and are committed to working on a >>> better solution based on community feedback. This RFC is the starting point >>> for that discussion. >>> >>> Thank you for your time and consideration. >>> >>> Best regards, >>> Honglei Huang >>> >>> --- >>> >>> Honglei Huang (5): >>> drm/amdkfd: Add KFD_IOCTL_SVM_ATTR_MAPPED attribute >>> drm/amdkfd: Add SVM ranges data structures >>> drm/amdkfd: Add AMDKFD_IOC_SVM_RANGES ioctl command >>> drm/amdkfd: Add support for pinned user pages in SVM ranges >>> drm/amdkfd: Wire up SVM ranges ioctl handler >>> >>> drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 67 +++++++++++ >>> drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 232 >>> +++++++++++++++++++++++++++++-- >>> drivers/gpu/drm/amd/amdkfd/kfd_svm.h | 3 + >>> include/uapi/linux/kfd_ioctl.h | 52 +++++++- >>> 4 files changed, 348 insertions(+), 6 deletions(-) >> >
