On Tue, 2025-11-18 at 17:15 +1100, Alistair Popple wrote: > On 2025-11-12 at 03:43 +1100, Thomas Hellström > <[email protected]> wrote... > > This series aims at providing an initial implementation of multi- > > device > > SVM, where communitcation with peers (migration and direct > > execution out > > of peer memory) uses some form of fast interconnect. In this series > > we're using pcie p2p. > > > > In a multi-device environment, the struct pages for device-private > > memory > > (the dev_pagemap) may take up a significant amount of system > > memory. We > > therefore want to provide a means of revoking / removing the > > dev_pagemaps > > not in use. In particular when a device is offlined, we want to > > block > > migrating *to* the device memory and migrate data already existing > > in the > > devices memory to system. The dev_pagemap then becomes unused and > > can be > > removed. > > > > Removing and setting up a large dev_pagemap is also quite time- > > consuming, > > so removal of unused dev_pagemaps only happens on system memory > > pressure > > using a shrinker. > > Agree it is quite time-consuming, we have run into this problem as > well > including with the pcie p2p dma pages. On the mm side I've started > looking > at if/how we can remove the need for struct pages at all for > supporting this. > Doesn't help you at all now of course, but hopefully one day we can > avoid the > need for this. I will be discussing this at LPC if you happen to be > there.
Yeah that sounds great. Will not be at LPC in person but will make sure to join remotely. Thanks, Thomas > > - Alistair > > > Patch 1 is a small debug printout fix. > > Patches 2-7 deals with dynamic drm_pagemaps as described above. > > Patches 8-12 adds infrastructure to handle remote drm_pagemaps with > > fast interconnects. > > Patch 13 extends the xe madvise() UAPI to handle remote > > drm_pagemaps. > > Patch 14 adds a pcie-p2p dma SVM interconnect to the xe driver. > > Patch 15 adds some SVM-related debug printouts for xe. > > Patch 16 adds direct interconnect migration. > > Patch 17 adds some documentation. > > > > What's still missing is implementation of migration policies. > > That will be implemented in follow-up series. > > > > v2: > > - Address review comments from Matt Brost. > > - Fix compilation issues reported by automated testing > > - Add patch 1, 17. > > - What's now patch 16 was extended to support p2p migration. > > > > Thomas Hellström (17): > > drm/xe/svm: Fix a debug printout > > drm/pagemap, drm/xe: Add refcounting to struct drm_pagemap > > drm/pagemap: Add a refcounted drm_pagemap backpointer to struct > > drm_pagemap_zdd > > drm/pagemap, drm/xe: Manage drm_pagemap provider lifetimes > > drm/pagemap: Add a drm_pagemap cache and shrinker > > drm/xe: Use the drm_pagemap cache and shrinker > > drm/pagemap: Remove the drm_pagemap_create() interface > > drm/pagemap_util: Add a utility to assign an owner to a set of > > interconnected gpus > > drm/xe: Use the drm_pagemap_util helper to get a svm pagemap > > owner > > drm/xe: Pass a drm_pagemap pointer around with the memory advise > > attributes > > drm/xe: Use the vma attibute drm_pagemap to select where to > > migrate > > drm/xe: Simplify madvise_preferred_mem_loc() > > drm/xe/uapi: Extend the madvise functionality to support foreign > > pagemap placement for svm > > drm/xe: Support pcie p2p dma as a fast interconnect > > drm/xe/vm: Add a couple of VM debug printouts > > drm/pagemap, drm/xe: Support migration over interconnect > > drm/xe/svm: Document how xe keeps drm_pagemap references > > > > drivers/gpu/drm/Makefile | 3 +- > > drivers/gpu/drm/drm_gpusvm.c | 4 +- > > drivers/gpu/drm/drm_pagemap.c | 354 ++++++++++++--- > > drivers/gpu/drm/drm_pagemap_util.c | 568 > > ++++++++++++++++++++++++ > > drivers/gpu/drm/xe/xe_device.c | 20 + > > drivers/gpu/drm/xe/xe_device.h | 2 + > > drivers/gpu/drm/xe/xe_device_types.h | 5 + > > drivers/gpu/drm/xe/xe_svm.c | 631 ++++++++++++++++++++++- > > ---- > > drivers/gpu/drm/xe/xe_svm.h | 82 +++- > > drivers/gpu/drm/xe/xe_tile.c | 34 +- > > drivers/gpu/drm/xe/xe_tile.h | 21 + > > drivers/gpu/drm/xe/xe_userptr.c | 2 +- > > drivers/gpu/drm/xe/xe_vm.c | 65 ++- > > drivers/gpu/drm/xe/xe_vm.h | 1 + > > drivers/gpu/drm/xe/xe_vm_madvise.c | 106 ++++- > > drivers/gpu/drm/xe/xe_vm_types.h | 21 +- > > drivers/gpu/drm/xe/xe_vram_types.h | 15 +- > > include/drm/drm_pagemap.h | 91 +++- > > include/drm/drm_pagemap_util.h | 92 ++++ > > include/uapi/drm/xe_drm.h | 18 +- > > 20 files changed, 1898 insertions(+), 237 deletions(-) > > create mode 100644 drivers/gpu/drm/drm_pagemap_util.c > > create mode 100644 include/drm/drm_pagemap_util.h > > > > -- > > 2.51.1 > >
