-----Original Message-----
From: Brost, Matthew <[email protected]>
Sent: Tuesday, December 2, 2025 8:06 AM
To: Cavitt, Jonathan <[email protected]>
Cc: [email protected]; Gupta, saurabhg <[email protected]>;
Zuo, Alex <[email protected]>; [email protected]; Zhang, Jianxun
<[email protected]>; Lin, Shuicheng <[email protected]>;
[email protected]; Wajdeczko, Michal
<[email protected]>; Mrozek, Michal <[email protected]>; Jadav,
Raag <[email protected]>; Briano, Ivan <[email protected]>; Auld,
Matthew <[email protected]>; Hirschfeld, Dafna <[email protected]>
Subject: Re: [PATCH v28 4/4] drm/xe/xe_vm: Implement xe_vm_get_property_ioctl
>
> On Mon, Dec 01, 2025 at 11:55:44PM +0000, Jonathan Cavitt wrote:
> > Add support for userspace to request a list of observed faults
> > from a specified VM.
> >
> > v2:
> > - Only allow querying of failed pagefaults (Matt Brost)
> >
> > v3:
> > - Remove unnecessary size parameter from helper function, as it
> > is a property of the arguments. (jcavitt)
> > - Remove unnecessary copy_from_user (Jainxun)
> > - Set address_precision to 1 (Jainxun)
> > - Report max size instead of dynamic size for memory allocation
> > purposes. Total memory usage is reported separately.
> >
> > v4:
> > - Return int from xe_vm_get_property_size (Shuicheng)
> > - Fix memory leak (Shuicheng)
> > - Remove unnecessary size variable (jcavitt)
> >
> > v5:
> > - Rename ioctl to xe_vm_get_faults_ioctl (jcavitt)
> > - Update fill_property_pfs to eliminate need for kzalloc (Jianxun)
> >
> > v6:
> > - Repair and move fill_faults break condition (Dan Carpenter)
> > - Free vm after use (jcavitt)
> > - Combine assertions (jcavitt)
> > - Expand size check in xe_vm_get_faults_ioctl (jcavitt)
> > - Remove return mask from fill_faults, as return is already -EFAULT or 0
> > (jcavitt)
> >
> > v7:
> > - Revert back to using xe_vm_get_property_ioctl
> > - Apply better copy_to_user logic (jcavitt)
> >
> > v8:
> > - Fix and clean up error value handling in ioctl (jcavitt)
> > - Reapply return mask for fill_faults (jcavitt)
> >
> > v9:
> > - Future-proof size logic for zero-size properties (jcavitt)
> > - Add access and fault types (Jianxun)
> > - Remove address type (Jianxun)
> >
> > v10:
> > - Remove unnecessary switch case logic (Raag)
> > - Compress size get, size validation, and property fill functions into a
> > single helper function (jcavitt)
> > - Assert valid size (jcavitt)
> >
> > v11:
> > - Remove unnecessary else condition
> > - Correct backwards helper function size logic (jcavitt)
> >
> > v12:
> > - Use size_t instead of int (Raag)
> >
> > v13:
> > - Remove engine class and instance (Ivan)
> >
> > v14:
> > - Map access type, fault type, and fault level to user macros (Matt
> > Brost, Ivan)
> >
> > v15:
> > - Remove unnecessary size assertion (jcavitt)
> >
> > v16:
> > - Nit fixes (Matt Brost)
> >
> > v17:
> > - Rebase and refactor (jcavitt)
> >
> > v18:
> > - Do not copy_to_user in critical section (Matt Brost)
> > - Assert args->size is multiple of sizeof(struct xe_vm_fault) (Matt
> > Brost)
> >
> > Signed-off-by: Jonathan Cavitt <[email protected]>
> > Suggested-by: Matthew Brost <[email protected]>
> > Cc: Jainxun Zhang <[email protected]>
> > Cc: Shuicheng Lin <[email protected]>
> > Cc: Raag Jadav <[email protected]>
> > Cc: Ivan Briano <[email protected]>
> > ---
> > drivers/gpu/drm/xe/xe_device.c | 2 +
> > drivers/gpu/drm/xe/xe_vm.c | 119 +++++++++++++++++++++++++++++++++
> > drivers/gpu/drm/xe/xe_vm.h | 3 +
> > 3 files changed, 124 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> > index 1197f914ef77..69baf01f008a 100644
> > --- a/drivers/gpu/drm/xe/xe_device.c
> > +++ b/drivers/gpu/drm/xe/xe_device.c
> > @@ -207,6 +207,8 @@ static const struct drm_ioctl_desc xe_ioctls[] = {
> > DRM_IOCTL_DEF_DRV(XE_MADVISE, xe_vm_madvise_ioctl, DRM_RENDER_ALLOW),
> > DRM_IOCTL_DEF_DRV(XE_VM_QUERY_MEM_RANGE_ATTRS,
> > xe_vm_query_vmas_attrs_ioctl,
> > DRM_RENDER_ALLOW),
> > + DRM_IOCTL_DEF_DRV(XE_VM_GET_PROPERTY, xe_vm_get_property_ioctl,
> > + DRM_RENDER_ALLOW),
> > };
> >
> > static long xe_drm_ioctl(struct file *file, unsigned int cmd, unsigned
> > long arg)
> > diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> > index dc6c36191274..ccc0aa3afe58 100644
> > --- a/drivers/gpu/drm/xe/xe_vm.c
> > +++ b/drivers/gpu/drm/xe/xe_vm.c
> > @@ -3850,6 +3850,125 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void
> > *data, struct drm_file *file)
> > return err;
> > }
> >
> > +/*
> > + * Map access type, fault type, and fault level from current bspec
> > + * specification to user spec abstraction. The current mapping is
> > + * 1-to-1, but if there is ever a hardware change, we will need
> > + * this abstraction layer to maintain API stability through the
> > + * hardware change.
> > + */
> > +static u8 xe_to_user_access_type(u8 access_type)
> > +{
> > + return access_type;
> > +}
> > +
> > +static u8 xe_to_user_fault_type(u8 fault_type)
> > +{
> > + return fault_type;
> > +}
> > +
> > +static u8 xe_to_user_fault_level(u8 fault_level)
> > +{
> > + return fault_level;
> > +}
> > +
> > +static int fill_faults(struct xe_vm *vm,
> > + struct drm_xe_vm_get_property *args)
> > +{
> > + struct xe_vm_fault __user *usr_ptr = u64_to_user_ptr(args->data);
> > + struct xe_vm_fault *fault_list, fault_entry;
> > + struct xe_vm_fault_entry *entry;
> > + int ret = 0, i = 0, count, entry_size;
> > +
> > + entry_size = sizeof(struct xe_vm_fault);
> > + count = args->size / entry_size;
> > +
> > + fault_list = kcalloc(count, sizeof(struct xe_vm_fault), GFP_KERNEL);
> > + if (!fault_list)
> > + return -ENOMEM;
> > +
> > + spin_lock(&vm->faults.lock);
> > + list_for_each_entry(entry, &vm->faults.list, list) {
> > + if (i == count)
> > + break;
> > +
> > + memset(&fault_entry, 0, entry_size);
>
> This memset only needs to happen once, right?
>
> So maybe when declaring 'fault_entry', do this: 'fault_entry = {};'.
This is true from a theoretical and practical standpoint. But from a design
perspective,
it's generally bad practice to reuse a memory region without clearing it first
(at least,
in the case where that memory region points to a struct).
On the other hand, there's apparently no precedent for calling memset on the
same
memory region repeatedly in a loop literally anywhere in the XE code, so maybe
it
would be more fitting to just do it the way you suggested. I'll apply the
change
later once more revision notes come in.
-Jonathan Cavitt
>
> Otherwise LGTM.
>
> Matt
>
> > +
> > + fault_entry.address = entry->address;
> > + fault_entry.address_precision = entry->address_precision;
> > +
> > + fault_entry.access_type =
> > xe_to_user_access_type(entry->access_type);
> > + fault_entry.fault_type =
> > xe_to_user_fault_type(entry->fault_type);
> > + fault_entry.fault_level =
> > xe_to_user_fault_level(entry->fault_level);
> > +
> > + memcpy(&fault_list[i], &fault_entry, entry_size);
> > +
> > + i++;
> > + }
> > + spin_unlock(&vm->faults.lock);
> > +
> > + ret = copy_to_user(usr_ptr, fault_list, args->size);
> > +
> > + kfree(fault_list);
> > + return ret ? -EFAULT : 0;
> > +}
> > +
> > +static int xe_vm_get_property_helper(struct xe_vm *vm,
> > + struct drm_xe_vm_get_property *args)
> > +{
> > + size_t size;
> > +
> > + switch (args->property) {
> > + case DRM_XE_VM_GET_PROPERTY_FAULTS:
> > + spin_lock(&vm->faults.lock);
> > + size = size_mul(sizeof(struct xe_vm_fault), vm->faults.len);
> > + spin_unlock(&vm->faults.lock);
> > +
> > + if (!args->size) {
> > + args->size = size;
> > + return 0;
> > + }
> > +
> > + /*
> > + * Number of faults may increase between calls to
> > + * xe_vm_get_property_ioctl, so just report the number of
> > + * faults the user requests if it's less than or equal to
> > + * the number of faults in the VM fault array.
> > + *
> > + * We should also at least assert that the args->size value
> > + * is a multiple of the xe_vm_fault struct size.
> > + */
> > + if (args->size > size || args->size % sizeof(struct
> > xe_vm_fault))
> > + return -EINVAL;
> > +
> > + return fill_faults(vm, args);
> > + }
> > + return -EINVAL;
> > +}
> > +
> > +int xe_vm_get_property_ioctl(struct drm_device *drm, void *data,
> > + struct drm_file *file)
> > +{
> > + struct xe_device *xe = to_xe_device(drm);
> > + struct xe_file *xef = to_xe_file(file);
> > + struct drm_xe_vm_get_property *args = data;
> > + struct xe_vm *vm;
> > + int ret = 0;
> > +
> > + if (XE_IOCTL_DBG(xe, args->reserved[0] || args->reserved[1] ||
> > + args->reserved[2]))
> > + return -EINVAL;
> > +
> > + vm = xe_vm_lookup(xef, args->vm_id);
> > + if (XE_IOCTL_DBG(xe, !vm))
> > + return -ENOENT;
> > +
> > + ret = xe_vm_get_property_helper(vm, args);
> > +
> > + xe_vm_put(vm);
> > + return ret;
> > +}
> > +
> > /**
> > * xe_vm_bind_kernel_bo - bind a kernel BO to a VM
> > * @vm: VM to bind the BO to
> > diff --git a/drivers/gpu/drm/xe/xe_vm.h b/drivers/gpu/drm/xe/xe_vm.h
> > index e9f2de4189e0..f2675ec9e8c4 100644
> > --- a/drivers/gpu/drm/xe/xe_vm.h
> > +++ b/drivers/gpu/drm/xe/xe_vm.h
> > @@ -210,6 +210,9 @@ int xe_vm_destroy_ioctl(struct drm_device *dev, void
> > *data,
> > int xe_vm_bind_ioctl(struct drm_device *dev, void *data,
> > struct drm_file *file);
> > int xe_vm_query_vmas_attrs_ioctl(struct drm_device *dev, void *data,
> > struct drm_file *file);
> > +int xe_vm_get_property_ioctl(struct drm_device *dev, void *data,
> > + struct drm_file *file);
> > +
> > void xe_vm_close_and_put(struct xe_vm *vm);
> >
> > static inline bool xe_vm_in_fault_mode(struct xe_vm *vm)
> > --
> > 2.43.0
> >
>