[Public] Regards, Prike
> -----Original Message----- > From: Alex Deucher <alexdeuc...@gmail.com> > Sent: Wednesday, September 17, 2025 10:10 PM > To: Liang, Prike <prike.li...@amd.com> > Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander > <alexander.deuc...@amd.com>; Koenig, Christian <christian.koe...@amd.com> > Subject: Re: [PATCH v2 9/9] drm/amdgpu: validate userq va for GEM unmap > > On Fri, Sep 12, 2025 at 2:04 AM Prike Liang <prike.li...@amd.com> wrote: > > > > When an user unmaps a userq VA, the driver must ensure the queue has > > no in-flight jobs. If there is pending work, the kernel should wait > > for the attached eviction (bookkeeping) fence to signal before > > deleting the mapping. > > > > Suggested-by: Christian König <christian.koe...@amd.com> > > Signed-off-by: Prike Liang <prike.li...@amd.com> > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c | 29 > > +++++++++++++++++++++++ drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h | > 3 +++ > > drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 11 +++++++++ > > 3 files changed, 43 insertions(+) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c > > index a13f186f0a8a..e14dcdcfe36e 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c > > @@ -1182,3 +1182,32 @@ int > amdgpu_userq_start_sched_for_enforce_isolation(struct amdgpu_device *adev, > > mutex_unlock(&adev->userq_mutex); > > return ret; > > } > > + > > +int amdgpu_userq_gem_va_unmap_validate(struct amdgpu_device *adev, > > + struct amdgpu_bo_va_mapping *mapping, > > + uint64_t saddr) { > > + u32 ip_mask = amdgpu_userq_get_supported_ip_mask(adev); > > + struct amdgpu_bo_va *bo_va = mapping->bo_va; > > + struct dma_resv *resv = bo_va->base.bo->tbo.base.resv; > > + int ret; > > + > > + if (!ip_mask) > > + return 0; > > + > > + dev_warn(adev->dev, "now unmapping a vital queue va:%llx\n", > > + saddr); > > dev_warn_once() so we don't spam the logs. It seems unlikely to run into this case, but I'm okay to change this to dev_warn_once() or dev_warn_ratelimited(). > > > + /** > > + * The userq VA mapping reservation should include the eviction > > fence, if the > eviction fence > > + * can't signal successfully during unmapping, then driver will > > warn to flag > this improper unmap > > + * of the userq VA. > > + * Note: The eviction fence may be attached to different BOs, and > > this unmap > is only for one kind > > + * of userq VAs, so at this point suppose the eviction fence is > > always > unsignaled. > > + */ > > + if (!dma_resv_test_signaled(resv, DMA_RESV_USAGE_BOOKKEEP)) { > > + ret = dma_resv_wait_timeout(resv, DMA_RESV_USAGE_BOOKKEEP, > true, MAX_SCHEDULE_TIMEOUT); > > + if (ret <= 0) > > + return -EBUSY; > > + } > > + > > + return 0; > > +} > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h > > index 8cd307be7256..c9a41876f10e 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h > > @@ -157,4 +157,7 @@ int amdgpu_userq_buffer_va_put(struct amdgpu_vm > > *vm, u64 addr); int amdgpu_userq_buffer_vas_put(struct amdgpu_device *adev, > > struct amdgpu_usermode_queue *queue); > > bool amdgpu_userq_buffer_vas_mapped(struct amdgpu_usermode_queue > > *queue); > > +int amdgpu_userq_gem_va_unmap_validate(struct amdgpu_device *adev, > > + struct amdgpu_bo_va_mapping *mapping, > > + uint64_t saddr); > > #endif > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > > index bd12d8ff15a4..ccde1f040cef 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c > > @@ -1941,6 +1941,7 @@ int amdgpu_vm_bo_unmap(struct amdgpu_device > *adev, > > struct amdgpu_bo_va_mapping *mapping; > > struct amdgpu_vm *vm = bo_va->base.vm; > > bool valid = true; > > + int r; > > > > saddr /= AMDGPU_GPU_PAGE_SIZE; > > > > @@ -1961,6 +1962,16 @@ int amdgpu_vm_bo_unmap(struct amdgpu_device > *adev, > > return -ENOENT; > > } > > > > + /* It's unlikely to happen that the mapping userq hasn't been idled > > + * during user requests GEM unmap IOCTL except for forcing the unmap > > + * from user space. > > + */ > > + if (unlikely(atomic_read(&bo_va->userq_va_mapped) > 0)) { > > + r = amdgpu_userq_gem_va_unmap_validate(adev, mapping, > > saddr); > > + if (unlikely(r == -EBUSY)) > > + dev_warn(adev->dev, "Here should be an > > + improper unmap request from user space\n"); > > dev_warn_once(). > > This looks good to me, but it would be good to get Christian's input as well. > > Alex > > > + } > > + > > list_del(&mapping->list); > > amdgpu_vm_it_remove(mapping, &vm->va); > > mapping->bo_va = NULL; > > -- > > 2.34.1 > >