On 5/30/2025 2:11 PM, Ma, Li wrote:
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Arun,
This patch is not for the issue we discussed in the other mail.
There is a risk of 'next' fence leak when amdgpu_ttm_fill_mem failed.
Best regards,
Li
-----Original Message-----
From: Paneer Selvam, Arunpravin <[email protected]>
Sent: Friday, May 30, 2025 1:57 PM
To: Ma, Li <[email protected]>; [email protected]
Cc: Deucher, Alexander <[email protected]>; Koenig, Christian
<[email protected]>; Yuan, Perry <[email protected]>
Subject: Re: [PATCH] drm/amdgpu: Fix potential dma_fence leak in
amdgpu_ttm_clear_buffer
Hi Ma,
On 5/29/2025 6:37 PM, Li Ma wrote:
The original code did not properly release the dma_fence `next` in case
amdgpu_ttm_fill_mem failed during buffer clearing.
Signed-off-by: Li Ma <[email protected]>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 9c5df35f05b7..b7284f0a5ac0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -2296,6 +2296,7 @@ int amdgpu_ttm_clear_buffer(struct amdgpu_bo *bo,
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
struct amdgpu_res_cursor cursor;
+ struct dma_fence *next = NULL;
u64 addr;
int r = 0;
@@ -2311,7 +2312,6 @@ int amdgpu_ttm_clear_buffer(struct amdgpu_bo *bo,
mutex_lock(&adev->mman.gtt_window_lock);
while (cursor.remaining) {
- struct dma_fence *next = NULL;
u64 size;
if (amdgpu_res_cleared(&cursor)) {
@@ -2334,10 +2334,13 @@ int amdgpu_ttm_clear_buffer(struct amdgpu_bo *bo,
dma_fence_put(*fence);
*fence = next;
+ next = NULL;
amdgpu_res_next(&cursor, size);
}
err:
+ if (next)
+ dma_fence_put(next);
This is okay for error case, but in success case we are dropping the
same fence twice. We are adding the last
returned fence to the bo and then we are already dropping the fence
there below the amdgpu_ttm_clear_buffer()
function call in amdgpu_bo_create() function.
Regards,
Arun.
Since you are observing use-after-free warning for the compute dispatch
test in amdgpu_test with this patch,
can we try the below code in amdgpu_bo_create() function,
r = amdgpu_ttm_clear_buffer(bo, bo->tbo.base.resv, &fence);
if (unlikely(r)) {
if (fence)
dma_fence_put(fence);
goto fail_unreserve;
}
Regards,
Arun.
mutex_unlock(&adev->mman.gtt_window_lock);
return r;