On 5/8/25 09:08, Liang, Prike wrote:
> [Public]
> 
>> From: Koenig, Christian <[email protected]>
>> Sent: Tuesday, May 6, 2025 4:39 PM
>> To: Liang, Prike <[email protected]>; [email protected]
>> Cc: Deucher, Alexander <[email protected]>
>> Subject: Re: [PATCH v3 4/5] drm/amdgpu: validate the eviction fence before
>> attaching/detaching
>>
>> On 5/6/25 10:22, Liang, Prike wrote:
>>>>> -   /* attach gfx eviction fence */
>>>>> +   /* attach gfx the validated eviction fence */
>>>>>     r = amdgpu_eviction_fence_attach(&fpriv->evf_mgr, abo);
>>>>>     if (r) {
>>>>>             DRM_DEBUG_DRIVER("Failed to attach eviction fence to BO\n");
>>>>> +           amdgpu_bo_unreserve(abo);
>>>> Adding this here looks like the only valid fix in the patch.
>>> As the eviction fence will be invalidated until the user queue is created 
>>> from the
>> user space, here it requires validating the eviction fence before trying to 
>> attach
>> and detach it to the reservation.
>>> I will try to draft a patch for validating the eviction fence at 
>>> attach/detach
>> separately with this attach error handler change.
>>
>>
>> No, that is clearly incorrect.
>>
>> See the eviction fence works like this:
>>
>> Validating thread
>> * Create new eviction fence
>> * Publish eviction fence
>> * Lock all BOs
>> * Replace eviction fence
>>
>> Attaching:
>> * Lock BO
>> * Attach current eviction fence
>> * Unlock BO
>>
>> Detaching:
>> * Lock BO
>> * Unconditionally detach all possible eviction fences, no matter if new or 
>> old.
>> * Unlock BO
>>
>> This order is necessary or otherwise you break the logic here.
>>
>> Any additional check will completely mess that up because it makes the 
>> operation
>> racy.
> As the user queue eviction fence doesn't create until user queue submission, 
> the eviction fence will be NULL without userq submission. So do we still try 
> to attach/detach the null eviction fence for the kernel queue case?

Yes, the problem is that we can't check the eviction fence before we have taken 
the reservation lock.

Otherwise it can always be that there is an eviction fence created between the 
check and attaching it.

I also suggested before that the eviction fence is never NULL, we just start 
with a dummy stub fence (see function dma_fence_get_stub()). This way we can 
avoid all the NULL checks.

> It's ok without validating the eviction fence or userqueue work before 
> attach/detach the eviction fence, but it will cost cycles for walking over 
> the reservation fences array in the dma_resv_reserve_fences() and 
> dma_resv_replace_fences().

That's completely irrelevant. Important is that we have the right sequence to 
not create a race condition.

Regards,
Christian.


> 
>> Regards,
>> Christian.
>>
>>>
>>> Thanks,
>>> Prike
>>>
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>>             return r;
>>>>>     }
>>>>>
> 

Reply via email to