On 2024-10-04 11:54, Xiaogang.Chen wrote:
From: Xiaogang Chen <[email protected]>

When kfd process has been terminated not restore userptr buffer after mmu
notifier invalidates a range.

Is this fixing a real problem or a hypothetical problem? If there is a real problem, can you include more information here? It looks to me that amdgpu_amdkfd_restore_userptr_worker is already handling the cases where a process or mm_struct no longer exists. Maybe the only user visible change from this patch is, that you no longer print "Failed to quiesce KFD" in a corner case of an MMU notifier for a process that no longer exists?

Or is there a problem with the lifetime of the process_info that contains the work_struct?



Signed-off-by: Xiaogang Chen<[email protected]>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 12 ++++++++----
  1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index ce5ca304dba9..6b4be7893dfb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -2524,11 +2524,15 @@ int amdgpu_amdkfd_evict_userptr(struct 
mmu_interval_notifier *mni,
                /* First eviction, stop the queues */
                r = kgd2kfd_quiesce_mm(mni->mm,
                                       KFD_QUEUE_EVICTION_TRIGGER_USERPTR);
-               if (r)
+
+               if (r && r != -ESRCH)
                        pr_err("Failed to quiesce KFD\n");
-               queue_delayed_work(system_freezable_wq,
-                       &process_info->restore_userptr_work,
-                       msecs_to_jiffies(AMDGPU_USERPTR_RESTORE_DELAY_MS));
+
+               if (r != -ESRCH) {

Braces are not required because there is only one statement inside the if.

Regards,
  Felix


+                       queue_delayed_work(system_freezable_wq,
+                               &process_info->restore_userptr_work,
+                               
msecs_to_jiffies(AMDGPU_USERPTR_RESTORE_DELAY_MS));
+               }
        }
        mutex_unlock(&process_info->notifier_lock);

Reply via email to