On 2023-11-06 5:40, ZhenGuo Yin wrote:
[Why]
There will be a warning trace when cleaning up the gtt
drm_mm allocator during unloading driver since gang_ctx_bo
and wptr_bo do not get freed.
This isn't just a problem with module unloading, but a more general
memory leak. pqm_uninit runs not during module unload, but during every
ROCm process termination.
[How]
Free gang_ctx_bo and wptr_bo in pqm_uninit.
Signed-off-by: ZhenGuo Yin <[email protected]>
---
drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index 77649392e233..fdb03b08df72 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -179,6 +179,14 @@ void pqm_uninit(struct process_queue_manager *pqm)
!pqn->q->device->kfd->shared_resources.enable_mes)
amdgpu_amdkfd_remove_gws_from_process(pqm->process->kgd_process_info,
pqn->q->gws);
+
+ if (pqn->q->device->kfd->shared_resources.enable_mes) {
+ amdgpu_amdkfd_free_gtt_mem(pqn->q->device->kfd->adev,
+ pqn->q->gang_ctx_bo);
+ if (pqn->q->wptr_bo)
+ amdgpu_amdkfd_free_gtt_mem(pqn->q->device->kfd->adev,
pqn->q->wptr_bo);
+ }
It looks like we're duplicating more and more code from
pqm_destroy_queue here. I wonder if we should have a common helper
function for freeing a queue's resources that could get used in both places.
Regards,
Felix
+
kfd_procfs_del_queue(pqn->q);
uninit_queue(pqn->q);
list_del(&pqn->process_queue_list);