On 2025-07-18 12:09, Sunday Clement wrote:
For security reasons it is safer to have the kernel driver handle
calculating the sizing for the control stack on queue creation for
gfx9, rather than having it done in userspace where arbitrarily large
values can be passed in potentially wasting space in VMID0.

I thought we already did that. See these two commits:

commit 629568d25fea8ece4f65073f039aeef4e240ab67
Author: Philip Yang <[email protected]>
Date:   Wed Jun 26 15:03:05 2024 -0400

    drm/amdkfd: Validate queue cwsr area and eop buffer size

    When creating KFD user compute queue, check if queue eop buffer size,
    cwsr area size, ctl stack size equal to the size of KFD node
    properities.

    Check the entire cwsr area which may split into multiple svm ranges
    aligned to granularity boundary.

    Signed-off-by: Philip Yang <[email protected]>
    Reviewed-by: Felix Kuehling <[email protected]>
    Acked-by: Christian König <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>

commit 517fff221c1e6b8a8db69e7a440116caee120ff5
Author: Philip Yang <[email protected]>
Date:   Wed Jun 26 14:52:28 2024 -0400

    drm/amdkfd: Store queue cwsr area size to node properties

    Use the queue eop buffer size, cwsr area size, ctl stack size
    calculation from Thunk, store the value to KFD node properties.

    Those will be used to validate queue eop buffer size, cwsr area size,
    ctl stack size when creating KFD user compute queue.

    Those will be exposed to user space via sysfs KFD node properties, to
    remove the duplicate calculation code from Thunk.

    Signed-off-by: Philip Yang <[email protected]>
    Reviewed-by: Felix Kuehling <[email protected]>
    Acked-by: Christian König <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>


This stores the CWSR context save area, control stack and EOP buffer sizes in kfd_node_properties and validates the sizes from user mode during queue creation.

Regards,
  Felix



Signed-off-by: Sunday Clement <[email protected]>
---
  .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c   | 19 ++++++++++++++++++-
  1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
index 97933d2a3803..8841411050a3 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
@@ -135,8 +135,25 @@ static struct kfd_mem_obj *allocate_mqd(struct kfd_node 
*node,
                mqd_mem_obj = kzalloc(sizeof(struct kfd_mem_obj), GFP_KERNEL);
                if (!mqd_mem_obj)
                        return NULL;
+
+               uint16_t xcc_mask = node->adev->gfx.xcc_mask;
+               uint32_t num_xccs = NUM_XCC(xcc_mask);
+               uint32_t num_cu = node->adev->gfx.cu_info.number;
+
+               if (num_xccs == 0) {
+                       pr_err("Invalid XCC mask: %u\n", xcc_mask);
+                       kfree(mqd_mem_obj);
+                       return NULL;
+               }
+
+               num_cu /= num_xccs;
+
+               uint32_t num_waves = num_cu * 40;
+               /* Add Bytes to accommodate ContextSaveAreaHeader */
+               uint32_t ctl_stack_size = (num_waves * 8) + 8 + 42;
+
                retval = amdgpu_amdkfd_alloc_gtt_mem(node->adev,
-                       (ALIGN(q->ctl_stack_size, PAGE_SIZE) +
+                       (ALIGN(ctl_stack_size, PAGE_SIZE) +
                        ALIGN(sizeof(struct v9_mqd), PAGE_SIZE)) *
                        NUM_XCC(node->xcc_mask),
                        &(mqd_mem_obj->gtt_mem),

Reply via email to