Re: [PATCH v3] drm/amdgpu: Increase KIQ invalidate_tlbs timeout

Jay Cornwall Wed, 02 Apr 2025 09:50:26 -0700

On 4/2/2025 02:37, Christian König wrote:

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index ffca74a476da..3cdb5f8325aa 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -356,7 +356,6 @@ enum amdgpu_kiq_irq {
        AMDGPU_CP_KIQ_IRQ_DRIVER0 = 0,
        AMDGPU_CP_KIQ_IRQ_LAST
  };
-#define SRIOV_USEC_TIMEOUT  1200000 /* wait 12 * 100ms for SRIOV */
  #define MAX_KIQ_REG_WAIT       5000 /* in usecs, 5ms */
  #define MAX_KIQ_REG_BAILOUT_INTERVAL   5 /* in msecs, 5ms */
  #define MAX_KIQ_REG_TRY 1000


Unrelated to this patch here, but defines like those *must* have an AMDGPU_ 
prefix.

Please fix in a follow up patch.

Sure. A deeper problem which has led to these macros is the duplicationof polling logic across several different files.

We could instead move this code into amdgpu_fence_wait_polling. Allclients would then abort early on in_reset or in_interrupt. There are acouple of users with different timeouts (adev->usec_timeout and ahard-coded 2100ms) which could be unified or retained with a fixed 5mspolling interval.


adev->usec_timeout is too low for this particular system under load.

Re: [PATCH v3] drm/amdgpu: Increase KIQ invalidate_tlbs timeout

Reply via email to