On 2/14/2025 12:14 PM, Zhang, Jesse(Jie) wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> Hi Lijo,
> -----Original Message-----
> From: Lazar, Lijo <[email protected]>
> Sent: Friday, February 14, 2025 2:10 PM
> To: Zhang, Jesse(Jie) <[email protected]>; [email protected]
> Cc: Deucher, Alexander <[email protected]>; Kim, Jonathan
> <[email protected]>; Zhu, Jiadong <[email protected]>; Prosyak, Vitaly
> <[email protected]>
> Subject: Re: [PATCH 2/2] drm/amdgpu: Enable per-queue reset support
>
>
>
> On 2/14/2025 11:25 AM, [email protected] wrote:
>> From: "[email protected]" <[email protected]>
>>
>> This patch updates the SDMA v4.4.2 software initialization to enable
>> per-queue reset support when the MEC firmware version is 0xb0 or
>> higher and the PMFW supports SDMA reset.
>>
>> The following changes are included:
>> - Added a condition to check if the MEC firmware version is at least 0xb0
>> and if
>> the PMFW supports SDMA reset using `amdgpu_dpm_reset_sdma_is_supported`.
>> - If both conditions are met, the `AMDGPU_RESET_TYPE_PER_QUEUE` flag is set
>> in
>> `adev->sdma.supported_reset`.
>>
>> Suggested-by: Jonathan Kim <[email protected]>
>> Signed-off-by: Vitaly Prosyak <[email protected]>
>> Signed-off-by: Jesse Zhang <[email protected]>
>> ---
>> drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
>> b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
>> index b24a1ff5d743..e01d97b96655 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
>> @@ -1481,9 +1481,10 @@ static int sdma_v4_4_2_sw_init(struct amdgpu_ip_block
>> *ip_block)
>> }
>> }
>>
>> - /* TODO: Add queue reset mask when FW fully supports it */
>> adev->sdma.supported_reset =
>> amdgpu_get_soft_full_reset_mask(&adev->sdma.instance[0].ring);
>> + if (adev->gfx.mec_fw_version >= 0xb0 &&
>> amdgpu_dpm_reset_sdma_is_supported(adev))
>> + adev->sdma.supported_reset |= AMDGPU_RESET_TYPE_PER_QUEUE;
>
> This function is reused across multiple IP versions. MEC fw versions aren't
> the same across those IP versions.
>
> In fact, the user queue relies on MEC fw and pmfw when the sdma queue do
> reset.
> So we need to check both of them at here to skip old mec and pmfw.
>
To make it clear -
MEC FW >= 0xb0 is having reset support for say GC 9.4.3. With GC 9.5.0,
MEC FW 0x20 may have the same support.
Thanks,
Lijo
> Thanks
> Jesse
>
> Thanks,
> Lijo
>
>>
>> if (amdgpu_sdma_ras_sw_init(adev)) {
>> dev_err(adev->dev, "fail to initialize sdma ras block\n");
>