[AMD Official Use Only - AMD Internal Distribution Only] Signed-off-by: Tony Yi <[email protected]> ________________________________ From: Zhang, Hawking <[email protected]> Sent: Wednesday, April 2, 2025 12:23 AM To: Skvortsov, Victor <[email protected]>; [email protected] <[email protected]> Cc: Luo, Zhigang <[email protected]>; Zhou1, Tao <[email protected]>; Zhao, Victor <[email protected]>; Yi, Tony <[email protected]> Subject: RE: [PATCH] drm/amdgpu: Fix CPER error handling on VFs
[AMD Official Use Only - AMD Internal Distribution Only] Reviewed-by: Hawking Zhang <[email protected]> Regards, Hawking -----Original Message----- From: Skvortsov, Victor <[email protected]> Sent: Wednesday, April 2, 2025 04:44 To: [email protected] Cc: Luo, Zhigang <[email protected]>; Zhang, Hawking <[email protected]>; Zhou1, Tao <[email protected]>; Zhao, Victor <[email protected]>; Yi, Tony <[email protected]>; Skvortsov, Victor <[email protected]> Subject: [PATCH] drm/amdgpu: Fix CPER error handling on VFs From: Tony Yi <[email protected]> CPER read will loop infinitely if an error is encountered and the more bit is set. Add error checks to break upon failure. Suggested-by: Tony Yi <[email protected]> Signed-off-by: Victor Skvortsov <[email protected]> --- drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c index 0bb8cbe0dcc0..8d2da3a27440 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c @@ -1378,14 +1378,16 @@ amdgpu_virt_write_cpers_to_ring(struct amdgpu_device *adev, used_size = host_telemetry->header.used_size; if (used_size > (AMD_SRIOV_RAS_TELEMETRY_SIZE_KB << 10)) - return 0; + return -EINVAL; cper_dump = kmemdup(&host_telemetry->body.cper_dump, used_size, GFP_KERNEL); if (!cper_dump) return -ENOMEM; - if (checksum != amd_sriov_msg_checksum(cper_dump, used_size, 0, 0)) + if (checksum != amd_sriov_msg_checksum(cper_dump, used_size, 0, 0)) { + ret = -EINVAL; goto out; + } *more = cper_dump->more; @@ -1434,7 +1436,7 @@ static int amdgpu_virt_req_ras_cper_dump_internal(struct amdgpu_device *adev) adev, virt->fw_reserve.ras_telemetry, &more); else ret = 0; - } while (more); + } while (more && !ret); return ret; } -- 2.34.1
