On 4/17/2024 12:05 AM, Ahmad Rehman wrote:
> In passthrough environment, the driver triggers the mode-1 reset on
> reload. The reset causes the core dump collection which is delayed task
> and prevents driver from unloading until it is completed. Since we do
> not need to collect data on "reset on reload" case, we can skip core
> dump collection.
> 
> Signed-off-by: Ahmad Rehman <[email protected]>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    | 1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h  | 1 +
>  3 files changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 1b2e177bc2d6..b4a41f075512 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -5430,7 +5430,8 @@ int amdgpu_do_asic_reset(struct list_head 
> *device_list_handle,
>  
>                               vram_lost = 
> amdgpu_device_check_vram_lost(tmp_adev);
>  
> -                             amdgpu_coredump(tmp_adev, vram_lost, 
> reset_context);
> +                             if (!test_bit(AMDGPU_SKIP_COREDUMP, 
> &reset_context->flags))

In addition, use this flag earlier and avoid calling
"amdgpu_reset_reg_dumps" based on the flag.

Thanks,
Lijo

> +                                     amdgpu_coredump(tmp_adev, vram_lost, 
> reset_context);
>  
>                               if (vram_lost) {
>                                       DRM_INFO("VRAM is lost due to GPU 
> reset!\n");
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 6ea893ad9a36..c512f70b8272 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -2481,6 +2481,7 @@ static void 
> amdgpu_drv_delayed_reset_work_handler(struct work_struct *work)
>  
>       /* Use a common context, just need to make sure full reset is done */
>       set_bit(AMDGPU_SKIP_HW_RESET, &reset_context.flags);
> +     set_bit(AMDGPU_SKIP_COREDUMP, &reset_context.flags);
>       r = amdgpu_do_asic_reset(&device_list, &reset_context);
>  
>       if (r) {
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
> index 66125d43cf21..b11d190ece53 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
> @@ -32,6 +32,7 @@ enum AMDGPU_RESET_FLAGS {
>  
>       AMDGPU_NEED_FULL_RESET = 0,
>       AMDGPU_SKIP_HW_RESET = 1,
> +     AMDGPU_SKIP_COREDUMP = 2,
>  };
>  
>  struct amdgpu_reset_context {

Reply via email to