amdgpu:fix gart table vram pin

Liu, Monk Mon, 06 Feb 2017 07:56:15 -0800

I recall why I made this patch

When testing SRIOV gpu reset feature, I it will always waiting and not return 
if without this patch, with more look into it:


Because gpu_srio_reset (will send patch for this routine later) doesn't call 
amdgpu_suspend(), so the gart table BO won't get unpin, which lead to driver 
infinite wait loop  if we pin it again in resume.
 
For bare-metal case, gpu_reset will call amdgpu_suspend so the gart bo will 
unpin.

BTW:
GPU_SRIOV_RESET is invoked after HYPERVISOR call VF_FLR on this vf device, so 
all IP blocks's suspend routine is not needed at all.

What about:
>> +    if (adev->gart.table_addr && amdgpu_sriov_vf(adev)) {
>> +            /* it's a resume call, gart already pin */
>> +            return 0;
>> +    }


BR Monk


-----Original Message-----
From: Christian König [mailto:[email protected]] 
Sent: Monday, February 06, 2017 10:31 PM
To: Liu, Monk <[email protected]>; [email protected]
Subject: Re: [PATCH 07/21] drm/amdgpu:fix gart table vram pin

Hui? We shouldn't need to call this function from a GPU reset, do we really do 
so?

But even if we call it from GPU reset we certainly should have called the 
matching unpin function before.

Otherwise we certainly won't be able to resume from the next suspend after a 
GPU reset.

Regards,
Christian.

Am 06.02.2017 um 15:25 schrieb Liu, Monk:
> Emmmm looks like I missed the part of S3 function
>
> But if this is from a GPU reset ,  we also shouldn't continue run this 
> function otherwise GPU reset will fail (SRIOV reset test)
>
> BR Monk
>
> -----Original Message-----
> From: Christian König [mailto:[email protected]]
> Sent: Monday, February 06, 2017 4:14 PM
> To: Liu, Monk <[email protected]>; [email protected]
> Subject: Re: [PATCH 07/21] drm/amdgpu:fix gart table vram pin
>
> A bug NAK on this! amdgpu_gart_table_vram_unpin() must be called during 
> suspend.
>
> Otherwise the GART table can be corrupted and we run into a whole bunch of 
> problems.
>
> We could add a "BUG_ON(adev->gart.table_addr != NULL);" here to double check 
> that, but just ignoring that something went horrible wrong is clearly the 
> wrong approach.
>
> Regards,
> Christian.
>
> Am 04.02.2017 um 11:34 schrieb Monk Liu:
>> if this call is from resume, shouldn't enter pin logic at all
>>
>> Change-Id: I40a5cdc2a716c4c20d2812fd74ece4ea284b6765
>> Signed-off-by: Monk Liu <[email protected]>
>> ---
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 5 +++++
>>    1 file changed, 5 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> index 964d2a9..5e907f7 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
>> @@ -151,6 +151,11 @@ int amdgpu_gart_table_vram_pin(struct amdgpu_device 
>> *adev)
>>      uint64_t gpu_addr;
>>      int r;
>>    
>> +    if (adev->gart.table_addr) {
>> +            /* it's a resume call, gart already pin */
>> +            return 0;
>> +    }
>> +
>>      r = amdgpu_bo_reserve(adev->gart.robj, false);
>>      if (unlikely(r != 0))
>>              return r;
>
> _______________________________________________
> amd-gfx mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


_______________________________________________
amd-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH 07/21] drm/amdgpu:fix gart table vram pin

Reply via email to