On Wed, Dec 7, 2022 at 11:35 PM Zhang, Yifan <[email protected]> wrote:
>
> [AMD Official Use Only - General]
>
> We encountered some issues in recent APUs when tried to pin a large
> framebuffer (e.g. 64MB w/ dual 4K display), switch to display SG could
> resolve such issue. Actually we received various kinds of VRAM shortage
> issues recently, there is more and more pressure on APU 512MB VRAM as FWs
> reserve more memory, buddy system in 5.19 creates more fragment and multiple
> 4k display scenario is used more often.. Since there is no difference b/w
> access VRAM and System memory in APUs from HW perspective, I think we can
> switch some of framebuffers to system memory to mitigate VRAM pressure.
>
> [ 52.798705] [TTM] Failed to find memory space for buffer
> 0x00000000833a4c59 eviction
> [ 52.798707] [TTM] No space for 00000000833a4c59 (16470 pages, 65880K, 64M)
> [ 52.798788] amdgpu 0000:e2:00.0: amdgpu: 000000003dbf313e pin failed
> [ 52.798790] [drm:dm_plane_helper_prepare_fb [amdgpu]] *ERROR* Failed to
> pin framebuffer with error -12
But from the patch:
- if (domain == (AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT)) {
+ if ((domain == (AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT)) &&
+ ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type ==
CHIP_STONEY))) {
domain = AMDGPU_GEM_DOMAIN_VRAM;
if (adev->gmc.real_vram_size <= AMDGPU_SG_THRESHOLD)
domain = AMDGPU_GEM_DOMAIN_GTT;
AMDGPU_SG_THRESHOLD is only used in this one place which only applies
to CZ and ST. There are not likely to be new CZ or ST boards any time
soon so I don't think it matters for newer APUs if we change
AMDGPU_SG_THRESHOLD.
Does the patch fix the issues you are seeing?
Alex
>
> Best Regards,
> Yifan
>
> -----Original Message-----
> From: Alex Deucher <[email protected]>
> Sent: Thursday, December 8, 2022 12:19 PM
> To: Zhang, Yifan <[email protected]>
> Cc: Christian König <[email protected]>; Zhang, Jesse(Jie)
> <[email protected]>; amd-gfx <[email protected]>;
> Paneer Selvam, Arunpravin <[email protected]>;
> [email protected]; Deucher, Alexander
> <[email protected]>; Koenig, Christian <[email protected]>
> Subject: Re: [PATCH] drm/amdgpu: try allowed domain when pin framebuffer
> failed
>
> On Wed, Dec 7, 2022 at 11:10 PM Zhang, Yifan <[email protected]> wrote:
> >
> > [AMD Official Use Only - General]
> >
> > Hi Alex,
> >
> > We need to adjust the AMDGPU_SG_THRESHOLD as well since recent APUs are
> > configured w/ 512MB VRAM. Pls check attached patch.
>
> Why do we need to increase this threshold? The condition only applies to CZ
> and ST, not more recent APUs.
>
> Alex
>
> >
> > Best Regards,
> > Yifan
> >
> > -----Original Message-----
> > From: Alex Deucher <[email protected]>
> > Sent: Thursday, December 8, 2022 12:21 AM
> > To: Christian König <[email protected]>
> > Cc: Zhang, Jesse(Jie) <[email protected]>; Zhang, Yifan
> > <[email protected]>; amd-gfx
> > <[email protected]>; Paneer Selvam, Arunpravin
> > <[email protected]>; [email protected];
> > Deucher, Alexander <[email protected]>; Koenig, Christian
> > <[email protected]>
> > Subject: Re: [PATCH] drm/amdgpu: try allowed domain when pin
> > framebuffer failed
> >
> > On Wed, Dec 7, 2022 at 11:10 AM Christian König
> > <[email protected]> wrote:
> > >
> > >
> > >
> > > Am 07.12.22 um 17:08 schrieb Alex Deucher:
> > > > On Wed, Dec 7, 2022 at 10:52 AM Christian König
> > > > <[email protected]> wrote:
> > > >> Am 07.12.22 um 16:38 schrieb Alex Deucher:
> > > >>> On Wed, Dec 7, 2022 at 10:23 AM Christian König
> > > >>> <[email protected]> wrote:
> > > >>>> I would go a step further and just allow GTT domain on ASICs !=
> > > >>>> CARRIZO
> > > >>>> | STONEY.
> > > >>>>
> > > >>>> I can't see a good reason we should still have any limitation
> > > >>>> here, VRAM doesn't have any advantage any more as far as I know.
> > > >>> Well, if VRAM is available we want to make sure someone uses it
> > > >>> otherwise it's just wasted.
> > > >> Well it still gets used when it's free. So now problem at all here.
> > > >>
> > > >> We should just not force anything into VRAM or GTT any more if
> > > >> it's technically not necessary.
> > > > So just this?
> > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > > index 919bbea2e3ac..8e8f07fa7a93 100644
> > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > > @@ -1506,7 +1506,7 @@ u64 amdgpu_bo_gpu_offset_no_check(struct
> > > > amdgpu_bo *bo)
> > > > uint32_t amdgpu_bo_get_preferred_domain(struct amdgpu_device *adev,
> > > > uint32_t domain)
> > > > {
> > > > - if (domain == (AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT))
> > > > {
> > >
> > > We still need to keep this check to avoid trying to adjust VRAM only
> > > allocations (the cursor still needs this IIRC).
> > >
> > > Apart from that I think that should work.
> >
> > Attached. Thanks,
> >
> > Alex
> >
> > >
> > > Christian.
> > >
> > > > + if ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type
> > > > + ==
> > > > CHIP_STONEY)) {
> > > > domain = AMDGPU_GEM_DOMAIN_VRAM;
> > > > if (adev->gmc.real_vram_size <= AMDGPU_SG_THRESHOLD)
> > > > domain = AMDGPU_GEM_DOMAIN_GTT;
> > > >
> > > >
> > > >
> > > >> Christian.
> > > >>
> > > >>> Alex
> > > >>>
> > > >>>
> > > >>>> Christian.
> > > >>>>
> > > >>>> Am 07.12.22 um 16:10 schrieb Alex Deucher:
> > > >>>>> Does this patch fix the problem?
> > > >>>>>
> > > >>>>> Alex
> > > >>>>>
> > > >>>>> On Wed, Dec 7, 2022 at 2:27 AM Zhang, Jesse(Jie)
> > > >>>>> <[email protected]> wrote:
> > > >>>>>> [AMD Official Use Only - General]
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> drm/amdgpu: try allowed domain when pin framebuffer failed.
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> [WHY&HOW]
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> in some scenarios, the allocate memory often failed. such
> > > >>>>>> as do hot plug or play games.
> > > >>>>>>
> > > >>>>>> so we can try allowed domain, if the preferred domain
> > > >>>>>> cannot allocate memory.
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> Signed-off-by: jie1zhan [email protected]
> > > >>>>>>
> > > >>>>>> Change-Id: I4b62e2ff072d02c515f901000a5789339d481273
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > >>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > >>>>>>
> > > >>>>>> index 1ae0c8723348..05fcaf7f9d92 100644
> > > >>>>>>
> > > >>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > >>>>>>
> > > >>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > > >>>>>>
> > > >>>>>> @@ -39,6 +39,7 @@
> > > >>>>>>
> > > >>>>>> #include "amdgpu.h"
> > > >>>>>>
> > > >>>>>> #include "amdgpu_trace.h"
> > > >>>>>>
> > > >>>>>> #include "amdgpu_amdkfd.h"
> > > >>>>>>
> > > >>>>>> +#include "amdgpu_display.h"
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> /**
> > > >>>>>>
> > > >>>>>> * DOC: amdgpu_object
> > > >>>>>>
> > > >>>>>> @@ -942,8 +943,14 @@ int amdgpu_bo_pin_restricted(struct
> > > >>>>>> amdgpu_bo *bo, u32 domain,
> > > >>>>>>
> > > >>>>>> bo->placements[i].lpfn = lpfn;
> > > >>>>>>
> > > >>>>>> }
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> + retry:
> > > >>>>>>
> > > >>>>>> r = ttm_bo_validate(&bo->tbo, &bo->placement,
> > > >>>>>> &ctx);
> > > >>>>>>
> > > >>>>>> if (unlikely(r)) {
> > > >>>>>>
> > > >>>>>> + //try allowed domain when pin failed. just a
> > > >>>>>> workaround.
> > > >>>>>>
> > > >>>>>> + if (unlikely(r == -ENOMEM) && domain !=
> > > >>>>>> + bo->allowed_domains) {
> > > >>>>>>
> > > >>>>>> + amdgpu_bo_placement_from_domain(bo,
> > > >>>>>> + bo->allowed_domains);
> > > >>>>>>
> > > >>>>>> + goto retry;
> > > >>>>>>
> > > >>>>>> + }
> > > >>>>>>
> > > >>>>>> dev_err(adev->dev, "%p pin failed\n", bo);
> > > >>>>>>
> > > >>>>>> goto error;
> > > >>>>>>
> > > >>>>>> }
> > >