*diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c*
*index 97a8f786cf85..9352fcb77fe9 100644*
*--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c*
*+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c*
*@@ -812,6 +812,13 @@* void amdgpu_kiq_wreg(struct amdgpu_device
*adev, uint32_t reg, uint32_t v)
int amdgpu_gfx_get_num_kcq(struct amdgpu_device *adev)
{
if (amdgpu_num_kcq == -1) {
+ /* raven firmware currently can not load balance jobs
+ * among multiple compute queues. Enable only one
+ * compute queue till we have a firmware fix.
+ */
+ if (adev->asic_type == CHIP_RAVEN)
+ return 1;
+
return 8;
} else if (amdgpu_num_kcq > 8 || amdgpu_num_kcq < 0) {
dev_warn(adev->dev, "set kernel compute queue number to 8 due to
invalid parameter provided by user\n");
And I am glad to see that we have a solution to fix this issue at
current. Nice work, Changfeng!
Best Regards,
Ray
*From:* Deucher, Alexander <[email protected]>
*Sent:* Wednesday, May 19, 2021 11:04 AM
*To:* Chen, Guchun <[email protected]>; Zhu, Changfeng
<[email protected]>; Alex Deucher <[email protected]>; Das,
Nirmoy <[email protected]>
*Cc:* Huang, Ray <[email protected]>; amd-gfx list
<[email protected]>
*Subject:* Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to
avoid compute hang
[Public]
I thought we had disabled all but one of the compute queues on raven
due to this issue or at least disabled the schedulers for the
additional queues, but maybe I'm misremembering.
Alex
------------------------------------------------------------------------
*From:*Chen, Guchun <[email protected] <mailto:[email protected]>>
*Sent:* Tuesday, May 18, 2021 11:00 PM
*To:* Zhu, Changfeng <[email protected]
<mailto:[email protected]>>; Deucher, Alexander
<[email protected] <mailto:[email protected]>>; Alex
Deucher <[email protected] <mailto:[email protected]>>; Das,
Nirmoy <[email protected] <mailto:[email protected]>>
*Cc:* Huang, Ray <[email protected] <mailto:[email protected]>>;
amd-gfx list <[email protected]
<mailto:[email protected]>>
*Subject:* RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to
avoid compute hang
[Public]
Nirmoy’s patch landed already if I understand correctly.
d41a39dda140 drm/scheduler: improve job distribution with multiple queues
Regards,
Guchun
*From:* amd-gfx <[email protected]
<mailto:[email protected]>> *On Behalf Of *Zhu,
Changfeng
*Sent:* Wednesday, May 19, 2021 10:56 AM
*To:* Deucher, Alexander <[email protected]
<mailto:[email protected]>>; Alex Deucher
<[email protected] <mailto:[email protected]>>; Das, Nirmoy
<[email protected] <mailto:[email protected]>>
*Cc:* Huang, Ray <[email protected] <mailto:[email protected]>>;
amd-gfx list <[email protected]
<mailto:[email protected]>>
*Subject:* RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to
avoid compute hang
[Public]
[Public]
Hi Alex,
This is the issue exposed by Nirmoy's patch that provided better load
balancing across queues.
BR,
Changfeng.
*From:* Deucher, Alexander <[email protected]
<mailto:[email protected]>>
*Sent:* Wednesday, May 19, 2021 10:53 AM
*To:* Zhu, Changfeng <[email protected]
<mailto:[email protected]>>; Alex Deucher <[email protected]
<mailto:[email protected]>>; Das, Nirmoy <[email protected]
<mailto:[email protected]>>
*Cc:* Huang, Ray <[email protected] <mailto:[email protected]>>;
amd-gfx list <[email protected]
<mailto:[email protected]>>
*Subject:* Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to
avoid compute hang
[Public]
+ Nirmoy
I thought we disabled all but one of the compute queues on raven due
to this issue. Maybe that patch never landed? Wasn't this the same
issue that was exposed by Nirmoy's patch that provided better load
balancing across queues?
Alex
------------------------------------------------------------------------
*From:*amd-gfx <[email protected]
<mailto:[email protected]>> on behalf of Zhu,
Changfeng <[email protected] <mailto:[email protected]>>
*Sent:* Tuesday, May 18, 2021 10:28 PM
*To:* Alex Deucher <[email protected] <mailto:[email protected]>>
*Cc:* Huang, Ray <[email protected] <mailto:[email protected]>>;
amd-gfx list <[email protected]
<mailto:[email protected]>>
*Subject:* RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to
avoid compute hang
[AMD Official Use Only - Internal Distribution Only]
Hi Alex.
I have submitted the patch: drm/amdgpu: disable 3DCGCG on
picasso/raven1 to avoid compute hang
Do you mean we have something else to do for re-enabling the extra
compute queues?
BR,
Changfeng.
-----Original Message-----
From: Alex Deucher <[email protected] <mailto:[email protected]>>
Sent: Wednesday, May 19, 2021 10:20 AM
To: Zhu, Changfeng <[email protected] <mailto:[email protected]>>
Cc: Huang, Ray <[email protected] <mailto:[email protected]>>; amd-gfx
list <[email protected]
<mailto:[email protected]>>
Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to
avoid compute hang
Care to submit a patch to re-enable the extra compute queues?
Alex
On Mon, May 17, 2021 at 4:09 AM Zhu, Changfeng <[email protected]
<mailto:[email protected]>> wrote:
>
> [AMD Official Use Only - Internal Distribution Only]
>
> Hi Ray and Alex,
>
> I have confirmed it can enable the additional compute queues with
this patch:
>
> [ 41.823013] This is ring mec 1, pipe 0, queue 0, value 1
> [ 41.823028] This is ring mec 1, pipe 1, queue 0, value 1
> [ 41.823042] This is ring mec 1, pipe 2, queue 0, value 1
> [ 41.823057] This is ring mec 1, pipe 3, queue 0, value 1
> [ 41.823071] This is ring mec 1, pipe 0, queue 1, value 1
> [ 41.823086] This is ring mec 1, pipe 1, queue 1, value 1
> [ 41.823101] This is ring mec 1, pipe 2, queue 1, value 1
> [ 41.823115] This is ring mec 1, pipe 3, queue 1, value 1
>
> BR,
> Changfeng.
>
>
> -----Original Message-----
> From: Huang, Ray <[email protected] <mailto:[email protected]>>
> Sent: Monday, May 17, 2021 2:27 PM
> To: Alex Deucher <[email protected]
<mailto:[email protected]>>; Zhu, Changfeng
> <[email protected] <mailto:[email protected]>>
> Cc: amd-gfx list <[email protected]
<mailto:[email protected]>>
> Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to
> avoid compute hang
>
> On Fri, May 14, 2021 at 10:13:55PM +0800, Alex Deucher wrote:
> > On Fri, May 14, 2021 at 4:20 AM <[email protected]
<mailto:[email protected]>> wrote:
> > >
> > > From: changzhu <[email protected]
<mailto:[email protected]>>
> > >
> > > From: Changfeng <[email protected]
<mailto:[email protected]>>
> > >
> > > There is problem with 3DCGCG firmware and it will cause compute
> > > test hang on picasso/raven1. It needs to disable 3DCGCG in driver
> > > to avoid compute hang.
> > >
> > > Change-Id: Ic7d3c7922b2b32f7ac5193d6a4869cbc5b3baa87
> > > Signed-off-by: Changfeng <[email protected]
<mailto:[email protected]>>
> >
> > Reviewed-by: Alex Deucher <[email protected]
<mailto:[email protected]>>
> >
> > WIth this applied, can we re-enable the additional compute queues?
> >
>
> I think so.
>
> Changfeng, could you please confirm this on all raven series?
>
> Patch is Reviewed-by: Huang Rui <[email protected]
<mailto:[email protected]>>
>
> > Alex
> >
> > > ---
> > > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 +++++++---
> > > drivers/gpu/drm/amd/amdgpu/soc15.c | 2 --
> > > 2 files changed, 7 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> > > index 22608c45f07c..feaa5e4a5538 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> > > @@ -4947,7 +4947,7 @@ static void
gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev,
> > > amdgpu_gfx_rlc_enter_safe_mode(adev);
> > >
> > > /* Enable 3D CGCG/CGLS */
> > > - if (enable && (adev->cg_flags &
AMD_CG_SUPPORT_GFX_3D_CGCG)) {
> > > + if (enable) {
> > > /* write cmd to clear cgcg/cgls ov */
> > > def = data = RREG32_SOC15(GC, 0,
mmRLC_CGTT_MGCG_OVERRIDE);
> > > /* unset CGCG override */ @@ -4959,8 +4959,12 @@
> > > static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device
*adev,
> > > /* enable 3Dcgcg FSM(0x0000363f) */
> > > def = RREG32_SOC15(GC, 0,
> > > mmRLC_CGCG_CGLS_CTRL_3D);
> > >
> > > - data = (0x36 <<
RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) |
> > > - RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK;
> > > + if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG)
> > > + data = (0x36 <<
RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) |
> > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK;
> > > + else
> > > + data = 0x0 <<
> > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT;
> > > +
> > > if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGLS)
> > > data |= (0x000F <<
RLC_CGCG_CGLS_CTRL_3D__CGLS_REP_COMPANSAT_DELAY__SHIFT) |
> > >
> > > RLC_CGCG_CGLS_CTRL_3D__CGLS_EN_MASK;
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c
> > > b/drivers/gpu/drm/amd/amdgpu/soc15.c
> > > index 4b660b2d1c22..080e715799d4 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
> > > @@ -1393,7 +1393,6 @@ static int soc15_common_early_init(void
*handle)
> > > adev->cg_flags = AMD_CG_SUPPORT_GFX_MGCG |
> > > AMD_CG_SUPPORT_GFX_MGLS |
> > > AMD_CG_SUPPORT_GFX_CP_LS |
> > > - AMD_CG_SUPPORT_GFX_3D_CGCG |
> > > AMD_CG_SUPPORT_GFX_3D_CGLS |
> > > AMD_CG_SUPPORT_GFX_CGCG |
> > > AMD_CG_SUPPORT_GFX_CGLS | @@
> > > -1413,7
> > > +1412,6 @@ static int soc15_common_early_init(void *handle)
> > > AMD_CG_SUPPORT_GFX_MGLS |
> > > AMD_CG_SUPPORT_GFX_RLC_LS |
> > > AMD_CG_SUPPORT_GFX_CP_LS |
> > > - AMD_CG_SUPPORT_GFX_3D_CGCG |
> > > AMD_CG_SUPPORT_GFX_3D_CGLS |
> > > AMD_CG_SUPPORT_GFX_CGCG |
> > > AMD_CG_SUPPORT_GFX_CGLS |
> > > --
> > > 2.17.1
> > >
> > > _______________________________________________
> > > amd-gfx mailing list
> > > [email protected] <mailto:[email protected]>
> > >
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2F
<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2F>
> > > li
> > > sts.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C0
> > > 1%
> > > 7CRay.Huang%40amd.com%7C0e273856253d4b3efd0b08d916e2892a%7C3dd8961
> > > fe
> > > 4884e608e11a82d994e183d%7C0%7C0%7C637565984495414849%7CUnknown%7CT
> > > WF
> > > pbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXV
> > > CI
> > > 6Mn0%3D%7C1000&sdata=lBzswAPBguL0mWFglEk%2Bg2eDCEuhir7JfFjov%2
> > > BV
> > > 7pSY%3D&reserved=0
_______________________________________________
amd-gfx mailing list
[email protected] <mailto:[email protected]>
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Calexander.deucher%40amd.com%7C6d2cfe6e59f54875f6fa08d91a6dd27f%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637569881259273626%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=33Is2P3sqdabI7PPuHFOmzuvXyFId%2BOTAMyJ8G5PhzI%3D&reserved=0
<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Cguchun.chen%40amd.com%7C3fc7a549816d4c8061c008d91a719cb8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637569897555065647%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=YTC%2FvVR%2BbPKw9JKayhmHapRkkEFaczoGzJJ3jFJqBAM%3D&reserved=0>