Ok, thanks!

From: Grodzovsky, Andrey
Sent: Thursday, February 01, 2018 12:59 AM
To: Yu, Xiangliang <[email protected]>; [email protected]; 
Deng, Emily <[email protected]>
Cc: Deucher, Alexander <[email protected]>; Koenig, Christian 
<[email protected]>; Wu, Haisheng <[email protected]>
Subject: Re: [PATCH] Revert "drm/amdgpu/gfx8: Fix compute ring failure after 
resetting"




On 01/25/2018 11:33 PM, Yu, Xiangliang wrote:

You can add amdgpu_sriov_vf() check to avoid breaking sriov.

+ Haisheng

As found out after more debugging  and discussion with Haisheng from HW team, 
the sequence introduced by this change is is wrong, it causes compute rings 
test failure because "the ring buffer has to be filled with valid packets (such 
as NOPs) first before submitting MAP_QUEUEs packet into KIQ. Once a compute 
engine is mapped, it will immediately execute the ring buffer if the RTPR is 
not equal to the WTPR from the MQD. It could lead to engine hang if the ring 
buffer filled with random data."

Hence we would like to revert this change in amd-staging-drm-next and continue 
investigation on the SR-IOV side why the correct programming sequence doesn't 
work there. I myself currently working on setting up SR-IOV setup to take a 
look at that.

Thanks,
Andrey






-----Original Message-----

From: Grodzovsky, Andrey

Sent: Friday, January 26, 2018 11:29 AM

To: Yu, Xiangliang <[email protected]><mailto:[email protected]>; amd-

[email protected]<mailto:[email protected]>

Cc: Deucher, Alexander 
<[email protected]><mailto:[email protected]>; Koenig, Christian

<[email protected]><mailto:[email protected]>

Subject: Re: [PATCH] Revert "drm/amdgpu/gfx8: Fix compute ring failure

after resetting"



No, just bare metal, I assumed your problem was with compute ring test

failure which I didn't see. Can you please recheck if reverting this still 
failing

on SRIOV ?

If so we obviously need to keep looking how to fix it.



Thanks,

Andrey



________________________________________

From: Yu, Xiangliang

Sent: 25 January 2018 20:59:45

To: Grodzovsky, Andrey; 
[email protected]<mailto:[email protected]>

Cc: Deucher, Alexander; Grodzovsky, Andrey; Koenig, Christian

Subject: RE: [PATCH] Revert "drm/amdgpu/gfx8: Fix compute ring failure

after resetting"



Did you test reset case in sriov?



-----Original Message-----

From: amd-gfx [mailto:[email protected]] On Behalf

Of Andrey Grodzovsky

Sent: Friday, January 26, 2018 7:07 AM

To: [email protected]<mailto:[email protected]>

Cc: Deucher, Alexander 
<[email protected]><mailto:[email protected]>; Grodzovsky,

Andrey

<[email protected]><mailto:[email protected]>; Yu, Xiangliang

<[email protected]><mailto:[email protected]>;

Koenig, Christian <[email protected]><mailto:[email protected]>

Subject: [PATCH] Revert "drm/amdgpu/gfx8: Fix compute ring failure

after resetting"



This reverts commit 75737cb4eb78c7f185e4700b4aa20cf7a3381aca.



Fixes GFX ring test failure after HW reset.

No compute ring test failures were observed with the change reverted.

So seems like whatever problem that change was addressing is not

present anymore.



Signed-off-by: Andrey Grodzovsky 
<[email protected]><mailto:[email protected]>

---

 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 10 +++-------

 1 file changed, 3 insertions(+), 7 deletions(-)



diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c

b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c

index 1207f36..8a65b53 100644

--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c

+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c

@@ -4847,6 +4847,9 @@ static int gfx_v8_0_kcq_init_queue(struct

amdgpu_ring *ring)

              /* reset MQD to a clean status */

              if (adev->gfx.mec.mqd_backup[mqd_idx])

                      memcpy(mqd, adev-

gfx.mec.mqd_backup[mqd_idx], sizeof(struct vi_mqd_allocation));

+             /* reset ring buffer */

+             ring->wptr = 0;

+             amdgpu_ring_clear_ring(ring);

      } else {

              amdgpu_ring_clear_ring(ring);

      }

@@ -4921,13 +4924,6 @@ static int gfx_v8_0_kiq_resume(struct

amdgpu_device *adev)

      /* Test KCQs */

      for (i = 0; i < adev->gfx.num_compute_rings; i++) {

              ring = &adev->gfx.compute_ring[i];

-             if (adev->in_gpu_reset) {

-                     /* move reset ring buffer to here to workaround

-                      * compute ring test failed

-                      */

-                     ring->wptr = 0;

-                     amdgpu_ring_clear_ring(ring);

-             }

              ring->ready = true;

              r = amdgpu_ring_test_ring(ring);

              if (r)

--

2.7.4



_______________________________________________

amd-gfx mailing list

[email protected]<mailto:[email protected]>

https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to