On 2018-08-21 01:54 PM, Marek Olšák wrote:
Note that WAVES_PER_SH should be 0x3ff on the compute ring for the
ring priorities to be applied.

Correct, we would need to set WAVES_PER_SH during pipeline creation.

I don't know if you need to do the same
thing for the gfx ring. You can ask Andres for more info.

The gfx side seems to be okay, I only borked the compute side it seems.

This was probably working during tests since we have two mechanism to control scheduling priority at the hardware level. One is based on CP_HQD_*_PRIORITY and SPI_ARB_PRIORITY.PIPE_ORDER*.

This approach has the advantage of not requiring driver opt-in, and it dynamically adjusts the resource split between different queues. However, it does not offer low latency when high priority compute initially pre-empts a gfx worload.

The second approach uses SPI_SHADER_PGM_RSRC3_* and WAVES_PER_SH to indicate driver opt-in. Then when the kernel scheduler detects that we have some high priority work, it will create a static resource split between all clients using SPI_WCL_PIPE_PERCENT_*. Drivers that did not opt in will not be force to abide to the resource split by the hardware.

This approach has the disadvantage of possibly leaving some resources idle since we have a static split. However, it does provide good latency for a high priority compute job pre-empting a heavy graphics workload.

So even though we didn't opt-in for compute resource splits, we still had the first approach to fall back on.

This sort of raises the question, should compute drivers really opt-in to static resource splits? Since we have the first approach that does a better job at managing compute/compute scheduling priorities, do we want to bring in the disadvantages of approach #2?

I haven't run into any good real world examples where this has come into play, since game usage of multiple compute queues has been rare. But maybe it is something worth keeping in mind to revisit in the future.

Regards,
Andres


Marek
On Tue, Aug 14, 2018 at 12:10 PM Samuel Pitoiset
<[email protected]> wrote:

The last parameter of radeon_set_sh_reg_seq() is the number of
dwords to emit. We were lucky because WAVES_PER_SH(0x3) is 3 but
it was initialized to 0.

COMPUTE_RESOURCE_LIMITS is correctly set when generating
compute pipelines, so we don't need to initialize it.

Signed-off-by: Samuel Pitoiset <[email protected]>
---
  src/amd/vulkan/si_cmd_buffer.c | 4 +---
  1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c
index 2337036c67..2cfa7f4c2c 100644
--- a/src/amd/vulkan/si_cmd_buffer.c
+++ b/src/amd/vulkan/si_cmd_buffer.c
@@ -88,9 +88,7 @@ si_emit_compute(struct radv_physical_device *physical_device,
         radeon_emit(cs, 0);
         radeon_emit(cs, 0);

-       radeon_set_sh_reg_seq(cs, R_00B854_COMPUTE_RESOURCE_LIMITS,
-                             S_00B854_WAVES_PER_SH(0x3));
-       radeon_emit(cs, 0);
+       radeon_set_sh_reg_seq(cs, R_00B858_COMPUTE_STATIC_THREAD_MGMT_SE0, 2);
         /* R_00B858_COMPUTE_STATIC_THREAD_MGMT_SE0 / SE1 */
         radeon_emit(cs, S_00B858_SH0_CU_EN(0xffff) | 
S_00B858_SH1_CU_EN(0xffff));
         radeon_emit(cs, S_00B85C_SH0_CU_EN(0xffff) | 
S_00B85C_SH1_CU_EN(0xffff));
--
2.18.0

_______________________________________________
mesa-dev mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
_______________________________________________
mesa-dev mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to