On 27.06.2017 18:59, Marek Olšák wrote:
On Tue, Jun 27, 2017 at 6:50 PM, Nicolai Hähnle <[email protected]> wrote:
On 27.06.2017 17:07, Marek Olšák wrote:
On Tue, Jun 27, 2017 at 9:22 AM, Nicolai Hähnle <[email protected]>
wrote:
On 27.06.2017 02:14, Marek Olšák wrote:
From: Marek Olšák <[email protected]>
Shader key size: 107 -> 47
Nice improvement.
Divisors of 0 and 1 are encoded in the shader key. Greater instance
divisors
are loaded from a constant buffer.
The shader code doing the division is huge. Is it something we need to
worry about? Does any app use instance divisors >= 2?
This reminds me of a certain LLVM improvement that I still need to clear.
I doubt instance divisors >= 2 are used. As a data point, Vulkan doesn't
support it as a feature at all, IIRC.
Can we get an optimized monotholic shader variant built for shaders that
have to fetch? This should help if anybody ever triggers this, because
We can't get optimized variants if we want to keep the shader key
small. If I put all instance divisors into key.opt, it would defeat
the effect of this patch.
What I meant is an optimized variant that still loads the instance divisors
from the constant buffer, but compiles the prolog and main parts together.
That way, LLVM can potentially schedule some of the division code before
waiting for the loads of other attributes that are per-vertex. That should
only require a single bit in the key.opt part.
Like this?
diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c
b/src/gallium/drivers/radeonsi/si_state_shaders.c
index 63cc746..af3f2a9 100644
--- a/src/gallium/drivers/radeonsi/si_state_shaders.c
+++ b/src/gallium/drivers/radeonsi/si_state_shaders.c
@@ -1192,6 +1192,11 @@ static void si_shader_selector_key_vs(struct
si_context *sctx,
prolog_key->instance_divisor_is_fetched =
sctx->vertex_elements->instance_divisor_is_fetched;
+ /* Prefer a monolithic shader to allow scheduling divisions around
+ * VBO loads. */
+ if (prolog_key->instance_divisor_is_fetched)
+ key->opt.prefer_mono = 1;
+
unsigned count = MIN2(vs->info.num_inputs,
sctx->vertex_elements->count);
memcpy(key->mono.vs_fix_fetch, sctx->vertex_elements->fix_fetch,
count);
Yes, that looks good. R-b with this as well, thanks!
Marek
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
_______________________________________________
mesa-dev mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/mesa-dev