Am 27.03.2013 12:02, schrieb Christian König:
Am 26.03.2013 18:03, schrieb Michel Dänzer:
On Die, 2013-03-26 at 17:37 +0100, Christian König wrote:
Am 26.03.2013 15:56, schrieb Michel Dänzer:
On Die, 2013-03-26 at 14:51 +0100, Christian König wrote:
From: Christian König <christian.koe...@amd.com>
Signed-off-by: Christian König <christian.koe...@amd.com>
[...]
diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.h
b/src/gallium/drivers/radeonsi/radeonsi_shader.h
index 9dae742..e09f297 100644
--- a/src/gallium/drivers/radeonsi/radeonsi_shader.h
+++ b/src/gallium/drivers/radeonsi/radeonsi_shader.h
@@ -111,13 +111,18 @@ struct si_shader {
unsigned nr_cbufs;
};
-struct si_shader_key {
- unsigned export_16bpc:8;
- unsigned nr_cbufs:4;
- unsigned color_two_side:1;
- unsigned alpha_func:3;
- unsigned flatshade:1;
- float alpha_ref;
+union si_shader_key {
+ struct {
+ unsigned export_16bpc:8;
+ unsigned nr_cbufs:4;
+ unsigned color_two_side:1;
+ unsigned alpha_func:3;
+ unsigned flatshade:1;
+ float alpha_ref;
+ } ps;
+ struct {
+ unsigned instance_divisors[PIPE_MAX_ATTRIBS];
+ } vs;
};
This grows the shader key from 8 to 128 bytes. I don't suppose the
instance divisors could be encoded in a more compact way? E.g. loading
the divisor values from constants and only tracking which elements
use a
divisor in a bitmask in the key.
Considered that also, and I have two problems with that approach:
1. While immediates are converted to shifts & muls, dividing even by a
constant in the shader isn't cheap.
Is that really significant? How much work would it be to come up with a
worst case test and measure the difference?
Well no idea how to measure that on SI, but when I implemented the
same feature on R600 the difference between using reciprocal and mul
compared to mulhi where quite significant.
How about storing only a byte for the instance_divisor? That limit's
the
divisor to a modulo of 256, but I don't think that would be so
extremly bad.
I have no idea what the impact of that would be. What happens if an app
tries to use a divisor >= 256?
It probably would select the wrong shader :(
That would reduce the key to 32 bytes instead.
Still seems kind of big.
Ok how about the following compromise: First we use a short for the
instance divisor, that makes the key 32 bytes in size and should leave
enough room for larger instance divisors, and second we don't copy the
key around so much any more.
Ups I wanted to write 64bytes in size, sorry.
Christian.
Regards,
Christian.
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev