On 10/30/2012 08:28 PM, Eric Anholt wrote:
Here's a patch series to clean up the most glaring failures I think we have
left in FS code generation other than variable-indexed array access.
Unfortunately, I haven't found a particular testcase to show that it's a
performance improvement, but I still think it's a good idea since it does
remove instructions.
I was hoping this series would let me remove the badly-named
register_coalesce() pass, but it turns out that doing so increases shader-db
instruction count by a significant fraction of a percent. A bit of a
surprise.
Patches 1-3 are:
Reviewed-by: Kenneth Graunke <[email protected]>
(I haven't looked at patch 4 yet, but I will soon.)
This also has another benefit: it cuts compilation time of L4D2's
largest fragment shader from 10.2 to 4.3 seconds (a 57% reduction!).
We used to do 26 iterations through the brw_fs optimization loop; the
first two did a bunch of optimizing, but on iterations 3-25 only
register_coalesce() flagged any progress. Which also meant
recalculating live intervals every time. Absurdly expensive.
With your patch, we do exactly 3 iterations.
1: copy propagation coalesce coalesce2 compute->mrf
2: CSE copy propagation
3: (nothing)
This is vastly more reasonable.
--Ken
_______________________________________________
mesa-dev mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/mesa-dev