On 10/30/2012 08:28 PM, Eric Anholt wrote:
Here's a patch series to clean up the most glaring failures I think we have
left in FS code generation other than variable-indexed array access.
Unfortunately, I haven't found a particular testcase to show that it's a
performance improvement, but I still think it's a good idea since it does
remove instructions.

I was hoping this series would let me remove the badly-named
register_coalesce() pass, but it turns out that doing so increases shader-db
instruction count by a significant fraction of a percent.  A bit of a
surprise.

Patches 1-3 are:
Reviewed-by: Kenneth Graunke <[email protected]>

(I haven't looked at patch 4 yet, but I will soon.)

This also has another benefit: it cuts compilation time of L4D2's largest fragment shader from 10.2 to 4.3 seconds (a 57% reduction!).

We used to do 26 iterations through the brw_fs optimization loop; the first two did a bunch of optimizing, but on iterations 3-25 only register_coalesce() flagged any progress. Which also meant recalculating live intervals every time. Absurdly expensive.

With your patch, we do exactly 3 iterations.
1: copy propagation coalesce coalesce2 compute->mrf
2: CSE copy propagation
3: (nothing)

This is vastly more reasonable.

--Ken
_______________________________________________
mesa-dev mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to