This series reworks how we do integer multiplication in the i965/fs backend and significantly improves code generation for Broadwell's scalar vertex shaders with NIR by allowing constant propagation into the MUL instruction (wow that code was stupid, and it still kind of is!).
Before this series, Jason said enabling NIR for scalar vertex shaders had this result: total instructions in shared programs: 2724483 -> 2711790 (-0.47%) instructions in affected programs: 1860859 -> 1848166 (-0.68%) helped: 4387 HURT: 4758 After this series, the results are: Broadwell, vertex shaders only, with and without NIR: total instructions in shared programs: 2742062 -> 2681339 (-2.21%) instructions in affected programs: 1514770 -> 1454047 (-4.01%) helped: 5813 HURT: 1120 Along the way, I move when we split integer multiplication into multiple instructions (on Gen < 8), implement SIMD16 support, and reimplement integer multiplication on Gen < 8 without using the accumulator. Patches 3 and 4, add SIMD16 support (for Haswell only, since pre-Gen7 already works, and IVB/BYT have a bug that prevents it from working). I don't particularly care if those patches go in separately, or if 6/6 is squashed with 4/6, but I think they are instructive -- I'll follow up with a piglit test that demonstrates that settings the wrong quarter control indeed generates incorrect results. I did not touch the imul_high opcode, though it could be lowered in a similar way (and probably in 3 instructions using the same tricks in 6/6). That would be a nice follow-on project, and it would mean we'd never see a MACH instruction during optimizations. _______________________________________________ mesa-dev mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/mesa-dev
