On Friday, March 13, 2015 04:40:17 PM Matt Turner wrote: > On Fri, Mar 13, 2015 at 4:16 PM, Kenneth Graunke <[email protected]> > wrote: > > Commit 36bc5f06dd22 began allowing immediates in MAD and LRP sources, > > in any position. One unforeseen consequence is that opt_algebraic began > > creating ADD and MUL instructions with src0 immediates. > > > > For example, > > > > mad(8) vgrf86:F, 1.000000f, 1.000000f, vgrf84:F > > > > would be optimized into: > > > > add(8) vgrf86:F, 1.000000f, vgrf84:F > > > > which is illegal. > > > > Fixes assert failures in steam/tiny-and-big-grandpas-leftovers/fp-19. > > > > Signed-off-by: Kenneth Graunke <[email protected]> > > --- > > We've talked about wanting to remove some hardware restrictions from > the IR like this, and I think if we have a good reason to do that then > this is a good way to do it. > > But currently the only source of this problem comes from 13 lines of > code in opt_algebraic. Fixing the problem at the source makes a lot of > sense to me. What advantage of this series do you see over that?
Sure. You could just commute things in opt_algebraic, and that would probably solve the problem we have today. It's also simple, which is nice, especially for backporting. Feel free to go with that solution if you prefer it. The fact that we restrict immediates to src1 on most instructions, but allow them anywhere on MAD and LRP is a bit odd - it's inconsistent, neither matching the hardware limitations, nor offering a clean abstraction so you don't have to think about those limitations. This seemed like a solution to the problem in general - immediates in src0 must be promoted. Let opt_algebraic and any other optimization passes create them, then promote later if we have to. Commute where possible to avoid having to promote. Your combine constants pass seemed like the ideal place to do that, since it already has most of the infrastructure and does a better job than naively emitting MOVs. The one downside is that it currently only handles floats - which is OK since MAD and LRP operate on floats - but we'd have to make it handle D/UD immediates before we'd be ready to allow immediates anywhere. Again...there are a lot of ways to solve this. Feel free to solve this however you like. > Maybe the conversation should be about how allowing immediates in > other sources would allow some additional optimization -- for instance > we don't do proper constant folding because of this restriction. I'd > be interesting to see if we could find some cases that benefit from > that.
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ mesa-dev mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/mesa-dev
