Kenneth Graunke <[email protected]> writes: > On 02/25/2014 09:38 AM, Eric Anholt wrote: >> Matt Turner <[email protected]> writes: >> >>> On Mon, Feb 24, 2014 at 10:15 AM, Eric Anholt <[email protected]> wrote: >>>> I think we would do better by emitting >>>> ADD(y_minus_x, y, negate(x)) >>>> MAC(dst, x, y_minus_x, a) >>> >>> MAC only takes two arguments, so >>> - if you meant MAD, there's no MAD on platforms that don't have LRP >>> - if you meant MAC(dst, ...) I don't see a way of doing it only two >>> instructions, but we could do >>> >>> MOV(acc, x) >>> ADD(y_minus_x, y, negate(x) >>> MAC(dst, y_minus_x, a) >> >> Oops, yeah, I was still thinking in terms of MAD. This should still be >> better I think, while being an obvious translation of the LRP >> instruction: >> >> ADD one_minus_a, negate(a), 1.0f >> MUL null, y, a >> MAC dst, x, one_minus_a >> >> (multiplying y * a first to slightly reduce the stall pressure from >> one_minus_a) > > Nice. I agree this is better, but it's harder than you think. We would > have to: > > 1. Create a MAC() emitter. > 2. Add BRW_OPCODE_MAC to vec4_generator. > 3. Add a new "enable accumulator writes" flag to vec4_instruction > and make vec4_generator respect that. (The MUL needs this.) > 4. Fix up dead code elimination and other things to know about implicit > accumulator writes. > > Given the severity of this problem (GPU hangs and crashes) and the fact > that it's a regression in 10.1---which we plan to ship in three days---I > would like to commit my existing patches and improve this after the release.
Acked. I had forgotten that "do MAC for pre-gen6 VS again" was still on some TODO list.
pgp3w0Mcqm0Vi.pgp
Description: PGP signature
_______________________________________________ mesa-dev mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/mesa-dev
