https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121925

--- Comment #4 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #3)
> (In reply to Tamar Christina from comment #0)
> > Given the following vectors
> > 
> > a = [A1 A0]
> > b = [C  D ]
> 
> b = [C B]  I suppose?

yeah, I double checked the thing and still made a typo :(

> 
> > c = [E  D ]
> 
> [..]
> 
> > rot0   = [E + A0 * C, D + A0 * B]
> > rot90  = [E + A1 * B, D - A1 * C]
> > rot180 = [E - A0 * C, D - A0 * B]
> > rot270 = [E + A1 * B, D - A1 * C]
> 
> so that's all c + mul-with-rot (a, b), I guess fmrot0a fmrot90a fmrot180a
> fmrot270a?
> 
> That is, do the instructions also avoid the extra rounding for the add?

Yeah, they're fused operation, so need restricting to fp-contraction.
Essentially after the operands reshuffling they're treated as a normal FMA.

So the accumulator needs to be in the operation.

Reply via email to