https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88497
kelvin at gcc dot gnu.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |SUSPENDED Last reconfirmed| |2019-01-23 Ever confirmed|0 |1 --- Comment #8 from kelvin at gcc dot gnu.org --- In revisiting this problem report, I have confirmed that if I specify -ffast-math when I compile the original loop that motivated this problem report, I get the desired optimization. I believe my original discovery of this optimization opportunity had omitted the -ffast-math option. I have confirmed that reassociation does not produce the desired translation of the loop body in isolation, even if -ffast-math is specified on the command line, and even if I experiment with very large values of the reassociation_width values for the rs6000 target. Apparently, the reassociation pass is not clever enough to recognize opportunities to transform multiply-add accumulations into expressions that favor use of the xvmaddadp instruction. Reassociation may change the order in which the sums computed for different vector element products are combined with the accumulator. But in my experience, reassociation does not discover the opportunity to accumulate the products from different vectors using vector sum instructions or even better, the vector multiply-add instruction. Since the code produced for -ffast-math auto-vectorization of multiply-add accumulation loops is "optimal", I am recommending future effort on this issue be treated as low priority.