https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97832
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Component|target |tree-optimization --- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> --- Ah, thanks - that helps. So we're re-associating from *_89 = (((*_89) - (f_re_34 * x_re_82)) - (f_im_35 * x_im_88)); *_91 = (((*_91) + (f_im_35 * x_re_82)) - (f_re_34 * x_im_88)); to *_89 = ((*_89) - ((f_re_34 * x_re_82) + (f_im_35 * x_im_88))); *_91 = (((*_91) + (f_im_35 * x_re_82)) - (f_re_34 * x_im_88)); that makes the operations unbalanced. This is (a - b) - c -> a - (b + c) as we're optimizing this as a + -b + -c. Even smaller testcase: double a[1024], b[1024], c[1024]; void foo() { for (int i = 0; i < 256; ++i) { a[2*i] = a[2*i] + b[2*i] - c[2*i]; a[2*i+1] = a[2*i+1] - b[2*i+1] - c[2*i+1]; } } here ranks end up associating the expr as (-b + -c) + a and negate re-propagation goes (-b - c) + a -> -(b + c) + a -> a - (b + c) which is all sensible in isolation. You could say that associating as (-b + -c) + a is worse than (a + -b) + -c in this respect. Ranks are Rank for _8 is 327683 (a) Rank for _13 is 327684 (-b) Rank for _21 is 327684 (-c) where the rank is one more for the negated values because of the negate operation. While heuristically ignoring negates for rank propagation to make all ranks equal helps this new testcase it doesn't help for the larger two. It might still be a generally sound heuristic improvement though. For the effects on vectorization I think we need to do sth in the vectorizer itself, for example linearizing expressions. The first reassoc pass is supposed to do this but then negate re-propagation undoes it in this case - which maybe points to it that needs fixing, somehow associating a not negated operand first.