https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97832

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|target                      |tree-optimization

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
Ah, thanks - that helps.  So we're re-associating from

  *_89 = (((*_89) - (f_re_34 * x_re_82)) - (f_im_35 * x_im_88));
  *_91 = (((*_91) + (f_im_35 * x_re_82)) - (f_re_34 * x_im_88));

to

  *_89 = ((*_89) - ((f_re_34 * x_re_82) + (f_im_35 * x_im_88)));
  *_91 = (((*_91) + (f_im_35 * x_re_82)) - (f_re_34 * x_im_88));

that makes the operations unbalanced.  This is (a - b) - c -> a - (b + c)
as we're optimizing this as a + -b + -c.

Even smaller testcase:

double a[1024], b[1024], c[1024];

void foo()
{
  for (int i = 0; i < 256; ++i)
    {
      a[2*i] = a[2*i] + b[2*i] - c[2*i];
      a[2*i+1] = a[2*i+1] - b[2*i+1] - c[2*i+1];
    }
}

here ranks end up associating the expr as (-b + -c) + a and negate
re-propagation goes (-b - c) + a -> -(b + c) + a -> a - (b + c)
which is all sensible in isolation.

You could say that associating as (-b + -c) + a is worse than
(a + -b) + -c in this respect.  Ranks are

Rank for _8 is 327683 (a)
Rank for _13 is 327684 (-b)
Rank for _21 is 327684 (-c)

where the rank is one more for the negated values because of the
negate operation.  While heuristically ignoring negates for rank
propagation to make all ranks equal helps this new testcase it
doesn't help for the larger two.

It might still be a generally sound heuristic improvement though.

For the effects on vectorization I think we need to do sth in the
vectorizer itself, for example linearizing expressions.  The
first reassoc pass is supposed to do this but then negate
re-propagation undoes it in this case - which maybe points to
it that needs fixing, somehow associating a not negated operand
first.

Reply via email to