https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102178

--- Comment #34 from Richard Biener <rguenth at gcc dot gnu.org> ---
As noted the effect of

  if(...) {
   ux = 0.005;
   uy = 0.002;
   uz = 0.000;
  }

is PRE of most(!) dependent instructions, creating

  # prephitmp_1099 = PHI <_1098(6),
6.49971724999999889149648879538290202617645263671875e-1(5)>
  # prephitmp_1111 = PHI <_1110(6),
1.089805708333333178483570691241766326129436492919921875e-1(5)>
...

we successfully coalesce the non-constant incoming register with the result
but have to emit copies for all constants on the other edge where we have
quite a number of duplicate constants to deal with.

I've experimented with ensuring we get _full_ PRE of the dependent expressions
by more aggressively re-associating (give PHIs with a constant incoming operand
on at least one edge a rank similar to constants, 1).

This increases the number of PHIs further but reduces the followup computations
more.  We still fail to simply tail-duplicate the merge block - another
possibility to eventually save some of the overhead, our tail duplication
code (gimple-ssa-split-paths.cc) doesn't handle this case since the
diamond is not the one immediately preceeding the loop exit/latch.

The result of "full PRE" is a little bit worse than the current state (so
it's not a full solution here).

Reply via email to