https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102178
--- Comment #34 from Richard Biener <rguenth at gcc dot gnu.org> --- As noted the effect of if(...) { ux = 0.005; uy = 0.002; uz = 0.000; } is PRE of most(!) dependent instructions, creating # prephitmp_1099 = PHI <_1098(6), 6.49971724999999889149648879538290202617645263671875e-1(5)> # prephitmp_1111 = PHI <_1110(6), 1.089805708333333178483570691241766326129436492919921875e-1(5)> ... we successfully coalesce the non-constant incoming register with the result but have to emit copies for all constants on the other edge where we have quite a number of duplicate constants to deal with. I've experimented with ensuring we get _full_ PRE of the dependent expressions by more aggressively re-associating (give PHIs with a constant incoming operand on at least one edge a rank similar to constants, 1). This increases the number of PHIs further but reduces the followup computations more. We still fail to simply tail-duplicate the merge block - another possibility to eventually save some of the overhead, our tail duplication code (gimple-ssa-split-paths.cc) doesn't handle this case since the diamond is not the one immediately preceeding the loop exit/latch. The result of "full PRE" is a little bit worse than the current state (so it's not a full solution here).