[Bug tree-optimization/92335] missed transformation to branchless

rguenth at gcc dot gnu.org Mon, 04 Nov 2019 02:17:27 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92335


Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2019-11-04
                 CC|                            |rguenth at gcc dot gnu.org
     Ever confirmed|0                           |1

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
The issue is probably some FP constraints that say we cannot elide ret += 0.0,
otherwise we'd try to do that resulting in branchy code for foo as well.
If you add -ffast-math to -O2 you'll see exactly that behavior - we're
presenting RTL expansion with

  <bb 3> [local count: 1063004407]:
  # ret_19 = PHI <0.0(2), prephitmp_25(5)>
  # ivtmp.13_7 = PHI <0(2), ivtmp.13_4(5)>
  k_12 = MEM[base: y_10(D), index: ivtmp.13_7, offset: 0B];
  _6 = MEM[base: x_13(D), index: ivtmp.13_7, offset: 0B];
  if (_6 > 0.0)
    goto <bb 4>; [59.00%]
  else
    goto <bb 5>; [41.00%]

  <bb 4> [local count: 627172604]:
  _24 = k_12 + ret_19;

  <bb 5> [local count: 1063004407]:
  # prephitmp_25 = PHI <_24(4), ret_19(3)>
  ivtmp.13_4 = ivtmp.13_7 + 4;
  if (ivtmp.13_4 == 4096)
    goto <bb 6>; [1.01%]
  else
    goto <bb 3>; [98.99%]

while without -ffast-math 'foo' has retained the unconditional accumulation.

Since RTL optimization chickens out on most FP involved transforms I'm not
surprised it doesn't try to undo this.  We're leaving most if-conversion
to RTL because it has a better idea of target costs.

[Bug tree-optimization/92335] missed transformation to branchless

Reply via email to