https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114589

--- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Andrew Macleod from comment #9)
> (In reply to Richard Biener from comment #8)
> > The missed sinking is now fixed for GCC 15, VRP is still confused by what
> > IVOPTs does so without -fno-ivopts the loop remains.
> 
> WHy is it VRP that should figure out that the loop doesn't iterate?  That
> seems very much a loop analysis thing...  VRP makes no attempt to do loop
> analysis.

without IVOPTs we DOM2 threading the loop exit, allowing us to optimize:

  <bb 3> [local count: 105119324]:
  # PT = nonlocal
  _11 = &o_3(D)->val;
  # RANGE [irange] sizetype [0, 1] MASK 0x1 VALUE 0x0
  _12 = (sizetype) _5;
  # RANGE [irange] sizetype [0, 0][4, 4] MASK 0x4 VALUE 0x0
  _13 = _12 << 2;
  # PT = nonlocal
  _10 = _11 + _13;

  <bb 4> [local count: 955630224]:
  # PT = nonlocal
  # __for_begin_15 = PHI <__for_begin_8(6), _11(3)>
  i_6 = *__for_begin_15;
  # USE = nonlocal escaped
  # CLB = nonlocal escaped
  f (i_6);
  # PT = nonlocal
  __for_begin_8 = __for_begin_15 + 4;
  if (__for_begin_8 != _10)
    goto <bb 6>; [89.00%]
  else
    goto <bb 5>; [11.00%]

  <bb 6> [local count: 850510900]:
  goto <bb 4>; [100.00%]

while IVOPTs turns the IL into the following which we do not optimize:

  <bb 3> [local count: 105119324]:
  # PT = nonlocal
  _11 = &o_3(D)->val;
  # RANGE [irange] sizetype [0, 1] MASK 0x1 VALUE 0x0
  _12 = (sizetype) _5;
  # RANGE [irange] sizetype [0, 0][4, 4] MASK 0x4 VALUE 0x0
  _13 = _12 << 2;
  # PT = nonlocal
  _10 = _11 + _13;
  _9 = (unsigned long) o_3(D);
  ivtmp.11_14 = _9 + 4;

  <bb 4> [local count: 955630224]:
  # ivtmp.11_17 = PHI <ivtmp.11_2(6), ivtmp.11_14(3)>
  # PT = nonlocal null
  _18 = (void *) ivtmp.11_17;
  i_6 = MEM[(const int *)_18];
  # USE = nonlocal escaped
  # CLB = nonlocal escaped
  f (i_6);
  ivtmp.11_2 = ivtmp.11_17 + 4;
  __for_begin_19 = (const int *) ivtmp.11_2;
  if (__for_begin_19 != _10)
    goto <bb 6>; [89.00%]
  else
    goto <bb 5>; [11.00%]

there's a missed optimization at least, not noticing that
ivtmp.11_14 == _11 (that might be a recent regression), FRE5 doesn't
figure this out, but note it would need to still preserve a cast
from pointer to unsigned long.  Possibly IVOPTs replacing the pointer
IV with an integer IV (but nothing else really) is the issue here.

"VRP" because of how threading works now (but it's catched only by
DOM threading, not by threadfull2 for example).

Reply via email to