https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86270

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|2025-02-04 00:00:00         |2025-2-12

--- Comment #17 from Richard Biener <rguenth at gcc dot gnu.org> ---
The inner loop is still

.L3:
        movq    %rax, %rdx
        movl    %eax, (%rsi,%rax,4)
        addq    $1, %rax
        cmpq    %rcx, %rdx
        jne     .L3

we RTL expand from

  # ivtmp.7_7 = PHI <ivtmp.7_8(4), 0(3)>
  i_16 = (int) ivtmp.7_7;
  MEM[(int *)a.0_1 + ivtmp.7_7 * 4] = i_16;
  ivtmp.7_14 = ivtmp.7_7;
  ivtmp.7_8 = ivtmp.7_7 + 1;
  if (ivtmp.7_14 != _12)
    goto <bb 4>; [89.00%]

I'll note that IVOPTs did the right thing and transform the loop to

  _12 = (unsigned long) len.1_15;
  _14 = _12 + 1;

  <bb 4> [local count: 955630224]:
  # ivtmp.7_7 = PHI <ivtmp.7_8(6), 0(3)>
  _6 = (unsigned int) ivtmp.7_7;
  i_16 = (int) _6;
  MEM[(int *)a.0_1 + ivtmp.7_7 * 4] = i_16;
  ivtmp.7_8 = ivtmp.7_7 + 1;
  if (ivtmp.7_8 != _14)

but we wreck that again later, during forwprop.

I think we can pattern match this at RTL expansion time.
  • [Bug tree-optimization/86270] [... rguenth at gcc dot gnu.org via Gcc-bugs

Reply via email to