https://gcc.gnu.org/bugzilla/show_bug.cgi?id=29256

--- Comment #75 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
This looks fixed in GCC 11+; I tried x86_64, i686, powerpc (powerpc-spe is no
longer supported).

For 32bit powerpc we get:
tuned_STREAM_Copy:
.LFB0:
        .cfi_startproc
        lis 9,.LANCHOR0@ha
        lis 10,0x3
        la 3,.LANCHOR0@l(9)
        ori 0,10,0xd090
        addis 4,3,0xf4
        mtctr 0
        addi 5,3,-8
        addi 8,4,9208
.L2:
        lwz 6,8(5)
        lwz 7,12(5)
        lfd 2,16(5)
        lfd 4,24(5)
        lfd 6,32(5)
        lfd 8,40(5)
        lfd 10,48(5)
        lfd 12,56(5)
        lfdu 0,64(5)
        stw 6,8(8)
        stw 7,12(8)
        stfd 2,16(8)
        stfd 4,24(8)
        stfd 6,32(8)
        stfd 8,40(8)
        stfd 10,48(8)
        stfd 12,56(8)
        stfdu 0,64(8)
        bdnz .L2
        blr

Which seems to the best.

gimple level for the loop is:
  <bb 3> [local count: 1063004409]:
  # ivtmp.10_8 = PHI <ivtmp.10_7(3), ivtmp.10_12(2)>
  # ivtmp.12_14 = PHI <ivtmp.12_15(3), ivtmp.12_16(2)>
  ivtmp.10_7 = ivtmp.10_8 + 8;
  _18 = (void *) ivtmp.10_7;
  _1 = MEM[(double *)_18];
  ivtmp.12_15 = ivtmp.12_14 + 8;
  _19 = (void *) ivtmp.12_15;
  MEM[(double *)_19] = _1;
  if (ivtmp.10_7 != _21)
    goto <bb 3>; [99.00%]
  else
    goto <bb 4>; [1.00%]

Reply via email to