Take the following two functions, they should produce the same asm, the second 
is better on powerpc 
at least for the inner loop (6 instructions vs 8):
void daxpy(int n, float  da, float  dx[], int incx, float  dy[], int incy)
{
  int i,ix=0,iy=0,m,mp1;
  
  mp1 = 0;
  m = 0;
  for (i = 0;i < n; i++){
    dy[iy] = dy[iy] + dx[ix];
    ix = ix + incx;
    iy = iy + incy;
  }
}
void daxpy1(int n, float  da, float  dx[], int incx, float  dy[], int incy)
{
  int i,ix=0,iy=0,m,mp1;
  mp1 = 0;
  m = 0;
  for (i = 0;i < n; i++){
    *(float*)(((char*)dy)+iy) = *(float*)(((char*)dy)+iy) + 
*(float*)(((char*)dx)+ix);
    ix = ix + incx*4;
    iy = iy + incy*4;
  }
}

inner loop for the first one:
L4:
        slwi r2,r9,2
        slwi r0,r11,2
        lfsx f13,r5,r0
        add r11,r11,r6
        lfsx f0,r7,r2
        add r9,r9,r8
        fadds f0,f0,f13
        stfsx f0,r7,r2
        bdnz L4

the second one:
L11:
        lfsx f0,r7,r0
        lfsx f13,r5,r2
        add r2,r2,r6
        fadds f0,f0,f13
        stfsx f0,r7,r0
        add r0,r0,r8
        bdnz L11

Yes this shows up in real code.

-- 
           Summary: Missed IV optimization (redundant instruction in loop)
           Product: gcc
           Version: 4.0.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: enhancement
          Priority: P2
         Component: tree-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: pinskia at gcc dot gnu dot org
                CC: gcc-bugs at gcc dot gnu dot org
GCC target triplet: powerpc-darwin


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19126

Reply via email to