Hi,
  I found this missed optimization when looking into an ICE on the pointer_plus
branch.
For this code:
quantize_fs_dither (unsigned width, short *errorptr, int dir)
{
  short bpreverr;
  unsigned col;
  for (col = width; col > 0; col--) 
    errorptr += dir;
  errorptr[0] = (short) bpreverr;
}

We get with -O2 -fomit-frame-pointer:
quantize_fs_dither:
        pushl   %ebx
        movl    8(%esp), %ebx
        movl    12(%esp), %eax
        testl   %ebx, %ebx
        je      .L2
        movl    16(%esp), %ecx
        addl    %ecx, %ecx
        leal    (%eax,%ecx), %edx
        leal    -1(%ebx), %eax
        imull   %ecx, %eax
        leal    (%edx,%eax), %eax
.L2:
        movw    %ax, (%eax)
        popl    %ebx


With -Os we get:
        movl    12(%esp), %edx
        movl    8(%esp), %eax
        addl    %edx, %edx
        imull   4(%esp), %edx
        movw    %ax, (%eax,%edx)
        ret

Which has no branches and is faster as we will most likely mispredict the
branch in the -O2 case.

This is related to PR 24574.


-- 
           Summary: Missed optimization caused by copy loop header (yes a
                    weird case)
           Product: gcc
           Version: 4.3.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: enhancement
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: pinskia at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32226

Reply via email to