Hi, I found this missed optimization when looking into an ICE on the pointer_plus branch. For this code: quantize_fs_dither (unsigned width, short *errorptr, int dir) { short bpreverr; unsigned col; for (col = width; col > 0; col--) errorptr += dir; errorptr[0] = (short) bpreverr; }
We get with -O2 -fomit-frame-pointer: quantize_fs_dither: pushl %ebx movl 8(%esp), %ebx movl 12(%esp), %eax testl %ebx, %ebx je .L2 movl 16(%esp), %ecx addl %ecx, %ecx leal (%eax,%ecx), %edx leal -1(%ebx), %eax imull %ecx, %eax leal (%edx,%eax), %eax .L2: movw %ax, (%eax) popl %ebx With -Os we get: movl 12(%esp), %edx movl 8(%esp), %eax addl %edx, %edx imull 4(%esp), %edx movw %ax, (%eax,%edx) ret Which has no branches and is faster as we will most likely mispredict the branch in the -O2 case. This is related to PR 24574. -- Summary: Missed optimization caused by copy loop header (yes a weird case) Product: gcc Version: 4.3.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: enhancement Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: pinskia at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32226