When compiling with -O2 void foo(int N, int k, int di, double x[N]) { long i; for (i = 0; k; k--, i += di) x[i] = 0; }
GCC produces for x86_64: .L9: movq $0, (%rcx) addq %rax, %rcx subl $1, %esi jne .L9 and for ia64: .L10: .mib stfd [r35] = f0 add r35 = r35, r34 br.cloop.sptk.few .L10 ;; However, when `long i' is changed to `int i', generated code is worse: .L3: movslq %edi,%rax addl %edx, %edi subl $1, %esi movq $0, (%rcx,%rax,8) jne .L3 and for ia64: .L3: .mii nop 0 sxt4 r14 = r15 add r15 = r15, r34 ;; .mii shladd r14 = r14, 3, r35 nop 0 ;; nop 0 .mmb stfd [r14] = f0 nop 0 br.cloop.sptk.few .L3 ;; Basically, unadjusted loop index (i) is copied into temporary, which is then multiplied by sizeof(x[0]), added to base address of array and used for addressing. I assume this can be improved on the grounds that signed integer overflow is undefined. -- Summary: suboptimal address generation for int indices on 64-bit targets Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: amonakov at gmail dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32949