https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69526
Bug ID: 69526 Summary: ivopts candidate strangeness Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: minor Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: rdapp at linux dot vnet.ibm.com Target Milestone: --- Target: s390, x86 While inspecting loop generation code on s390, I saw ivopts choose an IV candidate which does something peculiar. On x86 the same candidate is never chosen because of different cost estimations, yet the behavior can be evoked on x86, through "forcing" GCC to use the same candidate as on s390. I did this to confirm my suspicions and I'll show the x86 version here: This source void v(unsigned long *in, unsigned long *out, unsigned int n) { int i; for (i = 0; i < n; i++) { out[i] = in[i]; } } results in the following assembly, when ivopts candidate 7 is used: v: testl %edx, %edx je .L1 leal -1(%rdx), %eax leaq 8(,%rax,8), %rcx xorl %eax, %eax .L3: movq (%rdi,%rax), %rdx movq %rdx, (%rsi,%rax) addq $8, %rax cmpq %rcx, %rax jne .L3 .L1: rep ret Should the following be happening? leal -1(%rdx), %eax leaq 8(,%rax,8), %rcx i.e. %eax = n - 1 %rcx = 8 * (n + 1) The pattern can already be observed in ivopts' GIMPLE: <bb 4>: _15 = n_5(D) + 4294967295; _2 = (sizetype) _15; _1 = _2 + 1; _24 = _1 * 8; Why do we need the - 1 and subsequent + 1 when the %eax is zeroed afterwards anyway? Granted, this exact situation won't ever be observed on x86 as another ivopts candidate is chosen but on s390 this situation will amount to three instructions. If I see it correctly, the n - 1 comes from estimating the number of loop iterations, while the +1 is then correctly added by cand_value_at() because the loop counter is incremented before the exit test. Perhaps this is intended behavior and there is nothing wrong with it? Regards Robin