https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71275
Bug ID: 71275 Summary: [7 regression] Performance drop after r235660 on x86-64 in 32-bit mode. Product: gcc Version: 7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ysrumyan at gmail dot com Target Milestone: --- Regression can be seen at attached test-case. In the tail block of innermost loop redundant fill was added: before r235660 r235660 .L3: addl $1, %esi addl $1, %esi addl %eax, %ebx addl %eax, %ebx movw %bp, (%edi,%ecx) movl 44(%esp), %edx movswl %si, %ebp movswl %si, %eax cmpl (%esp), %ebp cmpl %edi, %eax jl .L6 movw %bp, (%edx,%ecx) jl .L6 In result we got up to 14% slow-down on one important benchmark. It is clear that it is not profitable to keep value of loop upper bound on register instead of the address base.