https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71361
--- Comment #5 from amker at gcc dot gnu.org --- (In reply to Jakub Jelinek from comment #4) > So shall we defer this PR to GCC 8 then (i.e. [8 Regression] and Target > Milestone: 8.0? Richard, are you ok with that? With ivopt rewriting, we now generate below code: .L5: movl (%esi,%ecx,4), %eax movl 40(%esp), %edx movl 44(%esp), %ebx imull (%edi,%ecx,4), %ebx imull %eax, %edx imull 44(%esp), %eax addl %ebx, %edx movl 40(%esp), %ebx imull (%edi,%ecx,4), %ebx addl %ebx, %eax movl 36(%esp), %ebx movl %edx, (%ebx,%ecx,4) movl 32(%esp), %ebx movl %edx, (%edi,%ecx,4) movl %eax, (%ebx,%ecx,4) movl %eax, (%esi,%ecx,4) addl 28(%esp), %ecx cmpl $511, %ecx jle .L5 Which I think is optimal. Shall we consider this fixed?