https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82682

--- Comment #2 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
We now generate:
 .L3:
        movzbl  (%edx), %esi
        addl    $2, %edx
        addl    $1, %ecx
        movzbl  -1(%edx), %eax
        movl    %esi, %ebx
        imull   $38470, %eax, %eax
        movzbl  %bl, %esi
        imull   $19595, %esi, %esi
        addl    %esi, %eax
        sarl    $16, %eax
        movb    %al, -1(%ecx)
        cmpl    %edi, %edx
        jne     .L3

while from older gcc I get
.L3:
        movzbl  (%ecx,%edx,2), %eax
        movzbl  1(%ecx,%edx,2), %edi
        imull   $19595, %eax, %eax
        imull   $38470, %edi, %edi
        addl    %edi, %eax
        sarl    $16, %eax
        movb    %al, (%esi,%edx)
        addl    $1, %edx
        cmpl    %edx, %ebx
        jne     .L3
.L1:

There is clearly missed optimization on movzbl $bl, esi because it is already
extended. 

I wonder how this can be triggered by the move cost changes, perhaps regalloc
difference?
Jakub, it is easy for you to get .s files from  r253958 and just before?

Reply via email to