12 Regression] Inefficient handling of 128-bit arguments

crazylht at gmail dot com via Gcc-bugs Sun, 29 Aug 2021 22:18:50 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97756


--- Comment #7 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Patrick Palka from comment #3)
> Perhaps related to this PR: On x86_64, the following basic wrapper around
> int128 addition
> 
>   __uint128_t f(__uint128_t x, __uint128_t y) { return x + y; }
> 
> gets compiled (/w -O3, -O2 or -Os) to the seemingly suboptimal
> 
>         movq    %rdi, %r9
>         movq    %rdx, %rax
>         movq    %rsi, %r8
>         movq    %rcx, %rdx
>         addq    %r9, %rax
>         adcq    %r8, %rdx
>         ret
> 
> Clang does:
> 
>         movq    %rdi, %rax
>         addq    %rdx, %rax
>         adcq    %rcx, %rsi
>         movq    %rsi, %rdx
>         retq

Remove addti3/ashlti3 from i386.md also helps this.

[Bug rtl-optimization/97756] [9/10/11/12 Regression] Inefficient handling of 128-bit arguments

Reply via email to