On Sun, Jun 26, 2022 at 1:12 PM Roger Sayle <ro...@nextmovesoftware.com> wrote:
>
>
> This patch is a follow-up improvement to my recent patch for
> PR rtl-optimization/7061.  That patch added the test case
> gcc.target/i386/pr7061-2.c:
>
> float im(float _Complex a) { return __imag__ a; }
>
> For which GCC on x86_64 currently generates:
>
>         movq    %xmm0, %rax
>         shrq    $32, %rax
>         movd    %eax, %xmm0
>         ret
>
> but with this patch we now generate (the same as LLVM):
>
>         shufps  $85, %xmm0, %xmm0
>         ret
>
> This is achieved by providing a define_insn_and_split that allows
> truncated lshiftrt:DI by 32 to be performed on either SSE or general
> regs, where if the register allocator prefers to use SSE, we split
> to a shufps_v4si, or if not, we use a regular shrq.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, with no new failures.  Ok for mainline?
>
>
> 2022-06-26  Roger Sayle  <ro...@nextmovesoftware.com>
>
> gcc/ChangeLog
>         PR rtl-optimization/7061
>         * config/i386/i386.md (*highpartdisi2): New define_insn_and_split.
>
> gcc/testsuite/ChangeLog
>         PR rtl-optimization/7061
>         * gcc.target/i386/pr7061-2.c: Update to look for shufps.

OK.

Thanks,
Uros.

>
>
> Roger
> --
>

Reply via email to