https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88873

--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Roger Sayle <sa...@gcc.gnu.org>:

https://gcc.gnu.org/g:12b78b0b42d53019eb2c500d386094194e90ad16

commit r14-2406-g12b78b0b42d53019eb2c500d386094194e90ad16
Author: Roger Sayle <ro...@nextmovesoftware.com>
Date:   Mon Jul 10 09:06:52 2023 +0100

    i386: Add new insvti_lowpart_1 and insvdi_lowpart_1 patterns.

    This patch implements another of Uros' suggestions, to investigate a
    insvti_lowpart_1 pattern to improve TImode parameter passing on x86_64.
    In PR 88873, the RTL the middle-end expands for passing V2DF in TImode
    is subtly different from what it does for V2DI in TImode, sufficiently so
    that my explanations for why insvti_lowpart_1 isn't required don't apply
    in this case.

    This patch adds an insvti_lowpart_1 pattern, complementing the existing
    insvti_highpart_1 pattern, and also a 32-bit variant, insvdi_lowpart_1.
    Because the middle-end represents 128-bit constants using CONST_WIDE_INT
    and 64-bit constants using CONST_INT, it's easiest to treat these as
    different patterns, rather than attempt <dwi> parameterization.

    This patch also includes a peephole2 (actually a pair) to transform
    xchg instructions into mov instructions, when one of the destinations
    is unused.  This optimization is required to produce the optimal code
    sequences below.

    For the 64-bit case:

    __int128 foo(__int128 x, unsigned long long y)
    {
      __int128 m = ~((__int128)~0ull);
      __int128 t = x & m;
      __int128 r = t | y;
      return r;
    }

    Before:
            xchgq   %rdi, %rsi
            movq    %rdx, %rax
            xorl    %esi, %esi
            xorl    %edx, %edx
            orq     %rsi, %rax
            orq     %rdi, %rdx
            ret

    After:
            movq    %rdx, %rax
            movq    %rsi, %rdx
            ret

    For the 32-bit case:

    long long bar(long long x, int y)
    {
      long long mask = ~0ull << 32;
      long long t = x & mask;
      long long r = t | (unsigned int)y;
      return r;
    }

    Before:
            pushl   %ebx
            movl    12(%esp), %edx
            xorl    %ebx, %ebx
            xorl    %eax, %eax
            movl    16(%esp), %ecx
            orl     %ebx, %edx
            popl    %ebx
            orl     %ecx, %eax
            ret

    After:
            movl    12(%esp), %eax
            movl    8(%esp), %edx
            ret

    2023-07-10  Roger Sayle  <ro...@nextmovesoftware.com>

    gcc/ChangeLog
            * config/i386/i386.md (peephole2): Transform xchg insn with a
            REG_UNUSED note to a (simple) move.
            (*insvti_lowpart_1): New define_insn_and_split.
            (*insvdi_lowpart_1): Likewise.

    gcc/testsuite/ChangeLog
            * gcc.target/i386/insvdi_lowpart-1.c: New test case.
            * gcc.target/i386/insvti_lowpart-1.c: Likewise.

Reply via email to