https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118076

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2
           Keywords|                            |needs-bisection
                 CC|                            |hjl.tools at gmail dot com,
                   |                            |rguenth at gcc dot gnu.org
            Summary|extra memcpy for passing    |[12/13/14/15 Regression]
                   |large arguments in some     |extra memcpy for passing
                   |cases                       |large arguments in some
                   |                            |cases, introduces STLF
                   |                            |fails
             Target|                            |x86_64-*-*
   Target Milestone|---                         |12.5

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
It's also very bad for performance as the two DImode stores do not forward to
the TImode load.

The issue is that we fail to re-use the dead-after-call stack space for the
argument slot (or rather the other way around since the call needs appropriate
placement of the aggregate on the stack).

In some cases RTL opts are able to elide 's' and directly copy the registers
to the argument slot, but I guess the inline expanded block copy via XMM
confuses RTL ops here.

This is probably a regression (on x86-64) for when we started to use XMM
to populate aggregate argument slots.

GCC 11 used

        movq    %rdi, (%rsp)
        movq    %rsi, 8(%rsp)
        movq    %rdx, 16(%rsp)
        movq    %rcx, 24(%rsp)
        pushq   24(%rsp)
        .cfi_def_cfa_offset 56
        pushq   24(%rsp)
        .cfi_def_cfa_offset 64
        pushq   24(%rsp)
        .cfi_def_cfa_offset 72
        pushq   24(%rsp)
        .cfi_def_cfa_offset 80
        call    extern_func

and with GCC 12 we started using the bad sequence.

Reply via email to